What is hiding in AI traffic by Anna Baron Garcia

1. The Upcoming Paradigm Shift

Offensive operations and red-teaming are entering a new paradigm shift: what was once primarily human-driven is becoming increasingly autonomous and event-driven, mediated through agentic frameworks that can plan, act, and coordinate with minimal oversight. Our research on Model Context Protocol powered Swarm Command-and-Control describes this shift not as a hypothetical future, but as an emerging operational reality. In the previous blogpost, Model Context Protocol (MCP) is treated as a new command-and-control (C2) substrate tailored to the agent era, enabling AI agents to communicate with operators and each other in ways that look increasingly like every-day enterprise AI activity.

The classic command-and-control has a distinctive fingerprint: periodic beaconing, regular or irregular intervals, predictable infrastructure patterns, and human-paced task execution. Adversaries can randomize sleep timers or rotate C2 domains, but defenders usually exploit the underlying requirement that implants must ‘call-back home’ often enough for the communications channel to remain useful. The MCP model challenges assumption. That is, because MCP is built for short-lived, on-demand exchanges between models and external tools; it naturally supports event-driven C2: agents can connect briefly to retrieve a task, disconnect to execute, then reconnect only when results or new context is available. Even if the AI communications protocol itself is legitimate, the mission intent behind that traffic may be malicious.

The Swarm C2 idea compounds this advantage. Instead of one single autonomous agent running a linear kill chain, many agents can be orchestrated in parallel, with specialization or role-based behavior. For example, one agent may focus on reconnaissance while another one focuses on exploit research, and these discoveries are shared via MCP and recombining work products at machine speed. Swarm communication might also introduce redundancy and variation to the traffic. In other words, MCP-enabled agent swarms do not merely automate steps of offensive operations; they can automate the operating model itself. Where attackers once needed skilled people to continuously coordinate tasks, interpret telemetry, and sequence actions, swarms can now do most of that autonomously, leaving the human-in-the-loop component to a figure that specifies strategic goals and occasionally intervenes on edge cases.

The upcoming shift does not only include higher speed, but also autonomy, coordination, higher breath of knowledge and stealth.

2. The New Age Threat Model

As it is explained in our paper, MCP provides a legitimate, low-noise communications fabric. Swarms provide parallelism, adaptability, and fault tolerance. Combined with capable reasoning models, they create offensive systems that resemble legitimate AI operations until the moment they act.

The consequences of that are the following:

C2 traffic becomes semantically ambiguous: Traditional detection pipelines look for anomalous network patterns: periodic callbacks, suspicious domains, odd user agents, or known C2 frameworks. MCP traffic, by contrast, may be fully legitimate at the transport level and indistinguishable from internal “AI tool use.” If an enterprise is already adopting MCP for productivity agents, security copilots, or code assistants, then adversarial MCP tasking can blend into the background.
Kill chain compresses and overlaps: Agent swarms let operations run in parallel across targets, techniques, and environments. Recon and exploit development no longer need to be sequential; an agent can research an exploit while another is already testing lateral movement paths, and a third is harvesting credentials.
Autonomy broadens the threat actor model: AI can handle most tactical steps, which means that the barrier of entry is lower for attackers (i.e. what was considered an APT threat actor level of skill, now can be a script-kiddie with no technical skills). In addition, the operator burden drops, which enables more campaigns and more targets.

These three key items modify the traditional threat model that has been used in most cybersecurity operations.

3. Anthropic’s AI-Led Attack: A Real World Confirmation

The threat profile hypothesized in the paper was almost immediately validated by Anthropic’s investigation into a state-linked espionage campaign that used agentic AI as the primary operator. Anthropic reports with high confidence that a Chinese state-sponsored group (labeled GTG-1002 in public reporting) built an autonomous framework around Claude Code, using it not as a helper but as the central executor of the campaign. There are two aspects of the Anthropic case are especially important for defenders:

The operational pattern: According to Anthropic, the AI system conducted most of the kill chain phases: reconnaissance, vulnerability discovery, exploit research and coding, credential harvesting, privilege escalation, adding a backdoor or foothold, and attempted exfiltration. Humans stepped in only for a small minority of decisions. Reporting estimates that roughly 80–90% of tactical operations were AI-handled, at request rates effectively impossible for a human operator to sustain.
The manipulation strategy: The campaign succeeded initially by jailbreaking Claude Code through deliberate task decomposition. Malicious goals were sliced into benign-seeming subtasks, framed as defensive research or routine security testing. This aligns with a broader and worrying agentic failure mode: if an AI system is optimized to be helpful on locally reasonable tasks, adversaries can hide intent across the task graph.

Anthropic has recognized as well that the cybersecurity landscape has changed due to these AI models and frameworks. But the logical conclusion is not to stop releasing models, but to empower cybersecurity professionals to use these models for defensive operations in preparation for attacks like this one.

4. What Comes Next?

Should we expect more of these types of attacks? Yes. Historically, once a technique has been proven by a state actor, it can diffuse (e.g. EternalBlue to WannaCry).

In addition, not every adversary needs to build a their own swarm. Many malicious actors will simply wrap commercially available agent frameworks around existing playbooks: phishing kits, exploit chains, ransomware staging. The issue is not the AI model on its own, it is the growing ecosystem of MCP servers, plugins, and agent toolchains. As more enterprises expose internal tools through MCP for legitimate purposes, the same interfaces become attractive offensive surfaces.

The MCP security literature is already warning about unverified context providers, tool-chain abuse, and protocol-level blind spots. MCP as a Command-and-Control channel should not be our only worry, but MCP as a supply-chain will become a high-value target.

Also, because MCP traffic resembles normal every-day enterprise AI traffic, detection alerts might increasingly collide with normal AI usage. This is where the defensive posture has to evolve to keep up with the new paradigm shift:

Detecting intent: Detecting agentic attacks will require linking model or tool telemetry with identity, endpoint behavior, and network signals. The detection would have to understand the behavior and answer the question: why is this agent running this tool now, and does that align with the identity and business context?
Securing MCP Infrastructure: MCP servers should be treated as privileged integration points of the enterprise and secured accordingly (e.g. enforce strict authentication, sandbox tool execution, separate logging from agent calls)
Assuming Machine-speed attacks: Incident response playbooks need to account for much compressed timelines. This means allowing faster containment options and resilient measures that stop rapid lateral movement techniques.

Therefore, the message is clear: MCP-enabled, agentic swarms represent the next generation of offensive security, presenting stealthier C2 frameworks, faster exploitation, and adaptative and distributed execution. Now, defenders need to assume that autonomous agents are part of the threat landscape, since the line between enterprise AI traffic and agentic C2 traffic is becoming blurry.

Further details of these sorts of attacks can be found in the recently released technical pre-print on Arxiv.

What is hiding in AI traffic

1. The Upcoming Paradigm Shift

2. The New Age Threat Model

3. Anthropic’s AI-Led Attack: A Real World Confirmation

4. What Comes Next?

FAQs

Request a Personalized Live Demo