Article co-authored by Zack Abzug, Fabien Guillot, and Alex Groyz.
---
On 3 June 2026, Anthropic published the LLM ATT&CK Navigator, a year of real attacker activity from 832 accounts it banned for malicious use, mapped to MITRE ATT&CK. It is the clearest public account so far of what attackers actually ask an AI model to do. We read it closely, compared notes, and two things stood out that matter for detection.
- First, attacker AI use is still concentrated where you cannot see it: on the attacker's own machines, building the malware and tooling that gets them through your defenses.
- Second, when AI is used to support post-compromise behaviors, they are the same behaviors that ordinary attackers are performing, and the same behaviors that network and identity detection is built to catch.
The rest of this post walks through three takeaways from the data, and where detection holds up once an attack reaches your environment.
Takeaway 1: AI is mostly used to get in, and it is working
The single most common use of AI in the dataset was developing capabilities, mainly writing malware: 69% of the actors studied. Close behind were obfuscating code (64.7%), pulling data from the attacker's own systems (55.9%), and impairing defenses (54.9%). Defense evasion was the largest tactic overall, present for 84.4% of actors.
Put those together and a picture forms. Most attacker AI use today is aimed at one thing: building malware that gets past endpoint defenses and into an environment. It is preparation, and it happens on infrastructure you do not own. It does not cross your network, touch your identity provider, or land in your logs. You cannot detect a model writing a payload on a machine you do not control, and you do not need to.
But there is a consequence you do need to plan for. If AI makes attackers better and faster at building what beats EDR, more of them get in. The realistic posture is to assume compromise. The question stops being whether something will get through and becomes what you can see once it does. That is the case for a detection layer that works inside the environment, after initial access, on behavior rather than on signatures.
Takeaway 2: AI is moving further down the kill chain
The early data is a snapshot, not a destination. Comparing the first half of the year with the second, the report shows attackers reaching for AI later in the operation, in the hands-on work that happens after they are in. Account discovery and automated exfiltration both rose in the second half.
This is the part that matters for a SOC, because the rare techniques are the dangerous ones. Using AI for lateral movement was the single strongest marker of a high-risk actor: the 54 actors who did it carried an average risk score of 56.4, nearly ten points above the mean of 46.8. At the technique level, the highest-risk actors leaned on remote services like SSH and SMB, valid accounts, credential dumping, and staging data to exfiltrate. Each was three to five times more common among them than across the rest.
One case makes it concrete. Anthropic describes GTG-1002, the operator behind an AI-run espionage campaign it disrupted in November 2025, which hit government and critical-infrastructure targets. Its technique list was unremarkable. What set it apart was orchestration: the operator ran Claude Code on a Kali Linux machine, wired penetration-testing tools in as MCP servers, and let the model scan, exploit a flaw to reach the internal network, harvest credentials, and move laterally, while a human only set direction. The reconnaissance and the path to a foothold were AI-driven from the start.
Active Directory reconnaissance, account discovery, lateral movement, exfiltration: this is the territory network detection specializes in. As more actors push AI into these later stages, and the trend is pointing that way, network and identity detection becomes more relevant, not less.
The common thread: identity
There is a connective theme across both halves, and it is identity. Cloud and AI workflows run through users, service principals, and managed identities. The Anthropic data shows valid accounts as one of the techniques most associated with high-risk actors, and an agent acting inside your environment still has to authenticate, reach services, and move like an account.
That matters because it does not depend on any one product surface. Whether an attacker is human or an agent, whether the workload is a cloud console or an AI service, the giveaway is the same: an identity used from the wrong place, reaching something it has no history with, behaving unlike itself. Detecting suspicious identity and behavior is where this whole story converges.
Where the coverage lands
Because Anthropic mapped its findings to MITRE technique IDs, and Vectra tags every detection with the same IDs, the two line up. On the post-compromise techniques that mark the dangerous actors, coverage is strong. The AI-heavy early stages are not yours to see, but the moment an AI-enabled operation starts operating inside your network, it produces behavior Vectra is built to detect.
One technique needs a footnote. Web shell deployment (T1505.003) does not have a dedicated Vectra detection: the install is often endpoint-side, though the command-and-control channel it creates surfaces in Hidden HTTPS Tunnel or External Remote Access. The domain-replication attacks DCSync and DCShadow surface in Suspicious Active Directory Operations, which Vectra maps to T1207 and the Credential Access tactic.
GTG-1002 fits the pattern. SSH remote services, exploitation of remote services, credential harvesting, and archive-and-stage are behaviors these detections are built to surface, whether the operator is a person or a model acting through an MCP server.
Takeaway 3: the model behind the attack is about to get harder to see
There is a reason to expect the early-stage blind spot to widen. Providers like Anthropic are investing heavily in safeguards, monitoring, and detection to stop their models being used for offensive work, and they are getting better at it. The likely response is that some actors move to open-source models they can run themselves, with the guardrails removed and no provider watching. Reporting like this Navigator exists because Anthropic can see its own systems. Self-hosted models offer no such window, and visibility into that shift is poor today.
This is the strongest argument for anchoring detection to behavior rather than the tool. A report tied to one provider's data cannot be your detection strategy, because the next operator may not use that provider at all. What does not change is what the attacker has to do inside your environment. An account that signs in from the wrong place and reaches a service it has never touched looks the same whether a human, a frontier model, or a self-hosted one is driving it. Anthropic is candid that ATT&CK does not yet capture what made GTG-1002 exceptional, the autonomous orchestration that chained techniques together at machine speed, and says it is in conversation with MITRE about adding categories for agentic behavior. Until that vocabulary exists, the detection that holds up is the one that never depended on naming the tool.
What doesn’t change
Attackers are using AI to get in faster and, increasingly, to do the hands-on work once inside. That makes two things true at once. You will not see the part that happens on the attacker's machines, and you do not need to. You can see the part that happens in yours, if you are watching behavior across network and identity rather than chasing whichever tool produced it. Assume compromise, watch behavior, and AI on the attacker's side changes the speed of the problem, not the shape of the answer.
