Anthropic’s recent disclosures should end any lingering illusion that AI risk is primarily about hallucinations, bias or prompt abuse. The threat landscape has moved on. What we’re now seeing is industrial-scale capability theft, with frontier AI models being targeted the way critical infrastructure is targeted - systematically, persistently and at scale.
Three labs - DeepSeek, Moonshot and MiniMax - have been running industrial-scale distillation campaigns against Claude. Over 16 million exchanges and 24,000 fraudulent accounts, using coordinated proxy networks to cycle through accounts faster than they can be banned.
This isn’t a model safety problem. It’s a model security problem. Above all, it’s an intelligence operation, not research. And it signals a decisive shift in AI risk.
This isn’t research, it’s an intelligence operation
Traditional distillation reduces model size for efficiency. What Anthropic describes is different in intent and scale. Foreign labs - including entities subject to Chinese-state influence - systematically extracted frontier capabilities: agentic reasoning, tool use, advanced coding. The extracted intelligence was then redeployed into new models stripped of guardrails or safety constraints. In one campaign, Claude generated censorship-safe responses to politically sensitive topics such as queries about dissidents and party leaders. Another pivoted within 24 hours of a new model release to harvest the newest capabilities before defenses could adapt.
The operational pattern mirrors advanced persistent threat (APT) campaigns:
The AI supply chain kill chain:
Replace “data theft” with “intelligence theft” and the playbook is instantly recognizable.
Capability extraction is the new exfiltration
Historically, cybersecurity has focused on protecting data. In the AI era, the crown jewels are no longer databases - they’re capabilities. Agentic reasoning enables autonomous decision-making. Tool use enables interaction with real systems. Advanced coding enables rapid software generation, including offensive tools. Together, they form a platform for automation at scale.
Steal those capabilities once, and you don’t need to breach the original system again. You can rebuild the intelligence elsewhere, without oversight, restrictions or accountability.
If you’re an organization consuming models via API, running agents or building on third–party AI infrastructure, this is your threat landscape now. The model you’re consuming could be built from stolen capabilities. The guardrails you’re trusting might not exist in a distilled version powering a tool in your supply chain.
The economic implications are every bit as stark: billions in R&D investment can be neutralized with relatively low-cost extraction campaigns. First mover advantage becomes fragile, temporary and easy to undermine.
The invisible risk: a compromised AI supply chain
Most organizations are unlikely to be direct targets of these operations - but they can become downstream victims. As mentioned above, if you’re consuming models via APIs, deploying agents or relying on third-party AI tools, you may unknowingly depend on intelligence derived from stolen capabilities.
This creates multiple systemic risks:
Guardrails may be absent. Distilled models often lack safety training.Provenance may be opaque. Intellectual property may be laundered through intermediaries.Control may be external. Components may be influenced by foreign actors.Behavior may diverge under pressure. Safety assumptions may fail in production scenarios.
This is the AI equivalent of SolarWinds or Log4j - not malicious code inserted into a trusted dependency, but compromised intelligence embedded inside a trusted capability.
Traditional controls can’t detect this; configuration hygiene does nothing if the underlying model is already compromised.
AISPM needs a deeper layer
AI Security Posture Management (AISPM) has largely focused on access controls, prompt injection defenses and pipeline hardening. All necessary - but inefficient. Defending against capability theft calls for deeper architecture that’s centred on the model itself:
- Model provenance and integrity: Organizations must verify where models come from, how they were trained, and whether their lineage is legitimate. Blind consumption is no longer acceptable.
- AI supply chain kill-chain mapping: Defenders need visibility into the entire attack lifecycle - from fraudulent account spikes to proxy usage and extraction patterns, not just isolated anomalies.
- Behavioral fingerprinting: Distillation campaigns generate distinctive interaction patterns. Anthropic reportedly deployed classifiers to detect them. Organizations need comparable visibility across every AI service they expose or consume.
- Threat intelligence for AI assets: AI inventories must be enriched with adversarial context, just as vulnerability management programs correlate CVEs with active exploitation and ransomware activity.
Without these capabilities, AISPM risks becoming a compliance checkbox rather than a defensive system.
Model safety vs Model security - two different questions
For years, the AI risk conversation has centred on a single concern: what models might do once deployed. Regulators, researchers, vendors have worked on alignment, guardrails and red-teaming. The implicit assumption was that the model itself - its underlying intelligence- remained secure, controlled and proprietary.
Those assumptions are now under pressure.
Model safety: Will the model produce harmful output?
Safety focuses on behavior at inference time - bias, toxicity, hallucinations, policy violations. It’s about preventing misuse or unintended consequences from an otherwise trusted system.
Model security: Has the model’s intelligence been stolen, cloned or rebuilt from exfiltrated capabilities?
Security addresses a fundamentally different risk: the compromise of the asset itself. If frontier capabilities can be extracted, reproduced and redeployed without safeguards, the problem is no longer what the model says, it’s who else now possesses its reasoning power.
Why does this matter? A model that says the wrong thing creates reputational and compliance exposure. A model whose intelligence has been exfiltrated represents loss of advantage, loss of control and, potentially, loss of security.
From an AISPMOrganizations deploying AI at scale must now answer both questions independently:
- Is this model aligned? (safety)
- Is this model authentic and uncompromised? (security)
Most governance frameworks address the first. The emerging threat landscape is increasingly defined by the second.
Protect the intelligence, not just the interface
The industry spent years focused on how AI systems behave. Anthropic’s findings make clear the deeper risk is whether the intelligence behind those systems remains controlled at all.
Organizations that treat AI as just another software feature will remain exposed. Those that treat it as critical infrastructure - requiring provenance, integrity and adversarial defense - will be far better prepared for a landscape where capability theft, not misbehavior, defines the real risk.