The Register

Chinese spies told Claude to break into about 30 critical orgs. Some attacks succeeded

Chinese cyber spies used Anthropic’s Claude Code AI tool to attempt digital break-ins at about 30 high-profile companies and government organizations – and the government-backed snoops “succeeded in a small number of cases,” according to a Thursday report from the AI company.

The mid-September operation targeted large tech companies, financial institutions, chemical manufacturers, and government agencies.

The threat actor was able to induce Claude to execute individual components of attack chains

While a human selected the targets, “this marks the first documented case of agentic AI successfully obtaining access to confirmed high-value targets for intelligence collection, including major technology corporations and government agencies,” Anthropic’s threat hunters wrote in a 13-page document [PDF].

It’s also further proof that attackers continue experimenting with AI to run their offensive operations. The incident also suggests heavily funded state-sponsored groups are getting better at autonomizing attacks.

The AI vendor tracks the Chinese state-sponsored group behind the espionage campaign as GTG-1002, and says its operatives used Claude Code and Model Context Protocol (MCP) to run the attacks without a human in the tactical execution loop.

A human-developed framework used Claude to orchestrate multi-stage attacks, which were then carried out by several Claude sub-agents all performing specific tasks. Those chores included mapping attack surfaces, scanning organizations’ infrastructure, finding vulnerabilities, and researching exploitation techniques.

Once the sub-agents developed exploit chains and custom payloads, a human operator spent between two and 10 minutes reviewing the results of the AI’s actions and signing off on the subsequent exploitations.

The sub-agents then got to work finding and validating credentials, escalating privileges, moving laterally across the network, and accessing and then stealing sensitive data. Post-exploitation, the human operator only had to again review the AI’s work before approving the final data exfiltration.

“By presenting these tasks to Claude as routine technical requests through carefully crafted prompts and established personas, the threat actor was able to induce Claude to execute individual components of attack chains without access to the broader malicious context,” according to the report.

Upon discovering the attacks, Anthropic says it launched an investigation that led it to ban associated accounts, mapped the full extent of the operation, notified affected entities, and coordinated with law enforcement.

These attacks represent a “significant escalation” from the firm’s August report that documented how criminals used Claude in a data extortion operation that hit 17 organizations and saw attackers demand ransoms ranging from $75,000 to $500,000 for stolen data. However, “humans remained very much in the loop directing operations,” in that attack, we’re told.

“While we predicted these capabilities would continue to evolve, what has stood out to us is how quickly they have done so at scale,” states Anthropic’s new analysis.

There is a slight silver lining, however, in that Claude did hallucinate during the attacks and claimed better results than the evidence showed.

The AI “frequently overstated findings and occasionally fabricated data during autonomous operations,” requiring the human operator to validate all findings. These hallucinations included Claude claiming it had obtained credentials (which didn’t work) or identifying critical discoveries that turned out to be publicly available information.

Anthropic asserts such errors represent “an obstacle to fully autonomous cyberattacks” – at least for now. ®

READ MORE HERE