Microsoft researchers bullish on AI security agent even though it let 74% of malware slip through

August 6, 2025 TH Author

Microsoft has rolled out an autonomous AI agent that it claims can detect malware without human assistance.

The prototype, called Project Ire, reverse engineers software “without any clues about its origin or purpose,” and then determines if the code is malicious or benign, using large language models (LLM) and a bunch of callable reverse engineering and binary analysis tools.

“It was the first reverse engineer at Microsoft, human or machine, to author a conviction case — a detection strong enough to justify automatic blocking — for a specific advanced persistent threat (APT) malware sample, which has since been identified and blocked by Microsoft Defender,” Redmond claimed in a Tuesday blog post.

If it performs as promised, and at scale, Project Ire will help relieve security analysts of the tedious work of manually analyzing every sample and classifying it as either good or bad. This can take hours, leading to alert fatigue and burnout, and it also means that there are fewer human eyes and brains focused on the really sophisticated and fast-moving threats that require immediate detection and blocking.

But that’s still a big if at this point.

In a real-world test of about 4,000 “hard-target” files, meaning that they weren’t classified by automated systems and would otherwise be manually reviewed by human reverse engineers, nearly 9 out of 10 files (89 percent) that Project Ire flagged as malicious were actually malicious.

However, the AI agent only detected about a quarter (26 percent) of all the malware in this test.

“While overall performance was moderate, this combination of accuracy and a low error rate suggests real potential for future deployment,” the Microsoft security engineers wrote.

The prototype will be integrated into Microsoft’s Defender suite of security tools that encompass antivirus, endpoint, email, and cloud security as a binary analyzer for threat detection and software classification.

“Our goal is to scale the system’s speed and accuracy so that it can correctly classify files from any source, even on first encounter,” according to Microsoft. “Ultimately, our vision is to detect novel malware directly in memory, at scale.”

Buzzy, but not new

AI-based malware analysis is not new, with antivirus vendors like Cylance using machine learning to analyze files for nearly a decade.

However, “what we learned then and that can be applied now is that the best results for malware detection involve a combination of deterministic (like patterns and signatures), machine learning and probabilistic techniques (AI/GenAI) approaches,” Gartner VP Neil MacDonald told The Register via email in response to questions about Project Ire.

“That’s why in this case, Microsoft highlighted its use in the SOC as far as an incident detection and response process rather than inline as a preventative control,” he said.

MacDonald did note the “relatively high percentage of false positives and false negatives documented in the paper show the limitations of this approach.”

Still, that’s not to say that security companies shouldn’t invest in AI, he added.

“It is clear that, moving forward in a world where hackers will leverage AI for quickly creating new and novel attacks, this type of AI/GenAI-based approach will be critical to keeping up with the volume and variations of new threats,” MacDonald said. “AI, in the hands of the defenders, will be necessary to offset the threat of AI in the hands of the attackers.”

All aboard the AI train

Indeed, Microsoft’s announcement comes as all of the big security companies double down on AI, especially AI agents — both integrating them into their enterprise tools and also helping companies protect their data and people against the myriad threats that AI systems and agents introduce.

While Redmond is arguably furthest along in this process of stuffing AI and task-specific agents into all of its security products, Google is also developing its own army of AI agents including one that analyzes malware and determines the extent of the threat it poses.

The Chocolate Factory announced this malware analysis agent at its annual Cloud Next event, and at the time, said it would be available in preview for select Google customers this year.

Late last month, Palo Alto Networks inked a $25-billion deal to buy Israeli biz CyberArk and bring the smaller firm’s identity security tech, which not only verifies human identities but also machines and AIs, into its larger security platform.

Machine identities outnumber those of humans by 40 to one, according to CyberArk, and this number is expected to skyrocket as more companies use AI agents. ®