AI security risk assessment using Counterfit

Today, we are releasing Counterfit, an automation tool for security testing AI systems as an open-source project. Counterfit helps organizations conduct AI security risk assessments to ensure that the algorithms used in their businesses are robust, reliable, and trustworthy.

AI systems are increasingly used in critical areas such as healthcare, finance, and defense. Consumers must have confidence that the AI systems powering these important domains are secure from adversarial manipulation. For instance, one of the recommendations from Gartner’s Top 5 Priorities for Managing AI Risk Within Gartner’s MOST Framework published in Jan 20211 is that organizations “Adopt specific AI security measures against adversarial attacks to ensure resistance and resilience,” noting that “By 2024, organizations that implement dedicated AI risk management controls will successfully avoid negative AI outcomes twice as often as those that do not.”

However, performing security assessments of production AI systems is nontrivial. Microsoft surveyed 28 organizations, spanning Fortune 500 companies, governments, non-profits, and small and medium sized businesses (SMBs), to understand the current processes in place to secure AI systems. We found that 25 out of 28 businesses indicated they don’t have the right tools in place to secure their AI systems and that security professionals are looking for specific guidance in this space.

This tool was born out of our own need to assess Microsoft’s AI systems for vulnerabilities with the goal of proactively securing AI services, in accordance with Microsoft’s responsible AI principles and Responsible AI Strategy in Engineering (RAISE) initiative. Counterfit started as a corpus of attack scripts written specifically to target individual AI models, and then morphed into a generic automation tool to attack multiple AI systems at scale.

Today, we routinely use Counterfit as part of our AI red team operations. We have found it helpful to automate techniques in MITRE’s Adversarial ML Threat Matrix and replay them against Microsoft’s own production AI services to proactively scan for AI-specific vulnerabilities. Counterfit is also being piloted in the AI development phase to catch vulnerabilities in AI systems before they hit production.

To ensure that Counterfit addresses a broader set of security professionals’ needs, we engaged with a diverse profile of partners spanning large organizations, SMBs, and governmental organizations to test the tool against their ML models in their environments.

“AI is increasingly used in industry; it is vital to look ahead to securing this technology particularly to understand where feature space attacks can be realized in the problem space. The release of open-source tools from an organization such as Microsoft for security practitioners to evaluate the security of AI systems is both welcome and a clear indication that the industry is taking this problem seriously.”

Matilda Rhode, Senior Cybersecurity Researcher, Airbus

Three key ways Counterfit is flexible

As a result of internal and external engagements, Counterfit is flexible in three key ways:

  1. Counterfit is environment agnostic—it can help assess AI models hosted in any cloud environment, on-premises, or on the edge.
  2. Counterfit is model agnostic—the tool abstracts the internal workings of their AI models so that security professionals can focus on security assessment.
  3. Counterfit strives to be data agnostic—it works on AI models using text, images, or generic input.

Under the hood, Counterfit is a command-line tool that provides a generic automation layer for adversarial AI frameworks such as Adversarial Robustness Toolbox and TextAttack. Our tool makes published attack algorithms accessible to the security community and helps to provide an extensible interface from which to build, manage, and launch attacks on AI models.

Designed for security professionals

Counterfit uses workflows and terminology similar to popular offensive tools that security professionals are already familiar with, such as Metasploit or PowerShell Empyre. Security professionals can benefit from the tool in the following ways:

  • Penetration testing and red teaming AI systems: The tool comes preloaded with published attack algorithms that can be used to bootstrap red team operations to evade and steal AI models. Since attacking AI systems also involves elements of traditional exploitation, security professionals can use the target interface and built-in cmd2 scripting engine to hook into Counterfit from existing offensive tools. Additionally, the target interface can allow for granular control over network traffic. We recommend using Counterfit alongside Adversarial ML Threat Matrix, which is an ATT&CK style framework released by MITRE and Microsoft for security analysts to orient to threats against AI systems.

Demo of Microsoft Counterfit used to scan an AI model for model evasion vulnerability.

  • Vulnerability scanning for AI systems: The tool can help scan AI models using published attack algorithms. Security professionals can use the defaults, set random parameters, or customize them for broad vulnerability coverage of an AI model. Organizations with multiple models in their AI system can use Counterfit’s built-in automation to scan at scale. Optionally, Counterfit enables organizations to scan AI systems with relevant attacks any number of times to create baselines. Running this system regularly, as vulnerabilities are addressed, also helps to measure ongoing progress toward securing AI systems.
  • Logging for AI systems: Counterfit also provides logging to record the attacks against a target model. Telemetry may help data science and engineering teams improve their understanding of failure modes in their AI systems.

This tool is part of broader efforts at Microsoft to empower engineers to securely develop and deploy AI systems. We recommend using it alongside the following resources:

  • For security analysts to orient to threats against AI systems, Microsoft, in collaboration with MITRE, released an ATT&CK style Adversarial ML Threat Matrix complete with case studies of attacks on production ML systems.
  • For security incident responders, we released our own bug bar to systematically triage attacks on ML systems.
  • For industry practitioners and security professionals to develop muscle in defending and attacking ML systems, we hosted a realistic Machine Learning Evasion Competition.
  • For developers, we released threat modeling guidance specifically for ML systems.
  • For engineers and policymakers, Microsoft, in collaboration with Berkman Klein Center at Harvard University, released a taxonomy documenting various ML failure modes.

Learn more

To learn more about this effort:

To learn more about Microsoft Security solutions visit our website. Bookmark the Security blog to keep up with our expert coverage on security matters. Also, follow us at @MSFTSecurity for the latest news and updates on cybersecurity.


1Gartner, Top 5 Priorities for Managing AI Risk Within Gartner’s MOST Framework, Avivah Litan, et al., 15 January 2021.

READ MORE HERE