DEF CON to set thousands of hackers loose on LLMs

This year’s DEF CON AI Village has invited hackers to show up, dive in, and find bugs and biases in large language models (LLMs) built by OpenAI, Google, Anthropic, and others.

The collaborative event, which AI Village organizers describe as “the largest red teaming exercise ever for any group of AI models,” will host “thousands” of people, including “hundreds of students from overlooked institutions and communities,” all of whom will be tasked with finding flaws in LLMs that power today’s chat bots and generative AI. 

Think: traditional bugs in code, but also problems more specific to machine learning, such as bias, hallucinations, and jailbreaks — all of which ethical and security professionals are now having to grapple with as these technologies scale.

DEF CON is set to run from August 10 to 13 this year in Las Vegas, USA.

The diverse issues with these models will not be resolved until more people know how to red team and assess them

“Traditionally, companies have solved this problem with specialized red teams. However this work has largely happened in private,” said Sven Cattell, the founder of AI Village, in a statement. “The diverse issues with these models will not be resolved until more people know how to red team and assess them.”

The data scientist wants to see bug bounties and live hacking events modified in general to fit in ML model-based systems. “These fill two needs with one deed, addressing the harms and growing the community of researchers that know how to help,” Cattell said.

For those participating in the red teaming this summer, the AI Village will provide laptops and timed access to LLMs from various vendors. Currently this includes models from Anthropic, Google, Hugging Face, Nvidia, OpenAI, and Stability. The village people’s announcement also mentions this is “with participation from Microsoft,” so perhaps hackers will get a go at Bing. We’re asked for clarification about this.

Red teams will also have access to an evaluation platform developed by Scale AI.

There will be a capture-the-flag-style point system to promote the testing of “a wide range of harms,” according to the AI Village. Whoever gets the most points wins a high-end Nvidia GPU.

The event is also supported by the White House Office of Science, Technology, and Policy; America’s National Science Foundation’s Computer and Information Science and Engineering (CISE) Directorate; and the Congressional AI Caucus. 

Additionally, the announcement comes as US Vice President Kamala Harris and other senior Biden administration officials met the bosses of OpenAI, Anthropic, Microsoft, and Google to discuss the risks AI poses to individuals and national security.

And separately, Rumman Chowdhury, who co-founded a group of experts calling themselves the Bias Buccaneers who champion algorithm transparency, discussed the need for AI red teams at last month’s RSA Conference.

The AI Village hosted its first machine-learning public bias bounty at DEF CON two years ago. ®

READ MORE HERE