DEF CONs AI Village Pits Hackers Against LLMs to Find Flaws

  /     /     /  
Publicated : 23/11/2024   Category : security


DEF CONs AI Village Pits Hackers Against LLMs to Find Flaws


Touted as the largest red teaming exercise against LLMs in history, the AI Village attracted more than 2,000 hackers and throngs of media.



DEF CON 2023 — Las Vegas —
DEF CONs most buzzed-about event, the AI Village, let thousands of hackers take their best shot at making one of eight different large language models (LLMs), including Google, and Open AI, say something dangerous.
According to the spokespeople for the Hack the Future AI Village, the event was a huge hit, but for now thats all thats being made public — results wont be made available for at least a week, maybe more.
The final AI hacking challenge leaderboard showed both first and third place prizes went to handles cody3 and cody2 respectively. The DEF CON AI Village itself was tight-lipped about any details about the winner, or even the prizes, but reports identified the person behind both top-three
AI Village contest entries
as Stanford masters computer science student Truc Cody Ho, adding he entered a total of five times in the competition.
More details about the hacking competition results are forthcoming, according to Avijit Ghosh, one of the authors compiling them.
We will be going through the anonymized data and finding patterns of vulnerabilities that participants discovered during the challenge and produce a report that will hopefully help ML and security researchers gain better insights into LLMs and policymakers make more informed regulations about AI, Ghosh says.
While he wont answer questions directly about any of the winning LLM hacks, Ghosh says he was able to use the LLMs to generate discriminatory code, credit card numbers, misinformation, and more.
Another of the events organizers, Jutta Williams, has a day job as Reddits senior director and global head of privacy and assurance; and on the side, is the founder of Humane-Intelligence, a nonprofit that provides safety, ethical, and other guidance for companies providing consumers with
AI products
.
Williams touted the event as the largest LLM red teaming to date.
All told, Williams said the AI Village attracted 2,240 hackers over the course of DEF CON 31 and explained the goal was to make one of its LLMs do something unsavory. That could mean generating misinformation, or using just the right question to prompt the chatbot to do something illegal — like steal data, generate malware, or stalk people.
The AI Village provided a 200-laptop wired network and gave each hacker 50 minutes to test their skills against 21 different
AI challenges
.
There were several problem statements in the challenge, Ghosh says. One of them was to get a model to produce discriminatory behavior towards one demographic versus the other. In my tests, the model refused to generate code to discriminate against different races (US definition of race), but was happy to generate code to rank people from different castes differently (Indian definition of the caste system).
By Saturday afternoon, Williams said the DEF CON crowd had already discovered dozens of vulnerabilities in the
LLM models
, but again, the specifics remain under wraps for now.
Its been wildly successful, Williams beamed. Weve had everyone from grandmas to seasoned Red Teamers through here this weekend.
The event got a big boost from the White House, thanks to a photo opportunity visit from Arati Prabhakar, a senior level science and technology adviser to the Biden Administration.
Bugcrowd helped design the AI Village challenges and the companys founder and CTO Casey Ellis was a judge of the event. He said there was a steady, long line of entrants throughout DEF CON ready to try their best to break AI.
Overall, I think everyone involved learned a ton, from those submitting findings to the vendors, contest organizers, and judges, Ellis explains. Given the speed at which this has become highly visible and incredibly important, the contest will form a critical input into how this class of security is carried out going forward.

Last News

▸ Car Sector Speeds Up In Security. ◂
Discovered: 23/12/2024
Category: security

▸ Making use of a homemade Android army ◂
Discovered: 23/12/2024
Category: security

▸ CryptoWall is more widespread but less lucrative than CryptoLocker. ◂
Discovered: 23/12/2024
Category: security


Cyber Security Categories
Google Dorks Database
Exploits Vulnerability
Exploit Shellcodes

CVE List
Tools/Apps
News/Aarticles

Phishing Database
Deepfake Detection
Trends/Statistics & Live Infos



Tags:
DEF CONs AI Village Pits Hackers Against LLMs to Find Flaws