Google: Big Sleep AI Agent Puts SQLite Software Bug to Bed

A research tool by the company found a vulnerability in the SQLite open source database, demonstrating the defensive potential for using LLMs to find vulnerabilities in applications before theyre publicly released.

Google has discovered its first real-world vulnerability using an artificial intelligence (AI) agent that company researchers are designing expressly for this purpose. The discovery of a memory-safety flaw in a production version of a popular open source database by the companys Big Sleep large language model (LLM) project is the first of its kind, and it has tremendous defensive potential for organizations, the
Big Sleep team wrote
in a recent Project Zero blog.
Big Sleep — the work of a collaboration between the companys Project Zero and Deep Mind groups — discovered an exploitable stack buffer underflow in SQLite, a widely used open source database engine.
Specifically, Big Sleep discovered a pattern in the code of a publicly released version of SQLite that creates a potential edge case that needs to be handled by all code that uses the field, the researchers noted. A function in the code failed to correctly handle the edge case, resulting in a write into a stack buffer with a negative index when handling a query with a constraint on the rowid column, thus creating an exploitable flaw, according to the post.
Google reported the bug to SQLite developers in early October. They fixed it on the same day and before it appeared in an official release of the database, so users were not affected.
We believe this is the first public example of an AI agent finding a previously unknown exploitable memory-safety issue in widely used real-world software, the Big Sleep team wrote in the post. While this may be true, its not the first time an LLM-based reasoning system autonomously found a flaw in the SQLite database engine, Google acknowledged.
An LLM model called Atlantis from a group of AI experts called Team Atlanta discovered six zero-day flaws in SQLite3 and even autonomously identified and patched one of them during the AI Cyber Challenge organized by ARPA-H, DARPA, and the White House,
the team revealed
in a blog post in August.
In fact, the Big Sleep team used one of the Team Atlanta discoveries — of a null-pointer dereference flaw in SQLite — to inspire them to use AI to see if we could find a more serious vulnerability, according to the post.
Google and other software development teams already use a process called
fuzz-testing
, colloquially known as fuzzing, to help find flaws in applications before release.
Fuzzing
is an approach that targets the software with deliberately malformed data — or inputs — to see if it will crash so they can investigate and fix the cause.
In fact, Google earlier this year released
an AI-boosted fuzzing framework
as an open source resource to help developers and researchers improve how they find software vulnerabilities. The framework automates manual aspects of fuzz-testing and uses LLMs to write project-specific code to boost code coverage.
While fuzzing has helped significantly to reduce the number of flaws in production software, developers need a more powerful approach to find the bugs that are difficult (or impossible) to find in this way, such as variants for previously found and patched vulnerabilities, the Big Sleep team wrote.
As this trend continues, its clear that fuzzing is not succeeding at catching such variants, and that for attackers, manual variant analysis is a cost-effective approach, the team wrote in the post.
Moreover, variant analysis is a better fit for current LLMs because its provides them with a starting point — such as the details of a previously fixed flaw — for a search, and thus removes a lot of ambiguity from AI-based vulnerability testing, according to Google. In fact, at this point in the evolution of LLMs, lack of this type of starting point for a search can cause confusion, they noted.
Were hopeful that AI can narrow this gap, the Big Sleep team wrote. We think that this is a promising path towards finally turning the tables and achieving an asymmetric advantage for defenders.
Google Big Sleep is still in its research phase, and using AI-based automation to identify software flaws overall is a new discipline. However, there already are tools available that developers can use to get a jump on finding vulnerabilities in software code before public release.
Late last month, researchers at Protect AI released Vulnhuntr, a free, open source static code analyzer tool that
can find zero-day vulnerabilities
in Python codebases using Anthropics
Claude artificial intelligence (AI) model
.
Indeed, Googles discovery shows promising progress for the future of using AI to help developers troubleshoot software before letting flaws seep into production versions.
Finding vulnerabilities in software before its even released means that theres no scope for attackers to compete: the vulnerabilities are fixed before attackers even have a chance to use them, Googles Big Sleep team wrote.

Last News
▸ ArcSight prepares for future at user conference post HP acquisition. ◂ Discovered: 07/01/2025 Category: security	▸ Samsung Epic 4G: First To Use Media Hub ◂ Discovered: 07/01/2025 Category: security	▸ Many third-party software fails security tests ◂ Discovered: 07/01/2025 Category: security

**Cyber Security Categories**
Google Dorks Database	Exploits Vulnerability	Exploit Shellcodes

CVE List

Tools/Apps

News/Aarticles

Phishing Database

Deepfake Detection

Trends/Statistics & Live Infos

Tags:
Google: Big Sleep AI Agent Puts SQLite Software Bug to Bed