ConfusedPilot Attack Can Manipulate RAG-Based AI Systems

  /     /     /  
Publicated : 23/11/2024   Category : security


ConfusedPilot Attack Can Manipulate RAG-Based AI Systems


Attackers can introduce a malicious document in systems such as Microsoft 365 Copilot to confuse the system, potentially leading to widespread misinformation and compromised decision-making processes.



Attackers can add a malicious document to the data pools used by artificial intelligence (AI) systems to create responses, which can confuse the system and potentially lead to misinformation and compromised decision-making processes within organizations.
Researchers from the Spark Research Lab at the University of Texas (UT) at Austin discovered the attack vector, which theyve dubbed
ConfusedPilot
because it affects all retrieval augmented generation
(RAG)-based AI systems
, including
Microsoft 365 Copilot.
This includes other RAG-based systems that use Llama, Vicuna, and OpenAI, according to the researchers.
This attack allows manipulation of AI responses simply by adding malicious content to any documents the AI system might reference, Claude Mandy, chief evangelist at Symmetry, wrote in a
paper
about the attack, which was presented at the DEF CON AI Village 2024 conference in August but was not widely reported. The research was conducted under the supervision of Symmetry CEO and UT professor Mohit Tiwari.
Given that 65% of Fortune 500 companies currently implement or are planning to implement RAG-based
AI systems
, the potential impact of these attacks cannot be overstated, Mandy wrote. Moreover, the attack is especially dangerous that it requires only basic access to manipulate responses by all RAG-based AI implementations, can persist even after malicious content is removed, and bypasses current AI security measures, he said.
RAG is a technique for improving response quality and eliminating a large language model (LLM) system’s expensive retraining or fine-tuning phase. It adds a step to the system in which the model retrieves external data to augment its knowledge base, thus enhancing accuracy and reliability in generating responses without the need for retraining or fine-tuning, the researchers said.
The researchers chose to focus on Microsoft 365 Copilot for the sake of their presentation and their paper, even though it is not the only RAG-based system affected. Rather, the main culprit of this problem is misuse of RAG-based systems … via improper setup of access control and data security mechanisms, according to the ConfusedPilot website hosted by the researchers.
In normal circumstances, a RAG-based AI system will use a retrieval mechanism to extract relevant keywords to search and match with resources stored in a vector database, using that embedded context to create a new prompt containing the relevant information to reference.
In a ConfusedPilot attack, a threat actor could introduce an innocuous document that contains specifically crafted strings into the target’s environment. This could be achieved by any identity with access to save documents or data to an environment indexed by the AI copilot, Mandy wrote.
The attack flow that follows from the users perspective is this: When a user makes a relevant query, the RAG system retrieves the document containing these strings. The malicious document contains strings that could act as instructions to the AI system that introduce a
variety of malicious scenarios
.
These include: content suppression, in which the malicious instructions cause the AI to disregard other relevant, legitimate content; misinformation generation, in which the AI generates a response using only the corrupted information; and false attribution, in which the response may be falsely attributed to legitimate sources, increasing its perceived credibility.
Moreover, even if the malicious document is later removed, the corrupted information may persist in the system’s responses for a period of time because the AI system retains the instructions, the researchers noted.
The ConfusedPilot attack basically has two victims: The first is the LLM within the RAG-based system, while the second is the person receiving the response from the LLM, who very likely could be an individual working at a large enterprise or service provider. Indeed, these two types of companies are especially vulnerable to the attack, as they allow multiple users or departments to contribute to the data pool used by these
AI systems
, Mandy noted.
Any environment that allows the input of data from multiple sources or users — either internally or from external partners — is at higher risk, given that this attack only requires data to be indexed by the AI Copilots, he wrote.
Enterprise systems likely to be negatively affected by the attack include enterprise knowledge-management systems, AI-assisted decision support systems, and customer-facing AI services.
Microsoft did not immediately respond to request for comment by Dark Reading on the attacks affect on Copilot. However, the researchers noted in their paper that the company has been responsive in coming up with practical mitigation strategies and
addressing the potential for attack
in its development of its AI technology. Indeed, the latter is key to long-term defense against such an attack, which depends on better architectural models that try to separate the data plan from the control plan in these models, Mandy noted.
Meanwhile, current strategies for mitigation include: data access controls that limit and scrutinize who can upload, modify, or delete data that RAG-based systems reference; data integrity audits that regularly verify the integrity of an organizations data repositories to detect unauthorized changes or the introduction of malicious content early; and data segmentation that keeps sensitive data isolated from broader datasets wherever possible to prevent the spread of corrupted info across the AI system.

Last News

▸ 27 Million South Koreans Hit by Online Gaming Theft. ◂
Discovered: 23/12/2024
Category: security

▸ Homeland Security Background Checks Breach Raises Concerns. ◂
Discovered: 23/12/2024
Category: security

▸ Fully committed to the future world of technology. ◂
Discovered: 23/12/2024
Category: security


Cyber Security Categories
Google Dorks Database
Exploits Vulnerability
Exploit Shellcodes

CVE List
Tools/Apps
News/Aarticles

Phishing Database
Deepfake Detection
Trends/Statistics & Live Infos



Tags:
ConfusedPilot Attack Can Manipulate RAG-Based AI Systems