← All primers

Primer

RAG

Retrieval-Augmented Generation is the pattern where a language model answers a question (or plans an action) using context fetched from an external store at inference time, rather than only what is encoded in the model's weights. The store is typically a vector database holding embeddings of documents the system can search over.

Why agentic systems lean on RAG heavily

Two reasons. First, agents need access to information that did not exist at training time (current customer data, current policies, current documents), so retrieval is the natural way to bring it in. Second, keeping organisational knowledge out of the model and inside a retrieval store is operationally simpler than fine-tuning, and gives finer-grained access control.

The OWASP document treats RAG as a core supporting service in its reference architecture rather than as a tool. The distinction matters: a tool is something the agent decides to call; the retrieval surface is consulted automatically as part of forming a response, often invisibly to the user.

Why this is a primary attack target

The retrieval surface is data that becomes part of the prompt. Anything that gets indexed into the store can subsequently steer the agent. Indirect prompt injection (instructions hidden inside an apparently benign document the user uploaded, or inside a web page the agent crawled) is the most common way this surface is abused. From the agent's perspective the malicious instruction looks like context, not user input.

Three classes of failure recur:

The two paths

A RAG system has a write path (top: ingestion into the store) and a read path (bottom: retrieval into the prompt). Threats land where the two paths meet the store and the prompt assembly. Boxes are clickable to the threat pages.

WRITE PATH SOURCE doc · web · user upload CHUNK split into passages EMBED passage → vector VECTOR STORE persisted, retrieval-indexed T1 Memory Poisoning: Adversarial content written into short- or long-term memory contaminates future decisions. T1 poison shared store READ PATH QUERY user / agent EMBED query → vector RETRIEVE top-k from store PROMPT ASSEMBLY context + query → prompt MODEL generates response T8 Repudiation and Untraceability: Agent actions cannot be reliably traced, attributed, or reconstructed. T8 T12 Agent Communication Poisoning: Inter-agent messages tampered with. The output of one becomes injection input of another. T12

How OWASP's catalog reflects RAG-specific risks

OWASP T1 Memory Poisoning covers the long-term memory and vector-store poisoning case explicitly. T5 Cascading Hallucinations applies because retrieved content gets quoted, cited, and reinforced through reflection loops. T12 Agent Communication Poisoning applies in multi-agent systems where one agent's retrieval feeds another agent's reasoning.

OWASP also notes that RAG-specific concerns overlap heavily with LLM08:2025 Vector and Embedding Weaknesses from the LLM Top 10. Helmwart cross-references that document rather than re-deriving its content.

Where this fits in Helmwart

Read this primer if your deployment uses RAG (almost all production agents do) and you want to understand why the retrieval boundary is treated as a primary threat surface in the catalog rather than an implementation detail.

Where to go next

Source: OWASP Agentic AI: Threats and Mitigations v1.1 (Dec 2025), §Reference Threat Model and §T1 Memory Poisoning.