PLAYBOOK · P2 · OWASP Agentic AI v1.1
Preventing Memory Poisoning & AI Knowledge Corruption
Keep short- and long-term memory clean of adversarial writes and retrievals.
Goal: Prevent AI from storing, retrieving, or propagating manipulated data that could corrupt decision-making or spread misinformation.
At a glance
Defence-in-depth chain
When a poisoned write targets agent memory or a knowledge store, Proactive controls (memory content validation and permission-aware vector retrieval) block at the write boundary by validating content and enforcing access policy. If a poisoned entry persists, Reactive controls (memory anomaly detection and separation of actor and recorder) detect the anomaly at runtime and trigger a rollback. Detective controls (multi-source verification) cross-check knowledge lineage to flag drift and contamination.
proactive Step 1: Secure AI memory access & validation
-
Scan every candidate memory insertion for anomalies and reject writes from untrusted sources, applying cryptographic validation for long-term stored entries.
-
Log all memory reads and writes to an immutable audit trail so every access can be traced after the fact.
-
Isolate each session's memory partition so the agent cannot read or carry over knowledge from a different user's session.
Helmwart controls: Session isolation -
Enforce access-control lists on vector stores and shared memory so each agent can only retrieve data relevant to its current task.
-
Pin every model artefact to a registry-managed, checksummed version and gate promotion to production on a behavioural regression suite.
Helmwart controls: Model registry -
Enforce retention limits keyed to data sensitivity so agents discard historical knowledge before it can be exploited.
Helmwart controls: Mem validate -
Record the originating source for every memory update so modifications can be traced back to a trusted or untrusted actor.
Helmwart controls: Provenance tracking -
Require multi-agent or external corroboration before committing any memory change that will persist across sessions.
-
Cross-check new knowledge against trusted external sources before writing it to long-term storage, and fail closed when confidence is insufficient.
reactive Step 2: Detect & respond to memory poisoning
-
Monitor memory logs in real time for unexpected updates or unauthorised access and raise an alert on any anomaly.
Helmwart controls: Mem anomaly -
Re-run multi-agent or external validation on any suspect memory entry after it has been committed to confirm or rule out poisoning.
-
Periodically re-check existing stored knowledge against trusted sources to detect drift or contamination introduced over time.
Helmwart controls: Multi-source verify -
Roll back agent knowledge to the last validated snapshot whenever an anomaly is detected in the memory store.
Helmwart controls: Mem anomaly -
Take periodic memory snapshots with actor attribution so a forensic rollback is possible after a poisoning event.
-
Alert when memory modification frequency for an agent spikes, as a sudden high rewrite rate is a reliable early indicator of manipulation.
Helmwart controls: Mem anomaly
detective Step 3: Prevent the spread of false knowledge
-
Cross-check new knowledge against multiple trusted sources before accepting it as established fact within the agent's knowledge base.
Helmwart controls: Multi-source verify -
Block knowledge propagation from unverified sources so low-trust inputs cannot influence downstream agent decisions.
-
Maintain a full provenance lineage of how agent knowledge evolved to enable forensic investigation into misinformation spread.
-
At the RAG retrieval boundary, route attacker-influenceable content through a quarantined extractor model before the privileged executor ever sees it.
Helmwart controls: PI defences+ -
Version-control every knowledge update so corrupted changes can be audited and rolled back to a clean state.
-
Continuously analyse memory access patterns and cross-audit access logs to catch long-term anomalies or policy drift before they compound.
-
Apply embedding-space anomaly detection and adversarial re-ranking at the retrieval layer to reduce the impact of poisoned vectors.
Helmwart controls: Memory-poison defence
Source
OWASP Agentic AI: Threats and Mitigations v1.1 (Dec 2025), §Mitigation Strategies. Action text is taken verbatim or paraphrased from the canonical document; the Helmwart additions are the per-action mappings onto deployable mitigation entries.