Definition
Agents in a multi-agent system pass instructions, results, and context to one another across APIs, message buses, and shared state. Without per-message authentication and integrity controls, a single compromised peripheral agent becomes an injection source for every peer it can reach. One hop becomes n-hop, and the orchestrator is reachable from the outside.
What it means in practice
A data pipeline runs three agents in sequence: an ingestion agent, a transformation agent, and an output agent. The ingestion agent is compromised via a poisoned source file and begins forwarding manipulated records to the transformation agent. Because messages between agents are not signed or authenticated (just HTTP calls on an internal network), the transformation agent accepts the manipulated records as legitimate peer output and processes them faithfully. The output agent does the same. A single compromised peripheral agent corrupts the entire pipeline; the orchestrator sees three agents each reporting success.
The attack surface here is lateral, not vertical. An adversary does not need to reach the orchestrator directly; compromising any peer is sufficient if inter-agent messages are trusted without verification. The baseline controls for any production multi-agent system are: mutual TLS or equivalent on every channel (so both sides are authenticated), cryptographically signed payloads on every message (so tampering in transit is detectable), and explicit agent-identity verification at the application layer rather than network-endpoint trust alone. Without these, the blast radius of any single agent compromise extends to every peer it can reach.
Threat catalogue links
Base-catalog T-numbers follow OWASP source material; normalized MAS scenario entries are Helmwart editorial cross-references. Role colour-codes Helmwart's display weight: chips in the hero use the same scheme.
-
T12 Agent Communication Poisoning primary Inter-agent messages tampered with. The output of one becomes injection input of another.
Open threat detail → -
T16 Insecure Inter-Agent Protocol Abuse primary MCP/A2A protocols abused via consent-flow manipulation, MCP response injection, or weaponised tool descriptions.
Open threat detail → -
T30 Insecure Inter-Agent Communication Protocol primary Agent comms protocol vulnerable to eavesdropping, tampering, or replay between agents.
Open threat detail → -
T37 Cross-Chain Bridge Attack (Indirect) contributing Adversary exploits cross-chain bridge weaknesses to disrupt agent-controlled fund movement.
Open threat detail → -
T42 Cross-Client Interference via Shared Server contributing Shared MCP server state leaks across client sessions, causing unintended cross-tenant interference.
Open threat detail → -
T43 Network Exposure of MCP Server contributing MCP server reachable from networks beyond its intended trust zone; surface available to unintended callers.
Open threat detail →
MITRE ATLAS technique
OWASP has not published a 1:1 MITRE ATLAS mapping for this entry. The closest red-team techniques are referenced on the individual threat detail pages linked in the section above.
OWASP LLM Top 10 cross-references
From OWASP Appendix A (canonical inheritance)
Helmwart mechanistic crossover (named in OWASP body text, not in Appendix A)
Recommended mitigations
No single control answers an ASI; it is met by a layered stack. The cards below are ranked by how directly each control counters ASI07: the chips on each card name the threat of this ASI it actually covers, colour-coded by that threat's role.
Counters the core
Cover one or more of this ASI's primary threats — the strongest direct response.
An inter-agent message travels through channels and intermediate agents the receiver did not originate. If nothing binds the message cryptographically to its source, any intermediate hop can substitute or inject content that the receiving agent will treat as authoritative. Message signing closes that gap: the source agent signs each message payload with its private key, and the receiver verifies the signature against a distributed trust bundle before the content reaches the reasoning layer.
In most deployments, agents authenticate to one another with long-lived bearer tokens or shared secrets. If any one of those credentials is stolen, the attacker has persistent, platform-wide access until someone manually rotates it. SPIFFE replaces that model: each workload is issued a short-lived, cryptographically verifiable identity document, and every connection requires both sides to present one. No long-lived secrets traverse the network, and a compromised credential is worthless within its TTL.
An agent that has been compromised, poisoned, or gone rogue will, in most cases, behave differently from its established baseline. Anomaly isolation acts on that difference: when an agent's behaviour score crosses a configured threshold, it is quarantined automatically, credentials revoked, message-queue access cut, in-flight actions aborted. Manual revocation cannot match the speed that cascading multi-agent failures demand.
An agent that holds a persistent catalog of invokable tools can reach any of them at any point in its session. If its reasoning is manipulated or its identity is compromised, that persistent surface is fully available to an attacker. Just-in-time tool grants remove the standing surface: a policy broker issues a time-bound, task-scoped grant immediately before the tool is needed and revokes it automatically when the task completes or the window expires.
Each tool in an agent's catalog should expose only the methods, resources, and parameter ranges its designated role requires. Over-broad tool surfaces let individually authorised primitives compose into actions no human intended to grant; narrowing the scope at design time reduces both the attack surface and the blast radius of any compromise.
An MCP server response is content the LLM will reason over next. The model cannot distinguish tool output from instruction: that boundary must be enforced at the client, before the payload enters the context window. MCP response sanitisation applies schema validation, Unicode normalisation, control-token stripping, and structural wrapping to every tool result at the response boundary, so adversarial content embedded in a server response cannot redirect the agent's planner.
A single agent's judgment on a high-impact action can be wrong, manipulated, or compromised. Requiring N of M independent peer agents to agree before the action executes means an attacker or a systematic error must affect the quorum majority, not just one agent, before harm results.
In a multi-agent system, each agent routes decisions based on what its peers report. If a peer's behaviour becomes unreliable or adversarial, agents that keep treating it with full authority will propagate whatever errors or manipulations that peer introduces. Per-agent trust scoring addresses this by maintaining a continuously updated reputation score for every peer, derived from observed behaviour, and using that score to determine how much authority each incoming message carries.
An agent identity backed by a long-lived bearer token grants access for as long as that token remains valid. If the token is stolen, logged, or extracted from a running process, the attacker holds working credentials for weeks or months without any further action. Short-lived tokens address this by issuing credentials with a time-to-live measured in minutes or hours, automated and renewed by the platform rather than a human. When a token expires, access ends: the attacker must win the renewal process as well, which requires compromising a harder target than the token itself.
A tool's description field is concatenated directly into the agent's system prompt and shapes which tools the agent selects and how it uses them. An attacker who controls or compromises a tool manifest can plant a description that overstates the tool's scope, suppresses safety scaffolding, or embeds instruction-following language aimed at the agent. Validating descriptions at catalog-load, before the tool enters the runtime, stops that class of manipulation at the registration boundary rather than detecting its effects later at the call seam.
Broader coverage — 3 controls that address contributing or related threats
A shared MCP server that accepts connections from multiple clients is a concentration point where one client's session state, credentials, and resource budget are physically co-located with every other client's. Without enforced isolation, a malicious or compromised client can read another session's cached credentials, consume shared resources to the point of denying service to other clients, or exploit aggregate server permissions that exceed its own declared scope. Cross-client isolation is the set of structural controls that close those paths: per-session state scoping, per-client permission evaluation, and per-client resource quotas enforced at the server layer.
A blockchain transaction, once committed, cannot be undone. An agent that signs and broadcasts a transaction without an enforcement layer before it can exceed its authorised value, call a contract it was never provisioned to reach, or drain a wallet in a runaway loop, and by then the funds are gone. A transaction guard intercepts each proposed transaction before signing, checks it against value bounds, a contract allowlist, a gas or compute-unit limit, and a replay-protection nonce, and refuses to sign anything that falls outside declared policy.
Role-Based Access Control (RBAC) assigns every agent identity a named role that sets the outer limit on what it can reach. Attribute-Based Access Control (ABAC) narrows individual decisions inside that role by evaluating contextual attributes at request time. Used together, they enforce least privilege for non-human identities: the agent can only do what its role permits, and only when the request attributes satisfy the policy.
Sources
- OWASP Top 10 for Agentic Applications 2026 (canonical source) ↗ · OWASP Gen AI Security Project · Dec 2025 · CC BY-SA 4.0
- Agentic Top 10 side-by-side explainer ↗ · trydeepteam.com · secondary reference