← All primers

Primer

Lethal Trifecta

Coined by Simon Willison in June 2025, the lethal trifecta names a deployment pattern, not a vulnerability in any single component, that turns ordinary prompt injection into data exfiltration. Helmwart draws a red △ P U O badge on an agent whenever the graph shows that agent has all three of Willison's conditions.

The three legs (Willison's phrasing)

leg 1 · P PRIVATE

Access to your private data

"one of the most common purposes of tools in the first place"

Helmwart maps this to: the agent can traverse the graph to any node whose sensitivity is sensitive or regulated, or whose data contains PII or credentials. Typical examples: a shared-memory store holding session tokens, a document store of internal policies, an external system of record (banking core, EHR, ERP).

leg 2 · U UNTRUSTED

Exposure to untrusted content

"any mechanism by which text (or images) controlled by a malicious attacker could become available to your LLM"

Helmwart maps this to: an upstream node can reach into the agent's context: anything tagged provenance: untrusted, end users, open-web document stores, or user-uploaded corpora. This is the attacker's delivery channel; prompt injection lands here.

leg 3 · O OUTBOUND

The ability to externally communicate

"in a way that could be used to steal your data"

Helmwart maps this to: the agent can reach an external API, external system, or any node whose data is flagged outboundNetwork: true. This is the exit route: the wire over which exfiltrated data leaves the trust boundary. A web fetch, an email-send tool, even Markdown image rendering with attacker-controlled URLs all qualify.

Three legs, four risky regions

The three legs intersect into four meaningful regions. Two-leg overlaps are real risks but bounded; the triple-intersection at the centre (the only region all three of P, U, and O share) is the EchoLeak class.

P private data U untrusted content O outbound capability private only no exfil channel untrusted only no target to steal outbound only no injection · no payload P + U reputational risk P + O insider risk U + O hijack · nothing to leak P + U + O EchoLeak class end-to-end exfil

Why the combination is special

Any single leg is normal. Almost every useful agent has at least one. Two legs is still routine. The qualitative shift is at three:

Documented incidents

Willison's article cites three publicly disclosed examples; all three are trifecta deployments where the bug is in the combination, not any one tool:

Microsoft 365 Copilot: EchoLeak

CVE-2025-32711. A crafted email becomes a prompt injection (U) that instructs Copilot to read tenant data (P) and exfiltrate via image markdown to an attacker-controlled host (O). Zero-click; no user action required.

GitHub's official MCP server

A poisoned issue (U) instructs an agent connected to the MCP server to read private-repo contents (P) and dump them into a public-repo comment or external webhook (O). The MCP server didn't have a bug. The combination did.

GitLab Duo Chatbot

A crafted issue or comment (U) gets summarised into the assistant's context. The assistant has access to private project data (P) and can render Markdown that hits attacker URLs (O). Same shape, different vendor.

Willison's mitigation thesis

"End users have no choice but to avoid that lethal trifecta combination entirely." (Simon Willison)

Willison's argument: vendor guardrails are not enough, because any untrusted token that reaches the LLM can in principle change the agent's behaviour. The only robust answer is to deploy the agent such that one of the three legs is absent. That's why the badge is a hard signal in Helmwart, not a severity gradient.

The distributed trifecta in multi-agent systems

Single-agent trifecta analysis assumes one agent carries all three legs simultaneously. In multi-agent topologies the legs can be distributed across peers: one agent holds Private data, a second ingests Untrusted content, and a third has Outbound capability. No individual agent satisfies Willison's condition in isolation, yet the end-to-end exfiltration path still exists. The attacker just has to cross inter-agent boundaries to assemble it. T12 Agent Communication Poisoning is the mechanism by which poisoned content crosses from the Untrusted agent into the Private-data agent's reasoning context. T30 Insecure Inter-Agent Communication Protocol removes the confidentiality and integrity guarantees that would otherwise constrain what crosses those boundaries. T47 Rogue MCP Server in Ecosystem can serve as the Outbound leg: an attacker-controlled server that appears legitimate becomes the exfiltration channel. Helmwart's per-agent trifecta detection captures the single-agent case; the cross-peer combination requires topology-level review of the full graph.

How Helmwart detects it

The detector lives in detectTrifecta() in src/lib/graph/engine.ts. For every agent node, Helmwart performs three bounded reachability walks (depth ≤ 8):

If all three walks return a non-empty path, the badge fires. The actual paths surface in the right-drawer inspector, showing which private node, which untrusted source, and which outbound exit are in play.

What to do when you see it

You don't have to remove the agent. You have to cut one leg. Helmwart shows the full reachability path so you can pick the cheapest cut. Common moves:

The badge clears as soon as any one walk returns empty. You don't need to fix every threat finding on the agent. You need to break the topology.