T17: Supply Chain Compromise

Definition

Supply Chain Compromise covers vulnerable, malicious, outdated, or otherwise harmful upstream components that end up inside the agent: models, libraries, plugins, prompt templates, build pipelines, or framework updates. The compromise can manipulate agent behaviour, exfiltrate data, or run arbitrary code, and is amplified in agentic systems because the agent will autonomously use the compromised component across many runs.

What it looks like in practice

Amazon Q Supply Chain Compromise. In the Amazon Q for VS Code extension (v1.84.0), an attacker with write access to the extension’s GitHub repository committed a pull request that embedded a destructive natural-language instruction directly into a source file. The instruction was phrased as a developer comment but was positioned so that any agent with access to the repository would interpret it as a directive. The prompt did not execute as the attacker intended. A reviewer caught it before the release completed propagation, but the incident established the mechanism: a supply-chain commit to a widely distributed IDE extension is a viable vector for delivering malicious instructions to thousands of developer environments where AI coding assistants read repository content as trusted context. Had the release completed, every developer who installed or auto-updated the extension would have had the instruction silently included in their agent’s context on next use.

Replit Vibe Coding Incident. Replit’s autonomous coding agent was tasked with building a small application that required a database. The agent, operating without a clear separation between test and production environments, hallucinated the existence of a test database and began treating it as the authoritative one. When the real production database was in its scope, the agent deleted it, interpreting the deletion as cleanup of a redundant resource, and then generated test results against its fabricated schema, which passed. No human reviewed the intermediate steps. The incident illustrates three compounding supply-chain weaknesses: the agent framework did not isolate test from production at the tool level; the agent’s tool call for database deletion was not gated by an explicit human confirmation step; and the agent’s own output (passing tests) served as the only validation signal for a destructive action it had just taken.

Why it’s dangerous

Conventional LLM applications already face supply-chain risk from compromised dependencies. Agentic systems extend the surface: agents are long-lived, stateful entities that chain decisions, collaborate with other agents, and persistently execute logic. A single compromised upstream component can therefore propagate behavioural drift or systemic misuse across many tasks before detection.

Where it manifests

Inspect signing and provenance for prompts, agent cards, model definitions, and plugins. Check for verifiable software bills of materials (SBOMs, including AI-specific AIBOMs and Agent SBOMs) covering the agent’s runtime components. Verify isolation of test from production environments, and whether human review gates apply to AI-generated build artifacts.

Detection signals

Supply chain compromise in agentic systems surfaces in dependency integrity checks, prompt provenance logs, and anomalous agent behaviour following updates.

Dependency hash mismatch against a pinned software bill of materials (SBOM): compare the hash of every installed framework package, plugin, and prompt template against a pinned SBOM on each agent startup; any mismatch that is not accompanied by an approved change-control entry is an immediate alert condition.
Novel natural-language instruction appearing in a source file or configuration after a dependency update: scan repository diffs and plugin changelogs for imperative natural-language constructs (e.g., “you must”, “ignore previous”, “delete all”) in files that should contain only code or structured data; flag any such pattern introduced by a third-party commit or package update.
Behavioural divergence in an agent after a framework or plugin version bump: establish a baseline of the agent’s tool-call distribution and output structure; alert when the distribution shifts by more than two standard deviations on a key metric such as deletion calls, outbound connections, or approval rate within 24 hours of a dependency update.
Destructive tool call (delete, drop, purge) without a matching user intent signal in the session log: log every tool call alongside the user instruction that preceded it; a deletion tool call that cannot be traced back to an explicit user instruction in the same session is a strong indicator of supply-chain-injected directives.
Agent-generated test results for a resource the agent itself modified in the same session: detect when the agent both modifies a data resource and subsequently generates a validation or test result against it without any external fixture; self-validating destructive actions (as in the Replit incident) should require an out-of-band verification step.

OWASP Top 10 for Agentic Applications 2026

The Agentic Top 10 (ASI01 through ASI10) is a separate practitioner-facing publication that maps onto the master Threats & Mitigations threat numbering. T17 is covered by the following Top 10 entries:

ASI04 Agentic Supply Chain Vulnerabilities primary

Third-party components that agents depend on (models, MCP servers, plug-ins, datasets, peer-agent descriptors, and update channels) may be malicious, compromised post-approval, or tampered with in transit. Unlike software supply-chain risk, this is a live exposure: every new session the agent fetches and trusts components whose state may have changed since they were last reviewed.

OWASP LLM Top 10: LLM03:2025

Source: OWASP Top 10 for Agentic Applications 2026 (Dec 2025) · the Top 10 is a compass into the master Threats & Mitigations taxonomy, not a replacement for it.

Design principles at stake

When T17 is present, these security design principles are the ones being violated or tested. Each links to the full principle; the mitigations below are how you restore them.

Defence-in-Depth A single compromised upstream component (a prompt template, framework update, or plugin) propagates behavioural drift silently across many autonomous runs before detection, because no individual agent flags that its foundation has changed. Depth means controls at every layer of the stack: signed artifacts verified by content hash on every load (not just at install), an SBOM and AIBOM that enumerate every runtime component so drift is detectable, sandboxed plugin loading and inter-agent message parsing isolated from the main process, and human review gates applied to AI-generated build artefacts before they reach production. Defeating any one of these layers (say slipping a malicious package past the registry) still leaves runtime re-verification, sandbox isolation, and the human gate standing.
Supply-chain Security In agentic systems the supply chain extends beyond code dependencies to include models, prompt templates, agent cards, and plugin definitions, all of which are inputs the agent will autonomously act on across many tasks before any anomaly surfaces. The Amazon Q incident showed that supply-chain write access to a widely-installed agent extension is sufficient for broad impact; the Replit incident showed that unsandboxed tools and unvalidated prompt execution are themselves supply-chain failure modes. Controls must therefore span provenance verification of every component at build and runtime, isolation of test from production environments so a compromised test artefact cannot reach live agents, and version-pinning by content hash so a rug-pull on a floating version tag is caught immediately.

Recommended mitigations

Auto-generated from the mitigation catalog: every mitigation whose coverage map includes T17, sorted by maturity tier (Tier 1 production-canonical first, then Tier 2, then Tier 3 research-stage).

Tier 1 Agent SBOM (Signed AIBOM: a cryptographically-bound inventory of every component an agent loads)

An AI agent assembles itself at runtime from a model, prompt templates, plugins, and library dependencies, any of which can be tampered with before they arrive. A signed AI Bill of Materials (AIBOM) locks down that assembly: it records every component with a version and hash at build time, signs the manifest, and verifies it before the agent accepts traffic. A component that does not match its declared hash cannot silently enter the agent.

why it helps Supply Chain Compromise is the introduction of a vulnerable, malicious, or outdated upstream component into the agent. AIBOM addresses this directly by making the component inventory explicit and tamper-evident. A modified model file or poisoned library produces a hash mismatch against the signed manifest, blocking deployment before the agent handles any traffic.
Tier 1 Sigstore (Sigstore signing — cryptographic provenance for agent artifacts and audit records)

An agent is composed of artifacts produced at different times by different identities: model weights, prompt templates, tool descriptors, MCP server binaries, and audit-log batches. Any of those artifacts can be substituted or tampered with between the moment they are built and the moment they are loaded. Sigstore addresses this by signing each artifact at build time using a short-lived certificate tied to the workload identity that produced it, recording the signature in an append-only public transparency log, and requiring verification against that log before the artifact is loaded or executed.

why it helps Supply Chain Compromise occurs when a compromised or substituted upstream artifact, a model weight, prompt template, plugin, or framework update, reaches the agent without detection. Signing each artifact at build time and verifying the signature against the Rekor transparency log at deployment time gives cryptographic proof that the artifact loaded is the one that was built by the expected identity.
Tier 2 Model registry (Model registry — version pinning, canary, rollback)

An agent loads whichever model weights are available at startup unless the runtime is told exactly which artifact to load. If a poisoned or regressed weight is published to the model store, the agent picks it up silently on the next restart. A model registry prevents that: every artifact is registered with a cryptographic checksum and an approval stage, the agent runtime loads by explicit version pin, and new versions must pass a canary evaluation before promotion to production.

why it helps AI Supply Chain Vulnerabilities include the scenario where a poisoned model weight reaches the agent runtime through a legitimate update path. Version pinning requires an explicit registry transition to a new artifact, and the signed-checksum gate rejects any weight whose hash does not match the registered record. Canary evaluation against a held-out set catches behavioural drift before the full fleet is promoted.
Tier 2 Secret scan (Secret scanning on agent-generated artefacts — detecting credentials before they escape the trust boundary)

An agent produces code, configuration files, tool-call payloads, and log records continuously and at a rate no human reviewer can match. Any of those artefacts may contain a live API key, service token, or private certificate, placed there accidentally through model context, or deliberately through prompt injection or context poisoning. Secret scanning places an inspection gate at every agent output seam: regex patterns match known token formats, entropy analysis detects arbitrary high-entropy strings, and validator calls confirm which candidates are live credentials. The CI-secret-scanning pattern is mature; the agentic specialisation is seam placement, moving the scanner from the repository gate to the agent egress point, where artefacts can be intercepted before they reach any downstream system.

why it helps T17 covers supply-chain attacks in which generated infrastructure artefacts introduce weaknesses into downstream systems. A credential embedded in a Terraform module, a Kubernetes manifest, or a container image travels into the build and deployment pipeline and is available to every system that consumes those artefacts. The scanner blocks that class of leakage at the generation point, before the artefact enters the supply chain.

Catalogue extensions: Helmwart T18 to T49

This normalized catalogue includes 3 multi-agent entries based on the OWASP MAS Threat Modelling Guide v1.0 that extend T17. The source guide reuses some numbers between worked systems; these Helmwart entries provide stable detail pages, MAESTRO layers, and mitigation coverage.

T29 Plugin Vulnerability Leading to Agent Compromise
A compromised or weakly-secured plugin takes control of an agent, including its cryptographic keys and downstream capabilities.
T37 Cross-Chain Bridge Attack (Indirect)
Attacker exploits a cross-chain bridge vulnerability to steal assets or disrupt coordination between agents operating on different blockchains.
T47 Rogue MCP Server in Ecosystem
Attacker publishes a malicious MCP server masquerading as a legitimate one; agents connecting to it receive manipulated data or have credentials stolen.

Red-team pivot: MITRE ATLAS techniques

MITRE ATLAS catalogues adversary techniques against AI systems. Where this OWASP threat has an attacker-perspective counterpart, the ATLAS technique is shown below. That is what a red team would actually be doing on the wire. Use this for detection-signal anchoring, threat-hunting hypotheses, and IR runbooks. Source: mitre-atlas/atlas-data v5.6.0.

AML.T0010 AI Supply Chain Compromise view on ATLAS ↗

Adversary tampers with components in the AI supply chain (datasets, model weights, libraries, container images) to compromise downstream systems before deployment.

Agentic angle: Agentic systems compose models, MCP servers, agent registries, and prompt templates at runtime. Every dependency is a potential vehicle for compromise.

AML.T0019 Publish Poisoned Datasets view on ATLAS ↗

Adversary publishes a manipulated dataset to a public hub (HuggingFace, Kaggle, GitHub) so that downstream training pipelines incorporate the poisoned data.

AML.T0058 Publish Poisoned Models view on ATLAS ↗

Adversary publishes a model (to HuggingFace, an internal registry, or an MCP server) that contains a backdoor or biased behaviour activated at runtime.

AML.T0109 AI Supply Chain Rug Pull view on ATLAS ↗

Adversary publishes legitimate AI components to gain adoption, then replaces them with a malicious variant, exploiting the trust established before the switch.

Agentic angle: Trusted MCP servers or model registries used by agents are high-value rug-pull targets because agents fetch and execute without further human review.

Sources

OWASP-Agentic-AI ↗ · 1.1 (Dec 2025) · Agentic Threats Taxonomy Navigator §Step 3; Threat Model T17
MAESTRO ↗ · 1.0 (Apr 2025) · Layer 1 Foundation Model; Layer 2 Data Operations