Agent-as-principal Identity · Principles

Why it matters for agentic AI

Every security property in agentic systems ultimately relies on the ability to answer one question: who did this? Not “which application,” but which specific agent instance, acting under which delegated authority, on behalf of which human principal, at what point in time. Without a cryptographically verifiable, non-human identity for each agent instance, the answer is always “we think it was the shared key,” which means incident attribution fails, per-agent access control is impossible, and a compromised sub-agent is indistinguishable from a legitimate one. Agent identity is not a nice-to-have; it is the substrate on which every other access and trust control in this handbook depends.

The dominant failure pattern in deployed agent systems is treating agents the way developers treat microservices in early-stage systems: give them a shared API key, rotate it occasionally, move on. This works until it doesn’t, at which point the blast radius of a key compromise is the entire application, attribution is a guessing exercise, and least-privilege enforcement is structurally impossible because there is only one identity to scope. Survey evidence suggests the vast majority of agent deployments still operate this way. The structural fix exists and is deployable today: unique short-lived SVIDs per instance via SPIFFE/SPIRE, with OAuth token exchange (RFC 8693) to carry both the human principal (sub) and the acting agent (act) through the delegation chain. The gap is adoption, not technology.

Delegation is the second major failure surface. When a human authorises an agent to act on their behalf, that delegation must be explicit, scoped, and time-bounded, and it must carry through every hop in the chain without either being silently widened or lost. A sub-agent that inherits the orchestrator’s token is implicitly claiming the orchestrator’s full authority; a sub-agent that re-asserts “I am acting for the orchestrator” with no verifiable proof is doing the same thing by self-assertion. RFC 8693 token exchange makes the delegation chain visible in the token itself: each hop adds a signed, scope-narrowing caveat, so a downstream service can see the full lineage and verify that each hop was authorised to delegate what it delegated. Without this, credentials launder through the delegation chain, with authority arriving at the tool and no traceable human behind it.

Scenario: the shared key and the invisible breach

An orchestrator and six sub-agents share one API key. A prompt injection targets one sub-agent and causes it to make a series of API calls that exfiltrate data over two hours. Responders can see from the service logs that the key was used and data was retrieved, but the six sub-agents are indistinguishable in the log, because all calls look identical. They cannot determine which agent was compromised, which sessions were affected, or whether the orchestrator itself was involved. Per- instance SVIDs would have made every call attributable to a specific agent instance, collapsing the attribution problem from “any of seven entities” to “this specific process, in this session.” SVID revocation would have halted the exfiltration within the certificate TTL.

Scenario: the impersonating sub-agent

An orchestrator spawns sub-agents and communicates with them over an internal message bus. One sub-agent is compromised; it begins sending messages to a high-privilege peer claiming to be the orchestrator, requesting it to perform actions outside its normal task scope. The peer, which checks conversational framing but not cryptographic identity, follows the instructions. Mutual authentication on the message bus, requiring every agent-to-agent message to carry a valid, unexpired SVID for the sender, makes impersonation immediately detectable: the compromised sub-agent cannot produce a valid orchestrator SVID, so its messages are rejected before any action is taken.

How it fails

Agents share an API key or OAuth token; a compromise of one agent exposes all agents using that credential, and attribution requires guessing.
The user’s own OAuth token is passed to the agent so it can act on the user’s behalf; service logs show the human as the actor, making agent actions invisible to audit.
Sub-agents inherit the orchestrator’s token without attenuation; a deep delegation chain produces sub-agents with the full authority of the top-level orchestrator.
Agent identity is self-asserted in a message header rather than cryptographically verified; a rogue or compromised agent impersonates a trusted peer with no technical barrier.
Authorisation is checked at session start against the agent’s claimed identity and then cached; a mid-session compromise is not detected until a human notices something wrong.

Why the mapped controls work

Unique short-TTL SVIDs per instance mean each agent has a verifiable, time-bounded identity that expires too quickly to be useful after revocation. A compromised identity’s useful window is measured in minutes, not days. RFC 8693 token exchange with both sub and act claims makes the delegation chain explicit in every service call, so a downstream policy engine can enforce rules like “this action requires a human-originated delegation chain, not a purely agent-to-agent one.” Minimal-scope per-task tokens collapse the blast radius of a compromised identity from “everything this agent can do” to “everything this specific task token permits,” which is a much smaller set. Signed inter-agent messages close the impersonation path: a peer cannot be convinced to act by a forged message because the signature check fails before the message is processed. Continuous authorisation rechecks at every tool call rather than once at session start, ensuring that a mid-session compromise triggers a policy failure on the next action rather than running freely until session end.

First steps

Deploy SPIFFE/SPIRE in your agent runtime so that each agent workload is issued a unique short-lived X.509 SVID (a 5–15 minute TTL is a workable starting point) rather than sharing any API key or static credential.
Configure your inter-agent message bus or HTTP gateway to require mutual TLS with SVID verification, so that agent-to-agent messages are rejected if the sender cannot present a valid, unrevoked certificate for its declared identity.
Audit every place your codebase passes a token or credential from one agent to another and replace those hand-offs with RFC 8693 token exchange, verifying that the resulting token carries both sub (human principal) and act (agent) claims before any tool gateway accepts it.

Threats it governs

When this principle is absent, these threats become reachable.

T9
Identity Spoofing and Impersonation Auth mechanisms exploited to impersonate agents, users, or services; misuse of persistent agent identities.
T22
Service Account Exposure Agent service-account credentials accidentally exposed via commits or insecure stores, creating an infrastructure vulnerability.
T34
Wallet Key Compromise Attacker steals private keys from agent wallet, gaining full transaction-signing authority.
T40
MCP Client Impersonation Attacker presents a forged MCP client identity to access an MCP server's tools and data.

Controls that advance it

Catalogue mitigations that strengthen this principle, grouped by the defence-in-depth stage they sit in.

Prevent

SPIFFE In most deployments, agents authenticate to one another with long-lived bearer tokens or shared secrets. If any one of those credentials is stolen, the attacker has persistent, platform-wide access until someone manually rotates it. SPIFFE replaces that model: each workload is issued a short-lived, cryptographically verifiable identity document, and every connection requires both sides to present one. No long-lived secrets traverse the network, and a compromised credential is worthless within its TTL.
NHI lifecycle A Non-Human Identity (NHI) is the service account, machine principal, or formal agent identity under which an agentic system authenticates and acts. When an NHI is provisioned with broad scope, never rotated, and has no named owner, a stolen or leaked credential gives an attacker persistent access for as long as that credential remains valid. NHI lifecycle management treats each agent identity as a first-class governance object: provision narrowly with a declared scope and owner, rotate on a short schedule using platform-native short-lived credentials, audit every authentication and rotation event, re-attest that the identity is still needed, and decommission by deletion when the agent is retired.
Token TTL An agent identity backed by a long-lived bearer token grants access for as long as that token remains valid. If the token is stolen, logged, or extracted from a running process, the attacker holds working credentials for weeks or months without any further action. Short-lived tokens address this by issuing credentials with a time-to-live measured in minutes or hours, automated and renewed by the platform rather than a human. When a token expires, access ends: the attacker must win the renewal process as well, which requires compromising a harder target than the token itself.
Message signing An inter-agent message travels through channels and intermediate agents the receiver did not originate. If nothing binds the message cryptographically to its source, any intermediate hop can substitute or inject content that the receiving agent will treat as authoritative. Message signing closes that gap: the source agent signs each message payload with its private key, and the receiver verifies the signature against a distributed trust bundle before the content reaches the reasoning layer.
Admission control In a multi-agent system, peer agents are granted authority by the other agents that accept their outputs. A rogue or compromised agent that enters the system inherits that authority immediately. Agent admission control is the registration gate that evaluates a peer's identity, declared capabilities, and binary provenance against policy before granting access. A peer that cannot pass attestation is refused entry and cannot participate in the system.
Agent MFA An agent identity that holds broad write authority is a high-value target: compromising its credential gives an attacker persistent, authenticated access to every system that identity can reach. Multi-factor authentication addresses this by requiring a second factor at credential issuance time, so a stolen token is bounded to its issued lifetime and cannot be silently renewed. For non-human identities the second factor is workload attestation, hardware-bound key material, or certificate-backed proof rather than a phone or one-time code.

Detect

No catalogued control.

Respond

No catalogued control.

In Helmwart

The first Q4 Zero-Trust tenet, “workload identity,” audits exactly this.