Accountability · Principles

Why it matters for agentic AI

In a traditional system, accountability is straightforward: a user authenticates, takes an action, and a log entry names them. Multi-agent pipelines dissolve that clarity. When a harmful outcome emerges from a chain of five autonomous agents (orchestrator delegates to sub-agent A, which calls tool B, which triggers agent C), the post-incident question “who is responsible?” has no clean answer unless accountability was wired into the architecture before the chain ever ran. The absence of a prior answer is not a bureaucratic inconvenience; it is a security gap. Without a named responsible human at every link, there is no one with the authority or the obligation to halt a runaway pipeline, no target for regulatory scrutiny, and no incentive for any component owner to tighten their controls.

The operative concept here is Human-Anchored, Intent-Bound Delegation (HAID): every delegation hop must be traceable to a named human officer, and each hop must narrow, not transfer, the scope of authority. The orchestrator does not hand an open cheque to the sub-agent; it signs a scope-attenuating caveat that describes exactly what the sub-agent is authorised to do on the human’s behalf. The result is an authority-lineage register: an auditable chain in which every action in the pipeline can be walked back to a human who accepted responsibility for it when they approved the delegation. This is the structural answer to accountability laundering: the pattern by which sub-agent chains diffuse responsibility until no individual party feels they own the outcome.

Non-repudiation is the technical mechanism that makes the register believable. Each delegation hop must produce a signed artefact (a scoped token, a capability caveat, a signed task authorisation) that cannot be later denied. Without cryptographic signing, an agent or its operator can always claim the action was taken under authority it did not actually hold, or that the scope of authority was wider than the register shows. Per-action attribution to a human officer, enforced not by convention but by the system’s inability to issue a credential without a named signatory, is what converts the register from a paper trail into a security control.

Scenario: the unowned pipeline failure

A financial services firm deploys a five-agent pipeline: a planning agent, three specialist sub-agents, and a final execution agent. The execution agent sends a large unauthorised transfer. The post-incident review finds that each agent was configured by a different team, each team considered itself a “platform provider” rather than an accountable decision-maker, and no delegation register exists. The transfer cannot be attributed to any single human authorisation because no such authorisation was ever explicitly captured. Under HAID, the human officer who approved the top-level task would have signed a scope-narrowing delegation chain; the execution agent’s credential would have been issued only for the specific funds and payee the officer authorised, making the unauthorised transfer impossible regardless of any downstream error.

Scenario: the laundered approval

A customer-service agent is granted broad authority to “resolve complaints.” It delegates to a refund sub-agent, which delegates to a payment sub-agent. The payment sub-agent issues a refund far outside any policy threshold. Each delegation was individually “approved,” but only in the weak sense that no one explicitly refused it. Because there is no scope-narrowing at each hop and no authority-lineage register, the refund cannot be connected to any human who consciously authorised it. The accountability chain was laundered across three hops until it vanished. A signed caveat at each delegation, for example “refunds for this customer, up to £X, in resolution of ticket #Y,” would have made the out-of-bounds action technically non-issuable.

How it fails

No delegation register exists, so actions cannot be traced to a responsible human after the fact.
Authority is transferred rather than narrowed at each hop, so sub-agents accumulate scope their originating human never consciously granted.
Delegations are not signed, so any party in the chain can later disclaim their role or misrepresent the scope they were given.
Accountability is siloed by team (“we own the platform, not the decision”), creating a responsibility vacuum no individual feels obliged to fill.
Incident response has no designated owner, so the chain is never halted promptly when something goes wrong.

Why the mapped controls work

An authority-lineage register makes the full delegation chain legible before an incident, not only after. It forces the question “under whose authority does this agent act?” to be answered at design time, which surfaces accountability gaps that would otherwise remain invisible until they become liability. Signed, scope-attenuating delegation tokens turn each hop into a verifiable narrowing of authority: the sub-agent literally cannot acquire privileges the orchestrator did not explicitly grant, and the signing means neither party can later misrepresent the scope. Non-repudiable per-action attribution ties every consequential action to the human officer who authorised the task, closing the laundering path and giving regulators, auditors, and internal review a clear trail from outcome to decision-maker.

First steps

Map every agent-to-agent delegation hop in your system today and name the human officer responsible for each; any hop with no named owner is an unacceptable accountability gap that must be resolved before the pipeline goes to production.
Replace any shared API key or bearer token passed between agents with RFC 8693 token exchange so that every inter-agent call carries both the originating human (sub) and the acting agent (act) in a verifiable, signed claim.
Enable append-only structured logging for every tool call (tools like OpenTelemetry with a write-once backend, e.g. AWS CloudTrail or Azure Immutable Blob) and confirm in a tabletop drill that an auditor can reconstruct the full delegation chain from log alone within thirty minutes of an incident.

Threats it governs

When this principle is absent, these threats become reachable.

T8
Repudiation and Untraceability Agent actions cannot be reliably traced, attributed, or reconstructed.

Controls that advance it

Catalogue mitigations that strengthen this principle, grouped by the defence-in-depth stage they sit in.

Prevent

No catalogued control.

Detect

Split actor An agent that writes its own audit log can omit, alter, or suppress any record of its own actions. This is not a theoretical risk: an attacker who controls the acting identity controls the evidence. Actor/recorder separation is the structural fix. The identity that performs an action and the identity that records it are different principals, with non-overlapping permissions, so no single compromise can both execute and erase.
Identity monitoring An AI agent operates under a non-human identity (NHI): a service principal, a task role, or a workload credential. That identity produces a stream of access events that, for a well-scoped agent, forms a narrow and predictable behavioural baseline. Identity monitoring applies User and Entity Behaviour Analytics (UEBA) to that stream, alerting when an observed access pattern deviates statistically from the baseline. Because agent behavioural distributions are tighter than those of human users, a deviation is a higher-confidence signal, and a spoofed or stolen credential used from the wrong workload origin is exactly the anomaly the technique is built to detect.
Cross-system audit An agent that operates across HR, Finance, cloud, and SaaS systems accumulates permissions at each boundary, often without any single team seeing the combined picture. Privilege accumulates silently across those boundaries until a quarterly review finds it, by which point a compromised or misconfigured agent has had weeks of unchecked reach. Cross-system scope auditing prevents that by continuously reconciling the agent's actual entitlements against a declared baseline across every system it touches and raising a ticket the moment drift is detected.
Insider program Privileged-access personnel are the human layer behind every agentic system. A person with legitimate administrative credentials can tamper with logs, manipulate approval gates, or extract training data through authorised channels, and no technical control prevents it when the access itself is valid. An insider threat program addresses that gap: it governs who holds operator access, what they agree to, how quickly credentials are revoked on departure, and whether anomalous behaviour is surfaced before damage accumulates.
Provenance tracking When an agent produces a claim derived from retrieved data, that claim needs a record of where it came from: the source document, version, and retrieval time. Without that record, a downstream verifier cannot distinguish a well-grounded output from a fabricated one, a tampered one, or a poisoned one. Provenance tracking attaches source attribution to every claim, carries it through each transformation in the pipeline, and surfaces it in audit logs and user-facing interfaces.

Respond

Legal hold An audit trail is only useful if its records cannot be altered after the fact. Without a storage-layer enforcement mechanism, a sufficiently privileged attacker (or a compromised recorder identity) can overwrite or delete the records that document what happened. Legal hold and WORM retention solve this by placing audit records in storage that the provider itself enforces as immutable: no user, including account root, can modify or delete a locked object within the retention window. Legal hold extends that protection indefinitely for active incidents, lifted only through an out-of-band authority outside the normal operations team.

In Helmwart

Related to the agent-identity and observability principles; not scored directly.