← Atlas · Playbooks PLAYBOOK · P5

PLAYBOOK · P5 · OWASP Agentic AI v1.1

Protecting HITL & Preventing Decision Fatigue Exploits

Keep human oversight effective when the agent fan-out tries to swamp it.

Goal: Prevent attackers from overloading human decision-makers, manipulating AI intent, or bypassing security through deceptive AI behaviours.

Aligned with Step 5: Does AI require human engagement to achieve its goals or function effectively? · 2 threats mitigated · 21 mitigations referenced

At a glance

THREATS COVERED
2
T10 · T15
NAVIGATOR STEP
P5
Step 5: Does AI require human engagement to achieve its goals or function effectively?
MITIGATIONS
21
distinct Helmwart controls referenced across the three phases

Defence-in-depth chain

When reviewer overwhelm or decision-fatigue exploitation arrives, Proactive controls (a risk-prioritised review queue and adaptive workload balancing) preserve reviewer attention by routing low-risk decisions away from humans and throttling notification volume. If a manipulative or high-volume request still reaches a reviewer, Reactive controls (reviewer decision summaries and plan-vs-goal validation) make the decision legible and validate goal consistency before approval. Detective controls (Sigstore signing) produce cryptographic audit trails of reviewer overrides for post-incident review.

ATTACK ARRIVES reviewer overwhelm PROACTIVE Risk-priority queue Adaptive load balancing Kill switch throttled AUTO-APPROVED / HELD REACTIVE Decision summary Plan-vs-goal check Egress DLP legible DECISION VISIBLE DETECTIVE Signed audit log Anomaly quarantine Cross-system audit recalibrate OVERRIDE CAPTURED attack passes attack passes OUTCOME loop closed

proactive Step 1: Optimize HITL workflows & reduce decision fatigue

  • Score each pending agent action by risk and use that score to rank the HITL review queue so reviewers tackle the highest-impact decisions first.

    Helmwart controls: Risk queue Trust score
  • Automate routine low-risk approvals and escalate only high-impact decisions to a human reviewer.

    Helmwart controls: Risk queue Policy bound
  • Cap the volume of AI-generated notifications per reviewer per hour to prevent cognitive overload.

    Helmwart controls: Adaptive load
  • Enforce rate limits on agent-generated notifications, approval requests, and alerts to block decision-fatigue-as-an-attack-vector.

  • Require two-agent consensus before any agent is permitted to modify its own operational goals.

  • Surface a plain-language explanation of each pending decision alongside the agent's recommendation so reviewers can act without re-reading raw logs.

    Helmwart controls: Decision summaries
  • Distribute the review queue across available human reviewers so no single operator is presented with an unmanageable volume.

    Helmwart controls: Adaptive load
  • Strip or sandbox all clickable links and rich content in agent output so social-engineering payloads cannot complete a click-through to the attacker.

    Helmwart controls: Render restriction
  • Provide a documented, tested kill-switch covering single-agent, fleet, and global scopes with a named authority and drill cadence.

    Helmwart controls: Kill switch
  • For irreversible high-stakes changes such as payments or production configuration writes, require confirmation through a channel independent of the agent.

    Helmwart controls: OOB verify
  • For irreversible high-impact actions, require two distinct human reviewers to independently approve and sign off before the agent proceeds.

    Helmwart controls: Dual control

reactive Step 2: Identify AI-induced human manipulation

  • Validate each pending agent plan against its declared goal before approving execution to detect and block unintended behavioural shifts.

  • Monitor how frequently each agent requests goal changes and alert when the rate suggests active manipulation rather than legitimate adaptation.

    Helmwart controls: Divergence monitor
  • Scan agent outputs and tool-call parameters for PII, secrets, and sensitive IP before egress so phishing payloads cannot reach the end user.

    Helmwart controls: Egress DLP

detective Step 3: Strengthen AI decision traceability & logging

  • Write every HITL decision and agent recommendation to a cryptographically signed, append-only log to prevent post-hoc tampering.

    Helmwart controls: Sigstore Split actor
  • Run real-time anomaly detection across the agent decision stream and escalate sessions that deviate from expected patterns.

    Helmwart controls: Anomaly isolation
  • Log every human override of an agent recommendation and surface reviewer-pattern analytics to detect bias or systematic misalignment.

  • Flag decision reversals in high-risk workflows where a previously rejected AI output was later approved under suspicious conditions.

    Helmwart controls: Cross-system audit

Source

OWASP Agentic AI: Threats and Mitigations v1.1 (Dec 2025), §Mitigation Strategies. Action text is taken verbatim or paraphrased from the canonical document; the Helmwart additions are the per-action mappings onto deployable mitigation entries.