MITIGATION · m-out-of-band-verify
Out-of-band verification — independent-channel confirmation for irreversible agent actions
An agent that can propose payments, update banking details, or modify production configuration is, by construction, a manipulation surface. If the only thing standing between a proposed change and its execution is the agent's own UI, a successful prompt injection or RAG poisoning attack requires no additional steps. Out-of-band verification breaks that dependency by routing a one-use confirmation code through a channel that is structurally separate from the agent's primary interaction channel, so an attacker who controls the agent's context cannot complete the approval without also compromising the user's registered secondary device.
At a glance
TL;DR
- Before any irreversible high-stakes action (payment, beneficiary update, production-config write, account deletion) commits, the agent issues a one-use confirmation code through a structurally independent channel (SMS, push notification, or hardware token) that the user must return before the action executes.
- The agent's primary interaction channel is the manipulation surface for prompt injection and RAG poisoning. An attacker who controls that channel cannot reach the secondary channel; defeating OOB verification requires compromising both simultaneously.
- Directly closes the OWASP Agentic AI v1.1 AI-Powered Invoice Fraud scenario: the attacker substitutes banking details in the agent UI, but the user must independently confirm those same details via the secondary channel before the transfer commits.
- Channel assurance varies: SMS is vulnerable to SIM-swap and SS7 attacks; push notifications and hardware-bound FIDO2/WebAuthn credentials carry higher assurance for the highest-stakes action classes.
How it behaves
What it is
An agent's output is potentially attacker-controlled. Prompt injection, RAG poisoning, and direct model compromise can all cause an agent to propose actions the operator or user did not authorise. If the agent's primary interaction channel is the only path a user sees before approving an irreversible action, any of those attacks is sufficient to manufacture approval.
Out-of-band verification addresses this by introducing a structurally independent confirmation step. For a defined set of high-stakes operations (payments, beneficiary or banking-detail updates, production-configuration writes, account deletions) the agent issues a time-limited, single-use confirmation code through a secondary channel that is registered separately from the agent's primary interface: an SMS message to a verified phone number, a push notification to a registered mobile app, or a hardware-token challenge. The user must return that code through the primary channel before the action executes. The agent's UI cannot complete the approval on its own.
The structural guarantee is channel independence. An attacker who controls the agent's context, via a poisoned tool output, an injected document, or a compromised retrieval corpus, has no path to the secondary channel. Defeating the verification requires compromising both channels simultaneously, which is a materially harder attack than compromising one. Real-world attackers do occasionally achieve this through SIM swap or OAuth-token theft, but at far lower rates than single-channel compromise.
OWASP Agentic AI v1.1 §T15 names this scenario explicitly as AI-Powered Invoice Fraud: the agent substitutes attacker-controlled banking details into an outgoing invoice; the user approves the substituted details in the agent's UI; the funds transfer to the attacker. OOB verification closes that path because the user must confirm the banking details through a channel the attacker has not reached. T9 Identity Spoofing is also reduced because possession of the registered secondary device cannot be acquired through prompt injection alone.
Detection signals
- Delivery failure rate per channel (SMS, email, push). A sustained spike points to a carrier or provider outage that is silently degrading the verification path.
- UI approval followed by OOB rejection on the same action. This pattern is a candidate phishing-via-agent signal: the user recognised the proposed action as fraudulent only when it reached the independent channel.
Threats it covers
-
WHY IT HELPS Identity Spoofing through indirect prompt injection can cause an agent to act under false authority, but it cannot manufacture possession of the user's registered secondary device. OOB verification requires physical or credential access to that device before the spoofed action can commit.
-
WHY IT HELPS Human Manipulation via AI-powered invoice fraud works by substituting attacker-controlled payment details into an outgoing transaction that the user then approves in the agent's UI. OOB verification requires the user to confirm the same details through a channel the attacker has not compromised, so approval of the substituted details in the primary channel is no longer sufficient to commit the transaction.
-
WHY IT HELPS Model instability leading to erratic high-stakes proposals is bounded by OOB verification: an unstable model that proposes an anomalous financial action must still obtain independent-channel confirmation before that action commits.
-
WHY IT HELPS Model inconsistency across agent instances can produce contradictory proposals; OOB verification surfaces the inconsistency to a human who can reconcile before committing to an irreversible action.
Principle coverage
Defence-in-Depth stage: Prevent — and it advances:
- Human Oversight (HITL / HOTL) OOB verification enforces human oversight at the approval boundary where it matters most: the user receives the proposed action details through a channel the agent does not control, so oversight is structurally independent of whether the agent's primary context has been manipulated.
- Reversibility / Dry-run / Hold periods OOB verification is a pre-commitment gate placed at the irreversibility boundary: it requires independent-channel confirmation before an action that cannot be undone, making that boundary require both a primary-channel proposal and a secondary-channel approval rather than one alone.
- Safety / Harm-limitation An agent that can propose payments or account-detail changes is a direct path to consequential harm if a single channel is compromised. OOB verification requires the user to confirm through a structurally independent channel, so no single manipulation of the agent's context is sufficient to commit a harmful irreversible action.
- Contestability / Redress OOB verification gives the user a second, independent opportunity to contest a proposed action before it commits: the confirmation request arrives through a channel outside the agent's influence, and declining or ignoring it is sufficient to stop the action without any further appeal.
Design & governance principles (open design, economy of mechanism, accountability, …) are architectural, not advanced by a single placed control.
Implementation options
Five implementation options spanning the normative standard, SaaS OOB APIs, payment-domain OOB (3DS), and hardware-bound passkeys. Most deployments compose two: a SaaS OOB provider for general approval flows and a hardware-token option for the highest-stakes action classes where SMS channel trust is insufficient.
NIST SP 800-63B OOB Authenticator Federal-grade specification defining OOB device requirements: uniquely addressable device, encrypted secondary channel, authenticated using approved cryptography or SIM-equivalent.
Why choose it: Use as the normative baseline against which any OOB provider is evaluated. Prohibits VoIP and email as OOB channels. No SDK; you wire providers to the spec.
More details:
Twilio Verify API Verify v2 API supporting SMS, voice, WhatsApp, push (Silent Device Approval), TOTP, and passkeys. Per-verification pricing, REST and webhook integration.
Why choose it: Best for teams without an existing identity-provider OOB offering. Per-message cost is negligible for sparse high-stakes actions. The developer controls which channel each verification attempt uses, allowing SMS as fallback with push as primary.
More details:
Auth0 Guardian push approvals Guardian MFA for iOS/Android: push via FCM/APNs/AWS SNS with 30-second TOTP fallback. Triggered programmatically from Auth0 Actions.
Why choose it: Best for teams already on Auth0. Push-first with OTP fallback; no SMS channel in Guardian itself. Approval flow is programmatic: api.multifactor.enable in an Auth0 Action gates the HITL approval on push confirmation.
More details:
Stripe 3D Secure Issuing-bank-controlled OOB for card payments: cardholder confirms via OTP to mobile, biometric, or bank password. Required under PSD2/SCA.
Why choose it: The correct choice for agent-initiated card payments in EEA, UK, India, and Australia. Narrow applicability: correct for payment actions, not applicable to non-payment high-stakes actions. The Payment Intents API handles the redirect flow.
More details:
FIDO2 / WebAuthn cross-device CTAP 2.2 Bluetooth-proximity cross-device authentication: the user's phone confirms an action initiated on a laptop via QR code and Bluetooth proximity check. Hardware-bound credentials; passkeys supported across all major OSes.
Why choose it: Highest assurance; correct when SMS channel trust is insufficient given a SIM-swap or SS7 threat model. Hardware-bound credentials defeat SIM-swap and SS7 attacks. Highest integration effort: registration flow, device management, and fallback path are all required.
More details:
Trade-offs
- Latency is high: SMS OOB adds 15 to 45 seconds of wall-clock wait; email can add minutes; push is sub-second. Acceptable for irreversible high-stakes actions; completely unacceptable for routine ones.
- Cost is low: per-message fees for SMS and push; per-active-user pricing for identity-provider OOB.
- UX friction is medium: users tolerate OOB for actions they recognise as high-stakes; applying it uniformly produces channel fatigue and reflexive approval without reading the confirmation content.
- Dev effort is medium: the provider integration is straightforward; the per-action-class routing policy that decides which actions trigger OOB is sustained design work.
When NOT to use
- Do not apply OOB to routine, low-value, or easily reversible actions. Channel fatigue causes users to approve reflexively without reading the OOB message, which defeats the control entirely.
- Do not deploy without a reliable secondary-device registration flow. Users without a registered device will be locked out rather than protected.
- Do not use SMS OOB as the primary control when the threat model includes SIM-swap or SS7 interception. The secondary channel is itself the attack surface in that scenario; use FIDO2/WebAuthn instead.
Limitations
- OOB assumes a trustworthy secondary channel. SIM-swap, OAuth-token theft, malware on the user's phone, and SS7 attacks all defeat SMS-based OOB specifically.
- The routing policy that determines which actions trigger OOB is where the real design work lives, and where miscalibration produces either false security (too few triggers) or channel fatigue (too many).
Maturity tier reasoning
- Tier 2 because OOB primitives are Tier 1 mature in financial-services and identity-provider infrastructure; the agentic application is operational composition: wiring an existing OOB endpoint into the agent's HITL gate for irreversible high-stakes actions.
- What keeps it from Tier 1 is the absence of an agentic-AI-specific OOB profile specifying which action classes warrant OOB and which channels are acceptable per class; every deployment tunes from scratch.
Last verified against upstream docs: 2026-05-30.