EVIDENCE TRAIL
Reflection-loop depth cap
Verbatim excerpts from the upstream sources cited on the mitigation page, with what each source does and does not prove. The "Reflection Loop Trap" scenario is named verbatim in OWASP Agentic AI Threats & Mitigations v1.1 under Intent Breaking & Goal Manipulation. The academic iteration-bound design pattern appears in both Reflexion (Shinn et al. 2023) and Self-Refine (Madaan et al. 2023).
Last cross-checked against upstream sources: · 7 sources
References
Each entry shows what the source supports and what it does not prove.
OWASP Agentic AI — Threats & Mitigations v1.1
§Intent Breaking & Goal Manipulation — Scenario 4 "Reflection Loop Trap"
"Scenario 4: Reflection Loop Trap – An attacker triggers infinite or excessively deep self-analysis cycles in an AI, consuming resources and preventing it from making real-time decisions, effectively paralyzing the system."
Supports: Verbatim source for the "reflection loop trap" threat pattern this control directly addresses. Names infinite or excessively deep self-analysis as the attack vector and compute exhaustion / decision paralysis as the effect.
Does not prove: The Reflection Loop Trap scenario appears in the Intent Breaking & Goal Manipulation section, not the T4 Resource Overload section. The two threats are related but distinct: T4 frames overload as a DoS concern; this scenario frames it as a reasoning-integrity concern. Both rationales apply to the cap control.
OWASP Agentic AI — Threats & Mitigations v1.1
§T4 Resource Overload — Description
"Resource Overload occurs when attackers deliberately exhaust an AI agent's computational power, memory, or external service dependencies, leading to system degradation or failure. Unlike traditional DoS attacks, AI agents are especially vulnerable due to resource-intensive inference tasks, multi-service dependencies, and concurrent processing demands, making them susceptible to delays, decision paralysis, or cascading failures across interconnected systems."
Supports: Establishes T4 Resource Overload as the primary threat taxonomy entry that a reflection-depth cap mitigates. Names decision paralysis and cascading failures as concrete effects of uncapped compute exhaustion in agentic systems.
Does not prove: T4 does not name reflection loops or iteration depth as the specific attack vector — the four T4 scenarios cover inference-time exploitation, multi-agent resource exhaustion, API quota depletion, and memory cascade failure. The Reflection Loop Trap scenario is cross-listed under Intent Breaking & Goal Manipulation (§T6).
OWASP Agentic AI — Threats & Mitigations v1.1
§Mitigation Strategies — Playbook 3: Securing AI Tool Execution · Step 3: Prevent AI Resource Exhaustion (Detective)
"Enforce auto-suspension of AI processes that exceed predefined resource consumption thresholds. … Limit concurrent AI-initiated system modification requests. Prevent mass tool executions that could inadvertently trigger denial-of-service (DoS) conditions."
Supports: Names predefined resource-consumption thresholds and execution limits as the operational mitigation for Resource Overload. The reflection-depth cap is the equivalent control applied to the iteration dimension rather than the concurrent-request dimension.
Does not prove: Playbook 3 addresses resource limits at the tool-invocation and process-suspension layer; it does not name reflection-loop depth specifically. Helmwart applies the same threshold-enforcement pattern to the reflection-iteration counter.
OWASP LLM Top 10 v2025 — LLM10: Unbounded Consumption
§LLM10:2025 Unbounded Consumption — Continuous Input Overflow vulnerability description
"Continuously sending inputs that exceed the LLM's context window can lead to excessive computational resource use, resulting in service degradation and operational disruptions."
Supports: Establishes the upstream LLM Top 10 entry that T4 Resource Overload explicitly extends for agentic systems. The pattern of unbounded input or unbounded iterations driving excessive compute is the same threat class that a depth cap bounds.
Does not prove: LLM10 does not name reflection loops or iteration counts as a specific vector — it focuses on input length and API request volume. Helmwart applies the same bounding principle to the self-iteration dimension.
Shinn et al. — Reflexion: Language Agents with Verbal Reinforcement Learning (2023)
§3 Reflexion — Algorithm pseudocode and AlfWorld experimental setup
"while Me not pass or t < max trials do … In practice, we bound mem by a maximum number of stored experiences, Ω (usually set to 1-3) to adhere to max context LLM limitations. … if the number of actions taken in the current environment exceeds 30 (inefficient planning), we self-reflect. … we truncate the agent's memory to the last 3 self-reflections (experiences)."
Supports: Primary academic source for the reflection-loop pattern this control bounds. Reflexion's own pseudocode uses a max_trials variable as the loop exit condition. The paper also sets explicit numeric bounds (Ω=1–3 stored experiences, 30-action limit, 3-reflection memory window) as operational parameters — confirming that bounded iteration is inherent to safe reflection design, not an afterthought.
Does not prove: Reflexion treats the bounds primarily as a quality/context-window constraint, not a security control. The security framing — that adversarial input can trigger excessively deep loops — is Helmwart's application of the paper's design pattern to a threat context.
Madaan et al. — Self-Refine: Iterative Refinement with Self-Feedback (2023)
§2 Self-Refine — Algorithm description and §4 Experimental Setup
"This process is repeated either for a specified number of iterations or until M determines that no further refinement is necessary. … The stopping condition stop(fbt, t) either stops at a specified timestep t, or extracts a stopping indicator (e.g. a scalar stop score) from the feedback. … the FEEDBACK-REFINE iterations continue until the desired output quality or task-specific criterion is reached, up to a maximum of 4 iterations."
Supports: Establishes that well-designed iterative self-critique loops always carry an explicit iteration bound as part of their stopping condition. The paper's Algorithm 1 requires a stop(·) function and specifies a hard maximum of 4 iterations in experiments. This confirms that a depth cap is a design requirement of the reflection pattern, not an optional add-on.
Does not prove: Self-Refine bounds iterations for quality and cost reasons, not security reasons. The adversarial framing — that a prompt can drive the loop to pathological depth — is applied by Helmwart to this design pattern.
MITRE ATLAS AML.M0004 — Restrict Number of AI Model Queries
AML.M0004 — Description
"Limit the total number and rate of queries a user can perform."
Supports: ATLAS's canonical mitigation for bounding model query volume. A reflection-depth cap is a per-session application of the same "limit the total number … of queries" principle applied to the self-call dimension within a single agent session rather than across users.
Does not prove: AML.M0004 is framed around external user query rate-limiting (a DoS/model-extraction defence), not internal agent self-iteration. The mapping to reflection-loop depth requires extending the principle one layer inward.