EVIDENCE TRAIL
Code-generation review gate
Verbatim excerpts from the upstream sources cited on the mitigation page, with what each source does and does not prove. One MDX error is flagged: the page cites "OWASP Top 10 Proactive Controls C9 — Implement Secure Code Review" but C9 in the current (2024) release is titled "Implement Security Logging and Monitoring," not code review. That citation should be removed or corrected.
Last cross-checked against upstream sources: · 7 sources
References
Each entry shows what the source supports and what it does not prove.
OWASP Agentic AI — Threats & Mitigations v1.1
§T11 Unexpected RCE and Code Attacks — Mitigation column (threat taxonomy table)
"Restrict AI code generation permissions, sandbox execution, and monitor AI-generated scripts. Implement execution control policies that flag AI-generated code with elevated privileges for manual review."
Supports: Direct verbatim source for the review-gate framing. Names the exact control mechanism: execution policies that flag AI-generated code with elevated privileges for manual review. The three scenarios in the MDX (DevOps Agent Compromise, Workflow Engine Exploitation, Linguistic Ambiguity) all originate from this entry.
Does not prove: The table-level mitigation is a one-sentence summary, not a deployment specification. Does not name branch protection, SAST tooling, or provenance attribution — those are Helmwart additions.
OWASP Agentic AI — Threats & Mitigations v1.1
§Step 2: Monitor & Prevent Tool Misuse and Supply Chain Anomalies (Reactive) — mitigation playbook bullet
"Require human verification before AI-generated code with elevated privileges can be executed."
Supports: Verbatim statement of the core control requirement cited in the MDX independent-evidence section as "OWASP Agentic AI Mitigation Playbook P3". Exact wording match with the MDX.
Does not prove: Appears in a playbook/checklist section that also governs tool misuse broadly, not a freestanding T11-specific prescription. Helmwart surfaces it as a named control; the source presents it as a checklist item.
OWASP Top 10 for Agentic Applications 2026
§ASI05 Unexpected Code Execution (RCE) — Prevention and Mitigation Guidelines, items 5 and 6
"Architecture and design: Isolate per-session environments with permission boundaries; apply least privilege; fail secure by default; separate code generation from execution with validation gates. Access control and approvals: Require human approval for elevated runs; keep an allowlist for auto-execution under version control; enforce role and action-based controls."
Supports: Names "separate code generation from execution with validation gates" and "Require human approval for elevated runs" as the two structural controls that define the review gate. Directly corroborates both the gate architecture and the privilege-separation framing in the MDX.
Does not prove: ASI05 mitigations are broader than a review gate (sandbox, eval-ban, SAST). The review gate is one item in a list, not the named centrepiece. Does not address AI provenance attribution or the separation-of-duty policy between AI and human reviewer roles.
NIST AI 600-1 — Generative AI Profile (NIST AI RMF)
MEASURE 2.7 heading; related actions MS-2.7-001 through MS-2.7-006 cover vulnerability assessment, content provenance benchmarking, security metrics, and resilience measurement.
"MEASURE 2.7: AI system security and resilience – as identified in the MAP function – are evaluated and documented."
Supports: The MDX cites MEASURE-2.7 as the NIST anchor for "human review for AI-generated code." MEASURE 2.7 does cover security and resilience evaluation — MS-2.7-001 explicitly names "autonomous agents" as a vulnerability class to assess. This is the closest NIST AI 600-1 hook for the control.
Does not prove: MEASURE 2.7 actions are evaluation and benchmarking requirements, not a human-review gate prescription. None of the MS-2.7 actions specify human review of AI-generated code before execution or merge. The MDX claim "NIST AI 600-1 … names human review for AI-generated code" overstates what MEASURE 2.7 says. Correction flagged.
MITRE ATLAS AML.M0029 — Human In-the-Loop for AI Agent Actions
AML.M0029 description field (ATLAS.yaml, dist/ATLAS.yaml)
"Systems should require the user or another human stakeholder to approve AI agent actions before the agent takes them. The human approver may be technical staff or business unit SMEs depending on the use case. Separate tools, such as dedicated audit agents, may assist human approval, but final adjudication should be conducted by a human decision-maker."
Supports: Defines the human-review escalation path that the review gate opens onto. "Final adjudication should be conducted by a human decision-maker" is the structural principle behind the AI-cannot-self-approve policy.
Does not prove: Does not specify how the approval is triggered or that the trigger is AI-attribution in a commit. Covers the broader class of AI agent actions, not the narrower domain of code generation and merge gates specifically.
MITRE ATLAS AML.M0008 — Validate AI Model
AML.M0008 description field (ATLAS.yaml, dist/ATLAS.yaml)
"Validate that AI models perform as intended by testing for backdoor triggers, potential for data leakage, or adversarial influence. Monitor AI model for concept drift and training data drift, which may indicate data tampering and poisoning."
Supports: Named in the MDX atlasMitigations list. Validates AI artefacts before deployment — analogous to validating AI-generated code before merge. The backdoor-trigger language is directly relevant to the Workflow Engine Exploitation scenario in the MDX.
Does not prove: Concerns model-level validation, not code-level review. Does not address pull-request gates, branch protection, or AI-provenance attribution. Adjacent concern, not identical.
NIST SP 800-53 Rev. 5 — AC-5 Separation of Duties
AC-5 Separation of Duties — Supplemental Guidance
"Separation of duties addresses the potential for abuse of authorized privileges and helps to reduce the risk of malevolent activity without collusion. Separation of duties includes dividing mission or business functions and support functions among different individuals or roles … ensuring that security personnel who administer access control functions do not also administer audit functions."
Supports: Canonical upstream source for the privilege-separation requirement that prevents an AI agent from approving its own code. "Potential for abuse of authorized privileges" is the exact risk that the AI-cannot-self-approve policy addresses. The MDX correctly cites this as the foundation for the separation-of-duty policy.
Does not prove: Pre-dates AI agents and does not mention LLMs, code generation, or agentic systems. The application to AI reviewer roles is a Helmwart inference from the generic principle.