3-Agent Cross-Review: Independent Eyes on Every Deliverable

Running in production since early 2025 — before "adversarial review" became standard terminology in the AI engineering community.

The Problem: Single-Reviewer Blindness

An engineer reviewing their own work is structurally incapable of catching certain classes of error. This is not a quality issue with any individual reviewer; it is a property of the review context being contaminated by the authoring context.

When the same person who built the analysis turns around to review it:

The reasoning that justified each decision is still in their head, so the decisions look obviously correct
Assumptions made during authoring feel like established facts
Missing requirements are invisible because they were never considered in the first place
Edge cases the author did not think of remain unconsidered

This is the engineering equivalent of a developer reviewing their own code. They might catch typos. They almost never catch architectural misalignment.

The Evidence

Before we enforced cross-review on our own work, the failure pattern was consistent: plans shipped with unstated assumptions; implementations diverged from plans because the same person rationalized the drift; the few self-reviews that did happen caught surface issues but missed structural problems. Independent review compliance was running at 4 percent. With cross-review enforced, that number is now consistently above 80 percent — and the issues caught are qualitatively different.

Three Independent Reviewers

For every plan and every deliverable, we engage three independent reviewers. Each operates in a fresh context, without access to the authoring session's reasoning or rationalizations.

Reviewer Role	Primary Focus	Review Strengths
Orchestrator-reviewer	Task framing and plan coherence	Scope verification, cross-file consistency, convention alignment
Implementation-reviewer	Code and analysis correctness	Off-by-one errors, test coverage, file-path verification, diff analysis
Adversarial reviewer	Architecture and completeness	Missing edge cases, requirement gaps, large-context structural review

The key word is independent. A reviewer operating in the same session as the author is not independent — they share the same context contamination. Cross-review works because each reviewer starts with only the artifact and the specification, not the authoring session's history of dead ends, retries, and rationalizations.

The Two-Stage Review Model

Cross-review happens at two distinct stages, not just one. Skipping either stage produces predictable failure modes.

Stage 1: Plan Review

Author creates implementation plan
    |
    v
Implementation-reviewer reviews plan --> finds execution concerns
    |   "This approach won't handle concurrent access"
    v
Adversarial reviewer reviews plan --> finds architectural concerns
    |   "This duplicates logic already in the existing toolchain"
    v
Author updates plan based on both reviews
    |
    v
Client / project lead approves the updated plan

Stage 2: Artifact Review

Implementation-reviewer's plan is executed
    |
    v
Orchestrator-reviewer checks against plan
    |   "The plan called for 7 checkpoints; only 5 are present"
    v
Adversarial reviewer checks architecture
    |   "This component should follow the same protocol as its peer"
    v
Issues resolved --> deliverable ships

Both stages are mandatory. Plans receive adversarial review before approval. Code, analysis output, and deliverable artifacts receive adversarial review before completion. There is no path to ship that bypasses either stage.

Why Different Reviewers Catch Different Things

Different review perspectives have characteristic blind spots and strengths. These are not random — they are consistent enough to be exploited deliberately.

Failure Mode	Best Caught By	Why
Plan-implementation drift	Orchestrator-reviewer	Maintains the plan context, can cross-reference
Off-by-one errors in code or formulas	Implementation-reviewer	Strong at line-level analysis
Missing edge cases in plan	Adversarial reviewer	Large context window, good at "what about..." questions
Scope creep	Any independent reviewer	Fresh context = no investment in existing approach
Incorrect file or reference paths	Implementation-reviewer	Good at verifying paths exist where claimed
Architectural inconsistency	Adversarial reviewer	Holds the entire repo structure in context
Unstated assumptions	Any independent reviewer	Author's assumptions are not in the reviewer's context

The probability that all three reviewers miss the same issue is far lower than the probability that any single one misses it. That is the structural reason cross-review works.

How the Workflow Is Enforced

Cross-review is not voluntary. It is enforced at the strongest level we operate — technical gates that block operations rather than warnings that suggest review.

Gate	Trigger	What It Enforces
Plan-execution gate	Implementation start	Blocks execution without plan review
PR creation gate	Pull request opened	Blocks if no review evidence is present
Pre-push review gate	Push to shared branches	Blocks feature commits lacking review
Ship gate	Final delivery	Blocks without test pairing and review

The gates check four evidence sources: review results files, recorded reviews in the planning archive, formal review reports, and review keywords in commit messages. If any source has evidence for the change in question, the gate passes. If none do, the gate blocks — regardless of who initiated the operation or what deadline pressure exists.

Review Verdicts

Reviews resolve to one of three explicit verdicts:

Verdict	Meaning	Action
APPROVE	No issues found	Proceed to ship
MINOR	Small issues, non-blocking	Fix recommended; not required
MAJOR	Significant issues found	Must fix before shipping

MAJOR findings require resolution and re-review. They do not count as passing evidence on their own — only resolution + re-review closes the gate.

What Cross-Review Actually Catches

Real examples from production engagements:

Configuration mismatches between policy and implementation. A review caught a threshold configured at one value in policy but a different value in the enforcing hook — a silent drift that would have made our own configuration misleading.
Default-mode misalignment. A reviewer noticed that the "advisory" mode was the default for a strict-by-stated-intent enforcement script. The contradiction was caught and corrected before it shipped to clients.
Concurrency failure modes that only appear at scale. Reviewers identified file-locking races, file-handle issues, and policy hooks blocking final writes — failure modes that would only have surfaced under client load, where they would have been far costlier.

None of these were caught by self-review. All were caught by an independent reviewer with no investment in the original approach.

Key Takeaway

Cross-review works because it exploits a structural property: independent reviewers have non-overlapping blind spots. A deliverable reviewed through three different lenses, with three different sets of assumptions, in three fresh contexts, has been examined more thoroughly than any single review can achieve.

The enforcement is what makes it real. Without technical gates, review compliance was 4 percent. With gates, every shippable artifact is examined. The combination of independent reviewers and automatic enforcement creates a quality floor that no single reviewer can produce alone.

Want engineering deliverables with three layers of independent review built into the workflow? Contact ACE Engineering to learn how this pattern applies to your project governance.

Three reviewers. Every deliverable. No exceptions.

Cross-review is not a checklist. It is enforced at every shipping gate so that the work you receive has already been adversarially examined.

Talk to ACE Engineering Download Capability Summary (PDF, 1 page)

3-Agent Cross-Review

On this page