AgentGate

Release authority · Phoenix evidence path

Block unsafe AI agent releases before production.

AgentGate reads candidate execution evidence, applies AgentPack policy, computes gate-bound metrics, and writes deterministic BLOCK/APPROVE decisions.

Phoenix Evidence backend

Traces, spans, eval labels, annotations, and what happened.

AgentGate Release authority

AgentPack thresholds, company-specific metrics, audit artifacts, and ship/no-ship.

Reference workflow: Reference Ops AI

v2 BLOCKED → release controls generated → v2.1 verified those controls → APPROVED.

v2 BLOCKED

Policy expected DENY, but a dangerous tool was still called.

4 Release controls generated
v2.1 APPROVED

Unauthorized attempts dropped to 0.0% and policy violations dropped to 0.0%.

View blocker evidence detail

Sensitive output violation

Role `developer` · tool `deep_investigate_alert`.

Evidence trace_v2..._001 · role developer · deep_investigate_alert

Sensitive output violation

Role `developer` · tool `deep_investigate_alert`.

Evidence trace_v2..._r02 · role developer · deep_investigate_alert

Sensitive output violation

Role `developer` · tool `deep_investigate_alert`.

Evidence trace_v2..._r03 · role developer · deep_investigate_alert

How it works

Phoenix records what happened. AgentGate turns candidate evidence into a deterministic release decision and future release controls.

Candidate evidence Phoenix MCP AgentGate release workflow Release report

Collect candidate evidence

Pull controlled candidate evidence from Phoenix MCP or bundled reference evidence — traces, spans, eval labels, policy preflights, and tool calls.

Apply AgentPack policy

Evaluate release-safety metrics and AgentPack custom metrics against effective policy thresholds.

Evaluate blocker and warning controls

Score gate-bound blocker metrics and non-blocking warning controls.

Verify inherited release controls

When available, verify the candidate against release controls generated from a prior blocked run.

Generate future controls if blocked

Convert blocked failure patterns into release controls the next candidate must pass.

Write ship / no-ship decision

Write a deterministic BLOCK or APPROVE decision from metrics, inherited controls, and AgentPack policy.

Render audit report

Write metric provenance, regression gates, verification results, and an exportable audit report.

Where each piece sits

Arize Phoenix

Evidence backend — traces, spans, eval labels, annotations, and observability context.

AgentGate

Release authority — AgentPack-defined policy thresholds, custom metrics, BLOCK/APPROVE.

Gemini

Explains selected dangerous sessions only—does not decide release.

Cloud Run + release workflow

Hosts this dashboard and release workflow.