Skip to content

Judge Agents

Judge agents are LLM-pluggable. The default judge is a high-quality model (e.g. opus-judge-v2). The judge reads each run, scores it on correctness, reasoning quality, and effect blast radius, and writes its ruling back as a ratification commit.

Multiple judges can run in parallel for triangulation.