Glossary · Term

LLM-as-a-judge

Definition

Plain language

Using one AI model to grade what another AI model produced.

As stated in the literature

An evaluation or moderation pattern in which a language model serves as the grader of outputs, preferences, or safety properties.

Also called: LLM judge, LLM-as-judge, LLM-as-a-Judge

Why it matters: It's how much of modern model evaluation and preference data collection actually gets done at scale, so its biases ripple everywhere.

For example, given two candidate summaries, a stronger model is asked to pick which one is more accurate and the result is used as a preference label.

Heard on the show

“AI judges — and LLM-as-a-judge is everywhere now, grading everything — tend to favor AI-generated text, especially text in a style like their own.”

Episode 178 — How an AI Reviewer Learned to Stop Going Easy on AI Writing

Mentioned in 14 episodes

178
How an AI Reviewer Learned to Stop Going Easy on AI Writing
132
The Agent Failed — But Did the Instructions Deserve to Be Followed?
125
AI Coding Agents Run a Marathon, and Fewer Than One in Three Finish
123
Five Identical Worlds, One Swapped Model: What Happens When AI Agents Run for Fifteen Days
122
When Your Coding Agent Lies About the Fix: Verifying the Plan Before the Model Runs
087
When No Agent Reads the Whole Document: A Universal Cliff in Multi-Agent Review
082
Training a Deep Research Agent on 8,000 Synthetic Tasks: The Rubric Tree Trick
062
Treating Hallucinations as Exploits: A Gate-Based Architecture for Agent Safety
059
Firefly's Inversion: Building Verified Tool-Call Training Data by Working Backward
055
Why LLM Judges Flip Their Verdicts When You Change the Question Format
052
An Old Reinforcement Learning Tradeoff Sneaks Back Into LLM Agents
028
Teaching a Model to Hire Copies of Itself: Recursive Agent Optimization
023
Why a Small Agent Confidently Overwrites Memories It Doesn't Understand
020
The Compliance Gap: Why AI Says Yes and Does No

Related concepts

LLM-as-Judge Rubric Generation

Related terms

safety property