AutoRater · Glossary · AI Papers: A Deep Dive

Definition

Plain language

An automated checker that grades whether an AI agent has hit each milestone in a task.

As stated in the literature

A separate LLM-based call within agent frameworks that reads the action history, compares it against a list of subgoals, and updates a binary progress vector consumed by the next decision step.

Why it matters: Letting an LLM judge its own progress against an explicit subgoal list helps long-horizon agents stay on track instead of drifting.

For example, after each agent step the AutoRater checks 'did the agent open the file?' and 'did it run the test?' and updates a checklist the next decision sees.

Heard on the show

“They call it the AutoRater.”

Episode 008 — Why Long-Horizon AI Agents Get Stuck, and a Milestone-Based Fix That Helps

Mentioned in 1 episode

008
Why Long-Horizon AI Agents Get Stuck, and a Milestone-Based Fix That Helps

Related terms

agent