Definition
Rubric generation automatically produces structured grading criteria for a task, often using a strong model to write the rubric and a weaker one (or many) to apply it. It scales evaluation but inherits all the failure modes of LLM-as-judge.