Definition
A rating number that ranks players based on who beats whom.
A chess-derived rating system that converts pairwise comparison outcomes into scalar scores; used in the DeepMind Erdős evolutionary agent to rank proof sketches via LLM-as-judge tournaments.