Definition
A benchmark of olympiad-level math problems used to evaluate advanced reasoning models.
A multi-discipline olympiad-style benchmark covering mathematics and physics at advanced competition difficulty, used in reasoning-model evaluation suites alongside MATH and AIME.