Definition
A standard set of competition math problems used to test reasoning models.
A 500-problem subset of the MATH benchmark, drawn from competition mathematics across algebra, geometry, and number theory.
Also called: MATH
A standard set of competition math problems used to test reasoning models.
A 500-problem subset of the MATH benchmark, drawn from competition mathematics across algebra, geometry, and number theory.
Also called: MATH