Definition
A benchmark of research-grade math problems used to test AI math reasoning.
An Epoch AI benchmark of advanced mathematics problems organized into difficulty tiers (Tier 4 being short-term research projects for PhD mathematicians), used to evaluate AI mathematical reasoning capabilities.
Also called: FrontierScience-Research