Definition
A math reasoning benchmark drawn from quantitative reasoning tasks.
A benchmark suite of mathematical and scientific reasoning problems originally introduced with Google's Minerva model; used as one of the canonical math evaluation sets.
A math reasoning benchmark drawn from quantitative reasoning tasks.
A benchmark suite of mathematical and scientific reasoning problems originally introduced with Google's Minerva model; used as one of the canonical math evaluation sets.