Glossary · Term

GSM-Infinite

Definition

A benchmark of math word problems where you can dial up how many reasoning steps are required.

A procedurally generated grade-school math benchmark with controllable arithmetic depth, used to test how reasoning quality scales with sequential computation in long-context and hybrid models.

Mentioned in 1 episode

085
Why Long-Context Models Might Need Compute, Not Capacity, Before Eviction