Glossary · Term

Epoch Capabilities Index

Definition

Plain language

A single score that ranks AI models by how well they do across many standard tests.

As stated in the literature

An aggregate capability metric maintained by Epoch AI combining performance on benchmarks like MMLU and GPQA, used as the capability axis in recent inverse-scaling analyses of LLM forecasting.

Also called: ECI

Why it matters: A composite score smooths out quirks of any one benchmark and lets researchers study how a downstream behavior scales with overall capability.

For example, when comparing how well two models forecast world events, researchers can plot performance against each model's Epoch Capabilities Index instead of debating which single benchmark to use.

Heard on the show

“The paper is using something called the Epoch Capabilities Index — basically an aggregate of how well a model does on standard benchmarks — to rank models.”

Episode 069 — When Smarter Models Forecast Worse: The Hidden Failure Mode in LLM Predictions

Mentioned in 1 episode

069
When Smarter Models Forecast Worse: The Hidden Failure Mode in LLM Predictions

Related terms

capability GPQA MMLU