Glossary · Term

mean of means

← all terms

Definition

A common but subtly wrong way to average data, where small groups get over-weighted because each group's average counts equally.

A pooling error where per-batch loss is computed as the average of per-rank averages rather than as a sum-over-tokens divided by total tokens, biasing the gradient when batch sizes are uneven; a known bug class in some SFT pipelines.

Also called: mean-of-means

Mentioned in 1 episode

  1. 009
    How Two Silent Library Bugs Quietly Invalidated a Wave of Reasoning Papers