Glossary · Term

FineWeb-Edu

← all terms

Definition

A large cleaned-up collection of educational web text used to train language models.

A pretraining corpus derived from FineWeb, filtered for educational content quality, widely used in open-weight model training.

Also called: FineWeb

Mentioned in 1 episode

  1. 033
    Echo: The Paper Arguing You Never Needed a KV Cache for Retrieval