Glossary · Term

DeepSpeed

← all terms

Definition

A widely used software toolkit that helps train very large models.

Microsoft's library for distributed training of large neural networks, including ZeRO sharding, CPU offloading, and gradient accumulation.

Mentioned in 1 episode

  1. 009
    How Two Silent Library Bugs Quietly Invalidated a Wave of Reasoning Papers

Related concepts