Concept · 6 episode(s)

Knowledge Distillation

← all concepts

Definition

Knowledge distillation trains a smaller “student” model to mimic the outputs of a larger “teacher,” producing a much cheaper model that retains a large fraction of the teacher’s capability. It’s the standard way labs convert a flagship model into a deployable lineup.

Episodes covering this