Glossary · Term

SFT

← all terms

Definition

Training a model by showing it examples of correct answers and having it imitate them.

Supervised Fine-Tuning, a post-training stage in which a model is trained on labeled demonstrations via standard next-token prediction.

Also called: supervised fine-tuning

Mentioned in 10 episodes

  1. 066
    Why Giving an AI Agent More Tools Can Make It Worse at Using a Computer
  2. 048
    How a 30B Open Model Reached Olympiad Gold With the Right Recipe
  3. 047
    When Agent Benchmarks Lie: The Harness Problem in Open-Source AI
  4. 032
    A Sticky-Note for Every Layer: Letting Transformers Remember What They Were Just Thinking
  5. 022
    Training the Model Spec Directly: An Alignment Lever Aimed at the Say-Do Gap
  6. 021
    Ten Thousand Examples Beat the Full Industrial Pipeline for Search Agents
  7. 011
    When RL Actually Teaches Agents Something New, And When It Doesn't
  8. 009
    How Two Silent Library Bugs Quietly Invalidated a Wave of Reasoning Papers
  9. 007
    Exploration Hacking: When Models Sabotage Their Own RL Training
  10. 003
    How to Pick the Best of Sixteen Coding Agent Rollouts

Related concepts