Definition
SFT (Supervised Fine-Tuning) trains a pretrained model on (input, target output) pairs to teach a specific behavior or format. It’s the simplest post-training method and the first step in most modern alignment pipelines before any RL.
Episodes covering this
Worth reading next
Papers we haven't done a deep dive on yet, but would recommend on this topic.