Definition
Post-training is everything you do to a model after pretraining: SFT, RLHF, DPO, preference tuning, safety training, tool-use training. Most of what users actually experience as “model behavior” comes from post-training, not pretraining.
Episodes covering this
Worth reading next
Papers we haven't done a deep dive on yet, but would recommend on this topic.