fine-tuning · Glossary · AI Papers: A Deep Dive

Definition

Plain language

Taking a model that already knows a lot and giving it more lessons so it gets better at one specific task.

As stated in the literature

Continuing the training of a pretrained model on a smaller task-specific dataset, updating its weights to adapt the general model to a particular use case.

Also called: Fine-tuning, fine-tune, fine-tuned, finetuning, finetune, finetuned, finetunes

Why it matters: It's the standard way to adapt a generic foundation model to a specific domain or task without training from scratch.

For example, you might take a general language model and fine-tune it on a few thousand customer-support transcripts so it adopts your company's tone and product knowledge.

Heard on the show

“… GRPO is reinforcement-learning fine-tuning where the model's answers get scored by a hand-built reward function and the model is nudged toward …”

Episode 199 — Finding a Model's Hidden Behaviors Without Knowing What You're Looking For

Mentioned in 59 episodes

Related concepts

Capability Elicitation Deliberative Alignment KL Divergence LoRA Midtraining Recursive Agent Optimization Structured Trace Formatting Training Methods

Related terms

pretraining weights