Training
Fine-tuning
The process of further training a pre-trained language model on a smaller, task-specific dataset to specialize its behavior or domain knowledge.
Fine-tuning is the process of taking a pre-trained large language model and continuing to train it on a smaller, curated dataset to specialize it for a specific task, domain, or style. The model retains its general language ability while adapting weights to the new objective.
Common reasons to fine-tune:
- Domain expertise — legal, medical, financial language
- Brand voice — consistent tone for a product
- Task specialization — function-calling reliability, structured output
- Performance — smaller fine-tuned model can outperform a larger general one on a narrow task
Modern practice uses parameter-efficient fine-tuning (PEFT) — LoRA, QLoRA — which trains only a small adapter on top of frozen base weights. This drops VRAM requirements by 10-100×, making fine-tuning practical on a single GPU. Full fine-tuning (updating all weights) is reserved for largest-scale projects.
Fine-tuning is distinct from:
- Pre-training: initial training on the full web corpus
- RLHF / DPO: alignment from human preferences (often a stage of fine-tuning)
- Prompt engineering: changing only the input, not the model
- RAG: retrieving context at inference time, not modifying the model
For most product use cases in 2026, RAG and prompt engineering reach acceptable quality without fine-tuning. Fine-tuning becomes worth it when you have a narrow, repeatable task and at least a few hundred high-quality examples.