Training

Fine-tuning

The process of further training a pre-trained language model on a smaller, task-specific dataset to specialize its behavior or domain knowledge.

Fine-tuning is the process of taking a pre-trained large language model and continuing to train it on a smaller, curated dataset to specialize it for a specific task, domain, or style. The model retains its general language ability while adapting weights to the new objective.

Common reasons to fine-tune:

  • Domain expertise — legal, medical, financial language
  • Brand voice — consistent tone for a product
  • Task specialization — function-calling reliability, structured output
  • Performance — smaller fine-tuned model can outperform a larger general one on a narrow task

Modern practice uses parameter-efficient fine-tuning (PEFT) — LoRA, QLoRA — which trains only a small adapter on top of frozen base weights. This drops VRAM requirements by 10-100×, making fine-tuning practical on a single GPU. Full fine-tuning (updating all weights) is reserved for largest-scale projects.

Fine-tuning is distinct from:

  • Pre-training: initial training on the full web corpus
  • RLHF / DPO: alignment from human preferences (often a stage of fine-tuning)
  • Prompt engineering: changing only the input, not the model
  • RAG: retrieving context at inference time, not modifying the model

For most product use cases in 2026, RAG and prompt engineering reach acceptable quality without fine-tuning. Fine-tuning becomes worth it when you have a narrow, repeatable task and at least a few hundred high-quality examples.

Sources

See also