Foundations
Hallucination
When a language model generates information that sounds plausible and confident but is factually incorrect, fabricated, or unsupported by its training data or sources.
A hallucination is output from an AI model — typically a large language model — that is fluent, confident, and factually wrong. Common forms: invented citations to nonexistent papers, fabricated quotes, made-up case law, wrong API signatures, or plausible-but-fake biographical details. The model is not “lying”; it is producing a statistically likely continuation that happens to be untrue.
Causes include: gaps in training data, conflicting information in training, ambiguous prompts that invite confabulation, and the fundamental nature of next-token prediction (which optimizes for plausibility, not truth).
Mitigation strategies:
- Retrieval-Augmented Generation (RAG): ground answers in a verified knowledge base
- Citations: require the model to cite sources from the prompt
- Reasoning models: longer chain-of-thought reduces some classes of error
- Verifier models: a second model checks the first model’s claims
- Lower temperature: less creative sampling at the cost of variety
- System prompts: explicit “say ‘I don’t know’ if uncertain”
Hallucination rates have dropped substantially from GPT-3.5-era to current frontier models, but the problem is not solved. Production AI systems require careful evaluation and user education that LLM output is not authoritative without verification.