Foundations

Hallucination

When a language model generates information that sounds plausible and confident but is factually incorrect, fabricated, or unsupported by its training data or sources.

A hallucination is output from an AI model — typically a large language model — that is fluent, confident, and factually wrong. Common forms: invented citations to nonexistent papers, fabricated quotes, made-up case law, wrong API signatures, or plausible-but-fake biographical details. The model is not “lying”; it is producing a statistically likely continuation that happens to be untrue.

Causes include: gaps in training data, conflicting information in training, ambiguous prompts that invite confabulation, and the fundamental nature of next-token prediction (which optimizes for plausibility, not truth).

Mitigation strategies:

  • Retrieval-Augmented Generation (RAG): ground answers in a verified knowledge base
  • Citations: require the model to cite sources from the prompt
  • Reasoning models: longer chain-of-thought reduces some classes of error
  • Verifier models: a second model checks the first model’s claims
  • Lower temperature: less creative sampling at the cost of variety
  • System prompts: explicit “say ‘I don’t know’ if uncertain”

Hallucination rates have dropped substantially from GPT-3.5-era to current frontier models, but the problem is not solved. Production AI systems require careful evaluation and user education that LLM output is not authoritative without verification.

Sources

See also