An AI hallucination happens when a model produces information that isn’t true, unverifiable, or invented (facts, names, quotes, citations, numbers). The model sounds confident but is wrong or made it up.
Types of hallucinations
- Fabrication: invented facts, people, events, or citations.
- Confabulation: plausible-sounding but incorrect explanations.
- Outdated/obsolete answers: correct once but no longer true.
- Hallucinated sources: citations or quotes that don’t actually exist.
- Semantic drift: correct concept used in wrong context (e.g., mixing two diseases).
Why hallucinations happen (short)
- Statistical prediction, not truth-seeking: models predict likely next tokens, not verify reality.
- Training gaps: missing or biased data; rare facts poorly represented.
- Decoding choices & temperature: sampling can produce less-accurate but creative outputs.
- Overconfidence: models aren’t well-calibrated on uncertainty.
- Prompt ambiguity: vague prompts cause the model to “fill in” details.
- No grounding: no access to external knowledge/verification unless retrieval is used.
How to spot them
- Check for specifics (dates, exact figures, author names, titles) that can be verified.
- Look for links/citations — verify they exist.
- Ask the model to explain its source or provide step-by-step reasoning.
- Watch for inconsistent details across the same conversation.
- If an answer sounds too confident about obscure facts, be suspicious.
How to reduce hallucinations — for users
- Ask for sources: “Cite sources and include links or page titles.”
- Ask for uncertainty: “How confident are you (0–100%)?” or “Which part is uncertain?”
- Constrain the task: ask for summaries of verifiable facts only.
- Use retrieval: when accuracy matters, combine the model with a search or your documents (RAG).
- Lower creativity: set lower temperature / deterministic decoding if available.
- Request chain-of-thought or stepwise checks for complex factual chains.
- Verify important facts externally — treat model output as draft, not final authority.
How to reduce hallucinations — for developers / teams
- Grounding / RAG: connect model to retrieval from trusted corpora and cite sources.
- Tooling & verifiers: post-processing modules that fact-check or call APIs for validation.
- Calibration: train models to express uncertainty or abstain on low-confidence answers.
- RLHF + targeted fine-tuning: penalize hallucinations and reward truthful behavior.
- Constrained decoding / prompts: force formats that make hallucination easier to catch (e.g., require numbered sources).
- Human-in-the-loop: route low-confidence outputs to human reviewers.
- Monitoring & metrics: measure hallucination rate on benchmarks and production logs; use datasets like TruthfulQA, fact-check corpora, or custom tests.
Practical prompt templates
- Verification-first:
“Answer briefly and list three supporting sources (title + URL). If you can’t find a reliable source, say ‘I can’t verify this.’” - Conservative reply:
“Give a short answer and then a bullet list of which facts you are uncertain about and why.” - Stepwise check:
“Provide your answer in steps and label which step needs external verification.”
Example: detect a hallucinated citation
If the model says: “Smith et al., 2018, Journal of X showed …” — search for that paper (title, authors, year). If you can’t find it, treat it as likely hallucinated.
When hallucinations are most dangerous
- Medical, legal, financial, safety-critical advice — always verify with experts or authoritative sources.
- Any decision with legal/regulatory consequences or large cost.