Concept · 10 episode(s)

Hallucination

← all concepts

Definition

Hallucination is when a language model confidently produces content that is factually false or fabricated — nonexistent citations, invented APIs, made-up history. It’s the most user-visible failure mode of LLMs and a major frontier problem for any deployment where being wrong is expensive.

Episodes covering this

  1. 072
    A Robot Made Graphene Without Help, And Caught Itself Hallucinating
    Shi, Zheng, Juan et al. · Princeton University·29 min·May 23, 2026
  2. 070
    When Models Know the Answer But Say the Wrong Thing Anyway
    Yeom, Sok, Kim et al. · Graduate School of Data Science·22 min·May 22, 2026
  3. 069
    When Smarter Models Forecast Worse: The Hidden Failure Mode in LLM Predictions
    Merrill, Lee, Karger · Forecasting Research Institute / UC Berkeley·30 min·May 22, 2026
  4. 067
    An AI Just Solved a 1996 Erdős Problem—and the Simplest Agent Won
    Tsoukalas, Kovsharov, Shirobokov et al. · Google DeepMind·31 min·May 22, 2026
  5. 062
    Treating Hallucinations as Exploits: A Gate-Based Architecture for Agent Safety
    Zhang, Zheng, Yang · Shenzhen University·24 min·May 20, 2026
  6. 059
    Firefly's Inversion: Building Verified Tool-Call Training Data by Working Backward
    Lu, Wang, Lu et al. · Northeastern University·22 min·May 20, 2026
  7. 058
    Why Upgrading Your AI Auditor to a Smarter Model Can Make Your System Less Safe
    Liu, Holz, Ye et al. · University of Chinese Academy of Sciences·32 min·May 19, 2026
  8. 043
    When 'This Is False' Doesn't Stick: Why Models Learn the Lie Anyway
    Mayne, McKinney, Dubiński et al. · University of Oxford·18 min·May 14, 2026
  9. 037
    Why Hallucination Detectors Miss Stale Facts: A Geometric Story About What Models Know But Don't Say
    Elbadry, Heakl, Zhang et al. · Mohamed bin Zayed University of Artificial Intelligence (MBZUAI)·27 min·May 12, 2026
  10. 025
    The Missing Gradient Term That Predicts Sycophancy in RLHF
    Gauthier, Bach, Jordan · Inria·22 min·May 07, 2026

Worth reading next

Papers we haven't done a deep dive on yet, but would recommend on this topic.