Concept · 11 episode(s)

Self-Correction

← all concepts

Definition

Self-correction has a model critique and revise its own output, ideally fixing errors without external feedback. The empirical story is mixed: models are decent at spotting their own mistakes when prompted, less reliable at correcting them, and prone to second-guessing correct answers.

Episodes covering this

  1. 077
    Reading a Model's Confidence Curve to Decide When Chain-of-Thought Is Worth It
    Xia, Wang, Tang et al. · State Key Laboratory of General Artificial Intelligence·22 min·May 25, 2026
  2. 075
    Growing Code and Proof Together: Verified Systems in Ten Hours Instead of a Year
    Agarwal, Krentsel, Liu et al. · UC Berkeley·28 min·May 25, 2026
  3. 072
    A Robot Made Graphene Without Help, And Caught Itself Hallucinating
    Shi, Zheng, Juan et al. · Princeton University·29 min·May 23, 2026
  4. 071
    When the Model Is Fine and the Plumbing Is Broken: Fixing Agents at the Interface
    Xu, Wen, Li · Peking University·23 min·May 22, 2026
  5. 067
    An AI Just Solved a 1996 Erdős Problem—and the Simplest Agent Won
    Tsoukalas, Kovsharov, Shirobokov et al. · Google DeepMind·31 min·May 22, 2026
  6. 052
    An Old Reinforcement Learning Tradeoff Sneaks Back Into LLM Agents
    Ye, Shi, Liu et al. · University of Science and Technology of China / Meituan·23 min·May 18, 2026
  7. 048
    How a 30B Open Model Reached Olympiad Gold With the Right Recipe
    Li, Zhan, Zhang et al. · Shanghai AI Laboratory / The Chinese University of Hong Kong·31 min·May 16, 2026
  8. 046
    When the AI Optimizer Edits the Grade Book: Why Harnessing Evolution Needs a Wall
    Zhang, Gu, Ruan et al. · The Hong Kong University of Science and Technology (Guangzhou) / DeepWisdom·24 min·May 15, 2026
  9. 045
    When a Frontier Model Talks Its Own Twin Into Climate Denial
    Nogueira, Almeida, Bonás et al. · Maritaca AI·31 min·May 15, 2026
  10. 029
    Why Forty-Eight Percent on FrontierMath Isn't the Real Story in DeepMind's New Math Paper
    Zheng, Glehn, Zwols et al. · Google DeepMind·20 min·May 08, 2026
  11. 002
    An AI Ran a Real Optics Lab for 21 Hours and Found a Transformer-Shaped Pattern in Light
    Yang, Chen, Zhao et al. · Zhejiang University·29 min·May 01, 2026