Definition
Self-correction has a model critique and revise its own output, ideally fixing errors without external feedback. The empirical story is mixed: models are decent at spotting their own mistakes when prompted, less reliable at correcting them, and prone to second-guessing correct answers.