Definition
Reading what an AI 'thinks out loud' to catch it before it does something bad.
A safety practice in which a separate model or human reads a reasoning model's CoT trace before it acts, flagging plans for deception, sandbagging, or policy violations.
Also called: CoT monitoring