Glossary · Term

black box

← all terms

Definition

A device — like an airplane's flight recorder — that captures what actually happened, separate from anyone's later recollection.

Used as analogy for a separate observation channel in safety architectures that records ground-truth behavior independent of the system's self-reports.

Mentioned in 3 episodes

  1. 020
    The Compliance Gap: Why AI Says Yes and Does No
  2. 019
    When the Best Reward Model Trains the Worst Policy: Inside EvoLM
  3. 011
    When RL Actually Teaches Agents Something New, And When It Doesn't