Glossary · Term

commitment

← all terms

Definition

How much of an AI's work has been locked into a particular interpretation, so that a later correction can only fix what hasn't yet been built.

In long-horizon agent analysis, the fraction of an agent's actions that are causally locked into a specific interpretation of underspecified inputs at a given trajectory position, bounding the value of subsequent clarification.

Mentioned in 10 episodes

  1. 075
    Growing Code and Proof Together: Verified Systems in Ten Hours Instead of a Year
  2. 070
    When Models Know the Answer But Say the Wrong Thing Anyway
  3. 069
    When Smarter Models Forecast Worse: The Hidden Failure Mode in LLM Predictions
  4. 045
    When a Frontier Model Talks Its Own Twin Into Climate Denial
  5. 037
    Why Hallucination Detectors Miss Stale Facts: A Geometric Story About What Models Know But Don't Say
  6. 035
    Why Frontier Agents Ask for Clarification at Exactly the Wrong Moment
  7. 034
    Catching Multi-Agent Deadlocks Before Deployment With a 40-Year-Old Tool
  8. 032
    A Sticky-Note for Every Layer: Letting Transformers Remember What They Were Just Thinking
  9. 026
    What RL Actually Does to Language Models, at the Token Level
  10. 020
    The Compliance Gap: Why AI Says Yes and Does No