Glossary · Term

inference-masking

← all terms

Definition

A trick where slow bookkeeping happens during the time an AI is already waiting on a network call.

A DeltaBox technique that overlaps checkpoint work with the LLM inference round-trip the agent is already blocked on, hiding state-management cost behind perceived idle time.

Mentioned in 1 episode

  1. 068
    The OS Trick That Makes Tree Search Practical for Coding Agents