Definition
Test-time compute is the amount of computation a model spends per query at inference — sampling more candidates, running longer chains-of-thought, searching deeper. Trading inference compute for capability has been one of the biggest stories of the last couple of years.
Episodes covering this
Worth reading next
Papers we haven't done a deep dive on yet, but would recommend on this topic.