Concept · 8 episode(s)

Inference Cost

← all concepts

Definition

Inference cost is the price — in dollars, watts, or latency — of running a model once it’s trained. At scale it dwarfs training cost, and it’s often the binding constraint on which features can actually ship.

Episodes covering this

Worth reading next

Papers we haven't done a deep dive on yet, but would recommend on this topic.