Definition
AI efficiency covers techniques for reducing the compute, memory, energy, or latency of AI systems at a given capability level — quantization, distillation, sparsity, better serving stacks, smarter scheduling. As models get more useful, efficiency increasingly determines what’s deployable rather than just what’s possible.