Glossary · Term

SGLang

← all terms

Definition

An open-source serving system optimized for fast LLM inference.

An open-source LLM serving framework offering high-throughput inference with structured generation and runtime optimizations.

Mentioned in 1 episode

  1. 027
    When AI Agents Build the Serving Stack: A Bet on Bespoke Infrastructure