← all terms
A popular open-source system for serving large language models efficiently.
An open-source high-throughput inference engine for large language models, widely used as a default open-source serving stack.