Definition
A wide-ranging multiple-choice benchmark covering many academic subjects.
Massive Multitask Language Understanding, a benchmark of multiple-choice questions across 57 subjects used to evaluate broad knowledge in language models.
Also called: M-M-L-U