Glossary · Term

MMLU

← all terms

Definition

A wide-ranging multiple-choice benchmark covering many academic subjects.

Massive Multitask Language Understanding, a benchmark of multiple-choice questions across 57 subjects used to evaluate broad knowledge in language models.

Also called: M-M-L-U

Mentioned in 5 episodes

  1. 074
    How a Fifteen-Hundred-Dollar Training Run Matched Llama and Gemma on Reasoning
  2. 060
    When Splitting One Model Across Three Agents Doubles Its Accuracy
  3. 058
    Why Upgrading Your AI Auditor to a Smarter Model Can Make Your System Less Safe
  4. 055
    Why LLM Judges Flip Their Verdicts When You Change the Question Format
  5. 025
    The Missing Gradient Term That Predicts Sycophancy in RLHF