Glossary · Term

GPQA

← all terms

Definition

A small set of extremely hard graduate-level science questions used to test AI reasoning.

GPQA-Diamond, the hardest subset of the Graduate-Level Google-Proof Q&A benchmark, covering expert-validated physics, chemistry, and biology questions.

Also called: GPQA-Diamond

Mentioned in 2 episodes

  1. 077
    Reading a Model's Confidence Curve to Decide When Chain-of-Thought Is Worth It
  2. 058
    Why Upgrading Your AI Auditor to a Smarter Model Can Make Your System Less Safe