Glossary · Term

HealthBench

← all terms

Definition

A benchmark of medical questions used to test how well models reason about health topics.

An evaluation suite of medical question-answering and clinical-reasoning tasks, used in EvoLM-style cross-domain transfer experiments to test whether learned evaluative rubrics generalize to expert health domains.

Mentioned in 1 episode

  1. 019
    When the Best Reward Model Trains the Worst Policy: Inside EvoLM