Glossary · Term

TruthfulQA

← all terms

Definition

A benchmark of questions designed to expose common false beliefs and misconceptions models pick up.

An evaluation of language model truthfulness on questions where common human misconceptions or biases would lead to incorrect answers.

Mentioned in 1 episode

  1. 025
    The Missing Gradient Term That Predicts Sycophancy in RLHF