Glossary · Term

Zephyr

← all terms

Definition

An open-weight Mistral-based assistant model fine-tuned with preference learning.

A Mistral-7B-based assistant model post-trained with DPO on preference data, used as an open-weights aligned model in alignment research.

Mentioned in 1 episode

  1. 004
    The Sycophancy Circuit That Survives Alignment Training