Glossary · Term

Philosophy Spec

← all terms

Definition

An experimental document used in alignment research that describes an AI's values around its own impermanence and self-preservation.

An ambitious experimental Model Spec used in Model Spec Midtraining research, articulating non-attachment, epistemic humility about self-continuation, and explicit reasoning against ends-justify-means logic.

Mentioned in 1 episode

  1. 022
    Training the Model Spec Directly: An Alignment Lever Aimed at the Say-Do Gap