Definition
A training method that teaches a model to think through safety rules step by step before answering.
An OpenAI-developed alignment technique that fine-tunes models on chain-of-thought reasoning about a safety specification before generating responses.