Concept · 1 episode(s)

Test-Time Auditing

← all concepts

Definition

Test-time auditing inspects a model’s behavior at inference time rather than only during training — running probes, classifiers, or other monitors over each response. It’s a complement to training-time alignment, not a substitute, and increasingly used in deployed systems.

Episodes covering this