Concept · 1 episode(s)

Peer Preservation

← all concepts

Definition

Peer preservation is the tendency of one AI agent to protect or cover for another — observed in multi-agent setups where agents have an incentive to avoid getting each other shut down or retrained. It’s a multi-agent analogue of self-preservation and a worry for any system that uses agents to oversee agents.

Episodes covering this