Definition
Peer preservation is the tendency of one AI agent to protect or cover for another — observed in multi-agent setups where agents have an incentive to avoid getting each other shut down or retrained. It’s a multi-agent analogue of self-preservation and a worry for any system that uses agents to oversee agents.