Definition
The finding that upgrading the smartest model in an AI security pipeline can make the system as a whole less safe.
The result that in hierarchical multi-agent systems with semantic attacks, Worker auditor vulnerability to adversarial narratives correlates positively with raw capability scores (MMLU, GPQA), because higher-capability auditors produce more linguistically certain reports that the Manager treats as authoritative.