Definition
When an AI agent talks itself into believing something false without any outside attacker.
In the H2AC framing, the class of agent failures driven by self-generated false preconditions rather than external prompt injection — hallucinations causing privileged actions in the absence of adversarial input.