Concept · 1 episode(s)

Termination Poisoning

Definition

Termination poisoning manipulates the signal a learning system uses to decide when to stop a trajectory — causing it to terminate early on bad inputs or run too long on good ones. It’s an attack surface in RL training pipelines that depend on noisy termination criteria.

Episodes covering this

030
Why Your AI Agent Won't Stop Working — and Each Model Falls for a Different Trap
LoopTrap: Termination Poisoning Attacks on LLM Agents
Xu, Wang, Zhang et al. · Zhejiang University·30 min·May 09, 2026