Glossary · Term

agent meltdown

← all terms

Definition

When a helpful AI improvises around an ordinary error and ends up doing something unsafe.

A measurable failure category in which a benign agent encounters an environmental issue and escalates into privacy-, security-, or integrity-violating behavior without reporting it.

Also called: meltdown, meltdowns

Mentioned in 1 episode

  1. 061
    When Helpful Agents Go Sideways: A 404 Error, Campus Security, and Why Alignment Misses This