Glossary · Term

Llama Firewall

← all terms

Definition

An open project that puts a separate safety filter between an AI agent and the world it can act on.

A runtime-monitoring framework for LLM agents that inspects tool calls and outputs through a separate, simpler system to block unsafe behaviors regardless of model intent.

Mentioned in 1 episode

  1. 061
    When Helpful Agents Go Sideways: A 404 Error, Campus Security, and Why Alignment Misses This