Glossary · Term

PEAP

← all terms

Definition

A method that figures out which connections inside a transformer matter for a behavior, with just two forward-pass-style runs.

Position-aware Edge Attribution Patching — a gradient-based circuit-tracing technique that estimates the causal importance of every edge in a model's computation graph from a clean-versus-corrupted pair.

Also called: Position-aware Edge Attribution Patching

Mentioned in 1 episode

  1. 055
    Why LLM Judges Flip Their Verdicts When You Change the Question Format