Definition
A method that figures out which connections inside a transformer matter for a behavior, with just two forward-pass-style runs.
Position-aware Edge Attribution Patching — a gradient-based circuit-tracing technique that estimates the causal importance of every edge in a model's computation graph from a clean-versus-corrupted pair.
Also called: Position-aware Edge Attribution Patching