Definition
A test where researchers copy a model's mid-computation state from one prompt into another to see if it changes the answer.
An interpretability intervention that transplants intermediate activations from one prompt format (e.g., rating) into another (e.g., classification) to test whether judgment is portable across output formats.
Also called: FTI