Definition
A training-free routing system that decides whether a model should reason out loud or just answer directly.
An entropy-dynamics routing method that probes the first 64 tokens of decoding, computes cumulative entropy, trend, and smoothness, and routes between chain-of-thought, Standard, and Direct decoding modes based on hand-tuned or learned thresholds.