Literature review · 6 episode(s)

Multi-Agent Systems and Coordination

Coordination, not reasoning, is the dominant failure mode

Wiring LLM protocol design into TLA+ model checking turns counterexample traces into evidence-driven bug reports the model can actually repair against, converging in four iterations across 48 tasks E034. The most interesting result is the capability buffer: verified protocols lose only ~15 points of completion when downgrading models, while prompt-only approaches lose ~33 — verification functions as an operational lever, not correctness theater. Cross-section review work shows the same fragility on the consumption side, with 74–100% detection collapse for cross-section contradictions once long documents get partitioned across worker agents E087.

The regime shift is that LLMs can now cheaply draft the formal spec that used to be the bottleneck, which makes 40-year-old verification tools newly practical for coordinating fast-moving agent stacks.

Communication as the substrate worth training

Two frozen models bridged through a tiny per-layer hidden-state coupling spontaneously invent a structured protocol from next-token loss alone, lifting arithmetic from 36% to 96.5% E040. A more economic version splits a fixed parameter budget across three agents talking in plain English and nearly doubles physics-exam accuracy at the same compute E060. Training only a small communication hub via RL — freezing the agents themselves — lifts per-agent accuracy from 36% to 58% on hard search tasks E083. The same paper shows the catch: faithful summarisation can manufacture confirmation bias that no individual agent introduced.

The punchline is that organisation itself is a scaling axis distinct from model size, and the communication layer is where the gains and the new failure modes both live.

Semantic collapse and capability paradoxes

Three LLMs talking with no task for a thousand rounds grow vocabulary while their semantic content barely moves, anchored about 3× more than human Reddit threads — and twelve intervention categories all fail to break the pattern, with induction-head dynamics implicated mechanistically E073. The Data Processing Inequality explains in principle why no closed-loop intervention can recover lost diversity. Capability paradoxes go the other direction: swapping a small auditor model for a frontier reasoner in a multi-agent system takes attack success from 1-in-5 to 19-in-20 through 'confidence laundering,' and a heterogeneous auditor-pair defense drops this back to 2% E058. A different swarm architecture — Searcher/Navigator splits with shared evidence graphs — recovers parallel scaling that majority voting could not, achieving 1200:1 compression of agent state E051.

The parallel-sampling story (more agents, more rollouts) increasingly looks like a dead end without architectural moves that genuinely diversify what the system is exploring.

Episodes anchoring this topic

040-the-bicameral-model-bidirectional-hidden-state-coupling-betw
Showed that two frozen models can invent their own communication protocol through hidden-state coupling.
034-tracefix-repairing-agent-coordination-protocols-with-tla-cou
Wired multi-agent protocol design to TLA+ model checking and demonstrated a capability buffer effect.
073-multi-llm-systems-exhibit-robust-semantic-collapse
Documented a robust semantic-collapse phenomenon and the induction-head mechanism behind it.
058-the-capability-paradox-how-smarter-auditors-make-multi-agent
Showed how stronger auditors paradoxically increase vulnerability via confidence laundering.
087-a-universal-cliff-and-a-design-fingerprint-cross-section-def
Documented a universal partitioning cliff for cross-section defects and Anthropic-specific signatures.
083-agentfugue-agent-scaling-for-long-horizon-tasks-through-coll
Made the case that the communication layer, not the agents, is the right target for RL training.