RAO · Glossary · AI Papers: A Deep Dive

Definition

Plain language

A training method that teaches an AI agent to spawn copies of itself for sub-tasks and learn from the whole tree of attempts.

As stated in the literature

Recursive Agent Optimization, a training framework in which a single shared policy learns to recursively delegate sub-tasks to child agent instances, with per-node local rewards combining own-task success with average child success.

Also called: Recursive Agent Optimization

Why it matters: Teaching one policy to recursively delegate is a path to agents that scale gracefully to long, branching tasks instead of trying to cram everything into a single conversation.

For example, a research agent stuck on a hard task can spawn three child copies to investigate sub-questions, and credit for the answer flows back to all of them based on how much each helped.

Heard on the show

“The paper, in full, is "Recursive Agent Optimization.”

Episode 028 — Teaching a Model to Hire Copies of Itself: Recursive Agent Optimization

Mentioned in 1 episode

028
Teaching a Model to Hire Copies of Itself: Recursive Agent Optimization

Related terms

agent policy