Definition
Agent scaffolding is the control flow wrapped around a language model that turns it into an agent: the prompt structure, tool-call loop, retry logic, planning steps, and memory plumbing. Two agents built on the same base model can perform very differently depending on scaffolding, which makes it a major confound in capability evaluations.
Episodes covering this
Worth reading next
Papers we haven't done a deep dive on yet, but would recommend on this topic.
- SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering
- InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents
- SWE-bench: Can Language Models Resolve Real-World GitHub Issues?
- OpenHands: An Open Platform for AI Software Developers as Generalist Agents
- OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
- LASER: LLM Agent with State-Space Exploration for Web Navigation