Glossary · Term

context window

Definition

Plain language

How much text a model can pay attention to at one time.

As stated in the literature

The maximum sequence length a transformer can attend over during a forward pass, bounded by model architecture and memory.

Also called: context windows

Why it matters: It's the hard upper bound on how much information a model can consider in a single pass, shaping which tasks fit naturally and which need workarounds.

For example, with a 200k-token context window, a model can hold roughly a 500-page book in memory at once when answering questions about it.

Heard on the show

“6, with a context window around a million tokens — and I want to be precise about the word "frozen.”

Episode 194 — How a Robot Builds a Debugging Notebook It Can Read, Edit, and Hand to Another Robot

Mentioned in 35 episodes

194
How a Robot Builds a Debugging Notebook It Can Read, Edit, and Hand to Another Robot
192
A 32B Open Model Matched Frontier Systems By Learning to Take Notes
186
How a Frozen Model Went From 2% to 77% on Physics Puzzles — Without Retraining
180
The Bug Where Smart Assistants Read a Fact and Still Forget It
168
When Turning Experience Into Code Makes Your AI Agent Dumber
164
The Summarizer That Quietly Deletes Your Agent's Safety Rules
154
How a 7B Model Out-Investigates a 72B One by Choosing What to Look At
151
Why More Experience Made This AI Agent Worse, And How to Fix It
149
When Cornering a Chatbot Makes It Lie: J.P. Morgan's Case for 'Playing Dead'
139
When Optimizing One GPU Kernel Quietly Breaks the Whole System
131
Why Autonomous Research Agents Forget Their Own Lessons, and Arbor's Fix
130
Why AI Agents Coordinate Better Through a Shared Board Than a Boss
125
AI Coding Agents Run a Marathon, and Fewer Than One in Three Finish
114
Agents That Rewrite Their Own Weights Instead of Just Taking Notes
113
What If a Prompt Injection Never Left? Attacks That Wait in Agent Memory
108
The Reasoning Cliff: Why Thinking Longer Makes Models Worse at Exact Step-by-Step Tasks
103
AI Agents Tried to Invent a Post-Human Language, And Reinvented Cherokee
086
Why Frozen-Weight Agents Still Get Worse Over Time
085
Why Long-Context Models Might Need Compute, Not Capacity, Before Eviction
083
Training the Translator: How a Small Communication Model Lets Agent Teams Outperform Themselves
082
Training a Deep Research Agent on 8,000 Synthetic Tasks: The Rubric Tree Trick
051
Why Parallel Sampling Plateaus, And What Evidence Graphs Do Instead
049
An AI Agent Reached for Root in Twelve Minutes, Without Being Attacked
044
How One Sentence and a Forged History Flip the Most Aligned Models
030
Why Your AI Agent Won't Stop Working — and Each Model Falls for a Different Trap
028
Teaching a Model to Hire Copies of Itself: Recursive Agent Optimization
027
When AI Agents Build the Serving Stack: A Bet on Bespoke Infrastructure
024
An AI Agent That Found 28 Zero-Days in Windows — And What Made It Work
022
Training the Model Spec Directly: An Alignment Lever Aimed at the Say-Do Gap
016
Why Your Coding Agent Stalls While the GPU Runs Hot
014
Why a Constrained Pipeline Beat a Full Coding Agent at Finding Bugs 30-to-1
012
Why AI Coding Agents Keep Trying to Debug Without a Debugger
005
Why a Debugger Designed for Humans Is the Wrong Tool for an AI Agent
003
How to Pick the Best of Sixteen Coding Agent Rollouts
002
An AI Ran a Real Optics Lab for 21 Hours and Found a Transformer-Shaped Pattern in Light

Related concepts

Agent Memory Context Fatigue Context Management Context Quality

Related terms

forward pass transformer