Definition
Andrej Karpathy's open challenge to autonomously improve a small GPT training script under a tight compute budget.
An open benchmark in which agents have five minutes of single-GPU wall-clock training to minimize validation bits-per-byte on a tokenized web corpus by modifying any part of a baseline GPT training script.