Glossary · Term

CUA-Gym

← all terms

Definition

A pipeline for automatically generating verified training tasks and environments for AI agents that operate real software.

A framework for scaling RLVR training of computer-use agents using an information-barrier Generator/Discriminator pair to synthesize verified (task, environment, reward) tuples across desktop apps and ninety-four synthesized web mocks, producing roughly 32K training tuples.

Mentioned in 1 episode

  1. 080
    How a Two-Agent Trick Unlocked Large-Scale Training for Computer-Use Agents