Definition
A training pipeline that builds AI research agents using fully synthetic tasks graded against detailed checklists.
A deep-research agent training framework from Ohio State and Amazon that uses LLM-generated rubric trees as the unifying primitive for both task synthesis and reward computation, combined with a context condenser, SFT on teacher trajectories, and GRPO with a capped fact-checking bonus.