Definition
A high-scoring software-engineering agent system used as a baseline.
An agentic coding system scoring around 64% on SWE-bench Verified under its native harness but producing malformed output under different harnesses; used as evidence of harness-fragility in open-source agent training.