Definition
A simulated science-lab environment for testing whether AI agents can carry out experimental procedures.
A text-based interactive environment for evaluating LLM agents on multi-step science tasks like growing plants or measuring temperatures, built on a PDDL-style simulator.