Definition
A benchmark of logic-grid puzzles where each clue narrows down who lives where, who owns what, and so on.
A benchmark suite of multi-clue logic-grid puzzles used to test whether agentic LLM systems can offload constraint reasoning to formal solvers and recover the correct assignments.