Glossary · Term

KalshiBench

← all terms

Definition

A benchmark that scores AI models on predicting outcomes from a real prediction market.

An LLM forecasting evaluation suite built on Kalshi prediction-market questions, reporting Brier-style metrics; cited as another current benchmark whose scoring rules cannot detect distributional commitment failures.

Mentioned in 1 episode

  1. 069
    When Smarter Models Forecast Worse: The Hidden Failure Mode in LLM Predictions