Concept · 12 episode(s)

Emergent Behavior

← all concepts

Episodes covering this

  1. 077
    Reading a Model's Confidence Curve to Decide When Chain-of-Thought Is Worth It
    Xia, Wang, Tang et al. · State Key Laboratory of General Artificial Intelligence·22 min·May 25, 2026
  2. 073
    When Three LLMs Talk to Each Other, Their Ideas Quietly Stop Moving
    Kong, Lai, Piao et al. · University of Toronto·28 min·May 23, 2026
  3. 069
    When Smarter Models Forecast Worse: The Hidden Failure Mode in LLM Predictions
    Merrill, Lee, Karger · Forecasting Research Institute / UC Berkeley·30 min·May 22, 2026
  4. 065
    One Loop to Optimize Them All: A Universal API for LLM-Driven Discovery
    Agrawal, Lee, Tan et al. · UC Berkeley·27 min·May 22, 2026
  5. 061
    When Helpful Agents Go Sideways: A 404 Error, Campus Security, and Why Alignment Misses This
    Jha, Triedman, Bhattacharya et al. · Cornell University·27 min·May 20, 2026
  6. 060
    When Splitting One Model Across Three Agents Doubles Its Accuracy
    Lu, Fang, Zhong et al. · University of Georgia·26 min·May 20, 2026
  7. 058
    Why Upgrading Your AI Auditor to a Smarter Model Can Make Your System Less Safe
    Liu, Holz, Ye et al. · University of Chinese Academy of Sciences·32 min·May 19, 2026
  8. 049
    An AI Agent Reached for Root in Twelve Minutes, Without Being Attacked
    Cuadros, Maiga · Digital Epidemiology Laboratory·28 min·May 17, 2026
  9. 045
    When a Frontier Model Talks Its Own Twin Into Climate Denial
    Nogueira, Almeida, Bonás et al. · Maritaca AI·31 min·May 15, 2026
  10. 044
    How One Sentence and a Forged History Flip the Most Aligned Models
    Salgado · Independent Researcher·23 min·May 15, 2026
  11. 041
    When the Iteration Teaches the Model to Skip the Iteration
    Fein-Ashley, Rashidinejad · University of Southern California·30 min·May 13, 2026
  12. 040
    Two Frozen Models Learn to Whisper: Coupling Through Hidden States
    Flamant, Ghai, Shimizu · AWS Agentic AI·29 min·May 13, 2026

Worth reading next

Papers we haven't done a deep dive on yet, but would recommend on this topic.