Research · The Decoder ·
AI search agents often confirm what they already know instead of actually researching the web
Researchers at the Harbin Institute of Technology developed LiveBrowseComp, a benchmark using only events from the last 90 days, to evaluate AI search agents. They found that leading agents like GPT-5.4 and Kimi K2.6 rely on training memory rather than actively researching the web, causing performance to drop significa