Research · Hacker News · 28 May 2026

Five frontier LLMs disagree on 67% of 1k real-world fact-check claims

A study evaluated five frontier large language models on 1,000 real-world fact-check claims, finding they disagreed on 67% of cases. The research highlights inconsistencies in model outputs and raises concerns about reliability for factual tasks.

Read the full story at Hacker News →