Research · Hacker News ·

Five frontier LLMs disagree on 67% of 1k real-world fact-check claims

Five frontier LLMs disagree on 67% of 1k real-world fact-check claims

A study evaluated five frontier large language models on 1,000 real-world fact-check claims, finding they disagreed on 67% of cases. The research highlights inconsistencies in model outputs and raises concerns about reliability for factual tasks.

Read the full story at Hacker News →