Research · The Decoder · 16 May 2026

New benchmark confirms AI video generators look stunning but still can't reason about the world

WorldReasonBench evaluates video models on physical and logical plausibility rather than image quality. ByteDance's Seedance 2.0 ranks first, ahead of Veo 3.1 and Sora 2, while commercial systems score about twice as high as open-source ones and logical reasoning remains the hardest task.

Read the full story at The Decoder →