Research · The Decoder ·
New benchmark confirms AI video generators look stunning but still can't reason about the world
WorldReasonBench evaluates video models on physical and logical plausibility rather than image quality. ByteDance's Seedance 2.0 ranks first, ahead of Veo 3.1 and Sora 2, while commercial systems score about twice as high as open-source ones and logical reasoning remains the hardest task.