Models · MarkTechPost · 20 May 2026

NVIDIA AI Releases Nemotron-Labs-Diffusion: A Tri-Mode Language Model with 6× Tokens Per Forward Over Qwen3-8B

NVIDIA researchers released Nemotron-Labs-Diffusion, a language model family that combines autoregressive, diffusion-based parallel, and self-speculation decoding in one architecture. It comes in 3B, 8B, and 14B sizes with base, instruct, and vision-language variants.

Read the full story at MarkTechPost →