Models · The Decoder ·
ByteDance's "iLLaDA" is a diffusion language model that keeps up with Qwen2.5
Researchers from Renmin University and ByteDance released iLLaDA, an 8B diffusion-based language model that generates text differently from standard autoregressive models. The model matches Qwen2.5 at the base level but performs worse after fine-tuning.