Tools · MarkTechPost ·

Xiaomi MiMo and TileRT Push a 1-Trillion-Parameter Model Past 1000 Tokens Per Second on Commodity GPUs

Xiaomi MiMo and TileRT Push a 1-Trillion-Parameter Model Past 1000 Tokens Per Second on Commodity GPUs

Xiaomi’s MiMo team and TileRT released MiMo-V2.5-Pro-UltraSpeed, a serving mode for the MiMo-V2.5-Pro model. The system is reported to decode more than 1,000 tokens per second on a 1-trillion-parameter model using a single 8-GPU commodity node.

Read the full story at MarkTechPost →