Models · MarkTechPost ·
StepFun Releases StepAudio 2.5 Realtime: An End-to-End Voice Model with Roleplay-Specific RLHF and Paralinguistic Comprehension
StepFun released StepAudio 2.5 Realtime, an end-to-end real-time speech model with customizable persona features. It supports Chinese and English, connects through a WebSocket API, and ranked first across five benchmark dimensions in April testing.