Mistral ships Voxtral TTS — first multilingual text-to-speech model from the lab, 9 languages with low-latency streaming and custom voices in Mistral Studio
Mistral released Voxtral TTS as its first multilingual text-to-speech model with support for 9 languages, low-latency streaming output, and custom voice profiles available through Mistral Studio. API access is now live. The release extends Mistral's open-weight catalog from text-and-code into the speech-and-voice tier alongside European-sovereign-AI procurement.
The TTS extension is the strategic piece. Mistral has built the broadest open-weight catalog in the frontier-adjacent tier — Large 3, Small 4, Medium 3.5 (Magistral + Devstral), Ministral 3, Leanstral, Forge — and now Voxtral adds the speech axis. For European procurement buyers evaluating fully open-weight production stacks against US-proprietary offerings (ElevenLabs, OpenAI TTS, Google Cloud Speech), Voxtral is the first Tier-1 European candidate that fits the sovereign-AI procurement criteria with no licensing carve-outs.
Custom-voice profiles inside Mistral Studio match the workflow shape ElevenLabs has built its enterprise pipeline around — voice cloning for narration, dubbing, accessibility, and brand-voice consistency. Combined with Medium 3.5 becoming Vibe CLI's default coding model, Mistral's product surface now spans LLM chat, coding agent, and speech generation — all open-weight, all routed through one product layer.
Mistral AI — Mistral AI Newsroom — Voxtral TTS → · Releasebot — Mistral Voxtral TTS release notes →