HomeWinBuzzer NewsMicrosoft Expands Azure AI Speech with Multilingual Voices and Avatars

Microsoft Expands Azure AI Speech with Multilingual Voices and Avatars

Microsoft's Azure AI Speech gets multilingual voices and lifelike avatars for real conversations to improve human interactions.


has rolled out updates for its Azure AI Speech services, introducing newly generated multilingual voices and advanced human avatars. These improvements target more natural user interactions and engagement by offering diverse and realistic auditory and visual experiences.

Wide Range of Multilingual Voices

An important upgrade in this release is the addition of various multilingual voices such as en-GB-AdaMultilingualNeural and es-ES-IsidoraMultilingualNeural. These voices provide a broad spectrum of accents and inflections, enhancing the authenticity of AI interactions, especially in chatbot applications.

The update also brings two new U.S.-based voices designed specifically for call center use. These voices are optimized to provide clearer, more natural-sounding conversations, aiming to elevate the customer service experience.

New Human Avatars

Microsoft has also introduced five new human avatars into Azure AI Speech. These avatars feature improved sound quality and are compatible with the Azure OpenAI GPT-4o model, offering seamless interaction between live chat avatars and the GPT-4o model. Sample code is available to help with the integration of avatars and the model.

A new Text Stream API aims to expedite text-to-speech (TTS) functionalities. The API processes input in segments rather than entire responses, reducing latency. This is particularly beneficial for real-time applications, live events, and interactive AI dialogues. Developers can find sample code for the Text Stream API on GitHub.

Regional Availability and Expansion

The updated voices and features are now in public preview in the East US, West Europe, and South East Asia regions. Additionally, the avatar service has expanded to Sweden Central, North Europe, and South Central US, enabling more developers and businesses to utilize Speech's advanced features.

Incorporating Azure AI Content Safety, the batch synthesis process for text-to-speech avatars includes measures to detect and prevent harmful content. Microsoft has also adopted the C2PA Standard to ensure transparency in AI-generated video content.

Luke Jones
Luke Jones
Luke has been writing about all things tech for more than five years. He is following Microsoft closely to bring you the latest news about Windows, Office, Azure, Skype, HoloLens and all the rest of their products.