HomeWinBuzzer NewsMicrosoft Unveils New AI Text-to-Speech Voices for Azure OpenAI

Microsoft Unveils New AI Text-to-Speech Voices for Azure OpenAI

Microsoft's new AI neural voices aim to improve speech-driven chatbots and voice assistants.

-

has rolled out four innovative AI neural voices for text-to-speech (TTS) applications, specifically designed for integration with Azure OpenAI Service. These voices are primed to enhance speech-based , , and conversational agents.

Voices Optimized for Conversational Scenarios

The newly introduced voices are named en-US-AndrewNeural, en-US-BrianNeural, en-US-EmmaNeural (all in US English), and zh-CH-YunjieNeural (Chinese). These voices have been fine-tuned for conversational contexts and are currently available for public preview in three regions: East US, South East Asia, and West Europe. Microsoft has provided samples of these voices, highlighting their advancements in delivering more natural and fluid speech compared to existing neural voices.

“…friendly, and optimistic about life, always eager to assist others and share intriguing or practical knowledge. The speaking style of the voice resembles a conversation with an acquaintance over a cup of tea, maintaining a natural and unexaggerated tone.” This statement from Microsoft emphasizes the persona and tone behind each voice.

Technological Advancements Behind the Voices

Microsoft's continuous efforts to enhance (TTS) modeling techniques have led to significant improvements in the quality of AI voices. Recent projects like DelightfulTTS 2 and MuLanTTS have bridged the quality gap between AI voices and professional human recordings. These projects have played a pivotal role in producing voices that sound more natural and realistic. Such technological progress forms the foundation for the newly introduced AI voices.

Developers can seamlessly integrate these voices into their applications using the Azure Speech SDK or REST API. The Azure Bot Framework also offers capabilities to craft intelligent bots that can utilize these new neural TTS voices.

Microsoft's extensive offering includes over 400 neural voices, spanning more than 140 languages and locales. This vast array ensures developers and businesses have a plethora of choices to provide enriched conversational experiences to their users.

Last Updated on December 28, 2023 9:49 am CET

SourceMicrosoft
Luke Jones
Luke Jones
Luke has been writing about Microsoft and the wider tech industry for over 10 years. With a degree in creative and professional writing, Luke looks for the interesting spin when covering AI, Windows, Xbox, and more.
Mastodon