Microsoft has rolled out updates for its Azure AI Speech services, introducing newly generated multilingual voices and advanced human avatars. These improvements target more natural user interactions and engagement by offering diverse and realistic auditory and visual experiences.
Wide Range of Multilingual Voices
An important upgrade in this release is the addition of various multilingual voices such as en-GB-AdaMultilingualNeural and es-ES-IsidoraMultilingualNeural. These voices provide a broad spectrum of accents and inflections, enhancing the authenticity of AI interactions, especially in chatbot applications.
The update also brings two new U.S.-based voices designed specifically for call center use. These voices are optimized to provide clearer, more natural-sounding conversations, aiming to elevate the customer service experience.
New Human Avatars
Microsoft has also introduced five new human avatars into Azure AI Speech. These avatars feature improved sound quality and are compatible with the Azure OpenAI GPT-4o model, offering seamless interaction between live chat avatars and the GPT-4o model. Sample code is available to help with the integration of text-to-speech avatars and the GPT-4o model.
A new Text Stream API aims to expedite text-to-speech (TTS) functionalities. The API processes input in segments rather than entire responses, reducing latency. This is particularly beneficial for real-time applications, live events, and interactive AI dialogues. Developers can find sample code for the Text Stream API on GitHub.
Regional Availability and Expansion
The updated voices and features are now in public preview in the East US, West Europe, and South East Asia regions. Additionally, the avatar service has expanded to Sweden Central, North Europe, and South Central US, enabling more developers and businesses to utilize Azure AI Speech’s advanced features.
Incorporating Azure AI Content Safety, the batch synthesis process for text-to-speech avatars includes measures to detect and prevent harmful content. Microsoft has also adopted the C2PA Standard to ensure transparency in AI-generated video content.
Last Updated on November 7, 2024 3:44 pm CET