Neural Voice Research gives developers the ability to create synthetic voices by tapping into Microsoft’s text-to-speech technology. While it is now available to the all users, Microsoft says the tool has limited access.
That means developers must still seek approval from Microsoft before using it. However, there are no limitations based on location with the tool available to Azure Cognitive Services in most Azure cloud regions.
Microsoft used its machine learning technology in Custom Neural Voice. With the latest developments, the company can combine prosody improvements with AI to reproduce a speakers’ voice. Prosody is the tone and duration of each unit of sound (phoneme) that comprises words and makes them different form one another.
Some machine learning models convert the acoustics, which allows the prediction of prosody, while another model converts the sequence into reproduced speech. Microsoft says the result is a more natural sounding reproduced voice.
With the tool, Azure Cognitive Services customers can create ultra-realistic custom voices for apps that sound real. Microsoft thinks these voices are so real that the new tool requires its own code of ethics.
That’s partly why the company is limiting access by asking for user applications. Whenever a recording is made, the voice actor or speaker must acknowledge they are aware the technology is being used and understand what it can do.
“We require customers to make very clear it’s a synthetic voice,” says Sarah Bird, AI lead for Cognitive Services at Microsoft Azure AI. “When it’s not immediately obvious in context, [customers must] explicitly disclose it’s synthetic in a way that’s perceivable by users and not buried in terms.”
Using the service appears to be very easy. Users simply record their voice and upload it to Custom Neural Voice for training. The AI will automatically create a unique voice for the recording with no developer input needed. Microsoft points to several scenarios where partners are already using this technology:
- “AT&T/Warner Bros. They recently launched a first-of-its-kind creative and interactive experience at the AT&T Experience Store in Dallas, TX where customers can talk directly to Bugs Bunny.
- Progressive. Using the voice of Flo, the iconic Progressive Insurance spokesperson, Progressive created the Flo chatbot to streamline the customer inquiry process and deliver personalized experiences.
- Duolingo. To help make learning a new language feel attainable and applicable with quirky characters and quality content, Duolingo created a diverse cast of stylized voices using the Duolingo curriculum.”
Tip of the day:
The Windows default font these days is Segoe UI, a fairly simple and no-nonsense typeface that’s used across many of Microsoft’s products. However, though some like this subdued style, others look to change Windows font to something with a bit more personality.
Thankfully, Microsoft does let you change Windows fonts, but it doesn’t make it particularly easy. I our tutorial we show you how to change system font in Windows 10, or restore it again if you don’t like the changes.