Artificial Intelligence (AI) bots are one of Microsoft’s core areas of research. The company now says it can give its existing bots even more power thanks to a breakthrough in voice recognition. Bots are now able to analyze human voices and speak at the same time.
The multi-skill capability gives AI bots the tools to have more natural conversations with users. Engineers at Microsoft say AI can now predict what someone will say next, and have real conversations by knowing when to pause and when it’s ok to interrupt someone.
With this breakthrough, Microsoft could power its virtual assistant Cortana to be more expressive. The first positive is the assistant could hold more natural conversations with users. However, more importantly, Cortana will be able to understand more and carry out more tasks.
It is an avenue tech giant are scrambling towards. Despite significant improvements in virtual assistant technology, naturalistic conversations are still mostly lacking. Indeed, most users would agree that interacting with Cortana or rivals like Alexa is still at a fairly basic level.
Microsoft’s breakthrough could improve Cortana in the long run, but the company is focusing on AI bots first. To that end, the technology is making its debut on the Xiaoice bot in China and the Rinna bot in Japan.
The former is important as it work with Xiaomi’s new Yeelight smart speaker. Like the Chinese company’s other devices, the Yeelight is almost a direct copy of a rival, in this instance the Amazon Echo Dot. However, what Xiaomi lacks in originality, it makes up for in popularity. The Yeelight is almost certain to become a smash hit in China, so Microsoft’s Xiaoice running on board is a nice link for the company in a country where it has notoriously struggled.
In terms of extending the new conversational AI abilities, the expansion will happen soon. Microsoft says more devices will get the tech in the next six months. Speaking to VentureBeat, Ying Wang, the director of Microsoft’s Zo AI confirmed the bot will get the new technology on Skype soon.
“If Xiaoice is telling a story, she will not be easily interrupted by murmurs and chats, unless there is explicit intent from the user to stop. Similarly, on Yeelight, when Xiaoice is handling a high-value IoT task, such as charging status of a robot vacuum, Xiaoice will choose to skip non-explicit intent from users, such as injections like ‘umm’ or ‘huh’,” Wang said.