Voice recording can be tricky, especially in crowded places or if secrecy is important. Shouting above a crowd is problematic and often results in fuzzy audio results. Microsoft Research has showcased a new voice input interface called SilentVoice that supports recording without voice leaks.
The Microsoft Research team was at ACM CHI 2018 to present SilentVoice. At its core, the technology monitors the 50% of your voice used when breathing in, rather than breathing out. Whispering into the microphone will produce clearer recordings.
One upside of this support is the ability to talk quietly even with a lot of surrounding noise. Microsoft explains how SilentVoice uses an “ingressive speech” method to block out noise and maintain clarity:
“SilentVoice is a new voice input interface device that penetrates the speech-based natural user interface (NUI) in daily life. The proposed “ingressive speech” method enables placement of a microphone very close to the front of the mouth without suffering from pop-noise, capturing very soft speech sounds with a good S/N ratio.
“It realizes ultra-small (less than 39dB(A)) voice leakage, allowing us to use voice input without annoying surrounding people in public and mobile situations as well as offices and homes.”
A voice is seperated from normal speech through the measurment of airflow in SilentVoice. Microsoft says the technology works to 98.8% accuracy and no activation words or phrases are needed.
“It can be used for voice-activated systems with a specially trained voice recognizer; evaluation results yield word error rates (WERs) of 1.8% (speaker-dependent condition), and 7.0% (speaker-independent condition) with a limited dictionary of 85 command sentences. A whisper-like natural voice can also be used for real-time voice communication.”
As this is a Microsoft Research project, it is still developmental and experimental. While SilentVoice won’t be with us anytime soon, you can check out the full ACM CHI presentation in the video above.