Microsoft Teams has introduced spatial audio to its platform, a feature that aims to enhance communication and reduce meeting fatigue during audio and video conferences. This information was confirmed by Microsoft in an official announcement on their tech community blog.
Spatial Audio for a More Natural Listening Experience
Spatial audio works by creating the illusion of physical space in audio and video calls. It separates the voices of individual participants, making it seem as if they are positioned in different locations in the room. This results in a more natural listening experience, similar to an in-person conversation.
Microsoft's implementation of spatial audio aligns the perceived audio location of each participant with their video representation. This makes it easier for users to track who is speaking, understand better when multiple speakers are speaking at the same time, and reduce meeting fatigue and cognitive load.
The Science Behind Spatial Audio
The science behind the benefits of spatial audio is rooted in the concept of binaural hearing, where we use both ears to help identify and distinguish the sources of sounds in the physical world. This is in contrast to most audio and video communication applications today that provide monophonic audio, where speech signals from different participants are transmitted in a single audio channel.
The concept of spatial audio is supported by the well-known study of the “Cocktail Party Effect” that outlines the brain's ability to focus auditory attention on a single speech stimulus while filtering other sources of sound. The study found that two ears (and separate channels) help the listener process speech more efficiently compared to using a single channel.
Device Support and Limitations
Spatial audio is generally available on desktop applications and can be enabled by going to settings -> Devices to turn on spatial audio. However, users will need a stereo-capable device such as wired headsets or stereo-capable laptops. Bluetooth devices are currently not supported due to protocol limitations, but next-generation LE Audio with stereo-enabled Bluetooth devices will be supported in the future.
Microsoft is working on future releases to support all users for spatial audio, including those on satellite servers. They are also planning to support receiving music mode in spatial audio and enable spatial audio for live interpretation where volume control for main floor and interpreter audio will be available.
Current Feature Limitations According to Microsoft
Device support:
-
- Currently we support wired headsets for spatial audio. They can be wired USB headsets or headsets connected to the computer audio jack. Some wireless headsets connected to the computer via USB dongle known to support stereo playback during a call are also supported.
- We also support stereo open speakers (built-in or external speakers)
- Native Bluetooth devices do not support stereo during a call, therefore spatial audio is not available. New Bluetooth standard LE Audio capable devices may support stereo in calls. For these devices, spatial audio will be supported.
Infrastructure related:
-
- When a conference call has more than 100 users, some users who were typically in listening mode will be moved to satellite server. Currently, spatial audio is not supported for users on satellite server. When such users speak, they are typically moved back to central media server, spatial audio may become available. In future releases, all users will be supported for spatial audio.
Music mode:
-
- Users can turn on music mode while receiving spatial audio. In this case, they will send audio in music mode (32kHz sampling and 128kbps), however, they will not be able to receive music mode when spatial audio is enabled. In order to receive music mode, the user needs to turn off spatial audio. Future releases will support receiving music mode in spatial audio.
Impact to Live Interpretation users:
-
- In Live Interpretation mode, when spatial audio is turned on, main floor audio and interpreter audio will be heard at the same volume from different directions depending on the video location of the main floor speaker and the interpreter. To go back to traditional main floor audio ducking, simply disable spatial audio. Future releases will enable spatial audio for live interpretation where volume control for main floor and interpreter audio will be available.