Microsoft has announced a transformative update for its Teams platform with the upcoming release of a voice cloning tool that promises to elevate multilingual communication during meetings.
Set to launch in early 2025, the “Interpreter in Teams” feature will replicate users’ voices in real-time across nine supported languages, facilitating more personal and immersive conversations.
Bringing Personalization to Multilingual Conversations
The new Interpreter in Teams tool will allow participants to speak in their native language, with their voice automatically cloned and translated into other languages. Initially, the tool will support English, French, German, Italian, Japanese, Korean, Portuguese, Mandarin Chinese, and Spanish.
According to Jared Spataro, Microsoft’s CMO, the feature offers a unique approach: “Imagine being able to sound just like you in a different language,” enhancing the natural flow of conversations for global teams. “Interpreter in Teams provides real-time speech-to-speech translation during meetings, and you can opt to have it simulate your speaking voice for a more personal and engaging experience.”
AI-Powered Translation and Privacy Assurances
This voice cloning capability leverages Microsoft’s advancements in AI-powered speech synthesis, a step forward from traditional automated translation tools. Microsoft has stressed user privacy, noting that no biometric data will be stored or used for model training, addressing potential concerns about data security. Users will have full control over activating the feature through meeting prompts or settings to enable “voice simulation consent”.
Broader Integration of AI in Microsoft Teams
Interpreter in Teams is part of Microsoft’s push to enhance AI features within its communication suite. Other updates include multilingual meeting transcripts, allowing real-time captions in up to 31 languages, and Copilot’s ability to summarize shared documents and meeting content. These tools aim to reduce the need for manual file review, creating a smoother user experience.
Related: |
The introduction of Teams Super Resolution marks another step in video communication improvements. This feature utilizes NPUs (Neural Processing Units) in Copilot Plus PCs to improve video call quality, even with unstable internet connections
NPUs are specialized chips designed for efficient processing of machine learning tasks, contributing to better video and image clarity. Windows developers can expect related API access in January, which will include object erasure and image analysis capabilities.
Historical Context: Microsoft’s Journey in Language Tech
Microsoft’s venture into language technology is rooted in its earlier projects. In 2017, Microsoft launched the PowerPoint Live Presentation Translator, capable of handling real-time group conversations with up to 100 participants in 60 languages. The app was developed to facilitate seamless multilingual interactions using mobile devices and PCs, building on Microsoft’s deep neural network technology for more natural translations.
This functionality catered to both enterprises and educational institutions by bridging language gaps during presentations and supporting audiences with hearing impairments.
Competing Technologies and Industry Context
The rise of voice translation tools from competitors puts Microsoft’s latest move into perspective. DeepL, a well-known name in text-based translation, recently introduced DeepL Voice, which features tools for both virtual and in-person use.
The service includes real-time captioning and mobile-friendly translations, expanding the market for AI-driven communication solutions. ElevenLabs has similarly carved out a space in voice synthesis technology, offering cross-language speech capabilities that retain the original speaker’s voice characteristics.
Addressing Challenges and Security Concerns
As voice cloning and real-time translation gain traction, data privacy and security remain pressing concerns. Microsoft has highlighted that while Interpreter in Teams will process voice data during use, it will not store or utilize it for future training, aligning with data protection standards such as GDPR.
This focus on security comes amid incidents reported by ReliaQuest involving phishing schemes targeting Microsoft Teams users. Attackers have used QR codes in chat messages, impersonating IT staff to lure employees into credential-stealing sites. Additionally, a report from Netskope outlined the increase in QR code phishing, which bypasses traditional security scans and leads users to malicious pages.