ElevenLabs, a prominent AI-powered voice-generating platform, has officially transitioned out of its beta phase, unveiling its advanced foundational deep learning model, Eleven Multilingual v2. The innovative model supports a staggering 30 languages, marking a significant leap in AI voice generation and cloning. Users of the platform can seamlessly utilize ElevenLabs' renowned text-to-speech and voice-cloning tools across this diverse linguistic range.
Innovative Features and Authentic Voices
The new model's prowess lies in its ability to automatically identify close to 30 written languages, generating speech that boasts an unparalleled level of authenticity. A standout feature ensures that the unique voice characteristics of a speaker, be it synthetic or cloned, remain consistent across all supported languages. This means content creators can maintain a uniform voice style across multiple languages, enhancing the user experience.
Commitment to Universal Accessibility
Mati Staniszewski, the CEO and co-founder of ElevenLabs, is passionate about the company's mission. “Our dream has always been to make all content universally accessible in any language and in any voice,” he remarked. With the release of Eleven Multilingual v2, Staniszewski believes they are a step closer to actualizing this dream, envisioning a future where human-quality AI voices are available in every possible dialect. He further expressed optimism about AI's potential to break down more linguistic barriers in the future.
“Our text-to-speech generation tools help level the playing field and bring top quality spoken audio capabilities to all the creators out there. Those benefits now extend to multilingual applications across almost 30 languages. Eventually we hope to cover even more languages and voices with help of AI, and eliminate the linguistic barriers to content. At ElevenLabs, we believe these leaps in accessibility will ultimately foster greater creativity, innovation, and diversity.”
Broad Applications and Potential Impact
The multilingual speech generation tool boasts a broad spectrum of applications, showcasing its potential impact across various sectors. In the gaming industry, game developers have the advantage of translating in-game experiences and audio content for international players, offering them a richer and more immersive experience in their own language. The education sector also reaps significant benefits. Schools and institutions can now instantly deliver accurate audio content in desired languages, improving students' understanding and pronunciation. Moreover, this tool is a game-changer for content creators who strive to enhance accessibility, particularly for those with visual challenges or unique learning requirements.
ElevenLabs' commitment to innovation is further highlighted by its collaborations with leading content creators and studios, including AI video generator D-ID and audiobook publisher Storytel. As the company looks to the future, it plans to introduce mechanisms that promote human-AI collaboration, allowing users to share and develop voices on the platform.
Addressing Past Controversies
While the platform's advancements are commendable, it hasn't been without its share of controversies. Past misuse by bad actors led to the generation of harmful content. The company's beta platform was abused by 4Chan users to “force celebrities” to say offensive things, as well as by AI enthusiasts to attack voice actors who spoke against voice cloning technology. To address these concerns, ElevenLabs has fortified its platform with new safeguards, including restricting voice cloning to paid accounts and introducing a novel AI detection tool.