HomeWinBuzzer NewsOpenAI Enhances ChatGPT Plus with Real-Time Natural Voice Mode

OpenAI Enhances ChatGPT Plus with Real-Time Natural Voice Mode

Despite controversy on its initial introduction, OpenAI is now slowly releasing Voice Mode to ChatGPT following internal updates.

-

OpenAI is currently releasing its advanced Voice Mode to select users of ChatGPT Plus. With this feature, OpenAI aims to create more fluid, real-time dialogues within the mobile app available for both iOS and Android. The company initially unveiled the capability of voice interactions through the release of its GPT-4o AI model

With the new voice feature, ChatGPT can now engage in conversations better suited to human interaction, recognizing tonal variations like humor and sarcasm. This upgrade bypasses the need for speech-to-text translation, significantly reducing interaction lag.

Beginning and Controversies

Unveiled during OpenAI's Spring Update event in May 2024, the feature initially used a voice named Sky, which resembled Scarlett Johansson. Johansson objected, revealing she had declined to lend her voice for ChatGPT

During the demonstration at its event, OpenAI showed the voice assistant's capability to solve math problems almost instantaneously. However, the feature raised legal issues due to the default “Sky” voice's similarity to Scarlett Johansson's voice, which led to its removal following legal scrutiny. OpenAI consequently removed Sky and delayed Voice Mode, affirming it was not meant to imitate her voice.

Post-demo, OpenAI has worked on improving voice interaction safety and credibility. The system restricts itself to four pre-defined voices, deliberately avoiding celebrity voice replication. Safeguards also prevent the AI from handling requests involving violent or copyrighted material.

Gradual User Rollout and Prospective Plans

Selected users will receive emails with access details for the new feature. OpenAI aims to extend the rollout through the fall, eventually offering it to all Plus subscribers. Demonstrations have shown the feature's usefulness in education, fashion advice, and assistance for the visually impaired. These scenarios accentuate the practical advantages of more natural AI communication.

The launch occurs amidst growing concerns over AI misuse in fraud or impersonation. While the new mode doesn't enable voice cloning, there is still potential for deceptive use in interactions where the AI nature isn't immediately clear.

OpenAI's Focus and Future Goals

Mira Murati, OpenAI's Chief Technology Officer, discussed the feature's collaborative potential on X, underlining a commitment to making AI useful and cooperative. OpenAI emphasized safety protocols, mentioning extensive testing by over 100 external reviewers across 45 languages.

This development distinguishes OpenAI amid competitors like Meta's Llama model and Anthropic's Claude while presenting a challenge to startups that specialize in expressive voice AI, like Hume.

Luke Jones
Luke Jones
Luke has been writing about Microsoft and the wider tech industry for over 10 years. With a degree in creative and professional writing, Luke looks for the interesting spin when covering AI, Windows, Xbox, and more.

Recent News

Mastodon