HomeWinBuzzer NewsStability AI Debuts Stable Audio Bringing Text to Audio Generation

Stability AI Debuts Stable Audio Bringing Text to Audio Generation

The Stable Audio platform aims to not only cater for musicians aiming to produce samples but also to the wider market of creators.


Stability AI has unveiled Stable Audio, marking its first move into AI-driven music and sound generation. This new product harnesses cutting-edge generative AI to produce high-quality swiftly through a user-friendly online interface. Users can access a complimentary basic version of Stable Audio to create and download music clips up to 20 seconds long. Additionally, a ‘Pro' subscription is available, offering tracks lasting 90 seconds suitable for commercial applications.

Empowering Music Enthusiasts and Professionals

Emad Mostaque, the CEO of , expressed the company's excitement in leveraging their expertise to craft a tool that champions music creators. He remarked, “Our hope is that Stable Audio will empower music enthusiasts and creative professionals to generate new content with the help of AI, and we look forward to the endless innovations it will inspire.”

The platform is not only tailored for musicians aiming to produce samples for their compositions but also presents boundless possibilities for all creators. The unique feature of Stable Audio is its ability to generate music tracks in response to descriptive text prompts provided by the user, coupled with a specified composition duration.

A Blend of Technology and Creativity

Stable Audio stands out by employing the latest techniques, similar to those used in Stability AI's image generation tool, Stable Diffusion. Of course, the core difference is the AI is generating audio instead of imaged.

The audio generation process utilizes a diffusion model, specifically trained on audio, to craft novel audio clips. This model was meticulously trained using music and associated metadata from AudioSparx, a popular audio licensing library. This collaboration aims to yield both economic and creative dividends for all stakeholders involved.

The platform's distinctiveness lies in its capability to produce high-fidelity, 44.1 kHz music suitable for commercial purposes through latent diffusion. This architecture conditions audio based on text metadata, audio file duration, and starting time, granting users enhanced control over the content and duration of the generated audio.

Growing Audio AI Market

Stability AI is not the only company that is exploring audio AI. Last month, Meta launched AudioCraft, an open source platform for creating AI audio.

Users can access AudioCraft through a web interface or a mobile app, and select from various genres, moods, instruments, and effects. They can also upload their own audio samples or recordings, and use them as inputs for the AI.

The platform can generate music and audio for different purposes, such as podcasts, videos, games, ads, or personal enjoyment. Users can also share their creations with other users on the platform, or export them to other apps or devices. AudioCraft aims to provide a fun and easy way for anyone to create original and high-quality music and audio.

is also in the AI audio space through a collaboration with Universal Music. Specifically, the two companies are working on a licensing system for AI songs. Under the proposed system, artists would grant Google and Universal a license to use their voices for AI-generated songs. In return, they would receive a share of the royalties generated by those songs. The amount of royalties would be based on a number of factors, including the popularity of the song and the length of time the artist's voice is used.

Luke Jones
Luke Jones
Luke has been writing about all things tech for more than five years. He is following Microsoft closely to bring you the latest news about Windows, Office, Azure, Skype, HoloLens and all the rest of their products.

Recent News