Stability AI's Stable Audio 2.0 lets users create full songs (up to 3 minutes) with text prompts and their own audio samples.


Stability AI has unveiled the latest iteration of its audio generation model, Stable Audio 2.0, marking a significant advancement in the capability of users to create AI-generated songs. The update introduces the ability for users to upload their own audio samples and transform them into full-length songs through the use of prompts. Unlike its predecessor, which limited creations to 90-second clips, Stable Audio 2.0 extends the potential length to three minutes, aligning with the duration of typical radio songs. 

Enhanced Features and Accessibility

Stable Audio 2.0 distinguishes itself from the initial version of the AI by not only extending the maximum length of generated audio but also by allowing greater customization and input from users. Participants can now influence the outcome of their music projects more directly by adjusting the strength of the prompts and the extent to which the uploaded audio is modified. Additionally, the inclusion of sound effects, such as crowd noises or ambient sounds, further enriches the creative possibilities. In contrast to OpenAI’s Voice Engine, which remains exclusive to a select group of users, Stability AI has made Stable Audio freely accessible through its website and is planning to extend this accessibility through an API.

Challenges and Future Directions

Despite these enhancements, the technology behind Stable Audio 2.0 is not without its limitations. Feedback from users indicates that while the software is capable of producing compositions that resemble songs, including elements like intros and outros, the generated audio sometimes lacks the emotional depth and clarity expected in human-produced music. For instance, an attempt to generate a “folk pop song with American vibes” resulted in a track that, although partially successful, included segments likened to “whale sounds,” highlighting the unpredictability and occasional strangeness of AI-generated music.

Moreover, Stability AI has addressed concerns regarding the use of copyrighted material in the training of their model. In response to these concerns and to ensure compliance with copyright laws, the company has partnered with Audible Magic to employ its content recognition technology. This collaboration aims to prevent copyrighted content from being used without authorization, safeguarding the interests of original artists and content creators.

