General Discussion
Related: Editorials & Other Articles, Issue Forums, Alliance Forums, Region ForumsAI can now generate CD-quality music from text, and it's only getting better
https://arstechnica.com/information-technology/2023/09/ai-can-now-generate-cd-quality-music-from-text-and-its-only-getting-better/Musicians: Speak now or forever hold your beats.
BENJ EDWARDS - 9/13/2023, 3:59 PM
Imagine typing "dramatic intro music" and hearing a soaring symphony or writing "creepy footsteps" and getting high-quality sound effects. That's the promise of Stable Audio, a text-to-audio AI model announced Wednesday by Stability AI that can synthesize music or sounds from written descriptions. Before long, similar technology may challenge musicians for their jobs.
-snip-
Now Stability and Harmonai want to break into commercial AI audio production with Stable Audio. Judging by production samples, it seems like a significant audio quality upgrade from previous AI audio generators we've seen.
-snip-
To train its model, Stability partnered with stock music provider AudioSparx and licensed a data set "consisting of over 800,000 audio files containing music, sound effects, and single-instrument stems, as well as corresponding text metadata." After feeding 19,500 hours of audio into the model, Stable Audio knows how to imitate certain sounds it has heard on command because the sounds have been associated with text descriptions of them within its neural network.
-snip-
As it stands, it's looking like we might be on the edge of production-quality AI-generated music with Stable Audio, considering its audio fidelity. Will musicians be happy if they get replaced by AI models? Likely not, if history has shown us anything from AI protests in the visual arts field. For now, a human can easily outclass anything AI can generate, but that may not be the case for long. Either way, AI-generated audio may become another tool in a professional's audio production toolbox.
-snip-
Stability.ai is of course the company behind image generator Stable Diffusion, which has been sued by artists. The company did not have the rights to the images used to train Stable Diffusion. They say they have a licensed data set this time.
Stability.ai announcement, with some audio samples: https://stability.ai/research/stable-audio-efficient-timing-latent-diffusion
And yes, I did register and try Stable Audio, using keywords that would fit a type of music I love.
I was hoping the results would not be as good as the article suggested. The AI music generated from text prompts that I've heard in the past weren't as good, and Google also realized there were huge copyright concerns with their MusicLM AI - https://www.democraticunderground.com/103492667 - and so far they've released only a test model, which is offered in part since users help train the AI by choosing the better of two options generated.
The music sample - only one was generated - that I got from my text prompt took longer to generate than typical AI text and images - I didn't time it, but I think it was over a minute, though that's probably due in part to high demand already - but it was much better than what I'd heard from other AI music generators.
It wouldn't have sounded bad in my favorite blues bar.
And no, I'm not going to tell you what prompt I used. The way generative AI works, you get different results from identical prompts (which is why Google wants users to help train their AI by choosing one of two options, and generative AI companies often give you 4 options simultaneously).
A few days ago I posted about Queen's Brian May and his concerns about AI: https://www.democraticunderground.com/100218263791
Stable Audio will, if anything, make him more apprehensive.
Damn damn damn...
Blues Heron
(8,388 posts)highplainsdem
(60,016 posts)emulatorloo
(46,135 posts)company to train it, rather than artists. It definitely sounds like stock music to me.
highplainsdem
(60,016 posts)often are.
Nothing like classic rock...
Elessar Zappa
(16,385 posts)that most top 40 pop music has been musically extremely repetitive since at least the 50s (and probably long before). It might be a little worse now but not by much. People essentially like to hear the same chord progressions, drum loops, bass lines, etc. over and over again. There is a lot of fairly original music out there now, just as there was back in the 60s and 70s, but in general you wont hear it on the radio.
newdayneeded
(2,493 posts)for the first time in years. I swear it's just the same chorus for 3 1/2 minutes!
highplainsdem
(60,016 posts)audio samples in their announcement.
Scrivener7
(58,344 posts)Hugin
(37,440 posts)Because if thats so, the source music was originally played by humans. That would also explain why the quality is so good, as well.
So, in essence this is exactly the same functionality that chatbots use stitching together text replies from a huge reservoir of Internet scrapings. Except in this case theyve upped the exploitation of the talents of the human studio musicians who provided the source.
Elessar Zappa
(16,385 posts)I wont listen to it even if its good. I want to support actual human artists. Same with tv/movie scripts.
GenXer47
(1,204 posts)As a jazz musician who used to practice like an Olympiad, I had to accept long ago that the music industry/profession/hobbyland is full to the brim with lazy, untalented hacks who just want to show off and maybe get laid.
And, I had to grasp that this profession is completely useless from a utilitarian point of view. We could survive (blandly) without it.
So, when I hear that AI can out-compose pretty much anyone, the lemonade from this lemon could be thirst-quenching for jazz musicians, who specialize in the love and beauty of live improvisation with other human beings. THAT is what the audience is there for - to passively participate in the incredible intimacy of two or more human beings creating harmony and dissonance without ever discussing it beforehand - to the point that we can predict what each other will play. I've "made love" to hundreds of other musicians in this way and it's a feeling on par with actual sex, or perhaps landing a jumbo jet full of human souls.
AI will scorch the landscape of musical hacks/attention seekers, and finally open up the space for those of us who "get it".
Hugin
(37,440 posts)I cant say I disagree with any of your points.
I will say what is going on is a little deeper, though.
Lets say someone is recognized as the best and most talented at performing with a particular instrument. How this works is a third party lifts your riffs either with or without compensation and loads it into a machine which presents the riffs custom mixed on demand. In essence, that performer is competing against themselves.
A situation lamented by many as it occurs even now with musicians trying to change their lineup and getting thwarted by recordings of their earlier work that they no longer control.
My point being its still people with cucumbers in their pants who are benefiting. They are simply different people and arguably less deserving.
tinrobot
(11,952 posts)People have been writing to a formula since the Brill Building was a thing.
Brian Eno used to compose musing by throwing dice.
And people have been using computer algorithms to compose music since at least the 1980's, when MIDI became a thing.
The only thing new about this is the algorithm.
Chainfire
(17,757 posts)A automobile will never replace my horse!