An open-source music generator has been trained on 20,000 hours of licensed music and produces short samples based on textual and melody prompts.
Ever wondered what Johann Sebastian Bach’s chilling Toccata and Fugue in D minor would sound like if combined with an “80s driving pop song with heavy drums and synth pads in the background?” We neither, but Meta’s new AI music generator has an answer.
A project by a research team at Meta, MusicGen, is available as a demo on Hugging Face, where anyone can try it out. It will generate 12 seconds of audio based on a provided description, which can take up to several minutes.
Some of the examples users could use include a “cheerful country song with acoustic guitars” and a “light and cheery EDM track, with syncopated drums, eery pads, and strong emotions bpm: 130.”
Alternatively, one can also provide reference audio for the model to extract from, and it will then work with both the description and the melody.
The paper outlines that MusicGen performs slightly better than similar software, including MusicLM, a text-to-music generator developed and released last month by search giant Google.
“Unlike existing methods like MusicLM, MusicGen doesn’t require a self-supervised semantic representation,” researchers said, and produces high-quality samples with “only 50 auto-regressive steps per second of audio.”
According to researchers, MusicGen was trained on 20,000 hours of licensed music, including an internal dataset of 10,000 high-quality music tracks, in addition to the Shutterstock and Pond5 music data.
The release of another text-to-music generator comes amid fears among creators about the proliferation of AI-generated music and its impact on the industry.
More from Cybernews:
Subscribe to our newsletter