Riffusion
This is the v1.5 stable diffusion model with no modifications, just fine-tuned on images of spectrograms paired with text. Audio processing happens downstream of the model.
It can generate infinite variations of a prompt by varying the seed. All the same web UIs and techniques like img2img, inpainting, negative prompts, and interpolation work out of the box.
An audio spectrogram is a visual way to represent the frequency content of a sound clip. The x-axis represents time, and the y-axis represents frequency. The color of each pixel gives the amplitude of the audio at the frequency and time given by its row and column.
With diffusion models, it is possible to condition their creations not only on a text prompt but also on other images. This is incredibly useful for modifying sounds while preserving the structure of the an original clip you like. You can control how much to deviate from the original clip and towards a new prompt using the denoising strength parameter.
Related Music Generation Tools
Soundraw
SOUNDRAW is an innovative composition tool for creators. Create songs that match your content perfectly in minutes and with no music composition knowledge.
Solaria
SOLARIA is an English native AI vocalist that provides a professional quality singer for your projects at any time. With user control over all aspects of the melody and detailed parameters to control various characteristics of the voice, SOLARIA is a high quality singer with near infinite options.
Mubert
Mubert is an AI-powered generative music platform that creates unique and royalty-free music for content creators.