Stable Audio Open
Open-weight text-to-audio — 47-second sound effects and music.
Open Source 6–8 GB VRAM
- Min VRAM
- 6 GB
- GPU class
- Entry GPU
- Quant
- FP16
Actually FreeOpen SourceWatermark-FreeHobbyist-Friendly
Meta's text-to-music & sound-effect model family.
Runs locally · Entry GPU (6–8 GB)
MusicGen small/medium fits in 8 GB; large needs 16 GB.
AudioCraft bundles MusicGen, AudioGen, and EnCodec from Meta FAIR. MusicGen produces 30-second music clips from a text prompt or a melody reference; AudioGen does ambient sound effects. The benchmark OSS baseline for music generation before Stable Audio Open.
MIT-licensed code; CC-BY-NC weights (non-commercial).
Open-weight text-to-audio — 47-second sound effects and music.