Stable Audio Open
Open-weight text-to-audio — 47-second sound effects and music.
Open Source 6–8 GB VRAMRuns locally
Actually FreeOpen SourceWatermark-FreeHobbyist-FriendlyAPI
Visit Stable Audio OpenUpdated 2026-01-20 · Direct link
Hardware requirements
Runs locally · Entry GPU (6–8 GB)
Min VRAM
6 GB
Rec. VRAM
8 GB
Min RAM
16 GB
Rec. RAM
16 GB
Disk
10 GB
GPU class
Entry GPU
11.8+No Apple SiliconGPU RequiredQuant: FP16
47-second generations need ~8 GB VRAM at full precision.
Screenshot placeholder · Stable Audio Open
What is Stable Audio Open?
Stability AI's open-weight audio diffusion model. Generates 44.1 kHz stereo audio up to 47 seconds from a text prompt. Optimised for sound effects, foley, and short musical loops rather than full songs. Local-only, commercial-friendly license.
Pros & cons
Pros
- True 44.1 kHz stereo output
- Genuinely usable for foley / SFX work
- Runs on consumer GPUs (8 GB+)
Cons
- Not a music generator — vocals & long compositions are out of scope
- Community License has revenue clauses
What's actually free?
Stability AI Community License — free for individuals & < $1M ARR.
✓ Actually FreeOpen SourceWatermark-Free