Skip to content
AI Tools Finder

Stable Audio Open

Open-weight text-to-audio — 47-second sound effects and music.

Open Source 6–8 GB VRAMRuns locally
Actually FreeOpen SourceWatermark-FreeHobbyist-FriendlyAPI
Visit Stable Audio OpenUpdated 2026-01-20 · Direct link

Hardware requirements

Runs locally · Entry GPU (6–8 GB)

6–8 GB VRAM
Min VRAM
6 GB
Rec. VRAM
8 GB
Min RAM
16 GB
Rec. RAM
16 GB
Disk
10 GB
GPU class
Entry GPU
11.8+No Apple SiliconGPU RequiredQuant: FP16

47-second generations need ~8 GB VRAM at full precision.

Screenshot placeholder · Stable Audio Open

What is Stable Audio Open?

Stability AI's open-weight audio diffusion model. Generates 44.1 kHz stereo audio up to 47 seconds from a text prompt. Optimised for sound effects, foley, and short musical loops rather than full songs. Local-only, commercial-friendly license.

Pros & cons

Pros

  • True 44.1 kHz stereo output
  • Genuinely usable for foley / SFX work
  • Runs on consumer GPUs (8 GB+)

Cons

  • Not a music generator — vocals & long compositions are out of scope
  • Community License has revenue clauses

What's actually free?

Stability AI Community License — free for individuals & < $1M ARR.

✓ Actually FreeOpen SourceWatermark-Free