Skip to content
AI Tools Finder

AudioCraft (MusicGen)

Meta's text-to-music & sound-effect model family.

Open Source 8–16 GB VRAMRuns locally
Actually FreeNo SignupOpen SourceWatermark-FreeHobbyist-Friendly
Visit AudioCraft (MusicGen)Updated 2024-07-19 · Direct link

Hardware requirements

Runs locally · Entry GPU (6–8 GB)

8–16 GB VRAM
Min VRAM
8 GB
Rec. VRAM
16 GB
Min RAM
16 GB
Rec. RAM
32 GB
Disk
15 GB
GPU class
Entry GPU
11.8+Apple Silicon ✓GPU RequiredQuant: FP16

MusicGen small/medium fits in 8 GB; large needs 16 GB.

Screenshot placeholder · AudioCraft (MusicGen)

What is AudioCraft (MusicGen)?

AudioCraft bundles MusicGen, AudioGen, and EnCodec from Meta FAIR. MusicGen produces 30-second music clips from a text prompt or a melody reference; AudioGen does ambient sound effects. The benchmark OSS baseline for music generation before Stable Audio Open.

Pros & cons

Pros

  • Highest-quality OSS music generation for a long time
  • Melody conditioning lets you guide the harmony
  • EnCodec audio codec included — useful as a building block

Cons

  • Weights are non-commercial — research/personal only
  • 30-second cap; longer outputs need stitching
  • Slower iteration than Stable Audio Open at similar quality

What's actually free?

MIT-licensed code; CC-BY-NC weights (non-commercial).

✓ Actually FreeNo SignupOpen SourceWatermark-Free

Alternatives

Stable Audio Open

Open-weight text-to-audio — 47-second sound effects and music.

Open Source 6–8 GB VRAM
Min VRAM
6 GB
GPU class
Entry GPU
Quant
FP16
Actually FreeOpen SourceWatermark-FreeHobbyist-Friendly