explaingit

facebookresearch/audiocraft

23,296Jupyter NotebookAudience · developerComplexity · 3/5MaintainedLicenseSetup · moderate

TLDR

AI library for generating music and sound effects from text descriptions. Write what you want to hear, and it creates matching audio clips.

Mindmap

mindmap
  root((AudioCraft))
    What it does
      Generate music from text
      Create sound effects
      Compress audio efficiently
      Watermark AI audio
    Models included
      MusicGen
      AudioGen
      EnCodec
      AudioSeal
    Use cases
      Background music for apps
      Sound design experiments
      Audio research
    Tech stack
      PyTorch
      Python
      Neural networks

Things people build with this

USE CASE 1

Generate background music for games, videos, or apps by typing a description of the sound you want.

USE CASE 2

Create sound effects and ambient audio (rain, crowds, footsteps) for projects without recording or licensing existing audio.

USE CASE 3

Experiment with AI-guided music composition using specific chords, melodies, or drum patterns as constraints.

USE CASE 4

Research and fine-tune audio generation models on your own datasets using the included training code.

Tech stack

PythonPyTorchJupyter Notebook

Getting it running

Difficulty · moderate Time to first run · 30min

PyTorch installation and model weights download can take 10-15 minutes depending on internet speed and GPU availability.

Model weights available for non-commercial use under a separate license; code itself follows Meta's research license terms.

In plain English

AudioCraft is a research library from Meta (Facebook Research) that lets you generate audio and music using AI. Give it a text description like "upbeat jazz with piano and drums" and it produces a matching audio clip, no musical knowledge or instruments needed. The library bundles several AI models. MusicGen generates music from text descriptions and can also follow a melody you hum or upload. AudioGen does the same for environmental sounds, things like rain, crowd noise, or footsteps. EnCodec is a neural audio compressor that converts audio into a compact form and back, which the other models use internally. There is also AudioSeal for adding invisible watermarks to AI-generated audio, and JASCO for music generation guided by specific chords, melodies, or drum patterns. Under the hood everything is built on PyTorch, a popular framework for deep learning research. The models are pre-trained, so you can run them without training anything yourself, just install the library and call the model with your text prompt. Training code is also included for researchers who want to fine-tune or build on top of these models. You would use AudioCraft when prototyping apps that need background music generation, when doing audio research, or when experimenting with AI-generated sound design. It requires Python 3.9 and PyTorch. Model weights are available for non-commercial use under a separate license.

Copy-paste prompts

Prompt 1
How do I use AudioCraft's MusicGen to generate a 30-second background music track from a text description?
Prompt 2
Show me how to generate environmental sound effects like rain or traffic noise using AudioCraft's AudioGen model.
Prompt 3
How can I guide music generation in AudioCraft using a specific melody or chord progression I provide?
Prompt 4
What's the simplest way to install AudioCraft and generate my first audio clip from a text prompt?
Prompt 5
How do I add an invisible watermark to audio generated by AudioCraft using AudioSeal?
Open on GitHub → Explain another repo

Generated 2026-05-18 · Model: sonnet-4-6 · Verify against the repo before relying on details.