explaingit

nari-labs/dia

Analysis updated 2026-05-18

19,294PythonAudience · developerComplexity · 3/5LicenseSetup · hard

TLDR

Open-weight AI model that generates realistic multi-speaker dialogue audio from scripts, with voice cloning and natural sounds like laughter and sighing.

Mindmap

mindmap
  root((Dia))
    What it does
      Multi-speaker dialogue
      Voice cloning
      Natural sounds
      Emotion control
    Tech stack
      Python
      PyTorch
      CUDA
      Hugging Face
    Use cases
      Podcast generation
      Interactive voice apps
      Dialogue content
      Voice acting replacement
    Audience
      Audio developers
      Content creators
      Product builders
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

What do people build with it?

USE CASE 1

Build a podcast generator that creates multi-speaker conversations from scripts without hiring voice actors.

USE CASE 2

Create interactive voice apps where characters have distinct, cloneable voices that respond naturally to user input.

USE CASE 3

Generate dialogue-heavy content like audiobook chapters, radio dramas, or interview simulations with realistic speaker variation.

USE CASE 4

Prototype voice-based products with emotion and tone control by conditioning the model on audio prompts.

What is it built with?

PythonPyTorchCUDAHugging Face Transformers

How does it compare?

nari-labs/diaanthropics/claude-plugins-officialcomet-ml/opik
Stars19,29419,29119,288
LanguagePythonPythonPython
Setup difficultyhardeasymoderate
Complexity3/52/53/5
Audiencedeveloperdeveloperdeveloper

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · hard Time to first run · 1h+

Requires CUDA/GPU setup, large model downloads from Hugging Face, and PyTorch compilation.

Use freely for any purpose including commercial. Keep the notice and disclose changes to the patent grant.

In plain English

Dia is an open-weight text-to-speech AI model built by Nari Labs that specializes in generating realistic multi-speaker dialogue from a written script. Unlike typical text-to-speech tools that synthesize a single narrator voice, Dia is designed to produce back-and-forth conversations with two distinct speakers, complete with natural nonverbal sounds like laughter, coughing, sighing, and gasping. You give it a script with speaker tags like [S1] and [S2], and it outputs audio that sounds like a real two-person conversation. It also supports voice cloning, you can provide a short audio sample and Dia will match that voice's tone and style. Emotion and tone can be steered by conditioning the model on an audio prompt. You'd use Dia if you're building a podcast generator, dialogue-based content, interactive voice apps, or any product that needs lifelike multi-speaker audio without expensive voice actors. The model has 1.6 billion parameters, runs on NVIDIA GPUs, supports English only at the moment, and is available through Hugging Face Transformers. The tech stack is Python, with PyTorch and CUDA required for inference.

Copy-paste prompts

Prompt 1
How do I set up Dia to generate a two-speaker conversation from a script with [S1] and [S2] tags?
Prompt 2
Show me how to clone a voice in Dia using a short audio sample and apply it to a dialogue script.
Prompt 3
What's the process for conditioning Dia's output on an emotion or tone using an audio prompt?
Prompt 4
How do I run Dia inference on an NVIDIA GPU using PyTorch and what are the minimum hardware requirements?
Prompt 5
Can you walk me through generating a podcast episode with multiple speakers using Dia's Hugging Face model?

Frequently asked questions

What is dia?

Open-weight AI model that generates realistic multi-speaker dialogue audio from scripts, with voice cloning and natural sounds like laughter and sighing.

What language is dia written in?

Mainly Python. The stack also includes Python, PyTorch, CUDA.

What license does dia use?

Use freely for any purpose including commercial. Keep the notice and disclose changes to the patent grant.

How hard is dia to set up?

Setup difficulty is rated hard, with roughly 1h+ to a first successful run.

Who is dia for?

Mainly developer.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub nari-labs on gitmyhub

Verify against the repo before relying on details.