explaingit

duixcom/duix-avatar

12,971CAudience · vibe coderComplexity · 4/5Setup · hard

TLDR

Duix.Avatar is a local desktop tool that creates realistic AI-generated videos of a digital copy of you, provide a face video and voice sample, type a script, and it produces a video of your likeness speaking it, entirely on your own machine.

Mindmap

mindmap
  root((Duix Avatar))
    What it does
      AI video of your face
      Voice cloning
      Lip sync output
    Input
      Face video clip
      Voice sample
      Text script
    Output
      Realistic video
      8 languages
    Setup
      Docker required
      NVIDIA GPU
      Local only
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Create a realistic video of yourself delivering a presentation or product demo without needing to film yourself again.

USE CASE 2

Generate avatar videos in eight languages from a single text script without hiring voice actors or recording new audio.

USE CASE 3

Build a conversational AI avatar product using the open API for real-time interactive video responses in a web or mobile app.

Tech stack

CDockerNVIDIA GPU

Getting it running

Difficulty · hard Time to first run · 1day+

Requires an NVIDIA GPU, at least 32GB RAM, roughly 130GB of disk space, and downloading about 70GB of Docker images before anything runs.

In plain English

Duix.Avatar is an open-source desktop tool for creating AI-generated videos of a realistic digital copy of yourself. You provide a short video clip of your face and a voice sample, and the software builds a virtual model that can be made to say whatever you write or record. The result is a video of your likeness speaking the content you provided. The tool runs entirely on your local computer with no internet connection required. All processing happens on your machine, so no footage, voice samples, or generated videos are sent to any server. It runs on Windows 10 (version 19042 or later) and Ubuntu 22.04. Setup is Docker-based. You download several Docker images totaling roughly 70 gigabytes and run them with a single command. The system also requires an NVIDIA graphics card, at least 32 gigabytes of RAM, and substantial disk space (around 130 gigabytes combined for the Windows version). Once running, you interact through a desktop application where you upload your appearance and voice samples, then type or speak the text you want the avatar to deliver. The avatar lip-syncs to the audio and reflects natural speech patterns. Scripts can be written in eight languages: English, Japanese, Korean, Chinese, French, German, Arabic, and Spanish. The company behind the project, Duix.com, says it previously sold this technology commercially and that more than 10,000 businesses have used it. The open-source release makes the same underlying system available at no cost. A real-time interactive mode is also available through a separate open API for developers who want to build conversational avatar applications.

Copy-paste prompts

Prompt 1
I have Duix.Avatar running via Docker. Help me understand what makes a good face video and voice sample so I get the most realistic avatar output.
Prompt 2
My Duix.Avatar output has visible lip-sync issues on certain sounds. What can I adjust in the settings or input to improve the quality?
Prompt 3
I want to create avatar videos in Spanish using Duix.Avatar. Walk me through writing a script and selecting the right language settings.
Prompt 4
Help me use the Duix.Avatar real-time API to build a simple web page where a user types a message and gets back a short avatar video response.
Prompt 5
The Duix.Avatar docs say I need 32GB of RAM but I only have 16GB. Are there lighter model options or settings to reduce memory usage?
Open on GitHub → Explain another repo

← duixcom on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.