Generate spoken audio from text in over 20 languages using a pre-trained model with a single pip install and terminal command.
Train a custom AI voice model on recordings of a specific speaker to produce that person's voice style from text.
Build a voice assistant or audiobook generator that synthesizes natural-sounding speech without a cloud API.
Deploy a trained voice model to an Android or iOS app using TFLite for on-device speech generation.
Quick inference via pip is straightforward, training a custom voice requires a GPU and a prepared audio dataset.
Mozilla TTS is a Python library for converting text into spoken audio using AI. It was built by Mozilla's research team and covers the full pipeline from typed words to a finished audio file. The library has been used to build products in over 20 languages. The system works in two main stages. First, a text-to-spectrogram model (such as Tacotron2 or Glow-TTS) converts text into a visual representation of sound frequencies called a spectrogram. Second, a vocoder model (such as WaveRNN or MelGAN) converts that spectrogram into an actual audio waveform you can listen to. You can mix and match models for each stage depending on how much you care about speed versus audio quality. If you just want to generate speech from existing pre-trained voices, you can install it in one line via pip and run it from the terminal. If you want to train your own voice model on a custom dataset, you clone the code, prepare your audio data, write a short configuration file, and run a training script. The repository includes tools to check your dataset for quality issues before training, and training logs are shown both in the terminal and in Tensorboard, a visual monitoring tool. The library also includes a speaker encoder, which learns to represent different voices as numbers. This enables multi-speaker models that can produce different voice styles from a single trained model. Training can run across multiple GPUs for speed, and trained models can be converted to TensorFlow or a compact format called TFLite for deployment on mobile devices. A demo server is included for testing models through a web interface. Pre-trained models are available for download from the project's wiki.
← mozilla on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.