Build a voice assistant that reads any text aloud in one of nine supported languages
Generate narration audio for videos, podcasts, or audiobooks from a written script
Add text-to-speech to a web app or desktop app without hosting a large model
Windows users must install the espeak-ng speech engine separately via a standalone installer before the library will work.
Kokoro is a text-to-speech model and its accompanying Python library. You give it a string of text and it produces audio of someone speaking that text. The underlying model has 82 million parameters, which makes it relatively small compared to many speech synthesis systems, yet the README states it produces quality comparable to larger models while running faster. The model weights are released under the Apache license, which means you can use them in commercial projects or personal work without cost. The library is installable from PyPI with a single pip command. Basic usage involves creating a pipeline object, passing text to it along with a voice identifier, and iterating over the results, which come back as chunks of audio data you can play or save to a WAV file. The library supports multiple languages including American and British English, Spanish, French, Hindi, Italian, Japanese, Brazilian Portuguese, and Mandarin Chinese. You select the language when creating the pipeline by passing a language code. Different voice options are available and are specified by name when generating audio. Under the hood, the library uses a companion package called misaki for converting written text into phonemes, which is the step of figuring out how words should sound before generating audio. Setup notes in the README cover Windows (where you install the espeak-ng speech engine separately via an installer), Mac on Apple Silicon (where a specific environment variable enables GPU acceleration), and a conda configuration file for resolving dependency conflicts. The library can also be run on Google Colab without a local installation. The project acknowledges the StyleTTS 2 architecture as its foundation and mentions a Discord community. The name Kokoro is a Japanese word meaning heart or spirit.
← hexgrad on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.