Convert text to a WAV file from a C++ app using TtsRestClient
Transcribe a recorded sound file with AsrRestClient and get a string back
Stream generated voice over a WebSocket as the text arrives
Run live ASR with voice activity detection inside a desktop application
Needs CMake 3.16 plus a Gradium API key in GRADIUM_API_KEY, and Linux or macOS builds pull libwebsockets at configure time.
GradiumPP is a C++ library that lets a program talk to Gradium, an online service for turning text into spoken audio (text-to-speech, or TTS) and turning recorded speech into written text (automatic speech recognition, or ASR). The library does the network plumbing for you, so a C++ application can send a sentence and get back an audio file, or send a sound file and get back a transcript, without manually building the HTTP requests. The code targets C++17 and builds with CMake 3.16 or newer. It works on Windows, using the built-in WinHTTP networking, and on Linux or macOS through a library called libwebsockets. The README shows that you build it with two cmake commands, and dependencies are fetched automatically over the internet during the build. You can turn off the example programs with a build flag if you only want the library itself. There are two kinds of clients in the library. REST clients send a single request and wait for the full reply, which is useful for converting a whole file or a chunk of text. WebSocket clients keep an open connection for streaming, which is what you want for live captioning or for speaking generated text as it arrives. The README's quick-start examples show creating a TtsRestClient with an API key, choosing a voice and an output format such as WAV, and writing the returned bytes to a file, plus an AsrRestClient that transcribes a recording into a string. Voice identifiers, audio formats, and model names are all exposed as compile-time constants inside named namespaces, for example gradium::voices::en::american::emma or gradium::tts::output_formats::pcm_16000. The API key is read from the GRADIUM_API_KEY environment variable in the examples. The transport layer is replaceable: you can pass your own mock HTTP or WebSocket implementation into a client, which is useful for testing or for unusual networking setups. Six example programs ship with the library, covering REST TTS, streaming TTS, multiplexed streams, REST ASR, live ASR with voice activity detection, and managing voices and credits. The license is Apache 2.0.
Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.