fatehmtd/gradiumpp

★ 3C++Audience · developerComplexity · 3/5ActiveLicenseSetup · moderate

Mindmap

mindmap
  root((GradiumPP))
    Inputs
      Text
      Audio file
      API key
    Outputs
      WAV audio
      Transcripts
      Streamed chunks
    Use Cases
      Add TTS to a desktop app
      Live captioning
      Streaming generated voice
    Tech Stack
      Cpp17
      CMake
      WinHTTP
      libwebsockets

mindmap root((GradiumPP)) Inputs Text Audio file API key Outputs WAV audio Transcripts Streamed chunks Use Cases Add TTS to a desktop app Live captioning Streaming generated voice Tech Stack Cpp17 CMake WinHTTP libwebsockets

Things people build with this

USE CASE 1

Convert text to a WAV file from a C++ app using TtsRestClient

USE CASE 2

Transcribe a recorded sound file with AsrRestClient and get a string back

USE CASE 3

Stream generated voice over a WebSocket as the text arrives

USE CASE 4

Run live ASR with voice activity detection inside a desktop application

Tech stack

Cpp17CMakeWinHTTPlibwebsockets

Getting it running

Difficulty · moderate Time to first run · 30min

Needs CMake 3.16 plus a Gradium API key in GRADIUM_API_KEY, and Linux or macOS builds pull libwebsockets at configure time.

Apache 2.0, free to use, modify, and ship commercially with attribution and a notice file.

In plain English

GradiumPP is a C++ library that lets a program talk to Gradium, an online service for turning text into spoken audio (text-to-speech, or TTS) and turning recorded speech into written text (automatic speech recognition, or ASR). The library does the network plumbing for you, so a C++ application can send a sentence and get back an audio file, or send a sound file and get back a transcript, without manually building the HTTP requests. The code targets C++17 and builds with CMake 3.16 or newer. It works on Windows, using the built-in WinHTTP networking, and on Linux or macOS through a library called libwebsockets. The README shows that you build it with two cmake commands, and dependencies are fetched automatically over the internet during the build. You can turn off the example programs with a build flag if you only want the library itself. There are two kinds of clients in the library. REST clients send a single request and wait for the full reply, which is useful for converting a whole file or a chunk of text. WebSocket clients keep an open connection for streaming, which is what you want for live captioning or for speaking generated text as it arrives. The README's quick-start examples show creating a TtsRestClient with an API key, choosing a voice and an output format such as WAV, and writing the returned bytes to a file, plus an AsrRestClient that transcribes a recording into a string. Voice identifiers, audio formats, and model names are all exposed as compile-time constants inside named namespaces, for example gradium::voices::en::american::emma or gradium::tts::output_formats::pcm_16000. The API key is read from the GRADIUM_API_KEY environment variable in the examples. The transport layer is replaceable: you can pass your own mock HTTP or WebSocket implementation into a client, which is useful for testing or for unusual networking setups. Six example programs ship with the library, covering REST TTS, streaming TTS, multiplexed streams, REST ASR, live ASR with voice activity detection, and managing voices and credits. The license is Apache 2.0.

Copy-paste prompts

Prompt 1

Build GradiumPP on macOS with cmake and run the REST TTS example

Prompt 2

Send a sentence to Gradium with TtsRestClient using the emma voice and write a WAV

Prompt 3

Stream live captions with the WebSocket ASR client and VAD enabled

Prompt 4

Swap the network transport for a mock HTTP impl in a unit test

Prompt 5

Switch the output format constant from WAV to PCM 16000 for streaming TTS

Open on GitHub → Explain another repo

Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.