AudioBook KJ is an experimental studio for turning long-form text into a narrated audiobook or video project. The README is direct that this is a public source snapshot rather than a finished product, and that anyone running it should expect to adjust the code locally. Generated media, local databases, virtual environments, node modules, private voice references, and manuscript content are intentionally excluded from the repo. The README lays out seven rough workflows. Script import and cleanup pulls text in, cleans up Markdown, and splits long content into chunks, with optional rewriting through Gemini CLI. AI direction and metadata extracts characters, scenes, and storyboard hints. Text-to-speech turns lines into audio clips through the Python backend and local model tooling. An audio timeline mixes narration, music, and sound effects using pydub and FFmpeg. A visual asset workflow connects generated images or video to timeline clips. A Chrome extension called FlowKit acts as a bridge between Google Flow in the browser and the local backend. The last stage exports the assembled audio and video to a final file. The frontend is React with Vite and Tailwind CSS, using TanStack Query, React Flow, Axios, and Lucide icons. The backend is FastAPI on Uvicorn, with PyTorch, Torchaudio, and Hugging Face Transformers handling the AI and audio side. Node 20.19+ or 22.12+ is required, Python 3.10 or 3.11, and FFmpeg for export. A CUDA-capable GPU is recommended for local TTS, since the project uses Torch and OmniVoice. Gemini CLI is optional. Several helper endpoints call the gemini command directly for script cleanup, prompt enhancement, entity extraction, and storyboard generation. The README warns to use the official @google/gemini-cli npm package, not look-alikes, and notes that some calls pass --skip-trust, which the user should review before letting Gemini modify files. If Gemini CLI is missing, the main frontend still loads but those endpoints fail. The FlowKit Chrome extension lives at audiobook_builder/flowkit_extension and is loaded as an unpacked extension through Developer mode. It expects the local backend to be running, and the README is explicit that it requests broad browser permissions because it bridges local tooling with Google Flow URLs. The README advises reviewing manifest.json, background.js, and side_panel.js before pairing it with a personal Google account.
Generated 2026-05-21 · Model: sonnet-4-6 · Verify against the repo before relying on details.