explaingit

78/xiaozhi-esp32

📈 Trending26,552C++Audience · vibe coderComplexity · 3/5ActiveLicenseSetup · hard

TLDR

A DIY voice AI chatbot that runs on cheap ESP32 microcontrollers, connects to cloud AI models like Qwen over Wi-Fi, and can control smart home devices through voice commands.

Mindmap

mindmap
  root((repo))
    What it does
      Voice chatbot on ESP32
      Connects to cloud AI
      Controls smart devices
      Offline wake detection
    Hardware
      70+ dev boards
      M5Stack CoreS3
      ESP32-S3-BOX3
      OLED/LCD display
    Features
      MCP integration
      Speaker recognition
      Emoji reactions
      Wi-Fi connectivity
    Getting started
      Free xiaozhi.me account
      Qwen AI model
      Breadboard assembly
      Afternoon build time
    Tech stack
      C++ firmware
      ESP32 SDK
      Wi-Fi protocols
    Use cases
      Smart home control
      Desktop interaction
      Information search
      Hardware automation

Things people build with this

USE CASE 1

Build a voice-controlled smart home hub that responds to commands and controls lights, locks, and appliances.

USE CASE 2

Create a portable AI assistant in your pocket that works offline for wake-word detection and online for conversations.

USE CASE 3

Assemble a desktop companion that can search the web, read emails, or control your computer through voice.

USE CASE 4

Prototype a custom voice interface for robotics or IoT projects using standard development boards.

Tech stack

C++ESP32Wi-FiQwenDeepSeekMCP

Getting it running

Difficulty · hard Time to first run · 1day+

Requires ESP32 hardware, Wi-Fi configuration, cloud API credentials (Qwen/DeepSeek), and embedded C++ toolchain setup.

Use freely for any purpose including commercial, as long as you keep the copyright notice.

In plain English

XiaoZhi is a DIY AI chatbot that runs on ESP32 microcontrollers, tiny, inexpensive chips (often costing just a few dollars) used for building physical hardware projects. The result is a voice-enabled AI assistant you can build yourself using either a breadboard and components, or one of 70+ supported off-the-shelf development boards. The chatbot connects to large AI models like Qwen or DeepSeek over Wi-Fi, enabling natural voice conversations with real AI intelligence in a pocket-sized physical device. It supports offline wake-word detection (so it listens for its name without sending audio to the cloud), has an optional OLED/LCD display that shows emoji reactions, and can recognize individual speakers' voices. What makes it especially interesting is MCP (Model Context Protocol) integration, a standardized way for AI models to control external devices and services. This means the chatbot can control smart home devices, interact with your computer desktop, search for information, or manage hardware peripherals like LEDs and motors, all through voice commands. For a hardware-curious vibe coder or maker, this is the kind of project you can assemble in an afternoon with a development board like the M5Stack CoreS3 or ESP32-S3-BOX3. Personal users can register a free account on xiaozhi.me and use it with the Qwen real-time AI model at no cost. Built in C++ for the embedded firmware, with documentation available in English, Chinese, and Japanese. It's an active, well-supported project that bridges physical maker culture with modern conversational AI.

Copy-paste prompts

Prompt 1
How do I set up XiaoZhi on an ESP32-S3-BOX3 board and connect it to the Qwen AI model?
Prompt 2
Show me how to add MCP integration to make my XiaoZhi chatbot control smart home devices via voice commands.
Prompt 3
What are the steps to enable offline wake-word detection on XiaoZhi so it listens for its name without cloud calls?
Prompt 4
How can I customize the OLED display on XiaoZhi to show different emoji reactions based on the AI's responses?
Prompt 5
Walk me through building a XiaoZhi chatbot from scratch using a breadboard and ESP32 microcontroller.
Open on GitHub → Explain another repo

Generated 2026-05-18 · Model: sonnet-4-6 · Verify against the repo before relying on details.