explaingit

daybreak-u/chineseocr_lite

12,285C++Audience · developerComplexity · 3/5Setup · moderate

TLDR

A lightweight OCR tool that reads Chinese (and other) text from images using three tiny AI models totalling under 5 MB, fast enough to run on phones and low-powered devices without a GPU.

Mindmap

mindmap
  root((chineseocr lite))
    What it does
      Chinese text from images
      Lightweight 4.7 MB models
      No GPU required
    Three AI models
      Text region detection
      Orientation detection
      Character recognition
    Interfaces
      Web browser upload
      Command line JSON
      Python library
    Platform support
      Android JVM bindings
      C++ native
      NET bindings
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Extract Chinese text from scanned documents or photos through a browser-based upload interface.

USE CASE 2

Batch-process a folder of images from the command line and save recognized text as JSON files.

USE CASE 3

Integrate on-device Chinese text recognition into an Android app using the provided Java and Kotlin bindings.

USE CASE 4

Add OCR to a Python script that handles both horizontal and vertical Chinese text from images.

Tech stack

PythonC++ONNXNCNNMNNJavaKotlin

Getting it running

Difficulty · moderate Time to first run · 30min

Python setup is a single pip install, C++, Android, and .NET integrations require additional build steps.

In plain English

ChineseOCR Lite is a lightweight tool for reading text out of images, with a focus on Chinese characters. OCR stands for optical character recognition, which means converting a photo or scan of printed text into actual text a computer can work with. This project is built to be small and fast: the three AI models it uses add up to only about 4.7 megabytes total, so it can run on phones and low-powered devices without needing a graphics card. The system uses three models working together. The first detects where text appears in the image. The second figures out the orientation of each detected text block. The third reads the actual characters. It can handle text written vertically as well as horizontally, which is important for Chinese content where vertical layouts are common. The project supports several ways to run it. There is a web interface where you can upload an image and see the recognized text in a browser. There is a command-line tool that takes an image file and outputs the recognized text as JSON, which is useful for batch processing or connecting it to other programs. It also provides example code for C++, Java and Kotlin via JVM bindings, Android apps, and .NET, so developers working in different programming languages can integrate it into their own software. The underlying models are provided in the ONNX format, a standard format for AI models, which means they can run on CPU without any special setup. For developers who need to run the models on a phone or in very constrained environments, there are also versions formatted for NCNN and MNN, which are frameworks commonly used for on-device AI in mobile apps. Setup for the Python version is a single pip install command, and a simple web server starts with one Python command. The README is written in Chinese and the project is aimed primarily at developers working with Chinese-language text recognition tasks.

Copy-paste prompts

Prompt 1
Using chineseocr_lite's Python API, write a script that reads every image in a folder and saves the recognized Chinese text from each one into a single JSON file.
Prompt 2
I want to add chineseocr_lite to an Android app to scan Chinese receipts. Show me how to use the JVM bindings to pass a Bitmap and get back the recognized text.
Prompt 3
Set up the chineseocr_lite web server locally and show me how to make a POST request with a base64-encoded image to get OCR results as JSON.
Prompt 4
I'm building a Python pipeline to process scanned Chinese documents. Show me how to use chineseocr_lite to detect text regions and filter out results below a confidence threshold.
Open on GitHub → Explain another repo

← daybreak-u on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.