oaid/tengine

★ 4,522C++Audience · developerComplexity · 4/5LicenseSetup · hard

Mindmap

mindmap
  root((repo))
    What it does
      Runs AI models
      Edge deployment
      Format conversion
    Tech stack
      C++ core
      ARM optimized
      Multi-format support
    Use cases
      IoT devices
      Mobile inference
      Embedded AI
    Audience
      Embedded devs
      AI engineers
      IoT builders
    Performance
      Low latency
      8-bit mode
      Multi-core ARM

mindmap root((repo)) What it does Runs AI models Edge deployment Format conversion Tech stack C++ core ARM optimized Multi-format support Use cases IoT devices Mobile inference Embedded AI Audience Embedded devs AI engineers IoT builders Performance Low latency 8-bit mode Multi-core ARM

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Deploy a trained image recognition model onto a Raspberry Pi or ARM-based IoT device without rewriting it for the new hardware.

USE CASE 2

Convert and run models from TensorFlow, ONNX, or Caffe on resource-constrained embedded systems with a single engine.

USE CASE 3

Speed up on-device AI inference using 8-bit integer quantization to cut latency nearly in half on ARM processors.

USE CASE 4

Build Android or Linux edge applications that run neural networks locally without needing a cloud connection.

Tech stack

C++ARMONNXCaffeTensorFlowMXNet

Getting it running

Difficulty · hard Time to first run · 1day+

Requires an ARM-based embedded device or cross-compilation toolchain. Target hardware like RK3399 needed for realistic benchmarking. Community support via GitHub issues, QQ group, and email.

Apache 2.0, free to use, modify, and distribute in personal or commercial projects. Just keep the license notice.

In plain English

Tengine is a lightweight inference engine developed by OPEN AI LAB, designed to run trained AI models on embedded and edge devices in IoT scenarios. The typical use case is taking a neural network model that was trained on a powerful server and deploying it on low-power hardware such as ARM-based processors, where resources are constrained. The core problem it addresses is fragmentation: AI models are trained using various frameworks like Caffe, ONNX, TensorFlow, and MXNet, and they need to run on a wide variety of hardware. Tengine can load models from all of those formats and execute them across different chip architectures, handling the translation and optimization work so developers do not have to do it separately for each target device. Internally the engine is split into five modules: a core module providing basic system components, an operator module that defines the mathematical building blocks of neural networks (convolution, pooling, activation functions, and more), a serializer module for loading saved models, an executor module that actually runs the computation with optimizations for multi-core ARM processors, and a driver module that interfaces with specific hardware. The benchmark table in the README shows inference times for two popular lightweight models on an RK3399 processor with a single A72 core. Running Mobilenet v1 in 32-bit floating point takes about 109ms, dropping to around 64ms in 8-bit integer mode. These numbers give a sense of the kind of hardware and latency the project targets. The project is licensed under Apache 2.0 and offers support through GitHub issues, a QQ group, and email. A companion repository shares Android and Linux application examples built on top of Tengine.

Copy-paste prompts

Prompt 1

I have a MobileNet v1 model trained in TensorFlow. Show me how to load and run it using the Tengine C++ API on an ARM Linux device.

Prompt 2

How do I convert an ONNX model to run with Tengine on an RK3399 processor? Walk me through the serializer and executor setup.

Prompt 3

Write a C++ code snippet that loads a Caffe model with Tengine, runs inference on a single image, and prints the top result.

Prompt 4

Explain how to enable 8-bit integer (INT8) quantization mode in Tengine to reduce inference time on a single-core ARM device.

Prompt 5

What are the five internal modules of Tengine and how do they work together to run a neural network model on embedded hardware?

Open on GitHub → Explain another repo

← oaid on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.