iree-org/iree

★ 3,759C++Audience · researcherComplexity · 5/5LicenseSetup · hard

Mindmap

mindmap
  root((iree))
    What it does
      Compile AI models
      Run on any hardware
      Optimize for target device
    Supported Backends
      NVIDIA via CUDA
      AMD via ROCm
      Vulkan and Metal
      Standard CPUs
    Input Formats
      PyTorch models
      JAX models
      ONNX models
    Setup
      pip install iree-base-compiler
      pip install iree-base-runtime
      Apache 2.0 license

mindmap root((iree)) What it does Compile AI models Run on any hardware Optimize for target device Supported Backends NVIDIA via CUDA AMD via ROCm Vulkan and Metal Standard CPUs Input Formats PyTorch models JAX models ONNX models Setup pip install iree-base-compiler pip install iree-base-runtime Apache 2.0 license

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Compile a PyTorch or JAX model with iree-base-compiler and deploy the result to a mobile device or embedded system.

USE CASE 2

Benchmark AI inference performance across different hardware backends (GPU, CPU, Vulkan) using the same compiled model.

USE CASE 3

Integrate IREE into a production pipeline to run the same trained model on data center GPUs and edge devices without rewriting inference code.

Tech stack

C++PythonMLIRLLVMCUDAVulkan

Getting it running

Difficulty · hard Time to first run · 1h+

Python packages are on PyPI for quick start, full hardware-specific targets (CUDA, ROCm) require matching drivers and toolchains on the host machine.

Apache 2.0 with LLVM Exceptions, use freely for any purpose including commercial, as long as you keep the copyright notice, the LLVM Exception permits linking without copyleft requirements.

In plain English

IREE (Intermediate Representation Execution Environment, pronounced "eerie") is a compiler and runtime toolkit for machine learning models. Its job is to take a trained AI model, written using frameworks like PyTorch, JAX, or ONNX, and compile it into an efficient form that can run on a specific piece of hardware. The same model can be compiled to run on a data center GPU, a laptop, a phone, or an embedded device, which is what the project means when it calls itself "retargetable." Under the hood, IREE is built on top of MLIR, a compiler infrastructure developed as part of the LLVM project that makes it easier to build compilers for multiple hardware targets. IREE takes the ML model, lowers it through a series of intermediate representations, and produces code tuned for the target device. Supported hardware backends include NVIDIA GPUs via CUDA, AMD GPUs via ROCm, cross-platform GPU access via Vulkan and Metal, and standard CPUs. The project is used in real deployments. In April 2025, AMD submitted an IREE-based image generation implementation to the MLPerf benchmark suite, a standard industry benchmark for AI inference performance. IREE is also a member of the Linux Foundation AI and Data Foundation. For developers, IREE is available as two Python packages on PyPI: iree-base-compiler for the compilation step and iree-base-runtime for running the compiled output. The project is licensed under Apache 2.0 with LLVM Exceptions, and active development discussions happen on a Discord server and mailing lists.

Copy-paste prompts

Prompt 1

Using IREE's Python API, how do I compile a PyTorch model with iree-base-compiler and run it with iree-base-runtime on CPU?

Prompt 2

I want to target Vulkan with IREE. What compilation flags do I pass to iree-base-compiler to produce a Vulkan-compatible artifact?

Prompt 3

How do I install iree-base-compiler and iree-base-runtime from PyPI and run a simple JAX model through the IREE pipeline?

Prompt 4

Show me how to check which hardware backends IREE supports on my machine after installing the runtime package.

Open on GitHub → Explain another repo

← iree-org on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.