explaingit

mistralai/mistral-inference

10,804Jupyter NotebookAudience · researcherComplexity · 4/5LicenseSetup · hard

TLDR

The official Python library for running Mistral AI's open-weight language models locally on your own GPU, with support for chat, function calling, image input via Pixtral, and fine-tuning with LoRA.

Mindmap

mindmap
  root((mistral-inference))
    What it does
      Run models locally
      Chat and demo CLI
      Fine-tune with LoRA
    Models supported
      Mistral 7B
      Mixtral 8x7B 8x22B
      Codestral Pixtral
    Requirements
      GPU required
      xformers library
      Hugging Face Hub
    Features
      Function calling
      Image input
      Jupyter tutorials
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Run a local Mistral 7B chatbot on your own GPU without sending data to any external API.

USE CASE 2

Fine-tune a Mistral model on a custom dataset using LoRA to specialize it for a specific task.

USE CASE 3

Use Pixtral's image-input support to build a local pipeline that describes or reasons about images.

Tech stack

PythonPyTorchxformersJupyter NotebookHugging Face

Getting it running

Difficulty · hard Time to first run · 1day+

Requires a compatible NVIDIA GPU and the xformers package, large Mixtral models need multiple GPUs.

Most Mistral models allow commercial use freely, Codestral and Mistral Large are for non-commercial research only, check each model's license before use.

In plain English

Mistral Inference is the official Python library for running Mistral AI's language models on your own hardware. Mistral AI is a French AI company that releases open-weight large language models, meaning the model files are publicly available for download and can be run locally rather than only through a cloud API. The library requires a GPU (graphics card) to install and run, because it depends on a GPU-acceleration package called xformers. Once installed, you download the model weights you want and point the library's command-line tools at the folder. The main commands are mistral-demo for a quick test and mistral-chat for an interactive conversation. Larger models like the 8x7B and 8x22B Mixtral variants need multiple GPUs and are launched with the torchrun command. Models are available in two ways: direct download links (tar archives from Mistral's servers) or through the Hugging Face Hub using a Python download helper. The library supports the full range of Mistral's model lineup, including Mistral 7B, the Mixtral mixture-of-experts models, Codestral (a code-focused variant), Mathstral (math-focused), Mistral Nemo, Mistral Large, and Pixtral (which can process images). Most models allow commercial use, but Codestral and Mistral Large carry a non-commercial research license. Beyond chatting, the library supports function calling (letting the model invoke tools you define), fine-tuning on your own data using a technique called LoRA, and image input for the Pixtral models. Tutorials are included as Jupyter notebooks and can be opened directly in Google Colab. Documentation is at docs.mistral.ai and community support is available via a Discord server.

Copy-paste prompts

Prompt 1
I have downloaded Mistral 7B weights locally. Using mistral-inference, show me how to start mistral-chat and have a conversation with the model.
Prompt 2
How do I fine-tune Mistral 7B on a custom dataset using LoRA with the mistral-inference library? Show me the key steps and commands.
Prompt 3
Using mistral-inference with Pixtral, how do I pass a local image to the model and ask it to describe what is in the picture?
Prompt 4
I want to run Mixtral 8x7B locally. What GPU setup do I need, and what torchrun command do I use to launch it with mistral-inference?
Open on GitHub → Explain another repo

← mistralai on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.