jada42/mlx-mamba3

Analysis updated 2026-05-18

★ 3PythonAudience · researcherComplexity · 4/5LicenseSetup · moderate

Mindmap

mindmap
  root((mlx-mamba3))
    What it does
      Mamba-3 on Mac
      No CUDA needed
      Metal GPU acceleration
    Modes
      SISO mode
      MIMO mode
      Hybrid attention
    Features
      LoRA fine-tuning
      Weight serialization
      Benchmarking
    Tech stack
      Python
      MLX framework
      safetensors
    Audience
      Researchers
      Mac developers

mindmap root((mlx-mamba3)) What it does Mamba-3 on Mac No CUDA needed Metal GPU acceleration Modes SISO mode MIMO mode Hybrid attention Features LoRA fine-tuning Weight serialization Benchmarking Tech stack Python MLX framework safetensors Audience Researchers Mac developers

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Run Mamba-3 text generation locally on an Apple Silicon Mac without Nvidia hardware.

USE CASE 2

Fine-tune a Mamba-3 language model on your own text data using LoRA on your Mac.

USE CASE 3

Benchmark Mamba-3 inference speed on Apple Silicon hardware.

What is it built with?

PythonMLXPyTorchsafetensors

How does it compare?

	jada42/mlx-mamba3	0marildo/imago	agentlexi/agent-lexi
Stars	3	3	3
Language	Python	Python	Python
Setup difficulty	moderate	easy	moderate
Complexity	4/5	2/5	4/5
Audience	researcher	general	vibe coder

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · moderate Time to first run · 30min

Requires an Apple Silicon Mac and specific Python dependencies including MLX and PyTorch.

Use freely for any purpose, including commercial use, as long as you keep the copyright notice.

In plain English

Mamba-3 is a type of neural network architecture that processes text using a mathematical technique called state-space modeling. Unlike the transformer models that power most modern AI tools, Mamba-3 handles long sequences more efficiently. This repository brings a working version of Mamba-3 to Apple Silicon Macs, using a framework called MLX that runs directly on the Mac's built-in GPU chip. Most existing Mamba-3 code was written for Linux machines with Nvidia graphics cards (CUDA), which means Mac users were locked out of experimenting with the architecture locally. This project rebuilds the full model in pure Python and MLX, so anyone with an M1, M2, or M3 Mac can run it without any Linux tools or extra hardware. The implementation covers three main configurations. SISO mode (single-input, single-output) handles simple channel-by-channel processing. MIMO mode (multi-input, multi-output) uses matrix projections for more expressive mixing. Hybrid mode alternates between Mamba-3 blocks and standard attention layers. All three have been verified to produce mathematically identical results to the original PyTorch code, with a maximum error below 0.00001. Beyond basic inference, the repository includes LoRA fine-tuning support, which lets users adapt a pre-trained model to new text data using mixed-precision training on the Mac GPU. Weights can be saved and loaded in the standard safetensors format. A benchmarking script shows roughly 469 tokens per second during text generation on an M1 Pro machine. The codebase is organized into a core Python package covering model definition, cache management, weight loading, generation loop, and training utilities, plus example scripts for text generation, hybrid model use, and a small fine-tuning demo on a toy dataset. A test suite compares numerical outputs against the PyTorch reference at each correctness boundary. The project is licensed under MIT and is actively maintained with continuous integration on each commit.

Copy-paste prompts

Prompt 1

Using the mlx-mamba3 library, write a Python script that loads a Mamba-3 model and generates text from a prompt on my Mac.

Prompt 2

Show me how to fine-tune a Mamba-3 model with LoRA using mlx-mamba3 on a small custom text dataset.

Prompt 3

How do I benchmark Mamba-3 inference speed with mlx-mamba3 on an Apple M-series Mac?

Prompt 4

Write a Python script using mlx-mamba3 to run a hybrid Mamba-3 plus attention model for text generation.

Frequently asked questions

What is mlx-mamba3?

A Python library that runs the Mamba-3 neural network architecture natively on Apple Silicon Macs, without needing Nvidia hardware or Linux.

What language is mlx-mamba3 written in?

Mainly Python. The stack also includes Python, MLX, PyTorch.

What license does mlx-mamba3 use?

Use freely for any purpose, including commercial use, as long as you keep the copyright notice.

How hard is mlx-mamba3 to set up?

Setup difficulty is rated moderate, with roughly 30min to a first successful run.

Who is mlx-mamba3 for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub jada42 on gitmyhub

Verify against the repo before relying on details.