Analysis updated 2026-06-24
Read clean from-scratch implementations of LSTM, attention, and gradient boosting
Prototype and modify a Q-learning or Dyna-Q agent against an OpenAI Gym env
Use as a reference when implementing ML algorithms for a class or paper
| ddbourgin/numpy-ml | meta-llama/codellama | nvidia/megatron-lm | |
|---|---|---|---|
| Stars | 16,340 | 16,327 | 16,322 |
| Language | Python | Python | Python |
| Setup difficulty | easy | hard | hard |
| Complexity | 3/5 | 4/5 | 5/5 |
| Audience | researcher | developer | researcher |
Figures from each repo's GitHub metadata at analysis time.
Reinforcement-learning modules need OpenAI Gym installed separately.
numpy-ml is a Python library that implements a wide range of machine learning algorithms using only NumPy, a fundamental Python library for numerical computing. Most production ML tools hide all the math behind high-level abstractions, which is great for building things fast but makes it hard to understand what is actually happening. numpy-ml takes the opposite approach: the code is intentionally readable and "somewhat legible" (as the README itself puts it), not optimized for speed, making it a learning and experimentation resource. The library covers an impressive breadth of algorithms across many categories: neural network layers (including LSTM, attention, convolution, and normalization layers), classical machine learning models (decision trees, random forests, gradient boosting, linear and logistic regression), probabilistic models (Gaussian mixture models, hidden Markov models, Bayesian regression), reinforcement learning agents (Q-learning, Monte Carlo, Dyna-Q), and preprocessing utilities (text tokenization, Fourier transforms, feature encoding). You would use numpy-ml if you are studying how machine learning algorithms work under the hood, for a course, research, or to build intuition before using a larger production framework. It is also useful as a prototyping sandbox where you can experiment with and modify algorithms without fighting a complex codebase. It installs as a simple Python package via pip. The reinforcement learning models require the OpenAI gym environment, which can be installed alongside it.
A Python library of from-scratch ML algorithms written only in NumPy. Designed to be readable for learning rather than fast for production.
Mainly Python. The stack also includes Python, NumPy, OpenAI Gym.
Setup difficulty is rated easy, with roughly 30min to a first successful run.
Mainly researcher.
This repo across BitVibe Labs
Verify against the repo before relying on details.