Learn how linear regression, decision trees, and neural networks actually work by reading and running their implementations.
Prepare for machine learning technical interviews by studying algorithm implementations from first principles.
Understand what happens inside clustering, dimensionality reduction, and generative models by running visualized examples.
Build intuition for deep learning by implementing convolutional layers, recurrent layers, and attention mechanisms yourself.
ML From Scratch is a collection of Python implementations of machine learning algorithms written from first principles using only NumPy, the fundamental numerical computing library. Its goal is education: rather than providing optimized, production-ready code, it prioritizes showing exactly how each algorithm works step by step, making the underlying math and logic visible and approachable. The project covers a broad range of machine learning techniques organized into four categories. Supervised learning includes algorithms like linear regression, decision trees, support vector machines, and neural networks. Unsupervised learning includes clustering methods like k-means and DBSCAN, dimensionality reduction methods like PCA, and generative models like variational autoencoders and generative adversarial networks. Reinforcement learning includes deep Q-networks. The deep learning section covers building neural network layers from scratch, including convolutional layers, recurrent layers, batch normalization, and attention mechanisms. Each implementation is accompanied by runnable example scripts that produce visualizations, such as an animated GIF of a GAN learning to generate handwritten digits or a graph of a regression model fitting temperature data. This makes abstract concepts concrete by letting learners run and observe the algorithms directly. You would use this repository when studying machine learning and wanting to understand what is actually happening inside a model, rather than just using a high-level library like scikit-learn or PyTorch as a black box. It is also useful for preparing for technical interviews where implementation knowledge matters. The tech stack is Python with NumPy as the only significant dependency. Some examples also use scikit-learn for datasets and Matplotlib for plotting. The project is designed to be read and run locally rather than deployed.
Generated 2026-05-18 · Model: sonnet-4-6 · Verify against the repo before relying on details.