explaingit

wendesi/lihang_book_algorithm

5,836PythonAudience · researcherComplexity · 3/5Setup · moderate

TLDR

Python implementations of every machine learning algorithm in Li Hang's Statistical Learning Methods textbook, chapter by chapter, tested on the MNIST dataset, a hands-on companion for students working through the book.

Mindmap

mindmap
  root((lihang algorithms))
    Algorithms
      Perceptron
      Naive Bayes
      SVM
      AdaBoost
    Dataset
      MNIST digits
    Tech
      Python
      C++ for speed
    Audience
      ML students
      Textbook readers
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Run working Python code alongside each chapter of Li Hang's Statistical Learning Methods textbook to see the theory in action.

USE CASE 2

Test classic ML algorithms like SVM, Naive Bayes, and AdaBoost on the MNIST handwritten digit dataset.

USE CASE 3

Read the companion CSDN blog posts to understand both the theory and the implementation together in Chinese.

Tech stack

PythonC++

Getting it running

Difficulty · moderate Time to first run · 30min

AdaBoost chapter requires compiling a C++ component alongside the Python code before running.

In plain English

This repository contains Python implementations of every algorithm covered in "Statistical Learning Methods," a well-known Chinese textbook on machine learning written by Li Hang. The project works through the book chapter by chapter, translating each method from the theoretical text into runnable code. The algorithms covered span a broad range of classic machine learning techniques. These include the perceptron (a basic binary classifier), K-nearest neighbors (a method that classifies new points by looking at nearby training examples), Naive Bayes (a probability-based classifier), decision trees, logistic regression, maximum entropy models, support vector machines, AdaBoost (a method that combines many weak classifiers into a stronger one), and hidden Markov models (used for sequence data like speech or text). There is also an extra implementation of a softmax classifier that goes beyond the book itself. Most implementations are tested on the MNIST dataset, which is a standard collection of handwritten digit images commonly used to verify that a machine learning algorithm is working correctly. One algorithm (AdaBoost) also includes a version that mixes Python with C++ for performance reasons. Each chapter entry in the README links to a companion blog post on CSDN (a Chinese developer platform) where the author walks through the implementation in more detail. The posts are written in Chinese and explain both the theory and the code. The repository is primarily a learning resource for people working through Li Hang's textbook who want to see the algorithms in working code alongside the mathematical explanations in the book.

Copy-paste prompts

Prompt 1
I'm reading Li Hang's Statistical Learning Methods and want to run the Python implementations from lihang_book_algorithm. Help me set up the environment and run the perceptron chapter code on MNIST.
Prompt 2
Using the SVM implementation in lihang_book_algorithm, explain how the algorithm works at a high level and show me how to test it on the MNIST digit dataset.
Prompt 3
The AdaBoost chapter in lihang_book_algorithm mixes Python and C++. Help me compile the C++ component and run the full example so I can verify my understanding of the boosting algorithm.
Prompt 4
I want to understand the Hidden Markov Model implementation in lihang_book_algorithm. Walk me through the code structure and explain what each part maps to in the textbook theory.
Open on GitHub → Explain another repo

← wendesi on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.