Follow along with the Li Hang textbook using runnable Python code that maps each step to the book's equations.
Study SVM, AdaBoost, or Hidden Markov Model implementations in clean, heavily annotated Python.
Run the K-means or PCA code to verify your understanding of a specific chapter's algorithm.
README and code comments are in Chinese, familiarity with the Li Hang textbook is assumed before using this repo.
This repository contains Python implementations of every algorithm from a well-known Chinese machine learning textbook, "Statistical Learning Methods" (统计学习方法) by Li Hang. The author's stated goal was to annotate every line of code and mark key sections with the mathematical formulas they correspond to, so that a reader can follow the code while reading the book and have a traceable reference for each step. The supervised learning section covers perceptron (the simplest type of neural unit), K-nearest neighbors (classifying data by comparing it to nearby examples), Naive Bayes, decision trees, logistic regression, maximum entropy models, support vector machines (SVM), AdaBoost boosting, the EM algorithm (used to estimate parameters when some data is missing), and Hidden Markov Models (a type of sequence model used in speech and language tasks). The unsupervised learning section covers K-means clustering, hierarchical clustering, principal component analysis (PCA, a method for reducing the number of variables in data), latent semantic analysis (LSA), probabilistic latent semantic analysis (PLSA), latent Dirichlet allocation (LDA), and PageRank. The README is written primarily in Chinese, and the project is aimed at Chinese-speaking learners working through this specific textbook. A companion blog series explaining the algorithms also accompanies the code. One update note mentions that the author has signed a publishing contract to release a printed book based on this repository. The license is Creative Commons Attribution-NonCommercial-ShareAlike 4.0, meaning you can share and adapt the code for non-commercial purposes as long as you credit the original author. Contributions from the community are welcome via pull requests.
← dod-o on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.