explaingit

rasbt/python-machine-learning-book

12,613Jupyter NotebookAudience · researcherComplexity · 2/5Setup · easy

TLDR

Jupyter Notebook code examples from the first edition of 'Python Machine Learning' by Sebastian Raschka, covering classification, clustering, neural networks, and text analysis using scikit-learn and NumPy.

Mindmap

mindmap
  root((Python ML Book))
    Content
      13 chapters
      454 pages
      Code notebooks
    Topics
      Classification
      Regression
      Clustering
      Neural networks
    Libraries
      scikit-learn
      NumPy
      Theano GPU
    Format
      Jupyter Notebooks
      Per-chapter folders
    Notes
      First edition only
      Buy book for text
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Follow along with the 'Python Machine Learning' textbook by running the chapter notebooks interactively.

USE CASE 2

Study working code examples for classification, regression, clustering, and neural networks in Python.

USE CASE 3

Learn how to evaluate model accuracy and combine multiple models to improve prediction performance.

USE CASE 4

See how to embed a trained machine learning model into a web application using the chapter 9 example.

Tech stack

PythonJupyter NotebookNumPyscikit-learnTheano

Getting it running

Difficulty · easy Time to first run · 30min

Code examples require the printed or digital book for explanatory text and math, the notebooks alone do not teach the concepts.

In plain English

This repository holds the code examples from the first edition of "Python Machine Learning" by Sebastian Raschka, published by Packt Publishing in 2015. The book is 454 pages and covers machine learning from theory through working code. The code here is meant to accompany the printed or digital book, not to stand on its own, since the notebooks contain code but not the explanatory text and mathematical formulas from the book itself. The content spans 13 chapters. Topics include training classification algorithms, preparing and cleaning data, reducing the number of variables in a dataset, evaluating how well a model works, combining multiple models to improve accuracy, analyzing text sentiment, embedding a trained model into a web application, regression for predicting numerical values, grouping unlabeled data, and training neural networks for image recognition. The final chapter covers speeding up neural network training using a library called Theano. The main Python libraries used throughout the examples are NumPy (for numerical operations), scikit-learn (a widely used machine learning toolkit), and Theano (a numerical computation library that can use a graphics card to speed up calculations). Each chapter has its own folder with Jupyter Notebook files that can be opened interactively in a browser. This repository covers the first edition only. A second edition exists in a separate GitHub repository with updated content. The book has been translated into German, Japanese, Italian, Chinese (both traditional and mainland editions), Korean, and Russian. The repository also links to free supplementary math and NumPy materials the author prepared for a separate book, covering algebra basics, calculus, and an introduction to NumPy, for readers who want background on the underlying mathematics.

Copy-paste prompts

Prompt 1
Using scikit-learn as shown in the Python Machine Learning book, how do I train a logistic regression classifier and evaluate its accuracy on a test set?
Prompt 2
In chapter 8 of Python Machine Learning, how is sentiment analysis done on text data, what preprocessing and model are used?
Prompt 3
How does the book's chapter on neural networks use Theano to speed up training with a GPU?
Prompt 4
Show me how to use PCA for dimensionality reduction on a dataset, following the approach in the Python Machine Learning first edition notebooks.
Prompt 5
How does the ensemble method in the Python Machine Learning book combine multiple classifiers to improve accuracy?
Open on GitHub → Explain another repo

← rasbt on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.