explaingit

iamseancheney/python_for_data_analysis_2nd_chinese_version

8,897Audience · dataComplexity · 2/5Setup · easy

TLDR

A Chinese translation of the second edition of 'Python for Data Analysis' by Wes McKinney, covering pandas, NumPy, and matplotlib for data manipulation, numerical computing, and visualization with Python 3.6.

Mindmap

mindmap
  root((Py Data Analysis))
    What it does
      Chinese translation
      Data analysis book
      Python 3.6 examples
    Libraries
      pandas
      NumPy
      matplotlib
    Setup
      Anaconda
      Jupyter notebook
    Audience
      Chinese speakers
      Data learners
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Learn data analysis with Python in Chinese by working through the translated McKinney book alongside its code examples.

USE CASE 2

Get up to speed on pandas and NumPy fundamentals using a structured book-format guide rather than scattered tutorials.

Tech stack

PythonpandasNumPymatplotlibJupyter

Getting it running

Difficulty · easy Time to first run · 30min

Requires Anaconda installation and the original book's GitHub code repository, Jupyter notebook environment needed to run examples.

In plain English

This repository contains a Chinese translation of the book "Python for Data Analysis, 2nd Edition," written by Wes McKinney, the creator of the pandas data library. The translation covers the second edition published in October 2017, which updated all code examples to Python 3.6 and brought the pandas and Anaconda references up to date compared to the first edition. The book teaches readers how to work with data using Python, focusing on the pandas library for data manipulation, NumPy for numerical computing, and matplotlib for visualization. It also briefly covers StatsModels and scikit-learn, two additional libraries used for statistical modeling and machine learning. The README notes that the third edition of the book has since been published, with further updates to pandas and Python versions. The translator also mentions a separate translation of a book about Polars, a newer data processing library written in the Rust programming language that has attracted attention for handling large datasets faster than pandas. To use this translation, the README suggests downloading the accompanying code from the original book's GitHub repository, installing Anaconda (a Python distribution commonly used for data work), and opening the files in Jupyter notebook, which is a browser-based environment for running code alongside text and notes. This repository is primarily aimed at Chinese-speaking readers who want to learn data analysis with Python using a translated version of the widely-read McKinney book.

Copy-paste prompts

Prompt 1
I'm following the Chinese translation of Python for Data Analysis, help me understand the pandas GroupBy examples in the aggregation chapter.
Prompt 2
Set up Anaconda and Jupyter so I can run the code examples from Python for Data Analysis 2nd edition alongside this Chinese translation.
Prompt 3
Using the pandas techniques from this book, help me clean a messy CSV file by handling missing values, duplicate rows, and inconsistent date formats.
Open on GitHub → Explain another repo

← iamseancheney on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.