Learn pandas from scratch using real messy datasets in interactive browser notebooks without installing anything
Filter, sort, group, and clean tabular data from CSV files using pandas
Combine multiple datasets by merging on a shared column, then analyze the result
Extract patterns from timestamped data and load tables from a SQL database into a pandas DataFrame
Can run entirely in the browser via Jupyter Lite with no installation required, all three datasets are included in the repo.
Pandas is a Python library for working with structured data like spreadsheets and CSV files. It is widely used in data analysis because it makes it fast to filter, sort, group, and combine large datasets. This cookbook is a collection of worked examples intended to help beginners get started with pandas using real datasets rather than toy examples. The cookbook is organized as nine chapters, each in its own Jupyter Notebook file. Jupyter Notebooks are interactive documents where code and explanatory text are combined, so you can run each example step by step in your browser or on your own machine. The chapters start with the basics, like reading a CSV file and selecting rows or columns, and progress through more involved tasks: grouping data to find patterns, combining multiple datasets, extracting information from text, cleaning up messy data, working with dates and timestamps, and loading data from a SQL database. All three real-world datasets used in the cookbook are included in the repository, so you can run every example immediately without hunting for data. The datasets are 311 service calls in New York City, bicycle path counts in Montreal, and hourly Montreal weather data for 2012. You can try the cookbook in your browser via Jupyter Lite without installing anything. To run it locally, you clone the repository, install the dependencies with pip, and start Jupyter. A Docker option is also described for those who prefer containers. The cookbook was written by Julia Evans, who notes in the README that the official pandas documentation is thorough but that many people find it hard to get started without concrete examples that show real-world messiness. The license is Creative Commons Attribution-ShareAlike 4.0. A Chinese translation of the repository exists separately.
← jvns on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.