explaingit

ossu/data-science

21,155Audience · vibe coderComplexity · 2/5StaleSetup · easy

TLDR

Free, self-taught Data Science curriculum covering math, programming, databases, statistics, and machine learning through university courses and MOOCs.

Mindmap

mindmap
  root((repo))
    What it does
      Curated course list
      Progress tracking
      Two-year roadmap
    Topic areas
      Math foundations
      Programming basics
      Databases
      Statistics
      Machine learning
    Learning resources
      MIT OpenCourseWare
      Coursera courses
      edX courses
    Support
      Discord community
      Progress spreadsheet
      Fork and track

Things people build with this

USE CASE 1

Build a self-paced Data Science education plan without paying for bootcamps or university tuition.

USE CASE 2

Track your progress through a structured curriculum by forking the repo and checking off completed courses.

USE CASE 3

Learn foundational math, programming, and statistics needed for machine learning and data analysis roles.

USE CASE 4

Join a community of self-taught learners for support and accountability while working through the curriculum.

Tech stack

PythonR

Getting it running

Difficulty · easy Time to first run · 5min
License could not be detected automatically. Check the repository's LICENSE file before use.

In plain English

The Open Source Society University Data Science repository is a curated, free study path that walks you through the equivalent of an undergraduate Data Science degree using open courses from universities. There is no software to install, the "product" is the curriculum itself: a structured list of courses, in a sensible order, that you work through on your own time. It is built around the Curriculum Guidelines for Undergraduate Programs in Data Science report from the American Statistical Association. The README explains how to use the guide. You can finish in roughly two years by studying about 20 hours a week, and a linked spreadsheet helps you estimate your end date. Some courses run in parallel; others must be done in sequence, and a topic-progression graph shows the recommended ordering. To keep track of your work, you fork the repository into your own GitHub account and tick off courses as you finish them, effectively using the repo as a personal kanban board. The curriculum prefers MOOC-style courses because they suit self-paced learners, and it expects high-school maths and statistics as prerequisites. The course list covers Introduction to Data Science, Introduction to Computer Science, Data Structures and Algorithms, Databases, Calculus, Linear Algebra, Statistics and Probability, Data Science Tools and Methods, Machine Learning and Data Mining, and a Final Project. Python and R are the main programming languages taught. You would use this if you want to learn data science seriously without paying for a degree, and you prefer a planned path. There is a Discord community and GitHub issues for support. The full README is longer than what was provided.

Copy-paste prompts

Prompt 1
I want to learn Data Science from scratch. What's the best order to take these free courses: intro to CS, Python, linear algebra, statistics, databases, and machine learning?
Prompt 2
Help me create a study schedule for the OSSU Data Science curriculum assuming I have 20 hours per week available.
Prompt 3
What are the key prerequisites I need to master before starting the machine learning section of this curriculum?
Prompt 4
I'm using the OSSU Data Science repo to self-teach. How should I structure my own fork to track which courses I've completed?
Open on GitHub → Explain another repo

Generated 2026-05-21 · Model: sonnet-4-6 · Verify against the repo before relying on details.