Study for a data science job interview using topic-by-topic cheatsheets and a 150-question Q&A bank
Practice designing real-world ML systems with included case study prompts
Browse the author's portfolio projects as inspiration for your own data science work
Use the ebook collection as a reading list to build foundational knowledge in ML and statistics
No installation needed for most materials. Download or browse PDFs and notebooks directly. Running Jupyter notebooks requires Python and standard ML libraries like PyTorch or Keras.
Cracking the Data Science Interview is a collection of study materials, practice questions, and sample projects assembled by one developer to help people prepare for data science job interviews. It is not a course or an application, it is a curated repository of reference files grouped into several topic areas. The cheatsheets section covers the concepts most commonly tested in interviews: SQL for querying databases, statistics and probability, linear algebra and other mathematics, machine learning fundamentals, deep learning, supervised and unsupervised learning, computer vision, and natural language processing. Many of these are downloadable PDF summaries meant for quick review before an interview. The ebooks section collects several books on practical data science and machine learning, including titles on Python-based machine learning, data science statistics, and applying machine learning to finance. The question bank gathers interview questions sourced from platforms like Analytics Vidhya, Interview Query, and others, including a PDF of 150 commonly asked data science questions and answers. There is also a section of case study prompts that ask candidates to reason through how they would design machine learning systems for real-world scenarios. Beyond study materials, the repository includes the author's own portfolio of past projects. These span recommendation systems built with PyTorch and Keras, machine learning work on taxi trip optimization and grocery basket prediction, computer vision projects on clothing classification and road segmentation, tweet classification, and data analysis on topics like World Cup soccer teams and Spotify artist styles. There is also a data journalism section with published stories. The repository is intended as both a study guide for job seekers and a portfolio reference for the author's own work.
← khanhnamle1994 on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.