Practice BeautifulSoup by extracting an ordered list of movie titles from an HTML page.
Generate a personal movies.txt watchlist of the top 100 movies of all time.
Use an Internet Archive snapshot as a stable scraping target for a reproducible exercise.
Drop in as a day-45 milestone in a 100 day Python coding course.
No code is provided; the reader installs BeautifulSoup and requests and writes the scraper from the README brief.
This repository is a small Python exercise that asks the reader to scrape the top 100 movies of all time from a webpage and save the result to a plain text file. The output file is called movies.txt and lists the titles in ascending order, starting from one. The README gives a short example of what the first few lines should look like, with titles such as The Godfather, The Empire Strikes Back, The Dark Knight, and The Shawshank Redemption. The stated purpose of the project is to practice using BeautifulSoup, a Python library that reads the HTML of a webpage and lets you pull pieces of data out of it. The README points to Empire's best movies list as the source, but also mentions that similar curated lists from Timeout or Stacker would work for the same exercise. There is no further code in the README, only the brief on what the script should do. The README includes one important note about the source link. Because live websites change layout often, the project recommends pointing the scraper at a snapshot stored on the Internet Archive. A specific archived URL from May 2020 is provided so that the page structure stays the same every time the script runs. This keeps the exercise reproducible long after the original page may have been updated or moved. The project looks like a single day of a longer learning series, judging by the repository name that includes Day 45. There is no list of dependencies, no setup script, and no test suite described in the README. A reader is expected to install BeautifulSoup and a request library on their own, fetch the archived page, find the right HTML elements that hold the movie titles, and write the ordered list to disk. The README is sparse, and that matters for anyone arriving at this repo. There is no license file mentioned, no contribution guide, and no description of the final solution. The repository works best as a starting prompt for someone practicing web scraping in Python, rather than as a finished tool to install and run.
Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.