Download the full text of a paywalled article from the NYT or Financial Times as Markdown for research or AI summarization
Batch-fetch recent articles from a news site by scraping its RSS feed or sitemap with a single command
Run a cross-site keyword search across multiple publications and download matching articles in one batch
Pipe article JSON output into an AI processing pipeline using bpc-fetch's structured stdout format
Requires pip install plus Playwright and Chromium. Windows exe available that auto-downloads Chromium.
bpc-fetch is a command-line tool that retrieves full articles from more than 930 news and magazine websites that require paid subscriptions, then saves each article as clean Markdown text with its images preserved. The project was built to replicate what the browser extension Bypass Paywalls Clean does, but runs entirely from a terminal without needing a browser extension installed. The tool supports sources across more than 40 countries, including publications in the financial press such as the Economist, Financial Times, Bloomberg, and the Wall Street Journal, US news sites including the New York Times and the Washington Post, European publications in German, French, Italian, and Spanish, and science and technology outlets including Wired, Nature, and MIT Technology Review. The full supported list covers 936 sites and can be viewed by running a command in the terminal. For each article URL you provide, the tool tries a sequence of retrieval strategies: impersonating Googlebot or Bingbot, manipulating the HTTP referer header, running JavaScript inside a headless Chromium browser window, or falling back to the Internet Archive if the direct approach does not work. The tool picks the best strategy for each site and degrades through the chain until it gets content. Beyond single articles, bpc-fetch can discover recent articles from a site by checking its RSS feed, XML sitemap, or rendered homepage. A cross-site search mode lets you search with a keyword, filter by time range, and download matching articles in one batch. Output is designed for automated pipelines: results go to standard output as JSON, progress messages go to standard error, and each response includes a suggestion for the next command to run. Installation uses pip and requires Playwright and Chromium installed alongside it. A self-contained Windows executable is also available in Releases and downloads Chromium automatically on first run. The repository does not state a license clearly in the README.
← sophomoresty on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.