Scrape product listings from Taobao or second-hand goods from Xianyu for price research.
Collect restaurant reviews from Dianping for market analysis or business intelligence.
Learn real-world web scraping patterns by studying standalone scrapers built for actual client projects.
Gather job listing data from Chinese job boards or business registration data from QiChaCha.
Each sub-project has its own requirements, Chinese platform scrapers may require valid accounts or sessions for authenticated pages.
ECommerceCrawlers is a Python collection of web scrapers targeting Chinese websites and platforms. Each scraper in the repository is a standalone project written by contributors to the group, and the README notes that about 80 percent were originally built for paying clients who agreed to open-source the work before it was added here. The collection is also presented as a learning resource for people studying web scraping techniques. The repository covers more than twenty different targets, ranging from e-commerce platforms to social media, news sites, and business databases. Among the included scrapers are tools for Taobao (China's largest shopping platform), Dianping (a restaurant and business review site), and Xianyu (a second-hand goods marketplace). There are also scrapers for job listing sites, WeChat public accounts, Weibo (a large Chinese microblogging platform), Douban (a movie and music review site), and Baidu Tieba (a popular forum network). Travel booking data from Ctrip, business registration data from QiChaCha, and property listings from Anjuke and Tujia are also covered. Each sub-project in the collection comes with its own readme explaining how the scraping process works for that particular site. The README describes the collection as practical examples that help someone new to crawling understand common problems and solutions, built around real targets rather than toy exercises. The project is maintained on both GitHub and a Chinese code hosting platform called Gitee. The README is written in Chinese, and the project is clearly aimed at a Chinese-speaking audience familiar with these platforms.
← dropsdevopsorg on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.