Research which scraping behaviors have led to criminal prosecution in China before building a data collection tool.
Understand the specific laws, Criminal Law, Cybersecurity Law, Anti-Unfair Competition Law, that apply to scraping personal or commercial data.
Identify the thresholds at which scraping activity becomes a criminal offense rather than a civil matter in China.
The README and most content are written entirely in Chinese.
This repository is a reference collection of legal cases, news articles, and relevant laws concerning web scraping violations in mainland China. It is aimed at developers who build data collection tools and want to understand where the legal lines are drawn, so they can avoid crossing into territory that has led to criminal prosecution or civil penalties for others. The cases are grouped into five categories of activity that have resulted in legal trouble. The first involves providing scraping services to organizations engaged in illegal activity, such as selling CAPTCHA-cracking tools. The second covers scraping and selling personal data belonging to individuals, including resumes, social security details, and account credentials. The third involves profiting from data that belongs to a commercial platform, such as reselling scraped listings or charging others for access to a scraping interface. The fourth covers cases where aggressive scraping caused a target server to go down, including a case where a developer and their manager were both convicted after their crawler sent 183 requests per second and brought down a government computing system. A fifth category is listed without a description. Alongside the case summaries, the repository includes a section explaining the specific laws that apply to each type of violation. These draw from China's Criminal Law, the Cybersecurity Law, the Anti-Unfair Competition Law, and civil law statutes covering personal information protection. The descriptions quote specific articles and outline the thresholds at which behavior becomes a criminal offense, for example, illegally obtaining location or financial data on more than fifty people constitutes a serious violation. The repository also links to analysis articles written by lawyers covering the legal risks facing data industry practitioners in China. The README is written entirely in Chinese.
← hiddendevj on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.