Build a price-monitoring scraper that extracts product data from e-commerce pages and saves results directly to a database.
Scrape JavaScript-rendered pages by connecting a browser-automation block in the visual flowchart editor.
Trigger scraping jobs from another system by calling spider-flow's built-in HTTP API.
Set up a recurring scraper with automatic proxy rotation to collect data from multiple sources on a schedule.
Requires Java 1.8+, a relational database, and optional plugin setup for Selenium, Redis, or MongoDB features.
spider-flow is a visual web scraping platform that lets you build scrapers by drawing a flowchart rather than writing code. You connect blocks in a diagram to define what the scraper should fetch, what data to extract, and where to store the results. The README is written in Chinese, but the features are documented in a structured list. The platform can extract data from web pages using several methods: XPath (a way of selecting elements by their position in an HTML structure), CSS selectors (targeting elements by their styling class or ID), JsonPath (for JSON data), and regular expressions. It handles pages that load their content dynamically through JavaScript or AJAX requests, not just static HTML. Proxy support is included, and cookies are managed automatically. Scraped data can be saved directly to a database using standard SQL operations (select, insert, update, delete), or written to files. Multiple database connections can be configured. A task monitoring panel and log viewer let you track what scrapers are running and what happened during each run. The platform also exposes an HTTP API so other systems can trigger scraper jobs programmatically. A plugin system extends the core platform. Available plugins include Selenium (for browser automation), Redis (for caching or queuing), MongoDB, cloud object storage, an IP proxy pool, an OCR plugin for reading text from images, and an email plugin. Custom functions and custom executor plugins can also be written. The project includes a disclaimer stating it should not be used for illegal purposes or in ways that violate websites' terms of service. It requires Java 1.8 or higher and is licensed under MIT.
← ssssssss-team on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.