Analysis updated 2026-05-18
Query a Parquet file stored in Cloudflare R2 from inside a Cloudflare Worker and return results as JSON.
Run SQL analytics on a large Iceberg table hosted on S3 from a browser-based dashboard tool.
Filter and paginate rows from a Parquet dataset over HTTP without loading the full file into memory.
Append rows to an Iceberg table from a serverless edge function.
| earonesty/lakeql | 0labs-in/vision-link | arviahq/arvia | |
|---|---|---|---|
| Stars | 4 | 4 | 4 |
| Language | TypeScript | TypeScript | TypeScript |
| Setup difficulty | easy | moderate | moderate |
| Complexity | 3/5 | 3/5 | 3/5 |
| Audience | developer | developer | developer |
Figures from each repo's GitHub metadata at analysis time.
LakeQL is a JavaScript library that lets you run SQL queries against large data files stored in object storage, such as Amazon S3 or Cloudflare R2. It works with two popular data file formats: Parquet (a compressed columnar file format commonly used in data analytics) and Apache Iceberg (a table format built for large datasets with versioning and partitioning). The key thing that sets it apart is that it is written in pure JavaScript with no WebAssembly or native code, so it runs anywhere JavaScript runs. Most similar tools require running a full database engine or a WebAssembly binary, which limits where they can be used. LakeQL is specifically designed for environments where those options do not work: Cloudflare Workers (serverless edge functions with strict memory limits), browser-based tools, and other JavaScript-only runtimes. Instead of loading an entire data file into memory, it reads only the portions it needs using HTTP range reads, keeping memory usage low even on large datasets. You can query data using SQL, a JavaScript method-chaining API, or a JSON query format. A typical use case might be a Cloudflare Worker that queries a Parquet file sitting in an R2 bucket and returns results as JSON to a web request, all without spinning up a database server. The library also supports writing Parquet files and appending to Iceberg tables. Geospatial queries are supported through an optional module that handles H3, a geographic grid system. The library documents which SQL features it supports and which it deliberately rejects, rather than silently giving wrong results. LakeQL is tested against reference implementations from Spark and PyIceberg, with row-by-row comparisons against DuckDB, ensuring results are correct and consistent with established tools. It is published as an npm package and is MIT-licensed.
A pure-JavaScript library for querying Parquet and Iceberg data files with SQL, built for Cloudflare Workers and edge runtimes where native database tools cannot run.
Mainly TypeScript. The stack also includes TypeScript, JavaScript, Parquet.
Use freely for any purpose, including commercial use, as long as you keep the copyright notice.
Setup difficulty is rated easy, with roughly 30min to a first successful run.
Mainly developer.
This repo across BitVibe Labs
Verify against the repo before relying on details.