explaingit

earonesty/lakeql

Analysis updated 2026-05-18

4TypeScriptAudience · developerComplexity · 3/5LicenseSetup · easy

TLDR

A pure-JavaScript library for querying Parquet and Iceberg data files with SQL, built for Cloudflare Workers and edge runtimes where native database tools cannot run.

Mindmap

mindmap
  root((lakeql))
    What it does
      SQL on Parquet files
      Queries Iceberg tables
    Where it runs
      Cloudflare Workers
      Node js
      Browser
    Storage adapters
      S3
      Cloudflare R2
      HTTP
    Features
      Streaming reads
      Geospatial H3
      Write Parquet
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

What do people build with it?

USE CASE 1

Query a Parquet file stored in Cloudflare R2 from inside a Cloudflare Worker and return results as JSON.

USE CASE 2

Run SQL analytics on a large Iceberg table hosted on S3 from a browser-based dashboard tool.

USE CASE 3

Filter and paginate rows from a Parquet dataset over HTTP without loading the full file into memory.

USE CASE 4

Append rows to an Iceberg table from a serverless edge function.

What is it built with?

TypeScriptJavaScriptParquetApache IcebergSQLCloudflare Workers

How does it compare?

earonesty/lakeql0labs-in/vision-linkarviahq/arvia
Stars444
LanguageTypeScriptTypeScriptTypeScript
Setup difficultyeasymoderatemoderate
Complexity3/53/53/5
Audiencedeveloperdeveloperdeveloper

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · easy Time to first run · 30min
Use freely for any purpose, including commercial use, as long as you keep the copyright notice.

In plain English

LakeQL is a JavaScript library that lets you run SQL queries against large data files stored in object storage, such as Amazon S3 or Cloudflare R2. It works with two popular data file formats: Parquet (a compressed columnar file format commonly used in data analytics) and Apache Iceberg (a table format built for large datasets with versioning and partitioning). The key thing that sets it apart is that it is written in pure JavaScript with no WebAssembly or native code, so it runs anywhere JavaScript runs. Most similar tools require running a full database engine or a WebAssembly binary, which limits where they can be used. LakeQL is specifically designed for environments where those options do not work: Cloudflare Workers (serverless edge functions with strict memory limits), browser-based tools, and other JavaScript-only runtimes. Instead of loading an entire data file into memory, it reads only the portions it needs using HTTP range reads, keeping memory usage low even on large datasets. You can query data using SQL, a JavaScript method-chaining API, or a JSON query format. A typical use case might be a Cloudflare Worker that queries a Parquet file sitting in an R2 bucket and returns results as JSON to a web request, all without spinning up a database server. The library also supports writing Parquet files and appending to Iceberg tables. Geospatial queries are supported through an optional module that handles H3, a geographic grid system. The library documents which SQL features it supports and which it deliberately rejects, rather than silently giving wrong results. LakeQL is tested against reference implementations from Spark and PyIceberg, with row-by-row comparisons against DuckDB, ensuring results are correct and consistent with established tools. It is published as an npm package and is MIT-licensed.

Copy-paste prompts

Prompt 1
I'm using lakeql in a Cloudflare Worker. Show me how to query a Parquet file in my R2 bucket and return filtered rows as JSON.
Prompt 2
How do I write a new Parquet file with lakeql from inside a Node.js script?
Prompt 3
Set up an Iceberg table query in lakeql that filters rows by partition and applies a delete file.
Prompt 4
Show me how to use lakeql's SQL dialect to run a GROUP BY aggregation on a large Parquet dataset.
Prompt 5
Explain the difference between lakeql's JavaScript builder API and the SQL API with a code example of each.

Frequently asked questions

What is lakeql?

A pure-JavaScript library for querying Parquet and Iceberg data files with SQL, built for Cloudflare Workers and edge runtimes where native database tools cannot run.

What language is lakeql written in?

Mainly TypeScript. The stack also includes TypeScript, JavaScript, Parquet.

What license does lakeql use?

Use freely for any purpose, including commercial use, as long as you keep the copyright notice.

How hard is lakeql to set up?

Setup difficulty is rated easy, with roughly 30min to a first successful run.

Who is lakeql for?

Mainly developer.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub earonesty on gitmyhub

Verify against the repo before relying on details.