jtablesaw/tablesaw

Analysis updated 2026-07-03

★ 3,751JavaAudience · dataComplexity · 2/5Setup · easy

Mindmap

mindmap
  root((Tablesaw))
    What it does
      Load and clean data
      Run statistics
      Plot charts
    Input Formats
      CSV and TSV
      JSON and Excel
      Databases and S3
    Visualization
      Scatter and histogram
      Time series
      Heatmaps and box plots
    Integrations
      Jupyter notebooks
      ML libraries
      Google Colab

mindmap root((Tablesaw)) What it does Load and clean data Run statistics Plot charts Input Formats CSV and TSV JSON and Excel Databases and S3 Visualization Scatter and histogram Time series Heatmaps and box plots Integrations Jupyter notebooks ML libraries Google Colab

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Load a CSV dataset into Tablesaw, filter rows by a condition, group by a category column, and compute summary statistics without leaving Java.

USE CASE 2

Explore a dataset interactively in a Jupyter notebook using IJava and Tablesaw to produce histograms and scatter plots rendered in the browser.

USE CASE 3

Prepare and clean a dataset in Tablesaw and pass it directly to a Smile or DL4J machine learning library for model training.

USE CASE 4

Import data from a relational database, join it with a local CSV file, and produce a time series chart without any Python or R tooling.

What is it built with?

JavaMavenPlot.ly

How does it compare?

	jtablesaw/tablesaw	undertow-io/undertow	termux/termux-api
Stars	3,751	3,749	3,755
Language	Java	Java	Java
Setup difficulty	easy	moderate	moderate
Complexity	2/5	4/5	2/5
Audience	data	developer	developer

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · easy Time to first run · 30min

No specific license terms were mentioned in the explanation.

In plain English

Tablesaw is a Java library that lets developers work with data tables directly in their code, similar to what Python programmers use pandas for. It handles the full lifecycle of a dataset: reading data in from files or databases, cleaning and reshaping it, running calculations, and then producing charts for visual exploration. On the data side, Tablesaw can import from CSV, TSV, JSON, HTML, Excel, fixed-width text files, and relational databases, whether stored locally or fetched from the web or cloud storage like S3. Once loaded, you can filter rows, sort and group data, add or remove columns, join multiple tables together, and handle missing values. Export back out works to CSV, JSON, HTML, or fixed-width formats. For statistics, the library covers the standard descriptive measures: mean, median, min, max, sum, standard deviation, variance, percentiles, skewness, kurtosis, and geometric mean. These are built in without needing a separate stats package. Visualization is handled through a wrapper around the Plot.ly JavaScript charting library. The result is that you can produce scatter plots, histograms, box plots, time series charts, heatmaps, pie charts, bubble charts, and more from within Java code, with the charts rendered in a browser or notebook environment. Tablesaw also works inside Jupyter notebooks via integrations with BeakerX and IJava, and in Google Colab, which makes it usable for interactive data exploration in a notebook format. It connects with machine learning libraries like Smile, Tribuo, and DL4J for teams that want to use it as a data preparation step before model training. Adding it to a Maven project requires a single dependency block, and optional companion packages handle Excel, JSON, HTML, and charting separately.

Copy-paste prompts

Prompt 1

Show me how to add Tablesaw to a Maven project, load a CSV file, filter rows where a numeric column is above a threshold, and print the mean and standard deviation.

Prompt 2

I want to explore a dataset in a Jupyter notebook using IJava and Tablesaw. Walk me through the setup and show me how to plot a histogram of a numeric column.

Prompt 3

Write a Tablesaw snippet that reads an Excel file, removes rows with missing values in a specific column, groups by a category column, and computes the mean of another column.

Prompt 4

How do I join two Tablesaw tables on a shared column, similar to a SQL inner join, and then export the result to a JSON file?

Frequently asked questions

What is tablesaw?

Tablesaw is a Java library for loading, cleaning, analyzing, and charting tabular data, similar to what pandas does for Python, with built-in statistics and Plot.ly-powered visualizations that run in the browser or a notebook.

What language is tablesaw written in?

Mainly Java. The stack also includes Java, Maven, Plot.ly.

What license does tablesaw use?

No specific license terms were mentioned in the explanation.

How hard is tablesaw to set up?

Setup difficulty is rated easy, with roughly 30min to a first successful run.

Who is tablesaw for?

Mainly data.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub jtablesaw on gitmyhub

Verify against the repo before relying on details.