explaingit

juicedata/juicefs

13,579GoAudience · ops devopsComplexity · 4/5LicenseSetup · moderate

TLDR

A distributed file system that lets applications use cloud object storage like Amazon S3 as a regular local disk, without changing any existing application code.

Mindmap

mindmap
  root((JuiceFS))
    What it does
      Cloud storage as local disk
      Distributed file access
    Storage backends
      Amazon S3
      Object storage
      Chunked blocks
    Metadata engines
      Redis
      MySQL
    Access interfaces
      POSIX
      Hadoop Java SDK
      S3 gateway
      Kubernetes driver
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Mount an Amazon S3 bucket as a local disk on a Linux machine and use it with standard file commands and existing applications

USE CASE 2

Connect big data tools like Spark or Hadoop to cloud storage via JuiceFS's Hadoop-compatible Java SDK without changing application code

USE CASE 3

Share a single JuiceFS file system across multiple Kubernetes pods simultaneously so all containers see the same data in real time

USE CASE 4

Access cloud object storage through JuiceFS's S3-compatible gateway from tools that only know the S3 API

Tech stack

GoRedisMySQLAmazon S3LZ4ZstandardKubernetesJava

Getting it running

Difficulty · moderate Time to first run · 1h+

Requires configuring both a metadata engine (Redis or MySQL) and a cloud object storage backend before the file system can be mounted.

Apache 2.0, use freely for any purpose including commercial products, modify, and distribute, as long as you include the original license and copyright notice.

In plain English

JuiceFS is a distributed file system that lets applications read and write files using familiar file-system operations, while the actual data is stored in cloud object storage (such as Amazon S3) and metadata is stored in a database like Redis or MySQL. The goal is to let you use massive amounts of cloud storage as if it were a local disk attached to your machine, without changing your existing application code. The system has three parts: a client that runs on the machine reading and writing files, an object storage backend that holds the actual file data, and a metadata engine that tracks file names, sizes, directory structure, and other attributes. Files are split into chunks and blocks before being stored in object storage, which is why the stored files do not look like your original files when you browse the storage bucket directly. JuiceFS is compatible with POSIX, which means most Linux and macOS programs that work with files can use it without any code changes. It also provides a Hadoop-compatible Java SDK for use with big data tools, an S3-compatible gateway for tools that expect an S3 API, and a Kubernetes driver for container workloads. Multiple clients can mount the same file system at once, and changes made by one client are immediately visible to others. Key technical properties include low-latency access, data compression using LZ4 or Zstandard, encryption in transit and at rest, file locking compatible with standard POSIX and BSD lock conventions, and the ability to scale throughput by adding more object storage capacity. Getting started requires choosing a metadata engine (Redis is the most common), configuring object storage, and then installing the JuiceFS client. The project is released under the Apache 2.0 license and offers documentation, a quick-start guide, and a Discord community.

Copy-paste prompts

Prompt 1
Set up JuiceFS with Redis as the metadata engine and Amazon S3 as the storage backend, then mount it on a Linux machine and copy a dataset to the mount point using standard cp commands
Prompt 2
Configure JuiceFS with Zstandard compression enabled, mount it, and run a benchmark comparing read and write throughput against direct S3 access
Prompt 3
Write a Kubernetes PersistentVolumeClaim manifest that uses the JuiceFS CSI driver to mount the same volume into two separate pods and verify a file written by one pod appears in the other
Prompt 4
Set up the JuiceFS S3 gateway and point an existing application that uses the AWS S3 SDK at it to use JuiceFS as a drop-in S3 replacement
Open on GitHub → Explain another repo

← juicedata on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.