explaingit

ly-derekx/lerobot-rgb-rgbd-vla-dataset-toolkit

16PythonAudience · researcherComplexity · 4/5Setup · hard

TLDR

A Python toolkit that builds robot-learning training datasets by merging recording sessions, auditing video quality, cleaning bad episodes, and uploading to Hugging Face, with added support for depth cameras.

Mindmap

mindmap
  root((lerobot-rgb-rgbd))
    Pipeline stages
      Merge sessions
      Audit quality
      Clean dataset
      Upload to HuggingFace
    Camera support
      Standard RGB video
      RGB-D depth channel
      Orbbec Femto Bolt
    Quality checks
      Frozen frames
      Wrong dimensions
      Near-black content
    Requirements
      Python 3.12
      ffmpeg
      LeRobot framework
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Merge multiple separate robot demonstration recording sessions into one unified LeRobot dataset.

USE CASE 2

Audit a robot video dataset for frozen frames, bad dimensions, or near-black content before model training.

USE CASE 3

Record robot demonstrations with an Orbbec Femto Bolt depth camera and store RGB-D data alongside standard video.

USE CASE 4

Upload a cleaned robot training dataset to Hugging Face with support for resuming large transfers.

Tech stack

PythonffmpegLeRobotHugging Face

Getting it running

Difficulty · hard Time to first run · 1day+

Requires Python 3.12, ffmpeg, an Orbbec Femto Bolt depth camera for RGB-D capture, and a configured LeRobot environment.

No license information was provided in the repository description.

In plain English

This toolkit helps researchers and roboticists build training datasets for robot learning models, specifically the kind that learn from video and robot movement data together. The project sits on top of LeRobot, an open-source framework from Hugging Face for collecting and storing robot demonstration data. A VLA model (Vision-Language-Action) is a type of AI that watches video, reads instructions, and decides how to move a robot arm. To train one, you need large amounts of carefully organized recordings. The toolkit handles the full pipeline from raw recordings to a published dataset. It supports two camera types: standard RGB color video and RGB-D, which adds a depth channel so the system can also record how far away objects are. The depth support is built around the Orbbec Femto Bolt camera and stores depth as lossless image files alongside the standard video. Both types work within the same workflow. The main steps the toolkit covers are merging, auditing, cleaning, and uploading. Merging takes multiple separate recording sessions (each stored as its own folder) and combines them into one unified dataset, rewriting all the internal indices and metadata so nothing conflicts. Auditing runs quality checks on the resulting dataset: it scans each video for empty files, wrong dimensions, frozen frames, nearly black or white content, and similar problems, then sorts episodes into keep, review, and drop lists. Cleaning builds a fresh copy of the dataset that excludes the dropped episodes. Uploading pushes the final result to Hugging Face with support for resuming large transfers. There is also a capture overlay that slots into the official LeRobot repository so you can record Orbbec depth data directly through LeRobot's existing recording scripts. You copy the overlay files into your LeRobot checkout and install the extra dependencies, then use a config file to point it at your camera and robot ports. The project requires Python 3.12 or newer and depends on ffmpeg for video processing. A sample merged RGB-D dataset collected with this toolkit is already published on Hugging Face under the name lerobot_derek_depth.

Copy-paste prompts

Prompt 1
I have multiple LeRobot recording sessions in separate folders and want to merge them into one dataset using this toolkit, show me the merge command and how it rewrites the internal indices.
Prompt 2
Run an audit on my LeRobot dataset to find frozen frames or bad-quality episodes, then show me how to generate a clean copy with those episodes removed.
Prompt 3
Set up the Orbbec depth camera capture overlay in my LeRobot checkout so I can record RGB-D robot demonstrations, walk me through copying the overlay files and configuring the camera and robot ports.
Prompt 4
Upload my cleaned robot dataset to Hugging Face using this toolkit, including how to resume a large transfer that was interrupted.
Open on GitHub → Explain another repo

← ly-derekx on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.