explaingit

danieldoradotalaveron-rb/yolosegment-2d-to-3d-rebotarm_pick_and_place

Analysis updated 2026-05-18

9PythonAudience · researcherComplexity · 5/5LicenseSetup · hard

TLDR

A robotic vision pipeline that uses an Intel RealSense camera and YOLO object detection to locate objects in 3D and direct a reBot arm to pick and place them using ROS 2 and MoveIt.

Mindmap

mindmap
  root((YoloSegment Pick and Place))
    Pipeline Stages
      RGB-D camera capture
      YOLO 2D segmentation
      2D to 3D projection
      MoveIt motion planning
    Hardware
      Intel RealSense camera
      reBot 601-DM arm
    Object Detection
      Pretrained YOLO weights
      Custom class training
      Temporal pose tracking
    Tech
      Python
      ROS 2
      MoveIt
      YOLO ultralytics
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

What do people build with it?

USE CASE 1

Run a pick-and-place demo with corks and pompoms using the pretrained YOLO model and a reBot arm

USE CASE 2

Adapt the pipeline to new object classes by capturing your own RGB-D images and training a custom YOLO-seg model

USE CASE 3

Study how to convert 2D image segmentation masks into 3D bounding boxes using aligned depth data

USE CASE 4

Integrate computer vision object detections into a ROS 2 MoveIt manipulation workflow

What is it built with?

PythonYOLOROS 2MoveItIntel RealSensePyTorch

How does it compare?

danieldoradotalaveron-rb/yolosegment-2d-to-3d-rebotarm_pick_and_placeewreaslan/jwttxhygenie1228/tehor_release
Stars999
LanguagePythonPythonPython
Setup difficultyhardeasyhard
Complexity5/53/55/5
Audienceresearcherdeveloperresearcher

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · hard Time to first run · 1day+

Requires a reBot 601-DM arm, Intel RealSense D405 camera, a configured ROS 2 environment with MoveIt, and hand-eye calibration before any grasping can run.

The repository's own code is MIT licensed, the bundled 3D detection tooling is adapted from another project and the ROS 2 workspace retains its upstream Apache-2.0 license.

In plain English

This project is a robotic vision system that allows a robot arm to see objects on a table and pick them up. It combines computer vision and robot control: a camera identifies what objects are present and where they are, then the robot arm moves to grasp and place them. The pipeline has four stages. First, an Intel RealSense camera captures a color image along with depth information, meaning it can measure how far away each pixel is. Second, a YOLO object detection model identifies and outlines individual objects in the color image. Third, those outlines are combined with the depth data to determine where each object sits in three-dimensional space relative to the robot. Fourth, a ROS 2 motion planning system called MoveIt takes those 3D positions and plans a sequence of movements to pick up the object and put it down somewhere else. The example in this repository uses three object types: corks, lighters, and pompoms, with the pick-and-place demonstration set up for corks and pompoms. Pre-trained model weights and a labeled dataset are available on Hugging Face, so you can run the pipeline with the included objects without training from scratch. If you want to use different objects, the repository includes tools for capturing your own images, labeling them, and training a new detection model. One notable aspect is how the system handles different object shapes. Symmetrical objects like corks and pompoms are always grasped straight down from above, since their rotation does not matter. Non-symmetrical objects can be grasped at a specific angle. The 3D position tracking also smooths detections over time to avoid jitter from single noisy frames. The project requires a reBot 601-DM arm, an Intel RealSense camera, and a ROS 2 environment with MoveIt. The repository's own code is under the MIT License. Parts of the underlying 3D detection tooling are adapted from another open-source project under separate licensing terms.

Copy-paste prompts

Prompt 1
How do I run the pretrained pompom/cork pick-and-place demo on a reBot 601-DM with an Intel RealSense camera using this repository?
Prompt 2
How do I capture my own RGB-D dataset and train a new YOLO-seg model for a different object class using the justfile recipes?
Prompt 3
How does the 2D-to-3D lifting step work in this pipeline, and how does RANSAC table-plane fitting help with pose estimation?
Prompt 4
How do I perform the ChArUco hand-eye calibration for the eye-in-hand RealSense camera in this ROS 2 setup?

Frequently asked questions

What is yolosegment-2d-to-3d-rebotarm_pick_and_place?

A robotic vision pipeline that uses an Intel RealSense camera and YOLO object detection to locate objects in 3D and direct a reBot arm to pick and place them using ROS 2 and MoveIt.

What language is yolosegment-2d-to-3d-rebotarm_pick_and_place written in?

Mainly Python. The stack also includes Python, YOLO, ROS 2.

What license does yolosegment-2d-to-3d-rebotarm_pick_and_place use?

The repository's own code is MIT licensed, the bundled 3D detection tooling is adapted from another project and the ROS 2 workspace retains its upstream Apache-2.0 license.

How hard is yolosegment-2d-to-3d-rebotarm_pick_and_place to set up?

Setup difficulty is rated hard, with roughly 1day+ to a first successful run.

Who is yolosegment-2d-to-3d-rebotarm_pick_and_place for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub danieldoradotalaveron-rb on gitmyhub

Verify against the repo before relying on details.