Analysis updated 2026-05-18
Run a pick-and-place demo with corks and pompoms using the pretrained YOLO model and a reBot arm
Adapt the pipeline to new object classes by capturing your own RGB-D images and training a custom YOLO-seg model
Study how to convert 2D image segmentation masks into 3D bounding boxes using aligned depth data
Integrate computer vision object detections into a ROS 2 MoveIt manipulation workflow
| danieldoradotalaveron-rb/yolosegment-2d-to-3d-rebotarm_pick_and_place | ewreaslan/jwttx | hygenie1228/tehor_release | |
|---|---|---|---|
| Stars | 9 | 9 | 9 |
| Language | Python | Python | Python |
| Setup difficulty | hard | easy | hard |
| Complexity | 5/5 | 3/5 | 5/5 |
| Audience | researcher | developer | researcher |
Figures from each repo's GitHub metadata at analysis time.
Requires a reBot 601-DM arm, Intel RealSense D405 camera, a configured ROS 2 environment with MoveIt, and hand-eye calibration before any grasping can run.
This project is a robotic vision system that allows a robot arm to see objects on a table and pick them up. It combines computer vision and robot control: a camera identifies what objects are present and where they are, then the robot arm moves to grasp and place them. The pipeline has four stages. First, an Intel RealSense camera captures a color image along with depth information, meaning it can measure how far away each pixel is. Second, a YOLO object detection model identifies and outlines individual objects in the color image. Third, those outlines are combined with the depth data to determine where each object sits in three-dimensional space relative to the robot. Fourth, a ROS 2 motion planning system called MoveIt takes those 3D positions and plans a sequence of movements to pick up the object and put it down somewhere else. The example in this repository uses three object types: corks, lighters, and pompoms, with the pick-and-place demonstration set up for corks and pompoms. Pre-trained model weights and a labeled dataset are available on Hugging Face, so you can run the pipeline with the included objects without training from scratch. If you want to use different objects, the repository includes tools for capturing your own images, labeling them, and training a new detection model. One notable aspect is how the system handles different object shapes. Symmetrical objects like corks and pompoms are always grasped straight down from above, since their rotation does not matter. Non-symmetrical objects can be grasped at a specific angle. The 3D position tracking also smooths detections over time to avoid jitter from single noisy frames. The project requires a reBot 601-DM arm, an Intel RealSense camera, and a ROS 2 environment with MoveIt. The repository's own code is under the MIT License. Parts of the underlying 3D detection tooling are adapted from another open-source project under separate licensing terms.
A robotic vision pipeline that uses an Intel RealSense camera and YOLO object detection to locate objects in 3D and direct a reBot arm to pick and place them using ROS 2 and MoveIt.
Mainly Python. The stack also includes Python, YOLO, ROS 2.
The repository's own code is MIT licensed, the bundled 3D detection tooling is adapted from another project and the ROS 2 workspace retains its upstream Apache-2.0 license.
Setup difficulty is rated hard, with roughly 1day+ to a first successful run.
Mainly researcher.
This repo across BitVibe Labs
Verify against the repo before relying on details.