explaingit

opendrivelab/uniad

4,612PythonAudience · researcherComplexity · 5/5LicenseSetup · hard

TLDR

A self-driving car research framework that trains perception, prediction, and planning as one unified model, CVPR 2023 Best Paper. Uses camera input on the nuScenes benchmark.

Mindmap

mindmap
  root((UniAD))
    What it does
      Unified self-driving model
      Perception to planning
      Camera-only input
    Tasks
      Vehicle tracking
      Road mapping
      Motion prediction
      Occupancy forecast
      Path planning
    Training
      Two-stage process
      nuScenes dataset
      End-to-end joint
    Audience
      AV researchers
      Computer vision PhDs
      Academic labs
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Reproduce the CVPR 2023 Best Paper results on the nuScenes benchmark using the provided model checkpoints.

USE CASE 2

Use the pre-trained UniAD weights as a starting point for your own autonomous driving research.

USE CASE 3

Study how training perception, prediction, and planning jointly improves each individual task's performance.

USE CASE 4

Evaluate a custom component such as a new motion predictor within the unified end-to-end training pipeline.

Tech stack

PythonPyTorchnuScenes

Getting it running

Difficulty · hard Time to first run · 1day+

Requires a GPU, the nuScenes dataset (~350GB download), and a two-stage training pipeline, pre-trained checkpoints are available for evaluation only.

Use freely including in commercial products, keep copyright notices and state any changes you make.

In plain English

UniAD is a research framework for self-driving cars that won the Best Paper Award at CVPR 2023, one of the top computer vision conferences. The central idea is that the many separate tasks a self-driving system needs to do, including detecting other vehicles, predicting where they will go, understanding the layout of the road, anticipating what parts of space nearby cars might occupy, and deciding where your car should drive next, work better when trained together in a coordinated way rather than as isolated modules. Most autonomous driving systems are built with separate components that pass information to each other in sequence. UniAD treats all of these tasks as part of one unified model where the output of perception tasks flows directly into prediction tasks, and prediction flows into planning. The reasoning is that if the system learns with planning as the final goal from the start, every intermediate step is optimized in service of that goal rather than for its own score in isolation. The model takes camera images from around the vehicle as input. It processes them into a bird's-eye view representation of the surrounding scene, which the task-specific parts of the network then interpret for tracking, mapping, motion prediction, occupancy forecasting, and path planning. The model is trained in two stages: first the perception parts are trained alone to reach stable starting weights, then all parts are trained together end-to-end. Pre-trained model checkpoints are provided for download so researchers can reproduce the reported numbers or use the weights as a starting point for their own experiments. The repository includes configuration files for training and evaluation, and documentation for installation and dataset preparation. The dataset used is nuScenes, a public benchmark for autonomous driving research. The code is released under the Apache 2.0 license.

Copy-paste prompts

Prompt 1
I want to reproduce the UniAD CVPR 2023 results. Walk me through downloading the nuScenes dataset, installing dependencies, and running evaluation with the provided checkpoints.
Prompt 2
How does UniAD connect tracking, mapping, motion prediction, occupancy forecasting, and path planning in a single end-to-end model? Explain the data flow.
Prompt 3
Show me how to run the two-stage UniAD training process: first training only the perception components, then fine-tuning the full model end-to-end.
Prompt 4
I want to replace the motion prediction head in UniAD with my own module. Where in the codebase is that component defined and what interface does it need to match?
Open on GitHub → Explain another repo

← opendrivelab on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.