Convert a semantic segmentation map of a city street into a photorealistic driving video using the pretrained model.
Generate a realistic talking-head video from a simple edge outline of a face.
Synthesize a moving person in video from skeleton pose keypoints for animation research.
Training at full 2048x1024 resolution requires 8 NVIDIA GPUs with at least 24 GB memory each, pretrained models are available for quick inference tests.
NVIDIA vid2vid is a research project that takes a video made of simplified visual inputs, such as colored region maps, edge outlines, or body pose skeletons, and generates a realistic-looking video that matches them. For example, you can take a map of a city street where each region is colored by category (road, sidewalk, building, sky) and produce a photorealistic video that looks like you are actually driving through that street. Other examples include generating a talking face from a simple edge outline of the face, or generating a person moving from a skeleton of their joints. The project was published at NeurIPS 2018 by researchers from NVIDIA and MIT. It builds on earlier NVIDIA image translation work and focuses specifically on making the output look consistent and smooth across video frames, not just frame by frame. To use it, you need Linux or macOS, Python 3, and an NVIDIA graphics card with CUDA support. Training at the highest resolution (2048 by 1024 pixels) requires 8 GPUs with at least 24 GB of memory each, so this is aimed at researchers and teams with significant hardware. Pre-trained models are available for the street and face examples, so you can test the system without training from scratch. The training process works at increasing resolutions in stages, starting small and working up to the full output size. The README includes detailed instructions for downloading datasets and pre-trained models, running tests, and training your own models on city street, face, and human pose data. This is a research code release tied to the published paper. It is not a finished product and is intended for academic exploration rather than production use.
← nvidia on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.