hkust-c4g/domainshuttle

Analysis updated 2026-05-18

★ 156PythonAudience · researcherComplexity · 4/5LicenseSetup · hard

Mindmap

mindmap
  root((DomainShuttle))
    What it does
      Text to video
      Subject driven
      Cross domain style
    How it works
      Decoupled features
      Domain modeling
      14B parameters
    Base model
      Wan2.2
      HuggingFace weights
    Use cases
      Consistent characters
      Style transfer video
      Research baseline
    Setup
      CUDA GPU required
      conda environment
      Apache 2.0 license

mindmap root((DomainShuttle)) What it does Text to video Subject driven Cross domain style How it works Decoupled features Domain modeling 14B parameters Base model Wan2.2 HuggingFace weights Use cases Consistent characters Style transfer video Research baseline Setup CUDA GPU required conda environment Apache 2.0 license

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Generate a video of a photo subject performing an action in a completely different visual style, such as animation.

USE CASE 2

Test how well a stylized character maintains its appearance across different scene backgrounds in generated video.

USE CASE 3

Use the subject-driven video pipeline as a baseline for research into cross-domain visual consistency.

USE CASE 4

Run inference on your own reference image by swapping the JSON config and running the provided shell script.

What is it built with?

PythonPyTorchHuggingFaceCUDAconda

How does it compare?

	hkust-c4g/domainshuttle	helpmeeadice/bandori-pet-rev	orchestration-agent/agentorchestration
Stars	156	156	155
Language	Python	Python	Python
Setup difficulty	hard	moderate	hard
Complexity	4/5	3/5	4/5
Audience	researcher	general	ops devops

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · hard Time to first run · 1h+

Requires a CUDA-enabled GPU with sufficient VRAM for a 14B model, both Wan2.2 base and DomainShuttle checkpoints must be downloaded from HuggingFace before inference.

Use freely for any purpose, including commercial, as long as you comply with the Apache 2.0 license conditions.

In plain English

DomainShuttle is an AI research project from Hong Kong University of Science and Technology that generates videos from text descriptions while keeping a specific subject (like a person, object, or stylized character) looking consistent throughout, even when placed in visually different settings or art styles. The challenge this addresses is that existing text-to-video systems struggle when you want a subject from one visual domain (say, a cartoon character) to appear in a different domain (say, a photorealistic landscape), or when a real photo subject needs to appear in an animated-style video. DomainShuttle handles this by separating how a subject looks from the domain, meaning the visual style and environment of the video, learning each independently before combining them during generation. The system is built on top of Wan2.2, a 14-billion-parameter video generation model. You give it a reference image of your subject and a text prompt describing the scene or action, and it generates a short video where that subject appears in the described setting. Setup requires a CUDA-enabled GPU, the conda environment manager, and downloading two large model files from HuggingFace. After installing dependencies with a setup script, you run a single shell script to generate videos. Sample test cases are included in the repository, and you can swap in your own reference image by editing a JSON config file. The model weights and code are licensed under Apache 2.0. A technical report describing the method in full is available on arXiv.

Copy-paste prompts

Prompt 1

Walk me through setting up the DomainShuttle conda environment, downloading the Wan2.2 base model and DomainShuttle weights, and running the inference script.

Prompt 2

How do I edit the test_case JSON file in DomainShuttle to use my own reference image for video generation?

Prompt 3

What GPU VRAM is required to run DomainShuttle at 480p vs 720p resolution?

Prompt 4

Explain how DomainShuttle decouples subject features from domain attributes to improve cross-domain consistency in generated video.

Prompt 5

How do I generate a video of a cartoon character placed in a photorealistic outdoor setting using DomainShuttle?

Frequently asked questions

What is domainshuttle?

DomainShuttle generates videos from text that keep a specific subject visually consistent across different art styles, using a subject-driven approach on top of the 14B-parameter Wan2.2 model.

What language is domainshuttle written in?

Mainly Python. The stack also includes Python, PyTorch, HuggingFace.

What license does domainshuttle use?

Use freely for any purpose, including commercial, as long as you comply with the Apache 2.0 license conditions.

How hard is domainshuttle to set up?

Setup difficulty is rated hard, with roughly 1h+ to a first successful run.

Who is domainshuttle for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub hkust-c4g on gitmyhub

Verify against the repo before relying on details.