tsa18/q-arvd

★ 15PythonAudience · researcherComplexity · 5/5Setup · hard

Mindmap

mindmap
  root((Q-ARVD))
    What it does
      Model compression
      Video quantization
      Quality preservation
    Technique
      Frame weighting
      Outlier handling
      Adaptive scaling
    Workflow
      Sensitivity measure
      Quantize training
      Sample generation
      Quality evaluation
    Audience
      ML researchers
      AI engineers

mindmap root((Q-ARVD)) What it does Model compression Video quantization Quality preservation Technique Frame weighting Outlier handling Adaptive scaling Workflow Sensitivity measure Quantize training Sample generation Quality evaluation Audience ML researchers AI engineers

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Compress an AI video generation model to use less memory so it fits on consumer hardware.

USE CASE 2

Apply frame-sensitivity-aware quantization to preserve quality in the most visually important parts of a video.

USE CASE 3

Run the four-step pipeline to quantize, generate, and compare video samples from the original and compressed models.

USE CASE 4

Evaluate video quality before and after quantization using standard benchmark metrics.

Tech stack

PythonPyTorchCUDA

Getting it running

Difficulty · hard Time to first run · 1day+

Requires a CUDA-capable GPU, compatible video generation model weights (Self-Forcing), and multiple research environment dependencies.

In plain English

Q-ARVD is the code release for a research paper from the National University of Singapore and Hong Kong Polytechnic University. The paper addresses a specific problem in AI video generation: making the models smaller and faster to run without significantly reducing the quality of the videos they produce. The technique this project focuses on is called quantization. In broad terms, quantization means storing the numbers that make up an AI model using less precision, which reduces the model's memory footprint and often speeds it up. The challenge is that reducing precision introduces errors, and those errors do not appear evenly across a video. Some frames are more sensitive to these errors than others, and some parts of the model have unusual numerical patterns that standard quantization methods handle poorly. The project introduces two specific solutions to these problems. The first is a frame-weighting mechanism that measures how much each chunk of video frames matters for final visual quality, then allocates precision accordingly rather than treating all frames equally. The second is a strategy for handling unusual numerical outliers in the model weights, using an adaptive two-scale approach that the authors found works better than treating all weights the same way. The workflow described in the README has four steps: measuring chunk-wise sensitivity across the model, running the quantization training process and saving the result, generating video samples from both the original and quantized model, and then evaluating them with standard video quality metrics. The code is built on top of several existing open-source projects including a video generation model called Self-Forcing.

Copy-paste prompts

Prompt 1

Using Q-ARVD, run the chunk-wise sensitivity measurement on my video model checkpoint and show me which frame ranges are most sensitive to quantization errors.

Prompt 2

Walk me through the four-step Q-ARVD pipeline: sensitivity measurement, quantization training, sample generation, and metric evaluation for my video diffusion model.

Prompt 3

I want to use Q-ARVD's adaptive two-scale outlier strategy. Show me how to configure and run the quantization training step on my model.

Prompt 4

Compare video quality metrics between the original and Q-ARVD-quantized versions of my model using the provided evaluation scripts and interpret the results.

Open on GitHub → Explain another repo

← tsa18 on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.