jjihwan/liteframe

Analysis updated 2026-06-24

★ 14Audience · researcherComplexity · 1/5Setup · hard

Mindmap

mindmap
  root((LiteFrame))
    Inputs
      Long videos
      Many frames
    Outputs
      Paper PDF
      Project page
      BibTeX
    Use Cases
      Cite the paper
      Watch for code drop
    Tech Stack
      Vision Transformer
      Video LLM
    Status
      Code coming soon
      Weights coming soon

mindmap root((LiteFrame)) Inputs Long videos Many frames Outputs Paper PDF Project page BibTeX Use Cases Cite the paper Watch for code drop Tech Stack Vision Transformer Video LLM Status Code coming soon Weights coming soon

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Cite the LiteFrame paper in your own video understanding research

USE CASE 2

Watch the repo for the upcoming code and weight release

USE CASE 3

Read the arXiv preprint to learn how to scale frame counts in Video LLMs

What is it built with?

ViTVideo LLM

How does it compare?

	jjihwan/liteframe	0c33/agentic-ai	0xbebis/hyperpay
Stars	14	14	14
Language	—	Python	TypeScript
Setup difficulty	hard	hard	hard
Complexity	1/5	4/5	5/5
Audience	researcher	developer	developer

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · hard Time to first run · 1day+

No runnable code exists yet, the repo only hosts the paper and a release-pending note.

In plain English

LiteFrame is the official GitHub repository for a research paper titled LiteFrame: Efficient Vision Encoders Unlock Frame Scaling in Video LLMs. The work comes from a team at Google DeepMind together with Seoul National University, with Jihwan Kim listed as the first author and other authors including Nikhil Parthasarathy, Danfeng Qin, Junhwa Hur, Deqing Sun, Bohyung Han, Ming-Hsuan Yang, and Boqing Gong. The README's own one-sentence summary calls the project a highly efficient video encoder for Video Large Language Models that aims to unlock scalable, long-form video understanding by addressing inefficiencies in both the language model and the Vision Transformer (ViT). In other words, the paper is about making it cheaper and more practical to feed many frames of a video into a model that combines vision and language, rather than only being able to look at a handful of frames. It is important to be plain about the current state of the repository. The README contains a clearly marked note that the code and model weights will be released soon. As of the README's news entry dated 2026.05.18, only the paper itself has been posted to arXiv. There is a 1-minute overview video linked in the README and a project page hosted on the first author's site, but no runnable training or inference code is present yet. Because the README is short and almost entirely about author credits, paper links, and the planned release, there is no install guide, no usage example, no benchmark numbers, and no description of the LiteFrame architecture itself in the text shown here. The repository acts as a placeholder that lets people cite the paper and watch for the upcoming code drop. A BibTeX citation block is included for researchers who want to reference the work in their own papers. The arXiv preprint number is 2605.17260, and the project page is at jjihwan.github.io/projects/LiteFrame. Anyone interested in actually running LiteFrame will need to wait for the authors to publish the code and weights.

Copy-paste prompts

Prompt 1

Summarise the LiteFrame paper's main idea for efficient video encoding in plain English

Prompt 2

Compare LiteFrame's claimed approach to existing Video LLM frame sampling tricks

Prompt 3

Draft a checklist of what I should test once LiteFrame's code and weights are released

Prompt 4

Explain why feeding many frames into a Video LLM is expensive and how LiteFrame plans to fix it

Frequently asked questions

What is liteframe?

Placeholder repo for the LiteFrame paper, a vision encoder that helps Video LLMs scale to many frames. Code and weights are not released yet.

How hard is liteframe to set up?

Setup difficulty is rated hard, with roughly 1day+ to a first successful run.

Who is liteframe for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Verify against the repo before relying on details.