Reproduce agent-driven video editing results from the paper once code drops
Evaluate a new video editing model on the AgentEdit-Bench benchmark
Test object replacement, removal, style transfer, and reference insertion under one model
Study how a VLM agent rewrites vague user requests into a typed edit plan
Code is not yet released; README promises a late May 2026 drop and gives no install instructions or dependencies.
Aurora is a research project that will host the official code for a paper on agent-driven video editing. The README is a placeholder for now: the actual code has not been published yet, with the authors giving an ETA of late May 2026. Links point to an arXiv paper and a project website. The approach the README describes has two parts working together. The first is a vision-language model agent that reads a raw user request and rewrites it into a typed edit plan with four fields: an instruction, a task label, an image-search query, and a mask phrase. The second is a unified video diffusion transformer that takes that plan and produces the edited clip. The agent talks to outside tools to fill in gaps, for example running a web image search when the user did not supply a reference picture, and running a grounded segmentation model to produce a mask when one is missing. The editing tasks listed in the README cover four kinds of changes under a single set of model weights. Replacement swaps one object or element for another. Removal deletes an object from the clip. Style transfer changes the visual look of the footage. Reference-driven insertion adds something into the clip based on an example image. The project also introduces a benchmark called AgentEdit-Bench, which evaluates this style of agent-enhanced video editing under conditions where the user request is underspecified, either in words or in supporting images. That is the situation where a user might say 'put a red car here' without explaining what the car looks like or where exactly to place it. The README is sparse beyond these points. There is no installation guide, no usage example, no license file mentioned, and no listed dependencies, because the repository is in a pre-release state. Anyone interested in trying the system will need to wait for the planned code drop or read the paper for technical detail.
Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.