browser-use/video-use

★ 7,474PythonAudience · vibe coderComplexity · 3/5Setup · moderate

Mindmap

mindmap
  root((video-use))
    What it does
      Chat-based editing
      Remove filler words
      Color grading
      Subtitle generation
    Tech stack
      Python
      ffmpeg
      ElevenLabs
    Use cases
      Interview cleanup
      Overlay generation
      Post-production
    Audience
      Vibe coders
      Content creators

mindmap root((video-use)) What it does Chat-based editing Remove filler words Color grading Subtitle generation Tech stack Python ffmpeg ElevenLabs Use cases Interview cleanup Overlay generation Post-production Audience Vibe coders Content creators

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Remove filler words and dead air from a recorded interview automatically.

USE CASE 2

Apply color grading and burned-in subtitles to video clips by describing the style you want.

USE CASE 3

Generate animated overlays for a video using an AI sub-agent without touching a timeline editor.

USE CASE 4

Automate post-production for short-form content and get a self-evaluated finished file.

Tech stack

PythonffmpegElevenLabs

Getting it running

Difficulty · moderate Time to first run · 30min

Requires an ElevenLabs API key and ffmpeg installed locally before running the agent.

In plain English

video-use is an open source tool that lets an AI coding agent edit raw video footage through a chat interface. You drop your video files into a folder, describe what you want in plain English, and the agent produces a finished edit as a single output file. It works with Claude Code and other agents that have shell access. The tool handles several common post-production tasks automatically. It removes filler words like "um" and "uh" along with false starts and dead air between takes. It applies color grading to each segment, adds audio fades at every cut to prevent clicks, and burns in subtitles. It can also generate animated overlays using supported animation libraries, with each animation handled by a parallel sub-agent. After rendering, it runs a self-evaluation pass that checks every cut boundary for visual jumps or audio issues before showing you the result. The AI never watches the video directly. Instead, it reads the video through two layers of structured data. The first is an audio transcript produced by ElevenLabs Scribe, which provides word-level timestamps, speaker identification, and audio event labels for every take. The second is an on-demand visual composite that shows a filmstrip, waveform, and word labels as a PNG image for any specific time range. This approach keeps the token cost low compared to feeding raw video frames to the model. Setup requires an ElevenLabs API key, ffmpeg, and Python. The agent can handle the installation itself if you paste a provided setup prompt into your agent session. All output files are written to an edit subfolder next to your source footage, and the agent saves session notes to a project file so future sessions can continue from where the last one left off.

Copy-paste prompts

Prompt 1

Using video-use with Claude Code, remove all filler words and dead air from my interview recording at /footage/interview.mp4 and export the result to the edit folder.

Prompt 2

With video-use, add warm color grading and burned-in subtitles to my raw video clips in /raw, describe the cinematic style I want in plain English.

Prompt 3

Set up video-use to generate animated lower-third overlays for each speaker segment in my podcast recording using a parallel sub-agent.

Prompt 4

Use video-use's self-evaluation pass to check my edited video for visual jumps or audio clicks at cut boundaries and list any issues found.

Prompt 5

Configure video-use to trim a long recording down to highlight clips by describing the key moments I want to keep.

Open on GitHub → Explain another repo

← browser-use on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.