explaingit

hanshuaikang/ai-media2doc

3,768VueAudience · generalComplexity · 3/5LicenseSetup · moderate

TLDR

AI-Media2Doc is a self-hosted web app that converts video and audio files into formatted text documents, blog posts, meeting notes, mind maps, subtitles, using AI transcription. No account needed, everything stays on your machine.

Mindmap

mindmap
  root((ai-media2doc))
    What it does
      Transcribes video
      Formats documents
      AI chat on content
    Tech Stack
      Vue frontend
      Python backend
      Docker Compose
      FFmpeg in browser
    Use Cases
      Meeting notes
      Social media posts
      Subtitle generation
    Audience
      Content creators
      Students
      Knowledge workers
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Turn a recorded meeting or lecture video into a formatted summary document automatically.

USE CASE 2

Convert a YouTube-style video into a Xiaohongshu or WeChat article for social media publishing.

USE CASE 3

Generate a subtitle file from any audio or video file without installing desktop software.

USE CASE 4

Ask follow-up questions about a video's content using the built-in AI chat mode after transcription.

Tech stack

VuePythonDockerFFmpeg

Getting it running

Difficulty · moderate Time to first run · 30min

Requires Docker Compose and API credentials for an AI model service (e.g. OpenAI), no other local software needed.

Use, modify, and distribute freely for any purpose including commercial use, as long as you keep the copyright notice. MIT license.

In plain English

AI-Media2Doc is a web application that converts video and audio files into text documents using AI. You point it at a video or audio file, and it transcribes and formats the content into written form using whichever output style you select. The tool offers several output styles for the generated documents. You can produce a post formatted for Xiaohongshu (a popular Chinese social media platform), a WeChat public account article, a knowledge note, a mind map, a content summary, or a plain subtitle file. This makes it useful for content creators, students, and anyone who wants to convert recorded material into readable notes or social media posts. No account creation is required to use the tool. All task records are stored locally on your machine. The frontend handles audio extraction directly in the browser using a browser-compatible version of FFmpeg, so you do not need to install any additional local software beyond running the Docker container. The project is designed for self-hosting. Deployment is done with Docker Compose: you download the configuration file, set your API credentials for an AI model service, and start the containers with one command. The frontend is built with Vue and the backend is written in Python, both running in separate containers. Extra features include an AI chat mode that lets you ask follow-up questions about a video's transcribed content, a smart screenshot function that captures relevant frames from the video and inserts them at corresponding positions in the generated document, and support for writing custom prompts to change how the output reads. The project is open source under an MIT license.

Copy-paste prompts

Prompt 1
I have a 1-hour lecture video and I want AI-Media2Doc to turn it into a structured knowledge note with key points. Walk me through the Docker Compose setup and which output format to choose.
Prompt 2
How do I configure AI-Media2Doc to use my own OpenAI API key and point it at a local video file to produce a mind map output?
Prompt 3
I want to use AI-Media2Doc's smart screenshot feature so that relevant video frames appear inside the generated document. How does that work and how do I enable it?
Prompt 4
Show me how to write a custom prompt in AI-Media2Doc so that the output is formatted as a bullet-point meeting recap instead of the default style.
Open on GitHub → Explain another repo

← hanshuaikang on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.