hkuds/rag-anything

Analysis updated 2026-05-18

★ 20,146PythonAudience · developerComplexity · 4/5LicenseSetup · moderate

Mindmap

mindmap
  root((repo))
    What it does
      Ingests mixed-content documents
      Builds multimodal knowledge graphs
      Answers questions across all content
    Document types
      PDFs and Office files
      Images and diagrams
      Tables and charts
    Key features
      Vision-language model routing
      End-to-end processing
      Single query interface
    Use cases
      Academic papers
      Financial reports
      Technical documentation
    Tech stack
      Python
      LightRAG framework
      Vision-language models

mindmap root((repo)) What it does Ingests mixed-content documents Builds multimodal knowledge graphs Answers questions across all content Document types PDFs and Office files Images and diagrams Tables and charts Key features Vision-language model routing End-to-end processing Single query interface Use cases Academic papers Financial reports Technical documentation Tech stack Python LightRAG framework Vision-language models

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Build a question-answering system for academic papers that extracts answers from text, figures, and equations together.

USE CASE 2

Create a financial document analyzer that answers questions by searching across tables, charts, and narrative text in reports.

USE CASE 3

Develop a technical documentation search tool that understands diagrams, code snippets, and explanatory text as a unified knowledge base.

What is it built with?

PythonLightRAGVision-language modelsPyPI

How does it compare?

	hkuds/rag-anything	facebook/prophet	openai/gpt-oss
Stars	20,146	20,179	20,095
Language	Python	Python	Python
Setup difficulty	moderate	moderate	hard
Complexity	4/5	3/5	4/5
Audience	developer	data	developer

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · moderate Time to first run · 30min

Requires API keys for vision-language models and LightRAG configuration.

Use freely for any purpose including commercial, as long as you keep the copyright notice.

In plain English

RAG-Anything is an all-in-one Python framework for building question-answering systems that work with complex, mixed-content documents, not just plain text. RAG stands for Retrieval-Augmented Generation, a technique where an AI model answers questions by first searching a document collection for relevant information, then using that context to generate an answer. Most RAG systems struggle with documents that contain images, charts, tables, or mathematical equations alongside text. RAG-Anything is designed specifically to handle all of these content types together. The framework processes documents end-to-end: it ingests PDFs, Office files, and images, parses them into their component parts (text, tables, figures, equations), builds a multimodal knowledge graph that captures relationships between these elements, and then allows users to query across all of them through a single interface. It is built on top of LightRAG, another project from the same research group at Hong Kong University. A recent addition is VLM-Enhanced Query mode, which routes visual content through a vision-language model for deeper analysis when images are relevant to a query. This system is aimed at research and enterprise scenarios where documents contain rich mixed content, academic papers with figures and equations, financial reports with charts and tables, or technical documentation with diagrams. A Python package called raganything is available on PyPI, and the project has an accompanying academic paper on arXiv (2510.12323).

Copy-paste prompts

Prompt 1

How do I set up RAG-Anything to ingest a PDF with mixed text and images, then query it for answers?

Prompt 2

Show me how to use the VLM-Enhanced Query mode to route visual content through a vision-language model in RAG-Anything.

Prompt 3

I have a collection of financial reports with tables and charts. How would I build a question-answering system using RAG-Anything?

Prompt 4

What's the difference between RAG-Anything and standard RAG systems, and when should I use it?

Frequently asked questions

What is rag-anything?

Python framework for building question-answering systems that handle complex documents with text, images, tables, charts, and equations all together.

What language is rag-anything written in?

Mainly Python. The stack also includes Python, LightRAG, Vision-language models.

What license does rag-anything use?

Use freely for any purpose including commercial, as long as you keep the copyright notice.

How hard is rag-anything to set up?

Setup difficulty is rated moderate, with roughly 30min to a first successful run.

Who is rag-anything for?

Mainly developer.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub hkuds on gitmyhub

Verify against the repo before relying on details.