Analysis updated 2026-05-18
Build a question-answering system for academic papers that extracts answers from text, figures, and equations together.
Create a financial document analyzer that answers questions by searching across tables, charts, and narrative text in reports.
Develop a technical documentation search tool that understands diagrams, code snippets, and explanatory text as a unified knowledge base.
| hkuds/rag-anything | facebook/prophet | openai/gpt-oss | |
|---|---|---|---|
| Stars | 20,146 | 20,179 | 20,095 |
| Language | Python | Python | Python |
| Setup difficulty | moderate | moderate | hard |
| Complexity | 4/5 | 3/5 | 4/5 |
| Audience | developer | data | developer |
Figures from each repo's GitHub metadata at analysis time.
Requires API keys for vision-language models and LightRAG configuration.
RAG-Anything is an all-in-one Python framework for building question-answering systems that work with complex, mixed-content documents, not just plain text. RAG stands for Retrieval-Augmented Generation, a technique where an AI model answers questions by first searching a document collection for relevant information, then using that context to generate an answer. Most RAG systems struggle with documents that contain images, charts, tables, or mathematical equations alongside text. RAG-Anything is designed specifically to handle all of these content types together. The framework processes documents end-to-end: it ingests PDFs, Office files, and images, parses them into their component parts (text, tables, figures, equations), builds a multimodal knowledge graph that captures relationships between these elements, and then allows users to query across all of them through a single interface. It is built on top of LightRAG, another project from the same research group at Hong Kong University. A recent addition is VLM-Enhanced Query mode, which routes visual content through a vision-language model for deeper analysis when images are relevant to a query. This system is aimed at research and enterprise scenarios where documents contain rich mixed content, academic papers with figures and equations, financial reports with charts and tables, or technical documentation with diagrams. A Python package called raganything is available on PyPI, and the project has an accompanying academic paper on arXiv (2510.12323).
Python framework for building question-answering systems that handle complex documents with text, images, tables, charts, and equations all together.
Mainly Python. The stack also includes Python, LightRAG, Vision-language models.
Use freely for any purpose including commercial, as long as you keep the copyright notice.
Setup difficulty is rated moderate, with roughly 30min to a first successful run.
Mainly developer.
This repo across BitVibe Labs
Verify against the repo before relying on details.