Upload company documents and ask natural-language questions across all of them with automatic web search fallback
Build a private per-user knowledge base where each person's uploaded files are kept separate and searchable
Use the hallucination-check and retry loop as a reference for building self-correcting RAG pipelines in your own projects
Requires a running Qdrant vector database via Docker, an API key for DeepSeek or Qwen, and optionally MySQL and a Serper web search API key.
This repository contains a question-answering system built around a technique called Retrieval-Augmented Generation, or RAG. The basic idea behind RAG is that instead of relying purely on an AI model's internal knowledge, you also pull in relevant information from external sources at the time of each question, then combine that information with the AI's reasoning to generate a more accurate answer. This project takes that idea several steps further by adding a multi-stage pipeline with quality checks and automatic retry loops. When a user asks a question, the system runs it through seven stages. It first rewrites and expands the question to make it more searchable, then decides whether the question even needs external information or can be answered directly. Complex questions are broken into smaller sub-questions. The system then searches four different sources simultaneously: a vector database (which finds content based on meaning rather than exact words), a keyword search index, a relational database for structured data, and a live web search. Results are checked for relevance, and if they do not match well enough, the system retries with a refined query. The answer is generated from whatever was retrieved, and then a final quality check looks for hallucinations, logical errors, and missing information. If the answer fails that check, the whole process starts over from query rewriting. The system supports multiple users, each with their own private knowledge base. A user can upload PDF, Word, Excel, PowerPoint, or text files, and the system automatically converts them into a searchable format. Conversation history is kept across sessions with a three-tier compression approach: recent messages are stored fully, older ones are progressively summarized to save space. The project is written in Python with FastAPI on the back end and React on the front end. The README is written in Chinese. Setup requires a running Qdrant vector database (available via Docker), an API key for a compatible AI model such as DeepSeek or Qwen, and optionally a MySQL database and a Serper web search API key. The README includes detailed setup instructions and a breakdown of every source file.
← dyj-naj on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.