netease-youdao/qanything

★ 13,979PythonAudience · generalComplexity · 3/5LicenseSetup · moderate

Mindmap

mindmap
  root((QAnything))
    File Types
      PDF and Word
      Images and CSV
      Web links
    How It Works
      Two-stage retrieval
      Reranking step
      Qwen language model
    Privacy
      Fully offline mode
      Local processing
    Setup
      Docker install
      Python install
      Offline install

mindmap root((QAnything)) File Types PDF and Word Images and CSV Web links How It Works Two-stage retrieval Reranking step Qwen language model Privacy Fully offline mode Local processing Setup Docker install Python install Offline install

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Ask questions in plain English about a stack of PDF reports and get accurate answers drawn from the documents.

USE CASE 2

Build a private internal knowledge base that employees can query without sending company data to external servers.

USE CASE 3

Search across Excel sheets, Word files, and PowerPoint decks using natural-language questions instead of keyword search.

USE CASE 4

Create a bilingual Q&A system that answers in English or Chinese regardless of the language the source documents are in.

Tech stack

PythonDockerQwenBCEmbedding

Getting it running

Difficulty · moderate Time to first run · 1h+

Requires Docker or a Python environment, GPU hardware recommended for good performance, and offline install needs manual model downloads.

Use, modify, and distribute freely under the Apache 2.0 license, including in commercial products.

In plain English

QAnything is a question and answering system that works on top of your own documents. The name stands for Question and Answer based on Anything. You point it at files you have stored locally, ask questions in plain language, and it gives you answers drawn from the content of those files. It is built by NetEase Youdao and is open source under the Apache license. The README lists the file types it can read, including PDF, Word, PowerPoint, Excel, Markdown, email, plain text, images, CSV, and web links, with more planned. A main selling point is privacy: it can be installed and run with the network cable unplugged, so your documents never leave your machine. It also supports asking questions in Chinese or English no matter which language the document is written in, and it can search across several knowledge bases at once. A technical section explains how it finds relevant text. It uses a two step process: a first pass that gathers candidate passages and a second pass called reranking that reorders them for accuracy. The authors argue this two stage method holds up well as the amount of data grows, where a single step search would get worse. The retrieval relies on their own models, BCEmbedding for the first stage and a matching reranker for the second, and the document includes comparison tables showing how these score against other models. The answering itself is built on the open Qwen language model, fine tuned on many question answering datasets. The rest of the README covers getting started: prerequisites, installation through a pure Python setup or Docker, offline install, an FAQ, and usage with an API. The full README is longer than what was shown.

Copy-paste prompts

Prompt 1

Install QAnything using Docker and set up a knowledge base from a folder of PDF files so I can ask questions about them in my browser.

Prompt 2

How does QAnything's two-stage retrieval work, and how do I tune the reranking step to get more accurate answers from large document collections?

Prompt 3

Show me how to use the QAnything API to send a question and get an answer with source citations from my uploaded documents.

Prompt 4

Set up QAnything in fully offline mode with no internet connection so my confidential company documents never leave the server.

Open on GitHub → Explain another repo

← netease-youdao on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.