Analysis updated 2026-05-18
Build a support chatbot for your product's help documentation that gives exact, verifiable answers without ever calling an AI at runtime.
Create an FAQ bot for a hotel, venue, or service business by feeding policy documents to the preparation script.
Add a deterministic Q&A layer to an internal knowledge base so employees get consistent answers to policy questions.
Replace a costly generative RAG pipeline with a retrieval-only system to eliminate per-query API costs.
| emilresearch/ragless | captaingrock/krea2trainer | codenamekt/hexus | |
|---|---|---|---|
| Stars | 7 | 7 | 7 |
| Language | Python | Python | Python |
| Setup difficulty | moderate | hard | moderate |
| Complexity | 3/5 | 4/5 | 3/5 |
| Audience | developer | designer | developer |
Figures from each repo's GitHub metadata at analysis time.
Requires a free Gemini API key and one LLM-powered ingestion run per document set before the chatbot is usable.
RAGless is a question-answering system for your own documents that works without calling an AI model at query time. Most documentation chatbots send each user question to a language model at runtime, which costs money, introduces lag, and sometimes produces incorrect answers. RAGless takes a different approach: an AI model is used only once during setup to convert your documents into pre-written question-and-answer pairs, and all subsequent user queries are handled by fast local vector search with no API calls. The workflow has three steps. First, you run a preparation script that reads your PDF, text, or Markdown files and uses the Gemini API to extract structured question-and-answer blocks from them. Each block contains the answer text, several ways a user might phrase the question, and a source quote. Second, an ingestion script turns those blocks into vector embeddings and loads them into a local Qdrant database stored on your disk. Third, users interact with a command-line chatbot that compares their question to the stored question vectors, sums up scores across multiple phrasing variants of the same answer, and returns the best matching pre-written answer verbatim. Because answers are fixed at ingestion time, the system cannot hallucinate at runtime. It also cannot generate novel answers or combine information from multiple sources into a single synthesized response. It works best for support documentation, FAQ content, or policy documents where there is a known, finite set of correct answers. The README includes a detailed comparison table showing where this approach beats classic retrieval-augmented generation and where it falls short. Setup requires Python 3.10 or newer and a free Gemini API key from Google. There is no Docker requirement: Qdrant runs embedded directly on your machine. Installing the Python dependencies, configuring the API key in a .env file, placing your documents in the source folder, and running the three scripts in sequence is the full setup process. The tool is aimed at developers who need a reliable, low-cost Q&A layer over a stable knowledge base and want deterministic behavior rather than generated responses.
A document Q&A system that uses AI only during setup to create answer blocks, then answers questions at runtime via fast local vector search with zero AI calls, zero hallucinations, and near-zero cost.
Mainly Python. The stack also includes Python, Qdrant, Gemini API.
No license information was mentioned in the README.
Setup difficulty is rated moderate, with roughly 30min to a first successful run.
Mainly developer.
This repo across BitVibe Labs
Verify against the repo before relying on details.