cinnamon/kotaemon

Analysis updated 2026-05-18

★ 25,366PythonAudience · developerComplexity · 3/5LicenseSetup · moderate

Mindmap

mindmap
  root((kotaemon))
    What it does
      Chat with documents
      AI-powered search
      Citation tracking
    How it works
      Hybrid retrieval
      Keyword search
      Vector search
      PDF viewer
    Features
      Multi-user login
      Document sharing
      Image support
      Table extraction
    Tech stack
      Python
      Gradio
      Docker
    AI providers
      OpenAI
      Azure
      Groq
      Ollama local
    Use cases
      Research analysis
      Legal review
      Report extraction

mindmap root((kotaemon)) What it does Chat with documents AI-powered search Citation tracking How it works Hybrid retrieval Keyword search Vector search PDF viewer Features Multi-user login Document sharing Image support Table extraction Tech stack Python Gradio Docker AI providers OpenAI Azure Groq Ollama local Use cases Research analysis Legal review Report extraction

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Upload a collection of research papers and ask questions to extract key findings without reading each one manually.

USE CASE 2

Build a legal document review system where lawyers can query contracts and regulations with cited answers.

USE CASE 3

Create an internal knowledge base where team members ask questions about company reports and get instant answers with source references.

USE CASE 4

Extract information from tables and images in PDFs by asking natural language questions instead of manual data entry.

What is it built with?

PythonGradioDockerRAGVector search

How does it compare?

	cinnamon/kotaemon	keon/algorithms	black-forest-labs/flux
Stars	25,366	25,444	25,496
Language	Python	Python	Python
Setup difficulty	moderate	easy	hard
Complexity	3/5	1/5	4/5
Audience	developer	developer	researcher

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · moderate Time to first run · 30min

Requires Docker setup and vector database initialization, API key for LLM service likely needed.

Use freely for any purpose including commercial. Keep the notice and disclose changes to the patent grant.

In plain English

kotaemon is an open-source, self-hosted chat interface that lets you have conversations with your own documents using AI. It solves the problem of needing to search through large collections of PDFs, reports, or other files manually, instead, you upload your documents and ask questions in plain language, and the AI finds relevant passages and answers you. The technology behind it is called RAG, which stands for Retrieval-Augmented Generation. This means the AI doesn't just rely on its training knowledge, it first searches your uploaded documents to find relevant sections, then uses that retrieved content to generate an accurate, cited answer. kotaemon uses a hybrid retrieval approach, combining traditional keyword search with semantic (meaning-based) vector search, to improve the quality of what it finds. Answers come with citations, and you can see exactly which passages were used, highlighted directly in a built-in PDF viewer. The tool supports multiple AI providers, including OpenAI, Azure, Groq, and locally-run models via Ollama, and handles images, tables, and complex multi-step questions. It has a multi-user login system, supports private and shared document collections, and is built on Gradio (a Python framework for building web UIs). You can run it with Docker for the easiest setup. You would use kotaemon if you are a researcher, lawyer, analyst, or any knowledge worker who needs to quickly extract information from large document collections. The tech stack is Python.

Copy-paste prompts

Prompt 1

How do I set up kotaemon with Docker to chat with my own PDF documents using OpenAI?

Prompt 2

Show me how to configure kotaemon to use Ollama for local AI models instead of cloud providers.

Prompt 3

How can I enable multi-user access and document sharing in kotaemo so my team can collaborate on document analysis?

Prompt 4

What's the difference between keyword search and vector search in kotaemo's hybrid retrieval, and how do I tune it for better results?

Prompt 5

How do I extract tables and images from PDFs using kotaemo's chat interface?

Frequently asked questions

What is kotaemon?

Self-hosted chat interface that lets you ask questions about your own documents using AI, with answers backed by citations from the source material.

What language is kotaemon written in?

Mainly Python. The stack also includes Python, Gradio, Docker.

What license does kotaemon use?

Use freely for any purpose including commercial. Keep the notice and disclose changes to the patent grant.

How hard is kotaemon to set up?

Setup difficulty is rated moderate, with roughly 30min to a first successful run.

Who is kotaemon for?

Mainly developer.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub cinnamon on gitmyhub

Verify against the repo before relying on details.