explaingit

owshq-mec/ws-1-llama-index-rag

11PythonAudience · developerComplexity · 4/5Setup · moderate

TLDR

A workshop demo project showing how to build an AI question-answering system that retrieves answers from three databases at once: a SQL database, a vector search store, and a graph database.

Mindmap

mindmap
  root((DataOps Knowledge Hub))
    What it does
      RAG question answering
      Multi-database retrieval
    Databases
      PostgreSQL SQL queries
      Qdrant vector search
      Neo4j graph queries
    Tech Stack
      LlamaIndex orchestration
      FastAPI API layer
      Docker services
      Pydantic validation
    Learning Goal
      Workshop exercise
      RAG patterns demo
      Multi-source AI
    Setup
      Copy env file
      Add OpenAI key
      Docker Compose up
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Learn how to build a RAG system that translates natural-language questions into SQL and queries a relational database.

USE CASE 2

Experiment with combining vector search, graph traversal, and SQL in a single AI answer pipeline.

USE CASE 3

Use this as a starting template for a question-answering app that retrieves from multiple storage backends at once.

Tech stack

PythonLlamaIndexPostgreSQLQdrantNeo4jFastAPIDocker

Getting it running

Difficulty · moderate Time to first run · 30min

Requires an OpenAI API key and Docker installed to spin up PostgreSQL, Qdrant, and Neo4j services together.

In plain English

This repository contains a demonstration project called DataOps Knowledge Hub, which is a system that answers questions by pulling information from multiple types of databases at once. The technique is called RAG (Retrieval-Augmented Generation), which means an AI language model is paired with a retrieval layer so it can look up real data before answering rather than relying only on what it learned during training. The system connects to three different storage backends. A PostgreSQL relational database handles factual and transactional queries using a text-to-SQL approach, where natural-language questions are translated into database queries automatically. A Qdrant vector database handles semantic search over documents and logs, finding content that is conceptually similar to a question even when exact keywords do not match. A Neo4j graph database handles relationship and lineage queries, which are questions about how entities connect to each other. The technology stack includes LlamaIndex for the retrieval orchestration, Pydantic for data validation, FastAPI for the API layer, and Docker to run all the services together. Setup involves copying an environment file, adding an OpenAI API key, and running one command to start everything. The project is labeled as Workshop 1 of a training program called AIDE Brasil Formation, so it is primarily a learning exercise demonstrating how to combine these tools rather than a finished product. The README is brief and does not go into deeper usage details.

Copy-paste prompts

Prompt 1
Using this LlamaIndex RAG project, write a query that asks which customers ordered product X and routes it to PostgreSQL with text-to-SQL.
Prompt 2
Show me how to add a new document to the Qdrant vector store in this project so the AI can answer semantic questions about it.
Prompt 3
Walk me through how the DataOps Knowledge Hub decides which of the three databases to query for a given user question.
Prompt 4
Help me extend this project to add a fourth retrieval source, a simple in-memory FAQ list alongside the three existing databases.
Open on GitHub → Explain another repo

← owshq-mec on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.