explaingit

liuhuanyong/qasystemonmedicalkg

7,267PythonAudience · researcherComplexity · 4/5Setup · hard

TLDR

A Python project that builds a Chinese medical knowledge graph of 44,000 entities and 300,000 relationships in Neo4j, then uses it to answer natural-language health questions without a large AI model.

Mindmap

mindmap
  root((Medical KG QA))
    Knowledge graph
      44000 entities
      300000 relationships
      Neo4j storage
    Entity types
      Diseases
      Drugs
      Symptoms
      Foods
    QA system
      Question classification
      Cypher query generation
      Natural language input
    Audience
      Researchers
      NLP students
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Build a question-answering system that looks up which drugs treat a disease or which symptoms might indicate a condition.

USE CASE 2

Study how to connect a Neo4j graph database to a natural-language question classifier for structured medical queries.

USE CASE 3

Use the knowledge graph as a dataset for Chinese medical NLP research or as a foundation for a medical chatbot.

Tech stack

PythonNeo4j

Getting it running

Difficulty · hard Time to first run · 1day+

Requires a locally running Neo4j instance and a multi-hour data loading step before the chatbot is ready to use.

In plain English

This project builds a medical knowledge graph focused on diseases, then uses that graph to power a question-and-answer chatbot. The entire pipeline, from collecting raw data to answering user questions, is built from scratch in Python. The source data comes from Chinese medical websites, and the project is documented primarily in Chinese. The knowledge graph contains roughly 44,000 medical entities organized into seven categories: diseases, symptoms, drugs, foods, diagnostic checks, hospital departments, and commercially available drug products. These entities are connected by about 300,000 relationships, capturing things like which drugs are commonly prescribed for a given disease, which foods are recommended or should be avoided, which tests are needed for diagnosis, which diseases frequently appear together, and which department a disease belongs to. All of this is stored in a Neo4j graph database, which is a type of database designed specifically for representing connections between things. The question-answering system sits on top of the graph. When a user types a question in natural language, the system classifies what type of question it is, then translates it into a graph database query to look up the answer. The supported question types cover most practical medical lookup needs: what are the symptoms of a disease, what might cause a given symptom, what should a patient eat or avoid, what drugs treat a condition, what tests diagnose a condition, how long does treatment typically take, and what is the cure rate. To run the project, you need a Neo4j database running locally, the appropriate Python dependencies installed, and then run the graph-building script to load all the data, which takes several hours due to the volume. After that, a chat script starts the question-answering interface. This is a tutorial and research project, not a production medical service. It demonstrates how knowledge graphs can be combined with natural-language question classification to answer structured queries without requiring a large language model.

Copy-paste prompts

Prompt 1
How does this medical knowledge graph QA system classify a user's natural-language question into a query type before searching Neo4j?
Prompt 2
I want to add a new relationship type to the medical knowledge graph, such as drug interactions. Which files and data structures do I need to modify?
Prompt 3
Show me how the system translates a question like 'what are the symptoms of diabetes' into a Cypher query for Neo4j.
Prompt 4
What Python dependencies do I need to install to run this project, and what is the order of steps to load the graph data and start the chatbot?
Open on GitHub → Explain another repo

← liuhuanyong on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.