explaingit

apurvthapa/natural-language-to-mysql-query-fastapi

0PythonAudience · developerComplexity · 3/5ActiveSetup · moderate

TLDR

FastAPI service that turns plain-English questions about a shipping database into validated read-only MySQL queries via an LLM and returns results as JSON.

Mindmap

mindmap
  root((nl2sql-shipping))
    Inputs
      English questions
      MySQL schema
      OpenAI API key
    Outputs
      Generated SQL
      JSON result rows
    Use Cases
      Ask a shipping DB in English
      Block unsafe AI-written SQL
      Demo LLM SQL pipelines
    Tech Stack
      FastAPI
      Python
      MySQL
      LangChain
      OpenAI

Things people build with this

USE CASE 1

Let non-SQL business users query a shipments database in plain English

USE CASE 2

Study a safe natural-language-to-SQL pipeline with intent routing and validation

USE CASE 3

Build a read-only analytics API over a MySQL schema you already have

USE CASE 4

Demo SQL injection and prompt injection filters on top of an LLM SQL generator

Tech stack

FastAPIPythonMySQLLangChainOpenAISQLAlchemyPandas

Getting it running

Difficulty · moderate Time to first run · 30min

Needs a running MySQL with the three shipping tables loaded and an OpenAI API key before any query works.

In plain English

This project is a small web service that lets a person ask questions about a shipping database in plain English, and have the answer come back as structured data. The README frames the problem this way: business users often need to look up facts in a database but do not know SQL, the language databases speak. The service translates a question like "top delayed routes" or "most active destination ports" into the matching SQL query, runs it against a MySQL database of shipments and ports, and returns the results. The pipeline behind the scenes has several steps. When a request comes in, an intent classifier first identifies which tables in the database are likely relevant. The schema for those tables is then loaded as context. A large language model (such as one from OpenAI) generates a SQL query based on the question and the schema. The query goes through a validation layer that checks it is safe to run, then it is executed against a read-only connection to the database, and the result is returned as JSON. The database is built around three connected tables. The Shipments table is the central record of movement, with route, cargo type, status, delay, and departure and arrival timestamps. The Ports table stores port names, countries, and regions, and is used to group results by location. The Ships table holds vessel names, types, operator companies (with examples like Maersk, MSC, and COSCO), origin country, and capacity measured in TEU (a standard container unit). The three tables are linked by foreign keys so the API can join them automatically. The README puts emphasis on safety, because letting an AI write database queries is a risk. Only SELECT queries (read-only ones) are allowed, common SQL injection patterns are blocked, prompt injection attempts are filtered, and SQL comments and multi-statement queries are rejected. The database connection itself is read-only as a second line of defence. The tech stack is FastAPI for the web server, Python as the language, MySQL as the database, LangChain to orchestrate the language model calls, the OpenAI API as the model, and SQLAlchemy with Pandas for data handling. The backend is deployed on Render, a hosting platform. The README lists planned future work, including query caching, a visualisation dashboard, SQL explanation support, and streaming responses.

Copy-paste prompts

Prompt 1
Walk me through wiring this FastAPI service to my own MySQL schema with three joined tables
Prompt 2
Explain the intent classifier step and show how it picks which tables get loaded as context
Prompt 3
Show me the SQL validation layer that blocks injection and rejects multi-statement queries
Prompt 4
Add a query cache so repeated questions reuse the last generated SQL and result set
Prompt 5
Swap the OpenAI provider for a local model through LangChain and keep the safety filters intact
Open on GitHub → Explain another repo

Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.