stanfordnlp/dspy

Analysis updated 2026-06-20

★ 34,238PythonAudience · developerComplexity · 3/5Setup · moderate

Mindmap

mindmap
  root((DSPy))
    Core Concepts
      Signatures
      Modules
      Optimizers
    What it does
      Auto-tunes prompts
      Chains LLM steps
      Fine-tunes weights
    Supported Models
      OpenAI GPT-4
      Claude
      Open-source models
    Use Cases
      RAG systems
      Multi-step reasoning
      QA pipelines
      Classifiers

mindmap root((DSPy)) Core Concepts Signatures Modules Optimizers What it does Auto-tunes prompts Chains LLM steps Fine-tunes weights Supported Models OpenAI GPT-4 Claude Open-source models Use Cases RAG systems Multi-step reasoning QA pipelines Classifiers

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Build a question-answering pipeline where prompts are automatically tuned to maximize accuracy instead of being hand-crafted.

USE CASE 2

Create a retrieval-augmented generation system where each step, retrieval, reasoning, answer, is a composable Python module.

USE CASE 3

Replace fragile hand-written prompts in a multi-step LLM pipeline with automatically optimized ones that adapt to model changes.

What is it built with?

PythonOpenAIAnthropic

How does it compare?

	stanfordnlp/dspy	zhulinsen/daily_stock_analysis	python-poetry/poetry
Stars	34,238	34,255	34,273
Language	Python	Python	Python
Setup difficulty	moderate	moderate	easy
Complexity	3/5	3/5	2/5
Audience	developer	data	developer

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · moderate Time to first run · 1h+

Requires an LLM API key and labeled examples for the optimizer to tune prompts effectively.

In plain English

DSPy is a Python framework from Stanford NLP that changes how you build applications powered by large language models (LLMs like GPT-4, Claude, or open-source models). The central idea is to replace hand-written prompts with structured Python code, then let DSPy automatically optimize those prompts or even fine-tune model weights to maximize performance on your specific task. The problem with the conventional approach to LLM development is that prompts are fragile and labor-intensive. A carefully crafted prompt that works well on one task may break with a different model, a slightly different question, or a new version of the same model. Developers end up spending enormous effort tweaking prompt wording rather than building better systems. DSPy treats the prompt as a hyperparameter, something to be automatically tuned, rather than something to hand-craft. The way it works is through "signatures" and "modules." A signature is a short, declarative description of what an LLM call should do (for example, "given a question and context, produce an answer"). Modules are composable building blocks you chain together in Python code to build multi-step pipelines, like a retrieval step followed by a reasoning step followed by an answer generation step. Once your pipeline is written, DSPy's optimizers analyze examples of correct outputs and iteratively refine the prompts and example demonstrations that each module uses, essentially teaching the model what good outputs look like for your specific task. You would use DSPy when building complex LLM systems, question-answering pipelines, retrieval-augmented generation (RAG) systems, multi-step reasoning agents, or classifiers, where prompt quality significantly affects results and you want a principled, automated way to improve them. The tech stack is Python, installable via pip. It supports multiple LLM backends including OpenAI, Anthropic, local models, and others.

Copy-paste prompts

Prompt 1

Write a DSPy Signature and module for a QA task: given a question and retrieved context, produce a 1-sentence answer. Then show how to compile it using 50 labeled examples with BootstrapFewShot.

Prompt 2

Build a DSPy pipeline that retrieves documents from a vector store, reasons over them with chain-of-thought, then generates a final answer. Show the full Python code with all three modules.

Prompt 3

Show me how to use DSPy's BootstrapFewShot optimizer to automatically select few-shot demonstrations for each of 3 chained modules using labeled training examples.

Prompt 4

Convert my existing OpenAI prompt template for a text classification task into a DSPy Signature and module, then switch the backend to a local model without changing the pipeline.

Frequently asked questions

What is dspy?

DSPy is a Python framework from Stanford NLP that lets you build LLM-powered apps using structured Python code instead of hand-crafted prompts, then automatically optimizes those prompts to maximize performance on your specific task.

What language is dspy written in?

Mainly Python. The stack also includes Python, OpenAI, Anthropic.

How hard is dspy to set up?

Setup difficulty is rated moderate, with roughly 1h+ to a first successful run.

Who is dspy for?

Mainly developer.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub stanfordnlp on gitmyhub

Verify against the repo before relying on details.