explaingit

edualvarado/obstero

Analysis updated 2026-05-18

2PythonAudience · researcherComplexity · 3/5LicenseSetup · moderate

TLDR

A three-stage Python pipeline that classifies your Zotero papers with Claude, generates structured research summaries from PDFs, and syncs them as interlinked Markdown notes into your Obsidian vault.

Mindmap

mindmap
  root((Obstero))
    Pipeline stages
      Classify papers
      Summarize PDFs
      Sync to Obsidian
    What Claude does
      Assigns folders and tags
      Writes research notes
    Output
      Obsidian Markdown notes
      Wiki-links between concepts
      Zotero tracking tags
    Setup
      Zotero API key
      Anthropic API key
      Config JSON files
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

What do people build with it?

USE CASE 1

Automatically classify and tag hundreds of unread Zotero papers into the right folders without reviewing each one manually.

USE CASE 2

Generate AI-written research summaries for every PDF in your Zotero library and save them as structured Obsidian notes.

USE CASE 3

Build a connected Obsidian knowledge graph from your Zotero library where concepts auto-link across papers.

USE CASE 4

Keep your Obsidian vault in sync with new Zotero additions by running the full pipeline with a single command.

What is it built with?

PythonZotero APIClaude APIObsidian

How does it compare?

edualvarado/obstero0-bingwu-0/live-interpreter0xkaz/llm-governance-dashboard
Stars222
LanguagePythonPythonPython
Setup difficultymoderatemoderatehard
Complexity3/52/54/5
Audienceresearchergeneralops devops

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · moderate Time to first run · 1h+

Requires a Zotero API key, an Anthropic API key, and generating two personal config JSON files before the first run.

Use freely for any purpose, including commercial use, as long as you keep the copyright notice.

In plain English

Obstero is a set of Python scripts that connects two research tools: Zotero, a reference manager that stores papers and PDFs, and Obsidian, a note-taking app that links notes together in a graph. The pipeline takes what is in your Zotero library and turns it into a structured, interlinked collection of Obsidian notes, with Claude doing the AI work in the middle. The process runs in three stages. The first stage classifies unclassified Zotero items: it sends each paper's metadata to Claude, which decides what folder the item belongs in and assigns three to five tags. The second stage extracts text from each PDF and asks Claude to write a structured research note covering the paper's core contribution, technical approach, limitations, and potential research or startup angles. That note gets attached to the item in Zotero. The third stage converts the summarized Zotero items into Markdown files in your Obsidian vault, mirroring the same folder structure from Zotero and automatically inserting wiki-style links to concepts you have defined. Safety is built into the default behavior: every stage runs in dry-run mode unless you explicitly pass a --live flag. Each stage also tracks which items it has already processed using internal Zotero tags, so running a stage multiple times never reprocesses the same paper. Setup requires a Zotero account with an API key, an Anthropic API key for Claude, and an existing Obsidian vault folder. You also create two personal configuration files: one that maps your Zotero folder names to their internal IDs, and one that lists the concepts you want auto-linked in Obsidian. Both can be generated automatically by helper scripts included in the project. You can run each stage separately or use an orchestrator script that runs all three in order. If you use Claude Code, the project also includes Claude Code skills that wrap each stage as a named command you can invoke directly from your coding assistant. MIT-licensed.

Copy-paste prompts

Prompt 1
Set up Obstero: help me fill in my .env file, generate collections.json from my Zotero account, and run Stage 1 in dry-run mode to preview classification.
Prompt 2
I have 50 new papers in Zotero's Unclassified folder. Run Obstero Stage 1 --live to classify them, then Stage 2 --live to summarize up to 50 papers.
Prompt 3
Run discover_primitives.py to find my most-used Obsidian wiki-links and add them to primitives.json so Stage 3 can auto-link them.
Prompt 4
I want to customize the Claude prompt in src/llm_api.py to also extract funding sources from each paper. Show me where to edit the summarize_paper function.
Prompt 5
Run the full Obstero pipeline on my Zotero library in dry-run mode first so I can review what it will change before going live.

Frequently asked questions

What is obstero?

A three-stage Python pipeline that classifies your Zotero papers with Claude, generates structured research summaries from PDFs, and syncs them as interlinked Markdown notes into your Obsidian vault.

What language is obstero written in?

Mainly Python. The stack also includes Python, Zotero API, Claude API.

What license does obstero use?

Use freely for any purpose, including commercial use, as long as you keep the copyright notice.

How hard is obstero to set up?

Setup difficulty is rated moderate, with roughly 1h+ to a first successful run.

Who is obstero for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub edualvarado on gitmyhub

Verify against the repo before relying on details.