Generate a multi-report literature synthesis for any PubMed-indexed condition
Run an arbiter-reconciled extraction over thousands of abstracts in about an hour
Build a Supabase-backed corpus with quote-level provenance for medical claims
Produce Vancouver-cited PDF briefs for pharma due diligence
Needs Anthropic API keys plus a Supabase project with pgvector, and a single run costs roughly 85 to 100 USD in model calls.
This project is an automated pipeline that reads large amounts of medical research papers and produces three written reports: one for researchers, one for pharmaceutical companies looking at drug targets, and one short summary for non-specialists. It is described as disease-agnostic, meaning it works for any condition that can be searched in PubMed Central. The main demonstration runs on Long COVID with about 4,666 papers processed in roughly one hour, and the README notes it has also been tested on Narcolepsy and Prostatic Neoplasms. The pipeline collects papers from PubMed Central and from medRxiv, then uses Anthropic's Claude Haiku model to read each abstract and pull out the study design, sample size, headline finding, and confidence score into a structured form. It then picks the top papers by a weighted score, downloads the full text from PubMed Central's open-access set, and runs a deeper extraction using Claude Sonnet. The deep extraction grades the paper on several standard scales used in evidence-based medicine, records effect sizes, and stores at least five literal quotes from the paper as proof for any claim it makes. The headline change in version 3.0 is that two separate model runs read each paper at different temperature settings, and a third model run reconciles them. The README says this removes a bias where the model anchors on its first reading. Other additions include splitting papers by their XML section tags so that the discussion and limitations are not cut off, linking extracted concepts to UMLS and MeSH medical vocabularies, computing Cohen's Kappa agreement against human ratings, and labelling each section of the final report as model inference, deterministic calculation, or arbiter consensus. Results are stored in Supabase tables for papers, extractions, provenance quotes, and contradictions, with pgvector embeddings. The system then pools effect sizes across studies, runs sensitivity checks, and produces Markdown reports that are converted to HTML and PDF with Vancouver-style citations. The README also describes a Flask user interface that runs a separate worker process so the page stays responsive during long runs, and notes a per-run cost of around 85 to 100 US dollars.
Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.