Extract named entities from medical or legal text and receive structured Pydantic objects instead of raw strings
Classify customer feedback into custom categories using any LLM by changing one model name string
Define a custom Pydantic schema for any extraction task and let Promptify handle prompting and response parsing
Benchmark multiple LLMs or prompt strategies on your specific NER or classification task using built-in evaluation metrics
Requires Python 3.9+ and an API key for your chosen LLM provider (OpenAI, Anthropic, or a local Ollama instance).
Promptify is a Python library that makes it straightforward to use large language models for common text processing tasks without writing custom prompts or parsing code. You pick a task, point it at a language model, pass in some text, and get back a structured Python object instead of a raw string. The README describes it as something like scikit-learn for language model powered text work. The built-in tasks cover a range of standard text analysis needs: named entity recognition (identifying names, conditions, dates, and similar items in text), text classification into categories you define, question answering given a passage, summarization, relation extraction, table extraction, SQL generation from natural language, and topic modeling. Each task returns a typed Pydantic object, so the result has predictable fields with type checking rather than freeform text that needs further parsing. The library connects to language models through a backend called LiteLLM, which means you can swap between providers by changing a single model name string. The same NER class works with OpenAI models, Anthropic models, or locally running models through Ollama. Batch processing and async calls are both supported for handling multiple inputs efficiently. For custom work beyond the built-in tasks, you define a Pydantic schema describing the shape of output you want and pass it to a generic Task class. The library handles turning that schema into a prompt and parsing the model's response back into your defined structure. An evaluation framework is included that measures precision, recall, F1, accuracy, and other metrics against labeled test data, making it possible to compare models or prompt strategies on a specific task. Cost tracking is also built in to monitor token usage across calls. The project is released under the Apache 2.0 license and requires Python 3.9 or newer.
← promptslab on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.