Build chatbots that always respond in a specific format (JSON, structured fields, multiple choice).
Create multi-step AI workflows where each step validates its output before moving to the next.
Generate form responses or API payloads that must match exact schemas without manual cleanup.
Run complex reasoning tasks locally with guaranteed output structure using open-source models.
Requires installing transformers and llama.cpp dependencies, plus downloading a model file.
Guidance is a Python library for controlling what a large language model produces. With a normal model, you send a prompt and hope the model returns something in the shape you wanted; Guidance lets you describe that shape directly in code, so the output is guaranteed to fit. The README pitches this as a way to get higher-quality, structured output while reducing latency and cost compared with conventional prompting or fine-tuning. You write a program in regular Python that builds up a conversation with the model using context managers for system, user, and assistant turns, and the library streams generation back into the program where you can capture it by name. The key part is the constraint system. A generation call can be limited by a regular expression, so for instance you can force the model's reply to be only digits when you want a number. There is a select function that forces the answer to be one of a list of choices you supply in advance, useful for multiple-choice questions or any time the valid answers are known. More generally, the constraints can describe any context-free grammar, so you can compose small generation functions into a larger grammar. A Mock model lets you validate a grammar locally without hitting any real model API. The library is installed from PyPI and supports several backends, including Transformers, llama.cpp, and OpenAI. You would reach for Guidance when you are building something on top of a language model and need its output to be reliably parseable or to follow a strict format, for example structured data extraction, picking from a fixed set of options, or workflows that interleave control flow with generation.
Generated 2026-05-21 · Model: sonnet-4-6 · Verify against the repo before relying on details.