explaingit

2arons/llm-cli

11PythonAudience · developerComplexity · 2/5ActiveLicenseSetup · easy

TLDR

A small Python CLI for sending prompts to LLMs from the terminal, with model picking, streaming, stdin piping, and OpenAI, Anthropic, or Ollama backends.

Mindmap

mindmap
  root((llm-cli))
    Inputs
      Prompt string
      Stdin pipe
      System prompt
    Outputs
      Streaming text
      Final answer
    Use Cases
      Quick shell prompts
      Pipe files to LLMs
      Test local models
    Tech Stack
      Python
      pip
      OpenAI
      Anthropic
      Ollama

Things people build with this

USE CASE 1

Send a one-shot prompt to GPT-4 from your shell

USE CASE 2

Pipe a file into the CLI to summarize or transform it

USE CASE 3

Stream Claude responses token by token in the terminal

USE CASE 4

Talk to local llama2 or mistral via Ollama without leaving the prompt

Tech stack

PythonpipOpenAIAnthropicOllama

Getting it running

Difficulty · easy Time to first run · 5min

You still need to provide an API key for hosted models, or run Ollama locally.

MIT, very permissive, lets you reuse the code for almost anything as long as you keep the copyright notice.

In plain English

This project is a small command line tool, written in Python, that lets you send prompts to large language models without leaving your terminal. You install it with pip install llm-cli, then call it as llm-cli followed by a quoted prompt. The README shows examples such as asking it to explain recursion in Python, write a haiku about debugging, generate test data, or simply answer what is 2 plus 2. The tool offers a set of flags for changing how the request is made. You can pick a model with the model flag, adjust randomness with temperature on a 0.0 to 2.0 scale, cap reply length with max tokens, stream the answer token by token, read the prompt from standard input by piping a file in, or use a multi line input mode that ends on Ctrl D. A system flag lets you set a system prompt, and the API key can be passed as a flag or read from the LLM_API_KEY environment variable. Defaults can be set in two ways. Three environment variables, LLM_API_KEY, LLM_DEFAULT_MODEL, and LLM_API_BASE, control credentials, the default model, and the base URL of the API. The same values, along with temperature and max tokens, can also live in a JSON config file at the tilde slash .llm-cli slash config.json path. The README lists three model families it claims to support: OpenAI models gpt-4 and gpt-3.5-turbo, Anthropic models claude-3-opus and claude-3-sonnet, and local models served through Ollama, namely llama2 and mistral. The project is released under the MIT license and the README says pull requests are welcome but to open an issue first for big changes.

Copy-paste prompts

Prompt 1
Show me how to install llm-cli and run my first prompt against gpt-3.5-turbo
Prompt 2
Walk me through setting LLM_API_KEY, LLM_DEFAULT_MODEL, and LLM_API_BASE
Prompt 3
Help me build a shell function that pipes file contents into llm-cli with a system prompt
Prompt 4
Explain how the JSON config at ~/.llm-cli/config.json overrides env vars
Prompt 5
Generate a multi-line prompt session example using the Ctrl D input mode
Open on GitHub → Explain another repo

Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.