explaingit

saif658/llmstats

Analysis updated 2026-05-18

1PythonAudience · developerComplexity · 2/5LicenseSetup · easy

TLDR

A zero-infrastructure tool that benchmarks 47 free AI models from OpenRouter, Groq, and Mistral every 3 hours via GitHub Actions and publishes a live comparison dashboard to GitHub Pages.

Mindmap

mindmap
  root((LLMstats))
    What It Does
      Benchmarks 47 free models
      Runs every 3 hours
    Providers
      OpenRouter 23 models
      Groq 10 models
      Mistral 14 models
    Dashboard Views
      Overview leaderboard
      Explorer compare
      Timeline runs
    Setup
      Fork add API keys
      Enable GitHub Pages
    Tech
      Python runner
      GitHub Actions Pages
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

What do people build with it?

USE CASE 1

Fork the repo to track how free AI model speeds and reliability change over time without managing any servers.

USE CASE 2

Use the leaderboard to pick the fastest free model for your current prototype before committing to a paid API.

USE CASE 3

Compare two specific models head-to-head to decide which one to use in a side project.

What is it built with?

PythonGitHub ActionsGitHub PagesHTMLJavaScript

How does it compare?

saif658/llmstatsa-bissell/unleash-liteabhiinnovates/whatsapp-hr-assistant
Stars111
LanguagePythonPythonPython
Setup difficultyeasyhardhard
Complexity2/54/53/5
Audiencedeveloperresearcherdeveloper

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · easy Time to first run · 5min

Requires three free API keys (OpenRouter, Groq, Mistral) added as GitHub repository secrets before the first run.

MIT license, use freely for any purpose, including commercial, as long as you keep the copyright notice.

In plain English

LLMstats is a tool that automatically tests and compares 47 AI language models from three free-tier providers, then publishes the results to a live public dashboard that refreshes roughly every three hours. The providers covered are OpenRouter (which routes to models from OpenAI, Meta, NVIDIA, Qwen, Google, and others), Groq, and Mistral. The entire system runs on GitHub infrastructure at no cost. A scheduled job fires every three hours, sends test requests to each of the 47 models, records how fast they respond and whether they succeed, then saves the results. A companion job builds a static website from those results and publishes it to GitHub Pages. There is no server to manage or database to maintain. The live dashboard has five views. The Overview shows summary cards and success-rate trend charts. The Leaderboard ranks models by a composite score and lets you sort by speed, throughput, or reliability, with a chip showing which provider each model comes from. The Explorer lets you drill into a single model to see its response-time history and error breakdown. The Timeline shows the history of each three-hour run. The Compare view lets you pick two models and see them side by side. Anyone can fork the repository and run the same dashboard for their own data. Setup takes fewer than five minutes: fork the repo, add three free API keys as GitHub repository secrets, enable GitHub Pages, and trigger the first benchmark manually. From that point the cron job keeps the data current automatically. The project uses Python for the benchmark runner and plain HTML plus JavaScript for the dashboard. The architecture was originally inspired by a similar project called NIMStats, rebuilt to cover multiple providers side by side.

Copy-paste prompts

Prompt 1
I forked LLMstats and added my API keys but the dashboard is showing no data after the first run. Walk me through debugging the GitHub Actions workflow step by step.
Prompt 2
I want to add a new model from Groq to the LLMstats benchmark. Show me exactly which files to edit and what to change.
Prompt 3
Explain what the Composite Score in the LLMstats leaderboard measures and how I should use it to choose a model for my app.
Prompt 4
I want to fork LLMstats and run it only for Mistral models. How do I remove the OpenRouter and Groq jobs from the workflow?

Frequently asked questions

What is llmstats?

A zero-infrastructure tool that benchmarks 47 free AI models from OpenRouter, Groq, and Mistral every 3 hours via GitHub Actions and publishes a live comparison dashboard to GitHub Pages.

What language is llmstats written in?

Mainly Python. The stack also includes Python, GitHub Actions, GitHub Pages.

What license does llmstats use?

MIT license, use freely for any purpose, including commercial, as long as you keep the copyright notice.

How hard is llmstats to set up?

Setup difficulty is rated easy, with roughly 5min to a first successful run.

Who is llmstats for?

Mainly developer.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub saif658 on gitmyhub

Verify against the repo before relying on details.