explaingit

eugeneyan/open-llms

12,765Audience · developerComplexity · 1/5LicenseSetup · easy

TLDR

A curated reference list of large language models licensed for commercial use, covering model size, context length, download links, and license type for each entry.

Mindmap

mindmap
  root((repo))
    What it does
      List open LLMs
      Track licenses
    License types
      Apache 2.0
      MIT
      OpenRAIL-M
    Model details
      Parameter count
      Context length
      Download links
    Audience
      Developers
      Businesses
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Find which open language models are licensed for commercial use before building a product

USE CASE 2

Compare model sizes and context lengths to pick the right model for your compute budget

USE CASE 3

Access direct download links for pretrained model weights hosted on Hugging Face

USE CASE 4

Track the history of publicly available language model releases from 2019 through 2023

Tech stack

Markdown

Getting it running

Difficulty · easy Time to first run · 5min
Freely shared reference document, individual model licenses vary per entry, check each model's listed license before using it commercially.

In plain English

open-llms is a curated reference list of large language models (LLMs) that are licensed for commercial use. Each entry in the list covers a specific model: when it was released, where to download the trained weights (called checkpoints), links to the research paper or announcement blog post, how many parameters the model has, how much text it can process at once (context length), and which license it uses. All models included have licenses that permit commercial use, such as Apache 2.0, MIT, or OpenRAIL-M. This makes the list practical for developers and businesses who want to build products with AI language capabilities but need to verify intellectual property terms first. The list covers a wide range, from small efficient models under one billion parameters to large ones such as Bloom at 176 billion parameters, spanning releases from 2019 through 2023. This is not software you run or a framework you install. It is a reference document maintained as a GitHub repository so anyone can contribute additions or corrections via pull requests. Each row in the main table links out to the actual model weights (hosted on sites like Hugging Face) and to the papers explaining how each model was built. The repository is primarily useful for teams evaluating which open, commercially licensed model to use in a product, or for researchers tracking what is publicly available. Community contributions are welcome. The full README is longer than what was shown.

Copy-paste prompts

Prompt 1
Using the open-llms list, find me all models with an Apache 2.0 license that have at least 7 billion parameters and a context length of 4096 tokens or more.
Prompt 2
I want to run an open LLM locally on a MacBook with 16GB of RAM. Based on the open-llms list, which models are small enough to fit in memory and commercially licensed?
Prompt 3
From the open-llms reference list, give me a timeline of when each major model family was released and which organization published it.
Prompt 4
I need a commercially licensed LLM for a text classification task. Based on the open-llms list, recommend the best model under 13B parameters and explain the tradeoff versus a larger model.
Prompt 5
Using the open-llms list as a starting point, write a Python script that downloads the model card metadata for each listed Hugging Face model and outputs a CSV with name, license, and parameter count.
Open on GitHub → Explain another repo

← eugeneyan on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.