explaingit

mooler0410/llmspracticalguide

10,181Audience · researcherComplexity · 1/5Setup · easy

TLDR

A curated reference guide for large language models: a family tree showing how models like GPT-4, BERT, and LLaMA relate historically, plus organized paper lists and guidance on picking the right model for tasks like summarization or code generation.

Mindmap

mindmap
  root((LLM Guide))
    Model families
      BERT-style models
      GPT-style models
    Timeline
      Early models
      Recent models
    Tasks
      Summarization
      Code generation
      Translation
      Question answering
    Licensing
      Commercial use
      Research only
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Look up which type of language model, BERT-style or GPT-style, works best for a specific task like classification or text generation.

USE CASE 2

Check which LLMs allow commercial use before building a product that depends on one.

USE CASE 3

Trace the historical lineage of AI language models to understand how the field has evolved over time.

USE CASE 4

Find the original research paper for any major language model released in recent years.

Getting it running

Difficulty · easy Time to first run · 5min

In plain English

This repository is a curated collection of resources about large language models (LLMs), built as a companion to a research survey paper. It functions as a reference guide that maps out the major AI language models and how to apply them in practice. The centerpiece is a family tree diagram showing how different language models relate to each other historically, tracing a lineage from early models like BERT and GPT through to more recent ones like GPT-4 and LLaMA. The tree shows which systems descended from or were influenced by earlier work, giving readers a timeline of how the field developed over several years. The resource catalog is organized into three main areas. For models, it separates "BERT-style" systems, which are generally better at understanding and classifying text, from "GPT-style" systems, which are generally better at generating new text. For each model, the repository links to the original research paper. For data, it covers guidance on pretraining data, fine-tuning data, and test data. For specific use cases, it lists guidance on tasks like summarization, question answering, translation, and code generation, noting which types of models tend to perform well on which tasks. The guide also includes a section on usage restrictions, documenting which models allow commercial use and which are limited to research purposes. AI model licensing varies considerably, and this section helps practitioners understand what they can and cannot do with a given model before building on it. This is a reference resource, not a software tool. It contains no runnable code. Its value is as an organized index of papers and context for choosing an LLM for a specific task. The full README is longer than what was shown.

Copy-paste prompts

Prompt 1
Based on this guide's breakdown of BERT-style vs GPT-style models, which type should I use to classify customer support emails into categories, and why?
Prompt 2
I'm building a commercial product and need an open-source LLM. Using the license section of this guide, list the models I can use for commercial purposes.
Prompt 3
Using the LLM family tree from this guide, explain in plain English how GPT-4 descends from and differs from GPT-2 and BERT.
Prompt 4
I want to add question-answering to my app. Which models does this guide recommend for that task, and what are their tradeoffs?
Open on GitHub → Explain another repo

← mooler0410 on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.