explaingit

meta-llama/llama

59,426PythonAudience · developerComplexity · 3/5StaleSetup · hard

TLDR

Deprecated repository that originally provided Meta's Llama 2 language model weights and inference code. Now redirects to newer maintained repositories.

Mindmap

mindmap
  root((repo))
    What it was
      Llama 2 inference
      Model weights
      PyTorch code
    Model sizes
      7 billion params
      13 billion params
      70 billion params
    Hardware needs
      Single GPU small
      Multiple GPUs large
      CUDA required
    Current status
      Deprecated
      No longer maintained
      Redirects elsewhere
    Where to go now
      llama-models repo
      PurpleLlama safety
      llama-cookbook examples

Things people build with this

USE CASE 1

Run Llama 2 language model locally on your own hardware without API costs.

USE CASE 2

Fine-tune or adapt Llama 2 for custom tasks using the provided inference framework.

USE CASE 3

Research and experiment with open-weights large language models at different scales.

Tech stack

PythonPyTorchCUDAtorchrun

Getting it running

Difficulty · hard Time to first run · 1day+

Repository is deprecated and redirects elsewhere; obtaining Llama 2 weights requires Meta access approval and significant CUDA/PyTorch setup.

License could not be detected automatically. Check the repository's LICENSE file before use.

In plain English

This repository was the original home for Meta's Llama 2 language model inference code, but it is now deprecated. The README itself explains that Meta has consolidated its model repositories and this one is no longer being maintained. The original purpose was to provide the model weights and minimal Python code needed to load and run Llama 2, which was Meta's open-weights large language model ranging from 7 billion to 70 billion parameters. A large language model, or LLM, is an AI system trained on vast amounts of text that can generate coherent, contextually appropriate responses to prompts and questions. When this repository was active, you would download the model weights from Meta's website after accepting a license agreement, then use a command called torchrun to launch the model and send it text prompts to complete or answer. The inference code used PyTorch as the deep learning framework and required CUDA-capable hardware for the larger model sizes. Different model sizes required different numbers of GPUs to run, with the smallest 7-billion-parameter version fitting on a single GPU and the 70-billion-parameter version requiring eight. The project's primary usefulness was giving researchers and developers access to a capable open-weights model they could run locally and adapt without API costs. The README now directs users to newer, actively maintained repositories including llama-models, PurpleLlama for safety tooling, and llama-cookbook for practical usage examples. You would only encounter this repository when following older tutorials or tracing the history of the Llama model family.

Copy-paste prompts

Prompt 1
How do I set up the original Llama 2 inference code from this deprecated repository?
Prompt 2
What are the GPU requirements for running different sizes of Llama 2 models?
Prompt 3
Where should I go now that this Llama repository is no longer maintained?
Prompt 4
How do I load Llama 2 model weights and run inference with PyTorch and torchrun?
Open on GitHub → Explain another repo

Generated 2026-05-18 · Model: sonnet-4-6 · Verify against the repo before relying on details.