Run a local AI code completion model that fills in gaps in existing code without sending data to a third-party API
Self-host a 7B coding model for fast autocomplete in a private development environment
Use Code Llama Instruct in a conversational style to ask coding questions and get detailed answers
Build a custom code review or generation pipeline using the 34B or 70B model for higher accuracy
Requires CUDA GPU with at least 12.5 GB VRAM for the smallest model and Meta download approval before you can start.
Code Llama is a family of large language models (AI systems trained on vast amounts of text and code) released by Meta, specialized for understanding and generating code. This repository contains the Python inference code, the scripts needed to load Code Llama model weights and run them locally to get predictions. The family comes in multiple flavors: base models (Code Llama) for code completion, Python-specialized models (Code Llama - Python) tuned further on Python code, and instruction-following models (Code Llama - Instruct) that you can prompt in conversational style to ask coding questions. Each flavor is available in sizes of 7 billion, 13 billion, 34 billion, and 70 billion parameters, larger models are generally more capable but require more memory and hardware. The 7B model requires about 12.55 GB of storage, while the 70B model requires about 131 GB. A notable feature is code infilling: the 7B and 13B base and instruct models can fill in a gap in existing code based on the surrounding context, useful for autocomplete-style features. All models support input contexts of up to 100,000 tokens, meaning they can consider large amounts of existing code when generating. To use the models, you request download access via Meta's website, download the weights, and run inference locally using PyTorch with CUDA (a GPU computing framework). This is for developers who want to run Code Llama on their own infrastructure rather than calling a hosted API. The full README is longer than what was provided.
← meta-llama on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.