explaingit

google-deepmind/alphafold

Analysis updated 2026-06-24 · repo last pushed 2026-04-22

14,586PythonAudience · researcherComplexity · 5/5MaintainedSetup · hard

TLDR

DeepMind's open-source inference code for AlphaFold 2, which predicts the 3D structure of a protein from its amino-acid sequence using deep learning.

Mindmap

mindmap
  root((alphafold))
    Inputs
      FASTA sequence
      Sequence databases
    Outputs
      Predicted 3D structure
      Confidence scores
    Use Cases
      Protein structure prediction
      Drug discovery research
      Multi-chain complex modeling
    Tech Stack
      Python
      JAX
      Docker
      CUDA
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

What do people build with it?

USE CASE 1

Predict the 3D structure of a single protein from its amino-acid sequence in a research lab.

USE CASE 2

Model a multi-chain protein complex with AlphaFold-Multimer for structural biology studies.

USE CASE 3

Reproduce the CASP15 baseline predictions shipped with the repo.

USE CASE 4

Build a pipeline that screens many proteins for predicted structure and confidence scores.

What is it built with?

PythonJAXDockerCUDATensorFlow

How does it compare?

google-deepmind/alphafoldpyodide/pyodidewifiphisher/wifiphisher
Stars14,58614,58814,593
LanguagePythonPythonPython
Last pushed2026-04-222025-02-04
MaintenanceMaintainedStale
Setup difficultyhardmoderatehard
Complexity5/54/54/5
Audienceresearcherdeveloperops devops

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · hard Time to first run · 1day+

Needs a Linux box with a modern NVIDIA GPU, Docker, NVIDIA Container Toolkit, and up to 3TB of disk for the sequence databases.

In plain English

AlphaFold is the open source code released by Google DeepMind for AlphaFold version 2, the system that predicts the three-dimensional shape of a protein from its amino-acid sequence. This repository contains the inference pipeline, meaning the part that takes a sequence and runs the trained model to produce a predicted structure. Model weights are downloaded separately. The package also includes AlphaFold-Multimer, an extension for predicting complexes made of more than one protein chain. The README notes that the multimer variant is a work in progress and is not expected to be as stable as the single-chain version. There is a technical note for an updated AlphaFold v2.3.0 and a CASP15 baseline set of predictions included with the repo. Running AlphaFold has heavy system requirements. You need a Linux machine, a modern NVIDIA GPU, and up to about 3 TB of disk space to store the genetic sequence databases it needs as input. The typical workflow is to install Docker and the NVIDIA Container Toolkit, clone the repo, run a download script that pulls roughly 556 GB of databases, build a Docker image, and then run a Python script pointing at a FASTA file containing the protein sequence you want to predict. The model relies on several public sequence databases such as BFD, MGnify, UniRef90, UniRef30, and the PDB, plus extra ones like UniProt and PDB seqres when running AlphaFold-Multimer. A reduced-database preset is offered for users who cannot store the full set. The repo points users to the AlphaFold paper for the scientific method and asks that work using the code cite that paper. The full README is longer than what was shown.

Copy-paste prompts

Prompt 1
Walk me through setting up AlphaFold on a Linux server with a single A100 GPU, including the Docker build and database download steps.
Prompt 2
Explain the difference between AlphaFold 2 single-chain mode and AlphaFold-Multimer with a small example of when to use each.
Prompt 3
Show me how to run AlphaFold with the reduced-database preset when I only have 500GB of disk space, with the exact command.
Prompt 4
Write a wrapper script that submits a folder of FASTA files to AlphaFold sequentially and collects the PDB outputs into one directory.
Prompt 5
Help me read AlphaFold's pLDDT and PAE confidence scores from the output and decide which predictions to trust.

Frequently asked questions

What is alphafold?

DeepMind's open-source inference code for AlphaFold 2, which predicts the 3D structure of a protein from its amino-acid sequence using deep learning.

What language is alphafold written in?

Mainly Python. The stack also includes Python, JAX, Docker.

Is alphafold actively maintained?

Maintained — commit in last 6 months (last push 2026-04-22).

How hard is alphafold to set up?

Setup difficulty is rated hard, with roughly 1day+ to a first successful run.

Who is alphafold for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Verify against the repo before relying on details.