explaingit

dakfjalka/aler-distill

14PythonAudience · researcherComplexity · 4/5Setup · hard

TLDR

Official ICML 2026 research code for AlerDistill, a method that stops AI language models from forgetting old knowledge when fine-tuned on new data, using risky embedding detection and knowledge distillation repair.

Mindmap

mindmap
  root((AlerDistill))
    Problem
      Catastrophic forgetting
      Fine-tuning causes loss
    Method
      High-risk embedding search
      Knowledge distillation repair
      Frozen reference model
    Training
      Hydra config system
      Qwen3-4B model
      Chemistry QA data
    Evaluation
      HumanEval coding
      MMLU knowledge test
      Inference server
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Reproduce the AlerDistill ICML 2026 results on chemistry QA fine-tuning without losing scores on coding and knowledge benchmarks.

USE CASE 2

Apply the latent embedding search module to identify high-risk representations in your own model before a fine-tuning run.

USE CASE 3

Use the knowledge distillation repair step to pull an updated model back toward its original capabilities during training.

Tech stack

PythonHydraQwen3-4Breinforcement learning

Getting it running

Difficulty · hard Time to first run · 1day+

Requires GPU for training and a separate inference server process to run evaluation benchmarks during the training loop.

In plain English

This repository contains the official code for AlerDistill, a research method introduced in an ICML 2026 paper. The problem it addresses is a common issue in AI development: when you train a large language model on new information, it tends to forget things it previously learned. This is sometimes called catastrophic forgetting, and it makes it hard to keep updating a model over time without losing old capabilities. AlerDistill tackles this through two steps. First, it searches for "high-risk" internal representations inside the model, specifically prompt embeddings that are likely to cause forgetting when the model gets updated on new data. Second, it repairs the updated model by comparing it against a frozen copy of the original model, pulling it back toward the original behavior where necessary. This repair step uses a technique called knowledge distillation, where one model learns from another. The code is written in Python and uses a configuration system called Hydra, which lets you adjust training settings from the command line without editing files directly. The default setup trains a model called Qwen3-4B-Instruct on chemistry question data. During training, the code can spin up a separate inference server to evaluate the model as it learns, testing it on benchmarks like HumanEval (a coding task dataset) and MMLU (a broad knowledge test). The repository is structured around a main training script, a trainer that handles both the standard fine-tuning and the latent repair logic, a search module that finds the risky embeddings, and an evaluation suite. Configuration files live in a separate folder and cover the model, data, and repair settings. Outputs from each run are saved with timestamps, and checkpoints are stored so training can be resumed. This is a research release aimed at other AI researchers who want to reproduce the paper's results or build on the method. It is not a general-purpose tool for non-researchers, and the README does not describe a consumer-facing product.

Copy-paste prompts

Prompt 1
I want to reproduce the AlerDistill results using dakfjalka/aler-distill. Walk me through the hardware requirements and how to launch the default Qwen3-4B chemistry training run with Hydra.
Prompt 2
Explain how the search module in AlerDistill finds high-risk prompt embeddings. What signal does it use and how does the latent repair step use those embeddings to counteract forgetting during training?
Prompt 3
I want to apply AlerDistill to a different domain than chemistry. Which config files and data paths do I need to change to point the trainer at my own fine-tuning dataset?
Prompt 4
How does the evaluation suite in AlerDistill work? How do I start the separate inference server during training so it runs HumanEval and MMLU benchmarks as the model trains?
Open on GitHub → Explain another repo

← dakfjalka on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.