Learn LoRA fine-tuning by writing each module from a blank TODO template
Fine-tune Qwen2.5-0.5B on your own small Q&A dataset using this as a starter
Run a rank vs dataset-size ablation on a 6 GB GPU to feel where the limits are
Read a clean reference implementation of loss masking with -100 tokens for chat fine-tuning
Needs an NVIDIA GPU with at least 6 GB of VRAM plus separate downloads for the Qwen2.5-0.5B base model and the medical Q&A dataset; the repo only ships code.
MiniLoRA is a teaching project that walks Python and PyTorch users through fine-tuning a small language model on a medical question-and-answer dataset. The base model is Qwen2.5-0.5B-Instruct, a 498-million-parameter open model from Alibaba's Qwen family, and the technique is LoRA, a method for training only a small set of extra weights while keeping the main model frozen. The data comes from a public Chinese medical Q&A collection, and the project uses 800 sampled pairs split into 640 train, 160 validation, and 200 test examples. The repository is structured as seven modules, each tied to a Python script in the scripts folder. The flow runs from data cleaning and splitting, to formatting prompts into a chat-style messages structure and masking the loss so the model only learns from the assistant's reply, to configuring LoRA with parameters like rank and alpha, then training with Hugging Face's Trainer, generating answers with the trained adapter loaded on top of the base model, batch evaluation, and finally ablation experiments that vary rank and dataset size. The repo follows a learn-by-doing pattern. For each module there is both a finished reference script and a blank my_ template with TODO markers, so a student can read the reference, then close it and write their own version. The README also lays out short notes on how the messages format is constructed, how the loss mask is built with -100 tokens for the prompt portion, and the underlying LoRA equation, h equals Wx plus a small product of two trained matrices A and B scaled by alpha over r. To run the code you need Python 3.10 or newer, PyTorch 2.1 or newer, an NVIDIA GPU with at least 6 GB of VRAM, and about 3 GB of disk space. The repo only contains code, so users download the model and the dataset separately. A note for users in mainland China points to a Hugging Face mirror for faster downloads. The README also shares an ablation result: rank 4, 8, and 16 produced almost identical training loss, which suggests that for this dataset the limiting factor is the amount of data rather than the size of the LoRA adapter.
Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.