Fine-tune ChatGLM-6B on a custom instruction-following dataset using LoRA to create a domain-specific AI assistant without training from scratch.
Run the included Jupyter notebook on Google Colab to try ChatGLM fine-tuning without setting up a local GPU environment.
Load the author's pre-trained LoRA weights from Hugging Face to run inference on a customized ChatGLM-6B model immediately.
Adapt the data conversion scripts to convert your own Q&A or instruction dataset into the line-by-line JSON format needed for training.
Requires a CUDA-capable GPU with at least 16 GB VRAM (24 GB recommended), CPU-only training is not practical.
ChatGLM-Tuning is a Python project for fine-tuning ChatGLM-6B, an open-source bilingual Chinese and English language model developed by Tsinghua University, using a technique called LoRA. LoRA (Low-Rank Adaptation) lets you customize a large AI language model at a fraction of the usual cost by training only a small set of extra parameters rather than updating the entire model. The goal is to produce a more affordable alternative to building a ChatGPT-style assistant from scratch. The README is written primarily in Chinese. The training data used comes from the Alpaca dataset, a collection of instruction-following examples created by Stanford researchers. The project provides a Jupyter notebook so people with access to Google Colab can try it without setting up a local environment. Running the fine-tuning locally requires a GPU with at least 16 GB of video memory (24 GB or more is recommended), Python 3.8 or above, and a CUDA environment for deep learning. The process has two steps before training begins: first converting the Alpaca data into a line-by-line JSON format, then tokenizing it using ChatGLM's own tokenizer. The main training script accepts options for batch size, learning rate, number of steps, and where to save checkpoints. For those who do not want to train from scratch, the author has published two pre-trained LoRA weight sets on Hugging Face: one trained on the English Alpaca dataset and one trained on a combined Chinese and English version. A second notebook demonstrates how to load these weights for inference. The project also includes placeholder sections for a reward model and a reinforcement learning step (RLHF), which are part of the pipeline used to align language model outputs with human preferences, but those stages were not yet implemented at the time of writing.
← mymusise on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.