Fine-tune a small open-source coding model with QLoRA on a custom QA set
Convert a fine-tuned adapter into GGUF and package it for Ollama
Benchmark Ollama model variants on shared coding prompts for latency
Use the Rust rem-cli as a beginner helper for HTML, CSS, and shell commands
Real QLoRA training needs an NVIDIA GPU, plus Python 3.10 or newer and a working Ollama install for the export and evaluation steps.
This repository is a personal training pipeline for building a coding assistant model the author calls rem-coder. The work is organized into seven steps: pick an objective and hardware plan, prepare and validate training data, run a baseline evaluation, train a small QLoRA adapter using a library called Unsloth, merge that adapter back into the base model, export the result to GGUF format and package it for Ollama, then run a post-training evaluation and compare reports. Scripts for each phase live in the scripts folder, and one orchestrator script can run the whole flow end to end. In plain terms, the project starts from an existing open-source coding model (the README uses deepseek-coder 1.3b as the example) and tries to nudge it toward better coding answers by fine-tuning it on a small custom dataset of coding questions and answers. QLoRA is a memory-light way to teach a model new behavior without rewriting all of its weights. After training, the new model is converted into the GGUF file format so it can be run locally through Ollama, a tool for running language models on your own machine. A second piece, in the rem-cli folder, is a Rust command-line tool aimed at beginners. The README says it covers basic HTML and CSS coding help, safer guidance for terminal commands, and a patch preview workflow that shows file context before applying changes. The evaluation step scores each model response on whether it returned anything, whether it looks like code, whether the code parses or has balanced brackets for Python, JavaScript, TypeScript, or SQL, and how much it overlaps with a reference answer. A separate benchmarking script compares several Ollama model variants on shared prompts for latency and throughput. The project lists Python 3.10 or newer and Ollama as prerequisites, with an NVIDIA GPU recommended for true QLoRA training. At the time of this snapshot the repository has no description, no topics, and zero stars.
Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.