Download WizardCoder to run a local AI coding assistant that generates and explains code without sending data to a cloud service.
Use WizardMath to build a math tutoring tool that solves grade-school and competition problems step by step.
Run the Evol-Instruct training scripts to reproduce the method and apply it to your own custom dataset.
Running 33B or 70B parameter models requires one or more high-memory GPUs, even smaller sizes need significant VRAM.
WizardLM is a research project from Microsoft that produced a family of AI language models trained to follow complex instructions more reliably than earlier models of similar size. The project contains three distinct models: WizardLM for general conversation and instruction following, WizardCoder for writing and understanding code, and WizardMath for solving math problems. All three are built using a method the team calls Evol-Instruct, where a simpler set of training examples is automatically expanded into a larger, more varied and challenging set by having an AI generate progressively harder versions of each example. WizardCoder is the most prominent part of the repository in terms of benchmark results. As of early 2024, the 33-billion-parameter version achieved scores on standard coding benchmarks that the team reported as competitive with or surpassing GPT-3.5-Turbo and Gemini Pro. WizardMath similarly focuses on grade-school and competition-style math problems, with the 70-billion-parameter version outperforming GPT-3.5 on one benchmark (GSM8K) at the time of release. WizardLM itself targets general complex instructions and was accepted as a paper at ICLR 2024. All three model families are available for download from HuggingFace. The models come in several sizes, ranging from 1 billion to 70 billion parameters, so users with different hardware can choose a version that fits their available memory and compute. The underlying base models include Llama, Mistral, and DeepSeek-Coder depending on the version. The code in the repository covers training scripts for reproducing the Evol-Instruct process and evaluation scripts for running the benchmarks. It requires Python 3.9 or later. Data produced by the project is licensed under Creative Commons BY-NC 4.0, meaning it can be used for research and non-commercial purposes. The code itself is Apache 2.0 licensed. The project has a Discord community and a homepage with additional details. Development appears to have been most active between 2023 and early 2024, corresponding to the period when these benchmarks were published.
← nlpxucan on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.