Blend two fine-tuned language models into one model that inherits capabilities from both.
Build a Frankenmerge model by taking specific layers from different base models in a custom order.
Use evolutionary merging to automatically search for the best combination of model weights for a target task.
Transplant the tokenizer and vocabulary from one language model onto another using the TokenSurgeon tool.
Requires large disk space and RAM to hold model weights, GPU is optional but speeds up the merge process.
Large language models, the kind that power AI chat tools, are trained at great expense to develop particular strengths. One model might excel at following instructions, another at coding, another at creative writing. Normally, combining those strengths would require either running multiple models at once (expensive) or doing additional training that requires the original training data. Model merging is a different approach: you take the internal numerical weights of two or more models and mathematically blend them to produce a single new model that can inherit capabilities from all of them. The resulting model runs at the same speed and cost as a single model. mergekit is a Python toolkit that automates this process. You write a short configuration file in YAML format describing which models to combine, how much weight to give each, which merging method to apply, and other options. The tool then handles loading the models, performing the merge operation, and writing the result to a new folder. From there you can test it locally or upload it to the Hugging Face Hub, a popular platform for sharing AI models, using commands the README provides. The toolkit supports several merging methods, including simple weighted averaging of model weights, layer-by-layer construction called Frankenmerging that takes specific layers from different models, and evolutionary approaches that automatically search for the best combination of merging parameters by testing different options and evaluating the results. It also includes a tool called TokenSurgeon for transplanting the vocabulary and tokenizer from one model onto another. The tool is designed to work under constrained hardware. Merges can run on a regular CPU without any dedicated graphics card, or with as little as 8 gigabytes of GPU memory, by loading only the parts of each model it needs at a given moment rather than keeping everything in memory at once. A hosted web version called FrankensteinAI is available for users who do not want to set up the toolkit locally. The project is licensed under the GNU LGPL v3.
← arcee-ai on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.