xming521/weclone

★ 17,885PythonAudience · generalComplexity · 4/5Setup · hard

Mindmap

mindmap
  root((repo))
    Pipeline steps
      Export chat data
      Privacy scrubbing
      Fine-tuning with LoRA
      Bot deployment
    Data sources
      Telegram
      WeChat
      Other platforms
    Model options
      Qwen2.5-VL default
      Any LLaMA Factory model
      VRAM size guide
    Deploy targets
      Telegram
      Discord
      Slack

mindmap root((repo)) Pipeline steps Export chat data Privacy scrubbing Fine-tuning with LoRA Bot deployment Data sources Telegram WeChat Other platforms Model options Qwen2.5-VL default Any LLaMA Factory model VRAM size guide Deploy targets Telegram Discord Slack

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Export your Telegram history, fine-tune a local model on it, and deploy a bot that replies to new messages in your exact phrasing and style.

USE CASE 2

Create a private digital twin trained entirely on your own hardware so your conversations never leave your machine.

USE CASE 3

Build a Slack or Discord bot that impersonates your communication style for fun, archival, or personal-assistant purposes.

Tech stack

PythonLLaMA FactoryLoRAQwen2.5-VLCUDAMicrosoft Presidiouv

Getting it running

Difficulty · hard Time to first run · 1day+

Requires a CUDA 12.6 GPU with enough VRAM for the chosen model size, larger models give noticeably better output quality.

In plain English

WeClone is an end-to-end toolkit for turning your own chat history into a chatbot that talks the way you do. You feed it a dump of your past messages from a platform like Telegram, the tool cleans the data and fine-tunes a large language model on it, and then you can plug the resulting model back into a chat service as a bot that mimics your phrasing, vocabulary, and replying style, what the project calls your digital avatar. The pipeline covers every step: exporting chat data, preprocessing it (including stripping out personal information like phone numbers, email addresses, credit cards, IP addresses, locations, and bank or wallet addresses using Microsoft Presidio plus a user-defined blocklist), fine-tuning a model, and deploying the result. By default it uses the Qwen2.5-VL-7B-Instruct multimodal model and the LoRA technique for supervised fine-tuning, but you can swap in any other model supported by LLaMA Factory, and the README provides a table of VRAM requirements from 7B up to 70B. After training, the bot can be deployed to Telegram, Discord, Slack, or personal WeChat accounts, WhatsApp support is under construction. You would reach for WeClone if you want a private, locally trained digital twin of yourself for fun, archival, or assistant purposes, and you would rather keep your conversations on your own hardware than send them through a cloud service. The project is written in Python, currently supports Telegram as the main data source, uses uv as its environment manager and expects CUDA 12.6 or newer for GPU training. The README warns that larger models trained on more data give noticeably better results. The full README is longer than what was provided.

Copy-paste prompts

Prompt 1

I have a Telegram export file. Walk me through the full WeClone pipeline: clean the data with Presidio, fine-tune Qwen2.5-VL-7B with LoRA, then deploy the result as a Telegram bot.

Prompt 2

My GPU has 16 GB VRAM. Which WeClone model size should I choose, and what config settings do I need to stay within that memory limit?

Prompt 3

I want WeClone to strip all personal info from my chat data before training. Show me how to configure Microsoft Presidio and add my own custom blocklist words.

Prompt 4

How do I evaluate whether my WeClone-trained model actually sounds like me? What questions should I ask it and what quality signals should I look for?

Open on GitHub → Explain another repo

← xming521 on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.