explaingit

blinkdl/rwkv-lm

Analysis updated 2026-06-24 · repo last pushed 2026-05-08

14,524PythonAudience · researcherComplexity · 5/5MaintainedSetup · hard

TLDR

Training code and reference implementation for RWKV, a recurrent neural network language model that aims to match transformer quality with constant memory and linear-time inference.

Mindmap

mindmap
  root((RWKV-LM))
    Inputs
      Token streams
      Training data
      MiniPile dataset
    Outputs
      RWKV weights
      Trained checkpoints
      Demo runs
    Use Cases
      Train an RNN LLM
      Run efficient inference
      Fine-tune with LoRA
    Tech Stack
      Python
      PyTorch
      DeepSpeed
      CUDA
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

What do people build with it?

USE CASE 1

Train a small RWKV-7 model from scratch on a single GPU with 7GB of VRAM.

USE CASE 2

Fine-tune a pre-trained RWKV checkpoint on a custom dataset using DeepSpeed.

USE CASE 3

Compare RWKV inference speed against a same-size transformer for long-context workloads.

USE CASE 4

Convert RWKV weights to GGUF and run them in a local chat UI.

What is it built with?

PythonPyTorchPyTorch LightningDeepSpeedCUDA

How does it compare?

blinkdl/rwkv-lmweifeng2333/videocaptionerswivid/f5-tts
Stars14,52414,53014,508
LanguagePythonPythonPython
Last pushed2026-05-08
MaintenanceMaintained
Setup difficultyhardeasyhard
Complexity5/52/54/5
Audienceresearchergeneraldeveloper

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · hard Time to first run · 1day+

Needs CUDA, a specific PyTorch Lightning 1.9.5, DeepSpeed, and at least 7GB GPU VRAM for the smallest training script.

License is not stated in the available content, though it notes the project is under the Linux Foundation AI umbrella with free weights.

In plain English

RWKV (pronounced "RwaKuv") is a research project that designs a new kind of large language model. Most modern chat-style models, like GPT and similar systems, use an architecture called the transformer. RWKV takes a different route: it is built as a recurrent neural network (RNN), which means it reads text one token at a time and keeps a small running "state" instead of looking back at the whole conversation. The claim of this repository is that RWKV can match transformer-level quality while keeping the speed and memory advantages of an RNN. The README is centered on RWKV-7, nicknamed "Goose", which the author calls the strongest linear-time, constant-space, attention-free, fully RNN architecture available at the time of writing. Linear-time means the work grows in proportion to the length of the input, and constant-space means it does not need a growing key-value cache the way a transformer does. The project is hosted under the Linux Foundation AI umbrella so the code and weights are free to use, and the README notes that RWKV is already shipped inside Windows and Office. The repository is mostly training code and reference implementations. There are demo scripts for RWKV-7 in GPT-like mode, in pure RNN mode, and in a faster combined mode, with similar files for RWKV-6 and RWKV-5. A simplified training script in RWKV-v7/train_temp can be run on a single GPU with about 7 GB of VRAM, and a fuller script trains a model on the MiniPile dataset using PyTorch, PyTorch Lightning 1.9.5, DeepSpeed, and CUDA. The README is firm about a few training details, like using PreLN LayerNorm, applying weight decay only to large projection matrices, and following the supplied initialization. The README also lists a wide surrounding ecosystem: pre-trained weights and GGUF conversions on Hugging Face, a pip package called rwkv, Gradio and WebGPU chat demos, a graphical runner, an inference server called Ai00, a PEFT and LoRA tuning project, an RLHF project, fast CUDA kernels, and a mobile inference library. A successor architecture, RWKV-8 "ROSA", is mentioned at the end. The full README is longer than what was shown.

Copy-paste prompts

Prompt 1
Set up the RWKV-v7 train_temp script on a single RTX 4090 with 24GB VRAM. List exact CUDA, PyTorch, and DeepSpeed versions plus the run command.
Prompt 2
Convert a pre-trained RWKV-7 checkpoint from Hugging Face into GGUF and run it locally with the rwkv pip package. Show every command.
Prompt 3
Write a minimal inference script that loads RWKV-7 in pure RNN mode and streams tokens one at a time given a prompt.
Prompt 4
Explain the difference between RWKV-7 GPT-mode and RNN-mode using the demo scripts in this repo, and tell me when to pick each.
Prompt 5
Fine-tune RWKV-6 on a 100MB plain-text dataset with LoRA. Reference the PEFT project linked in the README and produce a config file.

Frequently asked questions

What is rwkv-lm?

Training code and reference implementation for RWKV, a recurrent neural network language model that aims to match transformer quality with constant memory and linear-time inference.

What language is rwkv-lm written in?

Mainly Python. The stack also includes Python, PyTorch, PyTorch Lightning.

Is rwkv-lm actively maintained?

Maintained — commit in last 6 months (last push 2026-05-08).

What license does rwkv-lm use?

License is not stated in the available content, though it notes the project is under the Linux Foundation AI umbrella with free weights.

How hard is rwkv-lm to set up?

Setup difficulty is rated hard, with roughly 1day+ to a first successful run.

Who is rwkv-lm for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Verify against the repo before relying on details.