Download the ClawGym-Task and ClawGym-Trajectory datasets from Hugging Face to study agent task data
Run supervised fine-tuning on the ClawGym base models using the SFT folder
Train a claw agent with reinforcement learning using the RL folder
Reproduce the ClawGym paper results with the released 4B, 8B, and 30A3 checkpoints
README is mostly links to Hugging Face and the paper, so training requires GPU infra and reading the SFT and RL folders directly.
ClawGym-Agents is the public-facing piece of a research project from a group called RUC-AIBOX. The README itself is very short, and most of it is a set of links rather than a long explanation. The repository pairs with a research paper titled ClawGym: A Scalable Framework for Building Effective Claw Agents, listed as a 2026 arXiv preprint with Bai, Song, Sun, and several other authors. The README points to two datasets that the team has published on Hugging Face. The first is called ClawGym-Task and contains around 13,500 tasks. The second is called ClawGym-Trajectory and contains around 24,500 trajectories. The word trajectory in this kind of work usually means a recorded sequence of actions an agent took while attempting a task, so the two datasets line up: one set of problems to solve, one set of recorded attempts. The README also lists three trained models, all hosted on Hugging Face. ClawGym-4B and ClawGym-8B are named after their size, with four billion and eight billion parameters respectively. ClawGym-30A3 is a third variant whose naming the README does not explain. The repository is set up so that anyone can download the data and the models from Hugging Face by following the links. The training code for the models is split into two folders inside this repository. One folder is named SFT, which is short for supervised fine-tuning, and the other is named RL, which is short for reinforcement learning. The README only points at these folders without describing the contents. Beyond the dataset table, the model table, the training code pointer, and the BibTeX citation block, the README does not say anything about what a claw agent actually does, how the data was collected, what task format is used, or how the models compare. Anyone who wants more detail will need to read the linked paper or open the SFT and RL folders directly.
Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.