Analysis updated 2026-06-24
Reproduce the SEAL paper results on the BFCL function-calling benchmark
Train your own tool-using agent with category-aware GRPO reward reweighting
Extend the diagnostic taxonomy to cover new failure modes in agent rollouts
Plug a custom task adapter into the SEAL training loop
| yihaohu0118/seal | power-codes/scanner-ip-cdns | tg12/phantomstars | |
|---|---|---|---|
| Stars | 38 | 38 | 38 |
| Language | Python | Python | Python |
| Setup difficulty | hard | easy | easy |
| Complexity | 5/5 | 2/5 | 3/5 |
| Audience | researcher | ops devops | ops devops |
Figures from each repo's GitHub metadata at analysis time.
Requires two separate conda environments plus a running BFCL benchmark service and GPU compute to actually train.
SEAL stands for Synergistic Co-Evolution of Agents and Learning Environments. It is a research project that comes with a paper, a poster, and a project homepage, written by authors from Ant Group, Westlake University, the University of Michigan, and the University of Science and Technology of China. The license is Apache 2.0. The project is about making AI agents that use tools, the kind of agents that call functions, query APIs, or run commands to finish a task. The idea is that the agent and the training environment improve together in a closed loop. The agent runs through tasks, the system watches which steps fail, and the failures are sorted into categories such as invalid tool calls, wrong arguments, missed tool calls, failed recovery attempts, and responses that do not match what was expected. These labels then feed back into both the training interface and the model itself. The training method uses something called GRPO, a reinforcement learning approach, where the diagnostic categories reweight the rewards given during training. The README says the actual tool definitions, task labels, and verifier stay the same during evaluation, so the comparison to other methods remains fair. The training environment is built on BFCL, a public benchmark for function-calling agents. To run it, you clone the repo, create a Python 3.10 conda environment called seal, install the requirements, and then set up a second conda environment for the BFCL benchmark using its setup script. After both are ready, you launch the BFCL service and then start the training run with python launcher.py pointing at exp/SEAL.yaml. The repository layout is organized into folders for the experiment config, the BFCL environment service, modules for diagnostic state and reward reweighting, task adapters, and the released data splits.
Research code for training tool-using AI agents with a closed-loop reinforcement learning method that categorizes failures and reweights GRPO rewards on the BFCL benchmark.
Mainly Python. The stack also includes Python, Conda, GRPO.
Apache 2.0 lets you use, modify, and distribute commercially with attribution and a patent grant.
Setup difficulty is rated hard, with roughly 1day+ to a first successful run.
Mainly researcher.
This repo across BitVibe Labs
Verify against the repo before relying on details.