MetaFine is a software framework for testing robot manipulation policies, the programs that decide how a robot arm should move to grasp, slide, insert, or otherwise handle objects. The README frames it as a diagnostic tool rather than a leaderboard. Most existing benchmarks just check whether the task ultimately succeeded, but MetaFine splits a policy's behaviour into three separate scores: understanding, perception, and behaviour. The idea is that two policies with the same overall success rate may fail in very different ways, and the three-score breakdown makes those differences visible. Understanding is measured by breaking a task into stages and reporting success on each stage, so you can see exactly where the chain breaks: engagement, manipulation, or release. Perception is measured by running domain-randomisation sweeps over lighting, camera pose, and camera rotation, and reporting the area under the success curve as a single 0-to-1 score per axis. Behaviour is measured by looking at how smooth the action trajectory was, using metrics like jerk RMS, velocity variance, and path length, which can expose jerky or hesitant policies that still happen to succeed. The platform is built from small reusable pieces. There are 21 atomic skills such as grasp, rotate, slide, and insert, each declared with a @register_skill decorator and matched to objects through a closed set of 11 affordance types. The asset library currently has more than 40 part-annotated articulated objects, each shipping a URDF file and a generated capabilities.json that declares what the object can do. Tasks are described as compositional task graphs in YAML, and the README says that adding a long-horizon task is roughly 30 lines of YAML rather than a new environment class. MetaFine runs on the SAPIEN simulator and the ManiSkill robotics environment, and supports a real-sim hybrid mode the authors call PPI: scan an object with a phone, process it, import it, and reproduce it in simulation under the same diagnostic protocol. The repository vendors seven vision-language-action policy backbones (ACT, DP3, OpenVLA, OpenVLA-OFT, pi-0, pi-0.5, and StarVLA), although the README notes that training has only been verified through the LeRobot and StarVLA paths, and closed-loop inference has only been verified for pi-0.5. The project also ships two Claude Code slash commands. /metafine_help answers questions about the platform by routing to the right section of the user guide, strictly read-only. /metafine_add walks the user through designing either a new atomic skill or a new task graph YAML, validating affordances and predicates, and writes the file only after confirmation. Installation is pip install -e ., with per-policy stacks installed separately. The license is MIT and the project is marked alpha.
Generated 2026-05-21 · Model: sonnet-4-6 · Verify against the repo before relying on details.