cs.ROJun 11, 2026classified

Mana: Dexterous Manipulation of Articulated Tools

Zhao-Heng Yin, Guanya Shi, Pieter Abbeel, C. Karen Liu

cs.ROcs.AIcs.CVcs.LG

Paper Guide Brief

Reading Brief

Mana presents a sim-to-real framework for dexterous manipulation of articulated tools using a coarse-to-fine pipeline that combines procedural grasp keyframe generation, motion planning, and reinforcement learning, achieving zero-shot transfer on tongs, pliers, clothespins, and syringes.

Central Claim

A general sim-to-real data generation and policy learning framework for articulated tool manipulation that reinterprets dexterous manipulation as an animation problem, requiring minimal human input and achieving zero-shot transfer across four diverse articulated tools.

Contribution

Why It Matters

If true, this work is the first to achieve zero-shot sim-to-real transfer for both grasping and in-hand manipulation of thin articulated tools such as tongs, pliers, and syringes, using a coarse-to-fine pipeline inspired by computer animation.

Prerequisites

coarse-to-fine pipeline, procedural grasp keyframes, motion planning, reinforcement learning, diffusion policy

Atlas Placement

Robot Manipulation (subfield)

Read If

You care about coarse-to-fine pipeline, procedural grasp keyframes, motion planning.

Skip If

You only care about zero-shot sim-to-real transfer, teleoperation baseline.

Methods

coarse-to-fine pipelineprocedural grasp keyframesmotion planningreinforcement learningdiffusion policypoint cloud perceptionsim-to-real transfer

Tasks

articulated tool manipulationdexterous graspingin-hand manipulationtool actuation

Datasets

Mana generated trajectories

Benchmarks

zero-shot sim-to-real transferteleoperation baselineopen-loop baseline

Noosaga Placements

Learning-Based Manipulationframework95%
The paper explicitly uses learning-based manipulation, specifically RL and diffusion policies, to generate and execute manipulation trajectories.
Mana employs a coarse-to-fine pipeline...through motion planning and reinforcement learningWe train a point-cloud-conditioned transformer diffusion policy
Robot Manipulationsubfield95%
The paper directly addresses dexterous manipulation of articulated tools, which is a core problem in robot manipulation.
Articulated tool manipulation remains a major challenge in dexterous roboticsWe present Mana (Manipulation Animator), a general sim-to-real framework for dexterous manipulation
Robot Learningsubfield90%
The framework uses reinforcement learning and diffusion policies to generate and learn manipulation trajectories.
motion planning and reinforcement learning to generate manipulation trajectoriesWe train a point-cloud-conditioned transformer diffusion policy
Deep Reinforcement Learningframework85%
The framework uses reinforcement learning for short, contact-rich manipulation phases, which is a core part of the deep RL framework.
reinforcement learning is used only for the short, contact-rich phases where delicate position-force coordination is requiredWe use the same RL formulation described in the next section for challenging cases
Roboticssubfield70%
The work bridges robotics and AI through learning-based methods and sim-to-real transfer.
sim-to-real reinforcement learning approach is also insufficient on its owntrain a point-cloud-conditioned transformer diffusion policy
Optimization-Based Motion Planningframework70%
Motion planning is used for the pre-grasping phase to generate collision-free trajectories.
We instead implement a GPU-accelerated RRTConnect algorithm with path smoothing to plan trajectories

Abstract

Articulated tool manipulation remains a major challenge in dexterous robotics due to the need to coordinate internal degrees of freedom and contact-rich interactions. While prior work has largely focused on rigid objects, articulated tool use remains underexplored because of its physical complexity and the difficulty of learning functional grasping and manipulation policies. We present Mana (Manipulation Animator), a general sim-to-real framework that reinterprets dexterous manipulation as an animation problem. Inspired by computer animation, Mana employs a coarse-to-fine pipeline that transforms procedurally-generated grasp keyframes into manipulation trajectories through motion planning and reinforcement learning. The data generation process is largely automatic, requiring only a few mouse clicks to specify functional affordances (<1 minute per tool). Across four articulated tools spanning different scales and joint types, Mana achieves zero-shot sim-to-real transfer for both grasping and in-hand manipulation, demonstrating a scalable approach to dexterous articulated tool use.

Paper Context

Source ContextWhole paper

Budget100,000 tokens

Coverage52,865 chars

Classified from the full extracted paper text (52,865 characters). The Paper Guide brief above is the user-facing synthesis; raw context is kept out of the page.

Full-paper context sent 52,865 of 52,865 extracted characters to classification.