All videos are 1× speed (60 Hz control).

What Makes Dexterous Assembly Challenging?

Precise assembly is sparse-reward and contact-rich: the robot must learn grasping, in-hand reorientation, alignment, and tight insertion.

Play2Perfect Assembly Tasks (Sound On 🔊)

Dexterous play pretraining enables precise assembly across tight insertion, multi-part assembly, and screwing.

Turn sound on

Recovery Behavior (Sound On 🔊)

Even after an initial failure, the policy continues acting closed-loop: continuously retrying until it completes the task.

Turn sound on

Abstract

Multi-fingered robots promise the speed and dexterity of human hands, yet challenging problems such as precise assembly have remained out of reach. These tasks are contact-rich, making data collection for imitation learning difficult, and sparse-reward, making direct exploration with reinforcement learning (RL) intractable. Consequently, prior work has made progress by structuring the problem with specialized grippers, tool attachments, and environment fixtures. In this work, we argue that before a robot can perfect precise assembly, it must first learn to play. We further ask the question: what factors in the process of learning to play matter for precise assembly? We propose Play2Perfect, an RL framework for task-agnostic pretraining through play on diverse objects and goals, which is then perfected on precise assembly. The goal of play is to acquire reusable manipulation priors, such as grasping, in-hand reorientation and pose reaching. Finetuning then adapts this general prior to assembly, focusing exploration on the final contact-rich, high-precision interactions needed for success. We systematically study key design choices in play pretraining, including object diversity, training objective, trajectory diversity, and goal precision. We show that our prior is 33x more sample-efficient than RL training from scratch, even when provided with dense, multi-stage rewards. We demonstrate zero-shot sim-to-real transfer, achieving 60% success on tight insertions with only 0.5 mm contact clearance, and over 50% success on long-horizon multi-part assembly and screwing.

Assembly Tasks

Method

1. Play Pretraining

We train a goal-conditioned RL policy on diverse objects and goals to learn reusable manipulation priors such as grasping, in-hand reorientation, and object-pose control.

2. Assembly Finetuning

We adapt the play prior to sparse-reward precise assembly tasks, focusing on contact-rich interactions.

3. Zero-shot Sim-to-Real

We deploy the simulation-trained policy zero-shot in the real world on precise assembly tasks.