← Back to Robotics
cs.RO

How one robot controller handles joints, hands, and human poses

Zuxing Lu, Ziang Zheng, Yao Lyu, Jingyu Liu, Feihong Zhang, Song Lu, Xin Yuan, Changyin Sun, Xingxing Zuo, Shengbo Eben Li

June 3, 2026

Humanoid robots struggle when tasks demand different motion references—joints for walking, hand positions for grasping. M3imic unifies these mismatched input types using separate encoders that feed a shared latent space, then trains one policy via reinforcement learning that transfers directly to real Unitree G1 robots. The approach handles joint angles, human poses, and end-effector targets without task-specific retuning, achieving 98% success in simulation and validated on hardware.
Published as M3imic: Learning a Versatile Whole-Body Controller for Multimodal Motion Mimicking arXiv:2606.04829
Read the original paper →