Meta’s trying to teach robots manual skills


Facebook’s parent company, Meta, has released HOT3D, a novel dataset and benchmark for teaching artificial intelligence how humans use their hands to manipulate tools.

Arguably, handling various tools is among the key features that distinguish humans from other species on this planet. With HOT3D, Mark Zuckerberg’s Meta wants robots to have the same hand dexterity as we do.

The new dataset and benchmark are supposed to “unlock new opportunities” in areas such as “transferring manual skills from experts to less experienced users or robots,” Meta says.

While teaching robots to make an omelet sounds enticing, HOT3D is expected to allow more down-to-earth discoveries, like “helping an AI assistant to understand user's actions, or enabling new input capabilities for AR/VR users.”

The dataset offers researchers one million multi-view frames of hand-object interactions, with over 800 minutes of first-person (egocentric) recordings of humans interacting with 33 hand-held objects.

However, the key element of the dataset is a set of small optical markers attached to the user's hands and the objects they handled, which were captured with a motion-capture system.

The data was captured using Meta’s Project Aria glasses and Quest 3 VR headsets. According to the company, HOT3D also includes 3D object models with PBR (Physically Based Rendering) materials, 2D bounding boxes, gaze signals, and 3D scene point clouds from SLAM.

“We use our hands to communicate with others, interact with objects, and handle tools. Yet reliable understanding of how people use their hands to manipulate objects remains a key challenge for computer vision research,” Meta said.