Egocentric data: After the troll farms, we name the arm farms.
- benjamin. brl
- il y a 3 jours
- 2 min de lecture

Egocentric image example
How the fuel of world models starts building now with Egocentric data.
I wrote about ImageNet: 14 million images, hand-labeled, that became the foundation of modern visual AI.
This isn't new. To label those 14 million images, Fei-Fei Li needed humans at scale to perform micro-tasks: thousands of workers, clicking through images, one label at a time. At MyDataMachine, we've been doing exactly this for years: connecting human intelligence to AI training pipelines.
The same thing is happening right now. Except this time, it’s not images. It’s arms.
Thousands of people around the world are strapping cameras to their wrists and filming themselves doing chores: folding laundry, washing dishes, repotting plants. Not for social media. To train humanoid robots.
The technical term for this type of data is egocentric data: footage captured from the agent's own point of view.
The industry has a name for the places where this happens at scale: arm farms.
Why does this matter? Because robots cannot learn from text or flat images. They need to see humans move: in real homes, with real interruptions, real mess, real context switching. There is no YouTube for robot behaviors. Not yet.
Goldman Sachs, Morgan Stanley and Bain have all said the same thing: data (not hardware, not compute) is the limiting factor for the humanoid robot industry in 2025 and 2026.
A few numbers

Two signals worth watching.
Signal 1: Sunain
A startup shipping wrist cameras by mail to 25,000 contributors in 30 countries. They film natural household tasks at home: cooking, cleaning, watering plants. $80 for 2 hours. The
CEO’s pitch: “This will be one of the biggest gig economies that will ever exist in the world.”
Signal 2: DoorDash
In March 2026, DoorDash launched a “Tasks” app for its 8 million US couriers. Between deliveries: film yourself folding laundry, loading a dishwasher, repotting plants. Variable pay per task. The same model Uber tested quietly in 2024.
MyDataMachine connects human intelligence to AI training pipelines. Arm farms connect human movement to robot training pipelines. Same insight, two decades apart: the most valuable AI fuel is human behavior, captured at scale.
The pattern is the same as ImageNet. The data that seems mundane, like a hand washing a dish, an arm folding a sock, is the most valuable fuel for the next generation of AI.
PS: Did the police accidentally get ahead of this market? Body cams on officers worldwide have been capturing human movement in real environments for years. Nobody called it a training dataset. But here we are.
The question for business leaders
Every industry will need robots that understand its specific environment. A hospital is not a warehouse. A restaurant kitchen is not a factory floor. The data to train those robots doesn't exist yet. Who builds it first wins.
Your prompt for this week
“Which robot will understand my industry first and what data will it need to get there?”



Commentaires