I'd go for something in the manipulation of ropes or wires.
State of the art seems to be that they can untangle a loosely knotted cord.
Untying a short rope with a tightly pulled overhand knot in the middle seems like it's decades away. You have to be able to grip it well enough, then twist the rope and push (even though every physicist says pushing a rope is impossible).
Interesting. Futurism is super hard, but "decades" too far away to me. I think with strong 2 finger grippers this is probably close to state of the art, especially with a wrist force sensor, like the TRI setup.
Standard evening family home tidy/reset - toys, books, clothes, shoes away in their places. All over the house.
Oh, and load/unload dishwasher. Same with laundry machines. Along with folding laundry, these are the domestic robot equivalents of 'de-mining' and 'search and rescue': the classic motivating use cases for mobile autonomous robots.
You need to manipulate a large sheet, and you probably need to move around, bend down and lean over to reach all the corners. Bonus points for neat hospital corners on a flat sheet.
Putting pillows in pillowcases is another fun one. Usually pretty easy, probably a bronze medal.
Gold medal: put a UK super king size duvet inside a duvet cover. It's huge and awkward, there are buttons, and it's almost but not quite square (why??) so there's a good chance you'll get it round the wrong way and have to rotate it 90 degrees.
I think this is my favorite suggestion. All of them are super hard, but
- Bronze: pillow in pillow case
- Silver: Make a bed with fitted sheet, flat sheet and blanket
- Gold: put a duvet in a duvet cover
Feels like it would be a good option for a round 2...
Consider examples using building tools like screwing in a drywall screw, or hammering a nail, using a paint roller, caulking a sink, minor plumbing repair with a torch and solder. These differ enough in terms of forces, state changes, and combined dexterity/acuity (two-handed proprioception) from the windex, sandwich and key examples
How about something with unpacking items from a shopping bag, i suspect the difference in bags (standard plastic, reusable etc) and certain items can really crank the difficulty.
It can also create a good time of a story - open the door to get the grocery delivery, unpack the delivery etc.
Thanks for putting this together. This IMO is 1000x better than any other AI challenges to date. ARC-AGI is bullshit and has nothing to do with reality.
I’ll likely use some of your tests for our robotics (not just humanoid) testing in the future at least as some baselines.
Also I really liked you dressing up as a robot - that’s very fun and really reflects the point of robotics: replace human action for all tasks.
My suggestion: Identify a collapsed person in the home and render first aid (I need this because I have epilepsy and live alone)
Bronze: ID collapse and call emergency services
Silver: Bronze tasks + manipulate person into appropriate recovery position
Gold: Communicate details to emergency personnel and playback previous hours of interactions
Tie your shoe laces? Pet a cat. Open one of those scissors that are sold in those fused plastic boxes, requiring a scissor to open. Opening a packet of tissues (wet ones for extra challenge). Cook rice. Throw out the trash.
Maybe careful application of large amounts of force? Opening a jar, peeling garlic, splitting a squash, opening a soda can. This category seems like a good test of "grip" strength + force feedback + sense of touch.
I love your list and it makes me think we are so far away from these things ever being feasible/cost effective compared to just hiring a poor person to do it. And the world is making a lot of poor people right now.
something requiring navigating stairs while holding something full like a laundry basket. bronze - straight stairs, silver - one 90deg turn. gold - spiral.
something requiring co-ordination between 2 robots. think relay race which the olympics has. So say, moving a couch together.
btw love the idea and the silver body suit. good stuff.
Ooh. I like full body manipulation. Humans use hips & elbows to move laundry baskets. Two robot collaboration is good too. I wonder who I can convince to wear another silver suit.... :)
(HN link on Substack points at empty page instead of this one, at least before I made this comment.)
What I think is missing is marathon events. Biathalons and Triathalons.
We all know LLMs have a rather limited context window. Thus seeing robots do longer chains of events would be interesting to see that they're capable than a possibly rigged demo.
Something like: move a stack of boxes from one room to another. The boxes at the end also need to be stacked up. or how about pick up a box, go up some stairs, open a door, and put the box on a shelf on the other side.
Also, the real world is sloppy and messy and dirty and, to be real, kinda janky sometimes. Gold for unlocking a door with a key at a well-maintained office complex, (and opening it, and walking through it) is one thing, because facilities is going to replace the lock before it gets old and needs replacing, and we can assume the door fits in the frame properly so it doesn't need to be shoved or lifted up or yanked in order to be opened is easier than. But the real world is messy and sloppy and you gotta jiggle the key in just the right way in order to get it to work.
Closing the door (assuming the robots weren't raised in a robot barn) is also harder than it looks if the door is shitty and needs a proper slam in order to be fully closed. Also, the robot locking the door behind itself after it comes in.
Scanning a key card and opening a door, but the first try fails.
We're a long way from a general robot that can screw a simple screw together like you would to assemble Ikea furniture.
Object recognition.
Gather only the dishes from a messy coffee table and put them in the dish bin.
Pick up only the clothes from a messy floor and bed, and put them in the hamper.
Dump a hamper of clothes onto a table, and sort out stuff that doesn't want to go into the washing machine.
Terrain traversal.
Just walk 500 ft, but theres increasing levels of obstacles in the way.
We all saw Boston dynamics robot parkour videos, but what I want to see is a robot make it from the front door of Simpsons house to the kitchen in the back, but it's got to go through the living room, but it's hella messy, with Maggie and Bart and Lisa’s crap strewn all over, Homer’s got some beer bottles, some empty, some full, all over the floor and on the table, and all the robot has to do is walk from one side of the room to the far side of the room without stepping on anything, or knocking anything over. (Simpsons merely being a home layout that's familiar to most people. Doesn't need to actually be them.)
Ducking under a low ceiling. Climb over a barrier,
of varying shapes and sizes.
Other loocomotion. how much weight in its arms in front of it, holding a 5-lb briefcase with one hand while walking. Can it carry something on its back? What's the limit? Can it give piggyback rides?
A category for simulated. Let companies show off their robot's kinematics control systems, so have something on the level of CoppeliaSim, so the motors and the gears and the actuators are themselves simulated, vs a simple 3d video game where they are not. Plug their model into the simulated robot and see how well it just walks. If we remember QWOP, it's harder than it looks!
Obviously it's not going to be totally 100% accurate to the real world. The benefit of this is it lets people complete from all over world without having to replicate a very specific setup in the physical world, and compete from wherever they live am not have to fly to your facility to test, opening up a whole new world of contestants because they can now compete because they can afford it now.
At the end of the day, the most important challenge is, can it pick up a battery from the shelf, swap it with one of the two in its chassis, and put the dead one it just pulled out onto the charger?