Crédit photo: Lottie animation by Centre Robotics (LottieFiles Free, used with credit). · source
I've been covering tech long enough to recognize a pattern, and here it is again: engineers building sophisticated systems while forgetting to ask the people who'll use them what they actually need. Call me old-fashioned, but I thought we learned this lesson during the smartphone wars.
A new dataset from researchers studying explainable robotics (that's the field concerned with making robots explain themselves, for those keeping score at home) reveals something that should've been obvious from the start. When you put household robots in front of regular people and ask what questions they'd want answered, you get responses that don't match what the robotics community has been focused on.
Most academic work in explainable robotics, according to arXiv, obsesses over "why-questions." Why did the robot do that? Why did it choose this path over that one? The assumption being that users want to understand the robot's reasoning process.
But when researchers actually surveyed 100 participants and collected 1,893 questions about household robots, the picture looked different. The most common category? Questions about task execution details, at 21.4%. Followed by questions about the robot's capabilities (12.6%) and performance assessments (10.7%). People want to know what the robot did, what it can do, and whether it did a good job. Not necessarily why it made the choices it made.
Here's where it gets interesting though. When users were asked to rank importance rather than just frequency, questions about how robots would handle difficult scenarios and ensure correct behavior shot to the top. People ask simple questions most often, but they care most about the hard stuff.
À lire aussi
More in Consumer
Everyone's talking about the scuffles and the hype. They're missing the actual story here.
Mark Kowalski · 1 hour ago · 4 min
When Apple's former design chief teams up with Ferrari on an electric car, you'd expect something revolutionary. Instead, we got a 4,982-pound lesson in what happens when design becomes too smooth.
Sarah Williams · 3 hours ago · 6 min
I've seen enough spec sheets to know when a 'deal' is just last year's inventory. Here's what the numbers say about this weekend's laptop and monitor sales.
James Chen · 21 hours ago · 4 min
The new Ring Battery Doorbell Pro packs retinal-grade 4K video into a battery-powered unit. I'm not sure most homeowners will notice the difference.
The research found something else that veteran tech watchers will recognize: novices and experienced users ask fundamentally different questions. Novices (self-identified, which, who knows how accurate that is) tend to ask about simple facts. What did the robot do? What's the current state of the environment? More experienced users skip past the basics.
This is the self-driving car hype cycle all over again, in a way. Remember when Tesla first rolled out Autopilot and experienced tech users immediately started testing edge cases while regular drivers were still asking "wait, do I keep my hands on the wheel?" The gap between what engineers assume users understand and what users actually understand has been a problem in every major tech transition I've covered since the 90s.
The researchers created 15 video stimuli and 7 text stimuli showing robots doing various household tasks, then asked Prolific participants what they'd want to ask. It's not a perfect methodology (Prolific users skew younger and more online than the general population) but it's more rigorous than what most robotics labs do, which is basically ask their grad students and call it user research.
Separate research from Virginia Tech, also posted to arXiv, tackles a related problem from the opposite direction. Their work on "Language Movement Primitives" tries to bridge the gap between what large language models can reason about and what robots can actually do.
The problem, as the researchers frame it: VLMs (vision and language models, the kids call them) are great at understanding scenes and breaking tasks into logical steps. But they're terrible at translating those steps into actual robot motion. Robotics foundation models can output action commands, but they need fine-tuning before they can handle novel tasks. There's a disconnect between "understand what to do" and "actually do it."
Their solution uses Dynamic Movement Primitives, which provide a small number of interpretable parameters that VLMs can set to specify trajectories. Basically, give the language model a limited vocabulary of motion building blocks rather than asking it to control joints directly.
Across 31 real-world manipulation tasks, their approach achieved 65% task success compared to 35% for the best baseline. That's... better? I guess? It's still a 35% failure rate on tabletop manipulation, which doesn't exactly inspire confidence for household deployment. But what do I know.
Put these two papers together and you get a glimpse of where household robotics actually stands, which is somewhere between "promising research" and "nowhere near ready for your kitchen."
On one hand, we're learning that users have specific, practical questions they want robots to answer, and those questions don't necessarily align with what roboticists have been building toward. On the other hand, we're still struggling to get robots to reliably complete basic manipulation tasks from natural language instructions, even in controlled lab settings.
The optimistic read: at least people are asking the right questions now. The dataset of user questions gives roboticists something concrete to work toward. The Language Movement Primitives work shows progress on the motion control side. Progress is being made!
The skeptical read (and regular readers know which one I lean toward): we've been "making progress" on household robots for decades. The fundamental challenges, making robots that can handle the messy unpredictability of real homes, with real people asking real questions, remain largely unsolved.
I've seen this movie before. The hype cycle builds, investment floods in, demos look impressive, and then actual deployment reveals all the edge cases nobody planned for. Maybe this time is different. The language model revolution has genuinely changed what's possible in human-robot interaction. The question is whether it's changed enough.
Neither paper addresses what happens when these systems fail, and they will fail. The user questions dataset doesn't include "why did you just knock over my grandmother's vase" or "how do I get you to stop doing that thing you keep doing." The Language Movement Primitives work achieves 65% success, but what does the 35% failure mode look like? These are the questions that matter for actual deployment, and we don't have good answers yet.
The researchers behind the questions dataset note that as robots enter environments shared with humans and language becomes central to interaction, understanding user expectations is crucial. They're right about that. But understanding expectations and meeting them are very different things.
For now, I'll keep watching the papers roll in and the demos get more impressive. But I'll also keep my email open for readers who want to tell me I'm being too pessimistic about the robot future. Maybe the young founders will prove me wrong this time. It's happened before, occasionally.