VLAs Don't Know What They Don't Know. Two New Papers Are Trying to Fix That.

A pair of robotics papers tackle two of the most practical blockers standing between vision-language-action models and real-world deployment: overconfidence and computational bloat.

18 June 20267 min de lectura

Here's a thing that should unsettle anyone excited about deploying humanoids in the real world: the most capable robot manipulation models currently have no reliable way to tell you when they're about to fail.

That's not a minor footnote. That's a fundamental problem. And it's the one that two new papers from the robotics research community are, in their different ways, trying to solve.

Both papers focus on vision-language-action models, or VLAs. If you've been following the humanoid space at all, you'll have heard the term. VLAs combine vision and language understanding with the ability to output robot actions directly. They've become something of a north star for embodied AI research, and companies like Physical Intelligence, Google DeepMind, and a growing list of startups are betting heavily on them. The empirical results have been genuinely impressive.

But impressive benchmark numbers and safe real-world deployment are two very different things. These papers are about the gap between them.

The Overconfidence Problem

The first paper, out of TU Munich and posted to arXiv, tackles something called epistemic uncertainty. I'll be honest, when I first encountered this framing I had to sit with it for a bit. Epistemic uncertainty is, basically, uncertainty that comes from gaps in a model's knowledge, as opposed to noise that's inherent to the task itself. The distinction matters because epistemic uncertainty is, in principle, reducible. If you know the model doesn't know something, you can do something about it.

Cobertura relacionada

More in Humanoids

The headlines are celebrating a $2.5B humanoid robotics deal. I'd pump the brakes a little.

Mark Kowalski · 25 Jun · 6 min

Sometimes the sources don't pan out. Here's what happened when I tried to write a humanoids story this week and ended up with Samsung deals instead.

Sarah Williams · 25 Jun · 3 min

Diffusion models are getting good at imagining robot movements, but 'imaginable' and 'physically possible' aren't the same thing. Researchers are starting to close that gap.

Sarah Williams · 25 Jun · 6 min

A batch of fresh robotics research tackles the same underlying problem from different angles: robots that can see but don't really understand where things are.

VLAs Don't Know What They Don't Know. Two New Papers Are Trying to Fix That.

The Overconfidence Problem

More in Humanoids

The Compute Problem

Why Both of These Matter Right Now

Fuentes