Two Papers Quietly Solve Problems Most Robotics Labs Pretend Don't Exist
New research on curriculum learning reveals why your favorite humanoid demo probably won't scale to the real world.
Crédit photo: Lottie animation by Centre Robotics (LottieFiles Free, used with credit). · source
Most robotics research papers announce breakthroughs. These two acknowledge failures, and that's exactly why they matter.
A pair of arXiv preprints dropped this week that, on the surface, seem unrelated. One tackles a wheel-legged robot balancing multiple spheres on its back. The other studies quadruped locomotion across varied physical conditions. But read them together and a pattern emerges: both are wrestling with the same fundamental problem that haunts sim-to-real transfer, and both arrive at surprisingly similar conclusions about why standard approaches break down.
The question they're asking isn't glamorous. It's not about making robots do backflips or fold laundry. It's about why reinforcement learning policies that work perfectly in simulation often plateau or collapse when you try to scale them up. From my time in hardware, I've seen this movie before. A demo works. You try to generalize it. Everything falls apart. These papers actually explain why.
The sphere-balancing problem sounds like a party trick, but the arXiv paper uses it to expose a subtle failure mode in how we train robots to handle multiple objects. Here's the setup: a wheel-legged quadruped has to transport free-rolling spheres on its back without any fences or grippers. One sphere is manageable. Two spheres, things get interesting. Five spheres, and most standard architectures simply give up.
The researchers found that conventional approaches plateau at or below the two-sphere stage within the same training budget. That's not a minor limitation. It suggests something is fundamentally wrong with how these systems represent multiple identical objects.
The culprit, they argue, is what they call "per-frame permutation symmetry." When you have multiple identical spheres, their ordering can change independently at each moment in time. Standard neural network architectures don't handle this well. They impose the wrong kind of symmetry over the full history, which creates a concrete failure mode during curriculum-based training.
À lire aussi
More in Research
A wave of new research is tackling the boring but critical problem of making robots learn faster and execute reliably. I've seen hype cycles before, but this feels different.
Mark Kowalski · 2 days ago · 5 min
A flurry of new research tackles the boring but essential problem of making robot policies actually work in the real world.
Mark Kowalski · 2 days ago · 6 min
Two new papers tackle the same old problem: getting robots to do what we actually want, not what we technically told them to do.
Mark Kowalski · 2 days ago · 5 min
SurfFill and CoMo3R-SLAM take opposite approaches to the same problem, and both reveal something important about where 3D reconstruction is actually headed.