Two Papers, One Week: Humanoid Locomotion Research Is Converging on a Fundamental Tradeoff

New work from separate teams tackles the same problem from opposite directions, and the results reveal something important about where humanoid control is actually headed.

9 June 20268 Min. Lesezeit

Roughly an order of magnitude. That's how much one research team claims to have reduced upper-body style error in humanoid walking, while maintaining the same fall-recovery rate as baseline reinforcement learning. In the same week, a separate group reports over 30% improvement in task success for whole-body loco-manipulation. Two papers, two approaches, and when you read them together, a clearer picture of where humanoid control research actually stands.

I want to be precise here: these aren't competing solutions to the same problem. They're complementary attacks on what I'd argue is the central tension in humanoid robotics right now. How do you make a robot move naturally without sacrificing its ability to recover when things go wrong?

What's the actual problem these papers are solving?

Reinforcement learning has, at this point, become the default approach to humanoid locomotion. Policies trained in simulation transfer to real hardware with reasonable reliability, and they handle disturbances well. This is genuinely settled science, or at least settled engineering.

The problem is that task-only rewards (walk forward, don't fall, reach the goal) tend to produce what the first paper's authors call "stiff, asymmetric gaits." The robot accomplishes the task, but it looks like it's fighting its own body to do so. Anyone who's watched videos of early Boston Dynamics robots versus their recent work knows exactly what this looks like.

The obvious solution is motion imitation: train the robot to match reference motions from human demonstrations or motion capture. This works, sort of. The robot looks better. But here's the catch, and it's worth noting that this is a well-documented tradeoff in the literature: motion imitation methods become more sensitive to external disturbances. The reference signals can actively oppose the transient poses the robot needs to regain balance after a push or stumble.

Verwandte Beiträge

More in Humanoids

The headlines are celebrating a $2.5B humanoid robotics deal. I'd pump the brakes a little.

Mark Kowalski · 25 Jun · 6 min

Sometimes the sources don't pan out. Here's what happened when I tried to write a humanoids story this week and ended up with Samsung deals instead.

Sarah Williams · 25 Jun · 3 min

Diffusion models are getting good at imagining robot movements, but 'imaginable' and 'physically possible' aren't the same thing. Researchers are starting to close that gap.

Sarah Williams · 25 Jun · 6 min

A batch of fresh robotics research tackles the same underlying problem from different angles: robots that can see but don't really understand where things are.

Two Papers, One Week: Humanoid Locomotion Research Is Converging on a Fundamental Tradeoff

What's the actual problem these papers are solving?

More in Humanoids

How does Predictive Style Matching attempt to solve this?

What about MotionWAM's whole-body approach?

How do these approaches actually compare?

What remains unclear?

What would I want to see next?

The bigger picture

Quellen