Teaching a Humanoid Arm to Move More Efficiently Is Harder Than It Sounds
Two new papers tackle the energy problem in humanoid robots from opposite ends, and together they point at something the field has been quietly ignoring.
By
·2 days ago·6 Min. Lesezeit
Humanoid robots burn through battery charge faster than most people realise. Not because of locomotion, not because of compute, but because of something as mundane as how an arm reaches for an object.
Two papers just dropped on arXiv that tackle this problem directly, and honestly, I think they deserve more attention than they're going to get. One builds a detailed physics-based model of exactly how much electrical power the Unitree G1's arm consumes during motion. The other uses that model to train a reinforcement learning policy that actually moves the arm more efficiently. Together, they form something close to a complete pipeline, from measurement to action.
I initially thought this was a fairly niche problem, the kind of thing that matters to engineers but not to anyone watching humanoids from the outside. Then I started reading and changed my mind pretty quickly.
Here's the issue. Battery-powered humanoid robots doing real-world tasks, think agricultural picking, warehouse work, anything in the field, are fundamentally constrained by how many movements they can make per charge. If your robot can only harvest apples for 40 minutes before it needs to go back to a charging station, that's not a product. That's a prototype.
The first paper, from researchers working with the Unitree G1, focuses entirely on building an accurate model of electrical power consumption for the robot's seven-degree-of-freedom left arm. They ran 897 trajectories covering single-joint and coordinated arm motions at multiple speed levels, and used onboard power measurements as the regression target. The resulting model achieves an R² of 0.933 with an RMSE of 1.07 watts. When validated on 46 trajectories at previously unseen speeds, that R² actually improves to 0.965.
Verwandte Beiträge
More in Humanoids
A pair of fresh research efforts tackle one of the most stubborn problems in humanoid locomotion: what happens when the real world shoves back.
Mark Kowalski · 10 hours ago · 7 min
Two new papers take on one of embodied AI's most frustrating practical problems: what happens when a robot's sensors go dark mid-task.
Sarah Williams · Yesterday · 4 min
One team tackled the memory and latency problem for robots finding objects in real spaces. Another rethought how robots translate intent into motion. Both point at the same underlying tension.
Sarah Williams · Yesterday · 6 min
Motion planning is one of those problems that sounds solved until you watch a robot arm get stuck. Two new research papers are taking very different approaches to unsticking it.
That generalisation result is the part I keep coming back to. The model wasn't just fitting to training data. It held up on motion it had never seen before. That matters a lot if you want to use this in the real world, where robots don't move in neat pre-planned patterns.
What the model actually captures
One thing the paper does that I found genuinely useful is break down which physical loss mechanisms dominate at each joint. Viscous friction dominates the shoulder pitch and all three wrist joints. Copper losses dominate the shoulder yaw and elbow. And shoulder roll, interestingly, is uniquely dominated by Coulomb friction.
You might be wondering why that breakdown matters. It matters because it tells you where the inefficiencies actually live, and therefore where you'd focus if you wanted to reduce them. A one-size-fits-all energy model would miss this entirely. The fact that different joints have fundamentally different loss characteristics means any serious energy optimisation has to work at the joint level, not just globally.
The paper also introduces pairwise interaction terms to model power coupling during simultaneous multi-joint motion. This is the part that honestly, I'm not sure I fully grasp at the maths level, but the intuition seems to be that when two joints move at the same time, their power consumption isn't just additive. There's coupling. Ignoring that leads to prediction errors. Including it improves accuracy.
The second paper takes that power model and plugs it into a reinforcement learning framework. The goal is to train a policy that reaches target positions with the arm while also minimising energy use.
They use a Soft Actor-Critic algorithm, trained in a Pinocchio-based rigid-body dynamics simulator, with what they call a Hybrid Constellation Reward. The reward combines end-effector position accuracy (measured as a four-point constellation distance) with a torque-norm energy proxy. After 5 million training steps, the policy reaches a 69.9% success rate over 1,000 random targets in kinematic simulation, at a mean energy of 98.16 joules per successful episode.
Then they test it on the actual physical robot.
On the physical Unitree G1, across three independent 10-target batches, the policy achieves a mean energy of 71.5 ± 48.3 joules, an end-effector position error of 2.64 ± 1.04 cm, and an orientation error of 6.92 ± 1.33 degrees. Both error figures fall within the 4 cm and 8.6 degree training tolerances.
The energy number is interesting. The physical robot actually used less energy on average than the simulation predicted. The researchers suggest this might be because the real robot's joint friction and motor characteristics differ from the simulation, and the policy ended up being conservative in ways that happened to be efficient. It's a reasonable interpretation, but tbh this is based on a fairly small validation set, three batches of ten targets each, so I'd want to see this replicated at scale before drawing strong conclusions.
The variance is also worth flagging. That ± 48.3 joules on a mean of 71.5 joules is a wide spread. Some reaching motions used much more energy than others. The paper describes this as a first step, which is fair, but it also means the practical utility in a real deployment is still unclear.
I've been covering humanoids long enough to notice a pattern. Most of the attention goes to locomotion, dexterity, and increasingly, whole-body control. Energy efficiency in manipulation gets treated as a secondary concern, something to optimise later once the robot can actually do the task.
These papers push back on that framing, and I think they're right to.
If you're building a robot for in-field agricultural work, or any task where you can't guarantee frequent charging, energy efficiency isn't a nice-to-have. It's a design constraint from day one. The number of reaching motions per battery charge directly determines whether the economics work.
The approach here, identifying a physics-based model from real hardware data and then incorporating it into RL training, seems like the right direction. It's grounded in measurement rather than assumption. The model parameters were identified from the actual Unitree G1, not derived theoretically, which means they capture the real friction and loss characteristics of that specific hardware.
What remains unclear is how well this transfers to other humanoid platforms, or even to the right arm of the same robot. The papers focus exclusively on the left arm of the G1. Whether the same methodology would yield similarly accurate models for different robot morphologies, or for full-body motion planning, is an open question. The researchers themselves describe this as a first step, and I think that's the right framing.
I should also note that the RL policy currently handles reaching but not grasping or manipulation. Getting an arm to a target position efficiently is one problem. What happens once it gets there, the contact forces, the object interaction, the reactive adjustments, that's a different and harder problem, and energy-aware planning for that is still very much unsolved.
Still. The combination of a validated power model with an RL policy that demonstrably reduces energy use on physical hardware is meaningful progress. The field needs more of this kind of careful, measurement-grounded work. It's less flashy than a new whole-body controller video, but it's the kind of thing that actually determines whether humanoids become useful tools or stay expensive demos.
This raises questions about how the broader humanoid ecosystem will prioritise this kind of research, well, multiple things, really, including who funds it, whether hardware manufacturers will share the data needed to build accurate models, and whether energy efficiency will become a standard benchmark metric the way task success rate already is.
For now, two papers and one robot arm. But it's a start.