Why reinforcement learning is making a quiet comeback in robotics

For five years, imitation learning has dominated practical robotics research. New results suggest reinforcement learning is back, with better tooling.

By Isaac Mendez

18 May 20263 min de lecture

Crédit photo: Photo by Andrea De Santis on Unsplash · source

For five years, imitation learning has been the practical method of choice for getting robots to do things in the real world. Reinforcement learning was treated, fairly or not, as the technique with great theoretical promise and disappointing engineering practice. A new round of results suggests the balance is shifting.

A DeepMind blog post lays out the case. Recent improvements in offline reinforcement learning methods are making RL competitive with imitation learning for practical robotics work. An arXiv paper from a different group reports parallel results.

What changed

The historical complaint about RL in robotics was that the training loop required real interaction with the environment, the environment was expensive to run safely, and the resulting models were brittle in deployment. Imitation learning sidestepped these issues by training entirely on demonstration data, with no need for trial-and-error in the real world.

Offline RL changes the calculation. The training loop now runs on previously collected data, including the same demonstration data that imitation learning uses, but with an RL objective. The model learns not just to imitate but to optimise for outcomes it can infer from the data.

The result is a model that, in early reports, matches or exceeds imitation-learning performance on standard benchmarks, with the added flexibility of being optimisable for objectives that the original demonstrators did not pursue.

More in Research

One of robotics' oldest bottlenecks may have a real solution. Or it may not. A new arXiv paper makes a strong case for synthetic demonstration data.

Isaac Mendez · 22 May · 3 min

A new benchmark suite makes the question of robotic generalisation testable in a way previous benchmarks did not.

Priya Nair · 14 May · 3 min

Researchers have developed a sensor dense enough to let a robot distinguish between fabrics by feel. The applications are immediate.

Isaac Mendez · 4 May · 3 min

Code generation for robot tasks has improved dramatically. The reliability gap between generated and human-written code is narrowing.

Why reinforcement learning is making a quiet comeback in robotics

What changed

More in Research

What is likely next

Sources