The world model versus the policy: a debate that is finally getting resolved

A long-running theoretical disagreement inside robotics research is starting to resolve in favour of one side. The implications are bigger than they sound.

By Lena Park

23 May 2026読了 3 分

画像クレジット: Photo by Conny Schneider on Unsplash · source

One of the longest-running theoretical disagreements inside robotics research has, for years, gone like this. Should a robot have an explicit model of the world inside its head, against which it plans? Or should it learn an end-to-end policy that maps perception directly to action and never bother with the model?

The debate has been productive precisely because both camps could point to wins. That stalemate appears to be ending.

The DeepMind argument

A new DeepMind blog post is the clearest articulation of the world-model case in years. It argues that explicit world models continue to outperform end-to-end policies on tasks requiring physical reasoning: predicting where an object will land, whether a stack will tip, what a partner agent intends.

The post is careful. It does not argue that world models win everywhere. It argues that the kinds of tasks they win on are exactly the tasks roboticists actually want robots to do.

The OpenAI response

OpenAI's own post, published two weeks later, makes a sharper counter-argument: end-to-end policies trained on sufficient demonstration data approach world-model performance, and they do it without the inference cost of running a separate prediction engine in the loop.

The debate is empirical, not philosophical. — OpenAI blog

More in AI Models

Pi has released model weights for π0.5, the first major open-weights foundation model trained specifically on robot demonstration data.

Lena Park · 23 May · 3 min

The latest VLA models are starting to fail in a specific, predictable way: they remember the last few seconds, and not much more. Researchers are racing to fix it.

Priya Nair · 23 May · 3 min

Nvidia's humanoid robotics foundation model has been talked about for two years. The new SDK release is the first time it looks like a serious platform play.

Nadia Rahman · 23 May · 3 min

Open X-Embodiment was supposed to be a research curiosity. A year on, it is the default dataset for serious robot manipulation research.

The world model versus the policy: a debate that is finally getting resolved

The DeepMind argument

The OpenAI response

More in AI Models

Where it is actually resolving

What is likely next

出典