画像クレジット: Photo by Conny Schneider on Unsplash · source
One of the longest-running theoretical disagreements inside robotics research has, for years, gone like this. Should a robot have an explicit model of the world inside its head, against which it plans? Or should it learn an end-to-end policy that maps perception directly to action and never bother with the model?
The debate has been productive precisely because both camps could point to wins. That stalemate appears to be ending.
The DeepMind argument
A new DeepMind blog post is the clearest articulation of the world-model case in years. It argues that explicit world models continue to outperform end-to-end policies on tasks requiring physical reasoning: predicting where an object will land, whether a stack will tip, what a partner agent intends.
The post is careful. It does not argue that world models win everywhere. It argues that the kinds of tasks they win on are exactly the tasks roboticists actually want robots to do.
The OpenAI response
OpenAI's own post, published two weeks later, makes a sharper counter-argument: end-to-end policies trained on sufficient demonstration data approach world-model performance, and they do it without the inference cost of running a separate prediction engine in the loop.
The debate is empirical, not philosophical. — OpenAI blog
関連記事
More in AI Models
Pi has released model weights for π0.5, the first major open-weights foundation model trained specifically on robot demonstration data.
Lena Park · 23 May · 3 min
The latest VLA models are starting to fail in a specific, predictable way: they remember the last few seconds, and not much more. Researchers are racing to fix it.
Priya Nair · 23 May · 3 min
Nvidia's humanoid robotics foundation model has been talked about for two years. The new SDK release is the first time it looks like a serious platform play.
Nadia Rahman · 23 May · 3 min
Open X-Embodiment was supposed to be a research curiosity. A year on, it is the default dataset for serious robot manipulation research.