Big AI Labs Are Writing Their Own Safety Rulebooks. That Should Worry You.
Google DeepMind and OpenAI released updated 'frontier safety frameworks' this month. I've seen this self-regulation playbook before, and it rarely ends well.
Crédit photo: Lottie animation by Centre Robotics (LottieFiles Free, used with credit). · source
Zero. That's the number of legally binding safety requirements that apply specifically to frontier AI models in the United States right now. And yet Google DeepMind and OpenAI would like you to know they're taking safety very, very seriously.
Both companies released updated versions of their internal safety frameworks in recent weeks. Google DeepMind calls theirs the Frontier Safety Framework, OpenAI has its Preparedness Framework. The documents are dense, full of phrases like "critical capability levels" and "severe risk thresholds." They describe internal processes for evaluating whether an AI model might, say, help someone build a bioweapon or autonomously replicate itself across the internet. Scary stuff! And the companies assure us they have protocols in place.
Call me old-fashioned, but I've seen this movie before.
Back in the early 2000s, I covered the financial sector when derivatives were getting complicated enough that regulators couldn't keep up. The big banks said don't worry, we have internal risk models, we know what we're doing. Then 2008 happened. In the 2010s, I watched social media companies promise they could self-police content moderation, that their community standards and trust and safety teams had it handled. We all know how that turned out. Now I'm watching AI labs publish voluntary frameworks and expecting applause.
The pattern is always the same. A new technology emerges that regulators don't understand. The companies building it are the only ones with the technical expertise to evaluate the risks. So they write their own rules, publish them with great fanfare, and point to them whenever anyone asks uncomfortable questions. It works beautifully, right up until it doesn't.
À lire aussi
More in AI Models
Five years after AlphaFold solved protein folding, researchers are engineering heat-tolerant plants by redesigning photosynthesis itself.
Sarah Williams · 3 hours ago · 5 min
Google and OpenAI just released benchmarks showing their best models get basic facts wrong 30-40% of the time. That's... not great.
Sarah Williams · 3 hours ago · 5 min
Three papers in two weeks suggest synthetic training data could replace expensive real-world robot demonstrations. I've seen this movie before, but the ending might be different this time.
Mark Kowalski · 3 hours ago · 6 min
Everyone's focused on AI chatbots manipulating users. The real concern is what happens when these systems control physical hardware.
To be fair to DeepMind and OpenAI, their frameworks aren't nothing. They describe real processes, actual evaluations that happen before models get deployed. Google's updated framework talks about "early warning evaluations" and commits to not deploying models that cross certain capability thresholds without appropriate mitigations. OpenAI's document discusses how they assess models for potential misuse across categories like cybersecurity, biological threats, and what they call "model autonomy." These are genuine concerns and it's better that someone is thinking about them than no one.
But here's what bugs me, and I've been chewing on this for a couple weeks now. These frameworks are entirely voluntary. The companies wrote them, the companies enforce them, and the companies can change them whenever they want. OpenAI's document is explicit about this, noting that the framework will evolve as they learn more. Which sounds reasonable until you remember that "evolving" can mean weakening just as easily as strengthening.
The Frontier Model Forum is another piece of this puzzle. It's an industry body that OpenAI, Google, Microsoft, and Anthropic formed back in 2023 to "promote safe and responsible development" of frontier AI. They talk about advancing safety research, identifying best practices, sharing information with policymakers. All good things in theory. But industry groups have a funny way of becoming lobbying organizations that protect member interests while providing a veneer of responsibility. The chemical industry had Responsible Care. Big Tech had the Global Network Initiative. Neither exactly covered themselves in glory.
What's missing from all these frameworks and forums? External accountability. Independent auditing with teeth. Legal consequences for failures. The frameworks describe what the companies will do, but what happens if they don't do it? What happens if a model gets deployed that shouldn't have been, and something goes wrong? The documents are silent on this, as far as I can tell.
Now, some of the young founders and AI researchers I talk to (and yes, I still prefer email for these conversations, my inbox is right there on the about page) argue that self-regulation is actually the best we can do right now. They say regulators don't have the technical chops to evaluate these systems, that moving too fast with bad regulation could be worse than no regulation. There's something to this. I've watched well-intentioned but poorly designed rules create perverse incentives in other industries. Nobody wants that.
But the argument that we should trust companies to regulate themselves because regulation is hard, well, that's a bit convenient coming from the companies that would be regulated, isn't it? It's also not entirely true that government can't develop expertise. The FDA manages to evaluate drugs. The FAA certifies aircraft. It takes time and resources and political will, but it's doable.
The question is whether we have time. These models are advancing quickly, the labs are racing each other, and the gap between what AI can do and what regulators understand is only growing. Google and OpenAI are essentially asking us to trust that their internal processes will catch problems before they become catastrophic. Maybe they will! I genuinely don't know. But the track record of industry self-regulation in other sectors is, let's say, not encouraging.
What would actually give me confidence? A few things. Independent red teams with real authority, not just consultants the companies hire and can ignore. Mandatory disclosure of capability evaluations to some regulatory body, even if the details stay confidential. Legal liability that makes safety failures expensive. And frankly, some mechanism to slow down deployment when there's genuine uncertainty about risks, not just internal guidelines that can be overridden when competitive pressure gets intense enough.
I'm not holding my breath for any of this. The AI companies have enormous lobbying resources, Congress barely understands what a large language model is, and the EU's AI Act (which does have some teeth) won't fully apply to frontier models for years. In the meantime, we get frameworks and forums and promises.
Look, I don't think the people at DeepMind and OpenAI are villains. Most of them probably genuinely care about safety and are doing their best in a weird situation where they're building technology that might be transformative or might be dangerous or might be both. The frameworks they've published reflect real thought about hard problems. But good intentions and internal processes are not the same as accountability. We learned that with banks, we learned it with social media, and I suspect we're going to learn it again with AI.
Maybe I'm wrong. Maybe this time the industry will police itself effectively and we'll look back and say wow, those voluntary frameworks really worked. But what do I know, I've only been covering tech for three decades and I've never once seen that happen.