OpenAI's Privacy Paradox: Fighting for User Data While Building an Ad Machine
The company is battling the New York Times over 20 million ChatGPT conversations while simultaneously launching an advertising platform that needs user data to function.
Crédit photo: Lottie animation by Centre Robotics (LottieFiles Free, used with credit). · source
20 million. That's the number of private ChatGPT conversations the New York Times is demanding OpenAI hand over in ongoing litigation. It's also, coincidentally, roughly the number of questions I have about how OpenAI plans to reconcile its aggressive new advertising push with its equally aggressive privacy rhetoric.
Let me be clear: I'm not a privacy lawyer. But I've seen enough corporate pivots to recognize when a company is trying to have it both ways.
OpenAI is fighting a court order that would require indefinite retention of consumer ChatGPT and API user data. The company frames this as a principled stand for user privacy, and to be fair, the demand is extraordinary. According to OpenAI's blog, the Times and other plaintiffs want access to conversations that users reasonably expected to remain private.
The company says it's accelerating new security and privacy protections in response. That's the kind of language you'd expect from a company positioning itself as a privacy champion.
While OpenAI's legal team battles over user data retention, another part of the company is building out a full-scale advertising infrastructure. The timeline is striking:
Initial ad testing announced for ChatGPT's free and Go tiers
A beta self-serve Ads Manager now available
CPC (cost-per-click) bidding introduced
À lire aussi
More in AI Models
When the biggest AI company starts giving away its product to millions of federal workers, the rest of us need to pay attention to where this is heading.
Robert "Bob" Macintosh · 59 mins ago · 3 min
Everyone's covering the parental controls. The real story is how OpenAI is trying to solve an almost impossible problem: age verification without surveillance.
James Chen · 2 hours ago · 7 min
The company is rapidly expanding where customer data can live, but the real question is whether this solves the problems enterprises actually have.
James Chen · 2 hours ago · 5 min
Three announcements in quick succession reveal OpenAI isn't just scaling up, it's building the backbone for AI that needs to think and respond in real-time.
Enhanced measurement tools for advertisers
OpenAI insists the ads will have "clear labeling, answer independence, strong privacy protections, and user control." They claim conversations remain separate from ad targeting.
That's an ambitious claim. The real test is whether it holds up at scale.
This is where I get skeptical. OpenAI hasn't disclosed exact figures on how many users will see ads, what data signals inform ad placement, or how "separate" the conversation data truly remains from the advertising stack.
From my time building hardware, I learned that the spec sheet tells you what a system can do, but the implementation tells you what it actually does. OpenAI's privacy promises are spec sheet material. We don't have visibility into the implementation.
Consider the basic mechanics of a self-serve ad platform with CPC bidding and "enhanced measurement tools." Advertisers need to know:
Who clicked their ad
What context led to the click
Whether the user converted
How do you provide that measurement without some form of user tracking? OpenAI hasn't explained this in detail. Maybe they've built something genuinely novel. Maybe they're using aggregated, anonymized signals. Or maybe "privacy protections" means something different to an ad sales team than it does to a legal team fighting data retention orders.
In a separate announcement, OpenAI made ChatGPT for Clinicians free for verified U.S. physicians, nurse practitioners, and pharmacists. The tool is designed to support clinical care, documentation, and research.
This is good. Healthcare workers need better tools, and free access removes a barrier.
But it also raises the stakes on privacy. If clinicians are using ChatGPT for patient-related queries (even in anonymized form), and the same platform is now running ads with "enhanced measurement," the separation between those systems becomes critical.
OpenAI hasn't specified whether the clinician tier is completely ad-free or just prioritized differently. Some argue that verified professional accounts would obviously be exempt. Others counter that the company's public statements don't make this explicit.
I'm not naive about why OpenAI is doing this. Running inference at scale is expensive. GPT-4 class models cost real money per query. The company reportedly needs to find paths to profitability beyond subscription revenue, and advertising is the proven model for monetizing free users.
The company's stated goal is to "expand affordable access to AI worldwide." That's a reasonable justification. Ads subsidize free access. This is how Google, Meta, and basically every consumer internet company operates.
But those companies also faced years of scrutiny over how their advertising businesses actually handled user data. OpenAI is essentially asking us to trust that they've learned from those mistakes and built something better. We don't have independent verification of that yet.
Several questions don't have good answers based on available information:
Data separation architecture: How exactly are conversation contents kept separate from ad targeting? Is this a technical separation (different databases, different access controls) or a policy separation (same data, rules about who can query it)?
Retention policies: If OpenAI is fighting indefinite data retention for legal discovery, what are their actual retention policies for advertising purposes? The company didn't disclose exact figures.
Third-party access: Do advertising partners get any user-level data, even in hashed or aggregated form?
Opt-out mechanics: Users can apparently control their ad experience, but what does that mean practically? Can you use ChatGPT free without any ad-related data collection?
These aren't gotcha questions. They're basic due diligence that any company running an ad platform should be able to answer clearly.
I keep coming back to the timing. OpenAI is simultaneously:
Positioning itself as a privacy defender against overreaching legal demands
Building advertising infrastructure that inherently requires some form of user understanding
Expanding into sensitive verticals like healthcare
Each of these makes sense individually. Together, they create a tension that OpenAI hasn't fully addressed.
Maybe they will. The company has shown it can move quickly on product development. Perhaps equally detailed explanations of their privacy architecture are coming. But based on what's public today, we're being asked to trust the spec sheet without seeing the implementation.
From my time in hardware, I learned to be skeptical of that ask. The companies that had nothing to hide were usually eager to show you exactly how things worked. The ones that deflected to marketing language often had reasons.
I'm not saying OpenAI is hiding something. I'm saying the burden of proof is on them, and they haven't met it yet.
The NYT litigation will proceed. OpenAI's advertising platform will scale. At some point, these two tracks will intersect in ways that force more disclosure.
Until then, users should understand that the company fighting for their privacy in court is the same company building systems to monetize their attention. Both things can be true. But the details of how they coexist matter enormously, and those details remain unclear.