OpenAI recently published what it calls "the largest study of ChatGPT use" to date, presenting data on how people interact with the tool across personal and professional contexts. The company argues this research demonstrates ChatGPT's economic value and shows adoption broadening beyond early technology enthusiasts.
At the same time, the company highlighted CyberAgent, a Japanese advertising and media conglomerate, as a case study in enterprise AI deployment. The company has rolled out ChatGPT Enterprise and Codex across its operations spanning advertising, media production, and gaming divisions.
These two announcements, taken together, paint a picture of AI adoption maturing from novelty to infrastructure. But, to be precise, the evidence supporting this narrative deserves closer examination than the press materials suggest.
According to OpenAI's blog post, the study found that ChatGPT usage is "closing gaps" between early adopters and mainstream users, with the tool becoming "a part of everyday life" for many people. The company claims this demonstrates economic value creation through both personal and professional applications.
It's worth noting that OpenAI has not, as of this writing, published the full methodology, sample sizes, or statistical analyses underlying these claims. The blog post presents conclusions without the supporting data that would allow independent verification. This is not unusual for corporate research announcements, but it does limit what we can actually conclude from the findings.
À lire aussi
More in AI Models
ChatGPT Health looks polished, but anyone who's watched enterprise software enter hospitals knows the real test comes later.
Robert "Bob" Macintosh · 1 hour ago · 4 min
CyberAgent's rollout of ChatGPT Enterprise reminds me of watching PLCs spread through manufacturing in the 90s, for better and worse.
Robert "Bob" Macintosh · 1 hour ago · 3 min
A single model that handles vision, audio, and language at once sounds great on paper. I've heard that pitch before.
Mark Kowalski · 1 hour ago · 5 min
The AI giant is rolling out child and teen safety blueprints across multiple regions. I've got questions about the implementation.
The CyberAgent case study offers somewhat more concrete detail. The company reportedly uses ChatGPT Enterprise for "quality improvement" and "accelerated decision-making" across its advertising operations. Codex, OpenAI's code generation tool, has been integrated into their development workflows. The framing emphasizes security and scale (CyberAgent apparently needed enterprise-grade data protection before expanding AI use internally).
What remains unclear is the magnitude of these improvements. How much faster are decisions being made? By what metric is quality being measured? The announcement doesn't say, which makes it difficult to assess whether we're looking at transformative change or incremental efficiency gains. I know I'm being picky here, but these distinctions matter when evaluating claims about AI's economic impact.
Let me be direct about a concern: OpenAI studying ChatGPT adoption is a bit like a pharmaceutical company running its own drug trials without external oversight. This doesn't mean the research is wrong, but it does mean we should apply appropriate skepticism.
Actually, the research shows... well, we don't fully know what it shows. The claim that this is "the largest study of ChatGPT use" is presented without comparison to prior work. Largest by what measure? Number of participants? Duration of observation? Breadth of use cases examined? The blog post doesn't specify.
There's also the question of selection bias. Users who voluntarily participate in ChatGPT usage studies may differ systematically from those who don't. Heavy users are more likely to respond to surveys about their usage. People who had negative experiences may have already churned out of the user base entirely. These are standard methodological concerns that any academic reviewer would flag, and the public materials don't address them.
The economic value claims are particularly difficult to evaluate. "Creating economic value" can mean many things, from saving time on routine tasks to enabling entirely new capabilities. The research apparently doesn't distinguish between these categories, or if it does, those distinctions aren't reflected in the summary.
For comparison, independent academic research on large language model productivity effects (such as the MIT study by Noy and Zhang, or the Stanford/MIT call center study by Brynjolfsson et al.) typically includes specific productivity metrics, control groups, and confidence intervals. OpenAI's announcement includes none of these elements in its public presentation.
The CyberAgent deployment is, in some ways, more informative than the broader usage research, precisely because it's more specific. A major Japanese corporation with operations across advertising, media, and gaming decided that ChatGPT Enterprise met their security and capability requirements. That's a data point.
CyberAgent's decision to adopt Codex alongside ChatGPT Enterprise suggests they see value in AI-assisted code generation for their development teams. This is consistent with other enterprise adoption patterns we've seen (GitHub Copilot's growth, Amazon CodeWhisperer's deployment, etc.). The code generation use case has, arguably, the clearest productivity evidence behind it, with several peer-reviewed studies showing measurable time savings for certain programming tasks.
The advertising and media applications are harder to evaluate. "Quality improvement" in advertising could mean anything from better copywriting to more effective targeting to faster iteration cycles. Without specifics, we're left inferring from context.
One genuinely useful signal: CyberAgent apparently required enterprise-grade security before scaling their AI adoption. This suggests that data protection concerns were a real barrier to broader internal deployment, and that ChatGPT Enterprise's security features addressed those concerns sufficiently for a large, publicly-traded company. That's meaningful information about enterprise AI adoption dynamics, even if the productivity claims remain vague.
OpenAI's claim that adoption is "broadening beyond early users" and "closing gaps" is interesting but underspecified. Closing what gaps, exactly? The digital divide between technology workers and others? The gap between large enterprises and small businesses? Between different age demographics?
If we take the claim at face value, it suggests ChatGPT is moving from the "early adopter" phase into "early majority" territory (to use the diffusion of innovations framework). This would be consistent with the general trajectory of consumer technology adoption, though the speed of that transition remains debated.
It's too early to say whether this represents a fundamental shift in how knowledge work gets done or a more modest integration of AI tools into existing workflows. The honest answer is that we don't have enough longitudinal data yet to distinguish between these scenarios. Early productivity gains often look different from sustained productivity changes, and we're still in the early phase.
The research also doesn't appear to address the question of task displacement versus task augmentation. Are people using ChatGPT to do things they previously couldn't do, or to do existing things faster? These have very different implications for economic value creation and labor market effects.
If OpenAI wants its adoption research to be taken seriously by the academic and policy communities, several things would help:
First, publish the full methodology. Sample selection, survey instruments, statistical methods, confidence intervals. This is standard practice for credible research, and the absence of these details invites skepticism.
Second, enable independent replication. Share anonymized datasets or at least provide enough detail that external researchers could design comparable studies. The most credible productivity research on AI tools has come from independent academics with no financial stake in the outcomes.
Third, be specific about limitations. Every study has them. Acknowledging selection bias, measurement challenges, and generalizability constraints would actually increase credibility, not decrease it.
For the enterprise case studies, more granular metrics would be valuable. Not just "quality improved" but "error rates decreased by X%" or "revision cycles reduced from Y to Z." CyberAgent presumably has internal metrics justifying their continued investment; sharing some of those would strengthen the case considerably.
None of this is to say that ChatGPT isn't being widely adopted or that it doesn't provide value to users. Anecdotally, the evidence for broad adoption is overwhelming. But anecdotes aren't data, and corporate research announcements aren't peer-reviewed studies.
The robotics and AI research community has learned, sometimes painfully, to be skeptical of dramatic capability claims that aren't backed by rigorous evaluation. The same skepticism should apply to adoption and economic impact claims.
OpenAI has every incentive to present ChatGPT adoption in the most favorable light possible. That's not nefarious; it's how companies operate. But it means external observers should weight this research accordingly, as one data point among many rather than as definitive evidence of AI's economic transformation.
The CyberAgent case study, while thin on specifics, at least represents a real deployment decision by a real company with real money at stake. That carries more evidential weight than survey research, however large the sample size. Enterprises don't adopt expensive tools for PR reasons; they adopt them because they expect returns.
What we're left with is a picture that's... basically consistent with what we already knew. AI tools are being adopted. Some enterprises find them valuable enough to deploy at scale. Usage is growing. The magnitude and sustainability of the economic effects remain genuinely uncertain, and this research doesn't resolve that uncertainty as much as the framing suggests.
The honest conclusion is that we're still in the early chapters of understanding how large language models will reshape work and productivity. OpenAI's research contributes a data point, but the full picture requires independent verification, longitudinal study, and the kind of methodological rigor that corporate announcements rarely provide. We'll know more in a year. For now, cautious optimism seems appropriate, but so does continued scrutiny.