OpenAI is facing pointed questions about its safety protocols after a Financial Times report published today revealed that the company, now valued at $300bn, has dramatically shortened evaluation periods for its newest AI models.
Citing eight sources familiar with the company’s operations, the report states that testing timelines, which previously spanned months, have been compressed to mere days. This acceleration comes as OpenAI prepares for an imminent launch, possibly next week, of new models including the reasoning-focused o3, leaving some third-party and internal testers less than a week for crucial safety assessments.
The hurried schedule is reportedly driven by intense competitive pressures within the AI field, as OpenAI races against giants like Google and Meta, alongside startups like Elon Musk’s xAI. However, the speed has raised alarms among those tasked with evaluating the models. “We had more thorough safety testing when [the technology] was less important,” one individual currently assessing the upcoming o3 model told the Financial Times.
They warned that as AI capabilities grow, so does the “potential weaponisation” and characterized the current approach as “reckless,” adding, “But because there is more demand for it, they want it out faster. I hope it is not a catastrophic mis-step, but it is reckless. This is a recipe for disaster.”
Another tester, involved with the GPT-4 evaluation in 2023 which spanned six months, recalled that dangerous flaws only emerged well into that longer process, commenting on the current situation: “They are just not prioritising public safety at all.” Daniel Kokotajlo, a former OpenAI researcher, highlighted the environment enabling this rush: “There’s no regulation saying [companies] have to keep the public informed about all the scary capabilities . . . and also they’re under lots of pressure to race each other so they’re not going to stop making them more capable.”
This safety debate coincides with a significant shift in OpenAI’s product strategy. CEO Sam Altman confirmed a “Change of plans” on April 4, stating the company would release the o3 and o4-mini reasoning models “probably in a couple of weeks,” pushing the highly anticipated GPT-5 launch back by “a few months.”
This reversed an earlier plan from February to consolidate capabilities into GPT-5. Altman explained the decision was partly to “decouple reasoning models and chat/completion models,” adding via X that “we are excited about the performance we’re seeing from o3 internally” and that the delay would allow GPT-5 to be “much better than we originally though[t].”
Further evidence of the imminent launch emerged April 10, when engineer Tibor Blaho spotted code references to `o3`, `o4-mini`, and `o4-mini-high` in a ChatGPT web update. Concurrently, reports suggest an updated multimodal model, tentatively named GPT-4.1, is also nearing release.
Lingering Questions About Testing Practices
Beyond the compressed schedule, specific concerns about the depth of OpenAI’s testing have surfaced. Critics question the company’s commitment to assess misuse potential, like aiding bioweapon creation, through fine-tuning. This process involves training a model on specialized datasets (like virology) to see if it develops dangerous capabilities.
Yet, according to former OpenAI safety researcher Steven Adler and others cited by the FT, this detailed testing has been limited, primarily using older models like GPT-4o, with no published results for newer, more capable models like o1 or o3-mini. According to Adler, whose views were detailed in a blog post, the lack of reporting on newer models’ fine-tuned capabilities leaves the public with little insight into potential misuse.
He told the Financial Times, “Not doing such tests could mean OpenAI and the other AI companies are underestimating the worst risks of their models.” Another critique involves testing earlier model versions, or “checkpoints,” rather than the final code released to the public. “It is bad practice to release a model which is different from the one you evaluated,” a former OpenAI technical staff member told the FT.
OpenAI defends its practices, citing efficiencies gained through automation and expressing confidence in its methods. The company stated that checkpoints were “basically identical” to final releases and that models are thoroughly tested, especially for catastrophic risks. Johannes Heidecke, OpenAI’s head of safety systems, asserted, “We have a good balance of how fast we move and how thorough we are.”
The company also recently launched its OpenAI Pioneers Program on April 9, focusing on collaborating with startups on “domain-specific” evaluations and model optimization using Reinforcement Fine Tuning (RFT) – a technique for creating specialized “expert models” for narrow tasks. This initiative, however, appears distinct from the foundational, pre-release safety evaluations reportedly being shortened.
A History of Internal Safety Debates
The tension between product velocity and safety protocols at OpenAI is not new. In May 2024, Jan Leike, then co-lead of the company’s Superalignment team focused on long-term AI risks, stating publicly that over recent years, “safety culture and processes have taken a backseat to shiny products.” His departure and later joining of Anthropic signaled deep disagreements over resources and priorities regarding long-term AI safety research. Notably, OpenAI had announced the formation of a board-led Safety and Security Committee just days earlier, tasked with a 90-day period to evaluate and develop safety processes and make recommendations.
Industry Rivals Emphasize Transparency and Governance
OpenAI’s reported acceleration contrasts with recent public stances from key competitors. On March 28, Anthropic detailed its interpretability framework, an “AI microscope” using dictionary learning to dissect its Claude model’s reasoning and identify risks. Dictionary learning attempts to reverse-engineer the model’s internal calculations, mapping them to understandable concepts. Anthropic framed this as essential for trust. Similarly, Google DeepMind proposed a global AGI safety framework on April 3, advocating for international oversight and treating advanced AI risks as immediate. This proposal followed the formation of DeepMind’s own AI Safety and Alignment organization earlier in 2024.
Regulatory Landscape and Ongoing Risks
The broader industry landscape shows complexities. Anthropic, while pushing for stronger government AI rules in early March, also quietly removed some of its own prior voluntary safety commitments made under a 2023 White House initiative, illustrating the tension between public positioning and operational pressures. OpenAI itself is party to voluntary commitments with the UK and US governments regarding external safety testing access, as mentioned in the FT report.
Meanwhile, regulatory frameworks are tightening, with the EU’s AI Act now in effect, mandating stricter transparency and risk mitigation for high-risk systems, though global standards for pre-release safety testing remain undefined. The need for robust testing is underscored by ongoing vulnerability discoveries, such as the “delayed tool invocation” exploit found in Google Gemini’s memory in February, or persistent jailbreaking techniques affecting multiple leading models. OpenAI’s rapid development continues despite Altman acknowledging potential capacity challenges earlier this month, which could impact timelines and service stability.