OpenAI Changes its Strategy Again, Delaying GPT-5 in Favor of o3 and o4-mini

OpenAI has restructured its roadmap to decouple reasoning from chat models, starting with the release of o3 and o4-mini in the coming weeks.

OpenAI has reversed its February decision to cancel the o3 model, announcing instead that it will launch o3 and o4-mini in the coming weeks—putting GPT-5 on hold until later this year. The update was confirmed by CEO Sam Altman in a post on X, where he said the company will now keep reasoning-focused models separate from its general-purpose language models.

“Change of plans: we are going to release o3 and o4-mini after all, probably in a couple of weeks, and then do GPT-5 in a few months. There are a bunch of reasons for this, but the most exciting one is that we are going to decouple reasoning models and chat/completion models.” Altman wrote. He added, “we are excited about the performance we’re seeing from o3 internally.”

The reversal comes just weeks after OpenAI announced it would consolidate its offerings by integrating o3’s capabilities directly into GPT-5. That strategy was aimed at reducing user confusion and streamlining product complexity. OpenAI had said it wanted to create a single system that could serve all functions without requiring a “model picker.”

o3 and o4-mini: Structured Reasoning, Scaled Compute

o3’s planned rollout is tied to its performance in reasoning benchmarks, which OpenAI previewed in December 2024. The model scored 87.5% on ARC-AGI in low-efficiency settings and 91.5% in high-efficiency mode. On the AIME 2024 mathematics benchmark, it reached 96.7%. It also performed well on GPQA Diamond, a test for PhD-level science reasoning, with an 87.7% score. These numbers place o3 above GPT-4.5 and o3-mini in specific technical tasks, especially in structured domains.

ARC Prize researchers noted that “this represents the first time we have observed a model solving novel tasks through internal step-wise adaptation”, a reference to o3’s use of private chain-of-thought reasoning—a mechanism by which the model performs internal logic before delivering an output. This allows it to tackle complex multi-step problems more effectively than earlier models.

Another key feature is its ability to scale compute based on task complexity. o3 allows developers to increase reasoning depth, but the tradeoff is steep compute usage—up to 172x more in high-efficiency configurations. These demands were confirmed in the ARC benchmark blog and raise questions about feasibility at scale.

o4-mini, although unreleased, is expected to be a smaller sibling to o3. Based on OpenAI’s naming conventions, it likely offers reduced compute requirements with many of the same reasoning benefits. The company has not shared public benchmarks or specifications yet.

Enterprise Models and Monetization Tiers

OpenAI’s reasoning offerings are increasingly segmented. On March 20, the company launched o1-Pro via API access, targeting enterprise use cases like legal tech and agent pipelines. The model supports up to 100,000 output tokens and requires $5 in prior spend to unlock access. Pricing starts at $150 per million input tokens and $600 for outputs.

Described by OpenAI as a model that uses more compute to think harder and provide consistently better answers, it was introduced via OpenAI’s new Responses API, intended for structured applications with high prompt complexity.

At the other end of the spectrum, Microsoft has integrated OpenAI’s o3-mini-High model into Copilot at no cost, as reported on March 7. This shows a divergence in business models: Microsoft bundles OpenAI’s reasoning tools into broader software, while OpenAI monetizes reasoning as a premium feature.

Waiting for GPT-5, Learning from GPT-4.5

So, GPT-5 is still in the pipeline. Altman says the model will arrive “in a few months,” though no specific date has been given. When it launches, GPT-5 is expected to integrate reasoning and generation in one system. That goal had originally been cited in the now-abandoned February plan to fold o3 into GPT-5.

In the interim, GPT-4.5 remains OpenAI’s most capable general-purpose model. Released in late February, GPT-4.5 expanded the system’s context window to 200,000 tokens and introduced improvements in multilingual understanding. However, it still underperforms in math and science reasoning benchmarks compared to o3-mini.

Altman described GPT-4.5 as “the first model that feels like talking to a thoughtful person”, but conceded that the model was not designed as a reasoning powerhouse. OpenAI admitted that models like o3-mini still surpassed GPT-4.5 in high-rigor domains such as coding and advanced problem-solving.

Meanwhile, Google has released Gemini 2.5 Pro, which is not topping various benchmarks for AI reasoning models.

Funding, Infrastructure, and Strategic Pressures

OpenAI’s latest roadmap shift is unfolding alongside substantial financial and infrastructure moves. On April 1, the company announced a $40 billion tender deal led by SoftBank, pushing its valuation to $300 billion. The structure of the deal—primarily secondary share sales—allowed early employees and investors to cash out while increasing pressure on OpenAI to ship high-value products.

To support its long-term roadmap, OpenAI is investing in its own compute stack. In March, it signed an $11.9 billion compute agreement with CoreWeave and took a $350 million equity stake in the company. It also continues development of custom AI chips in partnership with Broadcom and TSMC, with early designs expected later this year.

OpenAI is part of the U.S.-backed Stargate Project, a multi-phase infrastructure initiative focused on building data centers and AI research capacity in the U.S. These moves indicate OpenAI’s goal to reduce reliance on Microsoft Azure, while building out its own compute independence.

Open-Weight Model and Transparency Push

Just before the o3 reversal, Altman also announced that OpenAI is preparing the release of its first open-weight language model since GPT-2. He described the model as “pretty capable” and asked developers and researchers to provide feedback on how to improve its utility. The company clarified that the model will include pre-trained weights but not training data or code.

“[W]e are excited to make this a very, very good model!” Altman wrote. The move comes amid growing developer demand for transparency and in response to competition from open-source models released by Meta, Mistral, and DeepSeek.

OpenAI has also made changes to increase interpretability. In February, the company began revealing internal reasoning traces from o3-mini, helping developers and researchers understand how models arrive at answers. This decision reflects a broader shift toward explainability across the company’s reasoning model family.

A Modular Future for OpenAI’s Model Roadmap

OpenAI’s pivot toward releasing o3 and o4-mini ahead of GPT-5 illustrates its increasingly modular approach to model deployment. Instead of waiting to bundle capabilities into one all-encompassing frontier model, OpenAI is now releasing specialized tools as they become production-ready.

The shift comes with trade-offs. Users now face more model types, but each one is better optimized for specific tasks. Enterprises can adopt high-end reasoning models like o1-Pro, while developers gain access to intermediate systems like o4-mini—or to open-weight versions for more transparent experimentation.

While this may complicate product selection in the short term, it allows OpenAI to push updates faster and respond to user needs without holding back for monolithic releases. Whether GPT-5 will eventually consolidate these offerings—or be just another branch in the company’s expanding model tree—remains to be seen.

Markus Kasanmascheff
Markus Kasanmascheff
Markus has been covering the tech industry for more than 15 years. He is holding a Master´s degree in International Economics and is the founder and managing editor of Winbuzzer.com.
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments
0
We would love to hear your opinion! Please comment below.x
()
x