DeepSeek R1 AI Model Update Boosts Reasoning, Catching up With OpenAI o3 and Gemini 2.5 Pro

DeepSeek AI has launched DeepSeek-R1-0528, a significant upgrade to its R1 model, boasting enhanced reasoning, math, and coding capabilities, reduced hallucinations, and performance nearing top global AI systems, further intensifying AI competition.

Chinese AI startup DeepSeek has launched a notable update to its R1 artificial intelligence model, designated DeepSeek-R1-0528, significantly enhancing its capabilities and positioning it closer to leading global systems. In spite of notable performance gains, the company considers the new model a “minor update.”

The company says that the new version delivers substantially improved performance in reasoning, mathematics, and programming, while also reducing the generation of incorrect information, or “hallucinations.” This is significant for users and the wider industry, highlighting the rapid progress of Chinese AI firms and promising more powerful, reliable AI tools, thereby intensifying global AI competition.

DeepSeek asserts that the upgraded model’s overall performance is now “approaching that of leading models, such as O3 and Gemini 2.5 Pro.” This improvement, according to DeepSeek AI, is a result of increased computational resources and new algorithmic optimization mechanisms applied during its post-training phase.

For users, this translates to expectations of more accurate and contextually relevant outputs, especially when tackling complex tasks. The company also emphasized that the R1-0528 version offers enhanced support for function calling, enabling better interaction with external tools, and an improved experience for “vibe coding,” suggesting more intuitive code generation, as stated in their announcement.

Th latest iteration of DeepSeek R1 continues to use a Mixture-of-Experts (MoE) architecture, with approximately 670-685 billion total parameters but activates only about 37 billion per token during inference.

Enhanced Capabilities and Deeper Thinking

DeepSeek quantifies the performance leap with specific benchmark results. Notably, in the AIME 2025 test, a challenging mathematics competition, the model’s accuracy reportedly surged from 70% to 87.5%.

This is attributed to a greater depth of reasoning; the new model averaged 23,000 tokens—units of text processed—per question in this test, a significant increase from the previous 12,000 tokens. Further demonstrating its prowess, on the LiveCodeBench leaderboard, maintained by researchers from UC Berkeley, MIT, and Cornell, the new DeepSeek R1-0528 outperformed xAI’s Grok-3-mini and Alibaba’s Qwen-3.

A key improvement highlighted by DeepSeek is its “reduced hallucination rate,” a critical step forward as AI models generating plausible but false information remains a persistent challenge. Developers testing the model have also observed that R1-0528 engages in @longer thinking sessions, reportedly spending 30 to 60 minutes on single tasks when needed,” indicating a shift towards more thorough responses. The maximum generation length for the model is a substantial 64,000 tokens.

DeepSeek-R1-0528 benchmarks vs openai o3 gemini 2.5 pro Qwen3 official

Availability, Open Source, and Community Reception

Users can interact with the enhanced model via DeepSeek’s official chat website by enabling the “DeepThink” option. For developers, an OpenAI-compatible API is available through the DeepSeek Platform. The DeepSeek R1-0528 model is also accessible with a free API via OpenRouter.  The model and its tokenizer can also be run locally, with details available in the DeepSeek-R1 GitHub repository.

Reinforcing its commitment to the open-source community, DeepSeek has also released DeepSeek-R1-0528-Qwen3-8B. This model, created by distilling the chain-of-thought from the upgraded R1 onto the Qwen3 8B base model from Alibaba, is claimed to achieve state-of-the-art performance among open-source models on the AIME 2024 benchmark.

The DeepSeek-R1 series, including this latest update, is licensed under the MIT License, allowing for commercial use and distillation. The community has also responded quickly, with Unsloth AI announcing in a blog post that they successfully quantized DeepSeek’s R1 671B-parameter model from 720GB down to 185GB, a 75% reduction, making it more accessible for local use while maintaining strong functionality.

Navigating Competition and Geopolitical Realities

While DeepSeek’s Hugging Face post details a significant upgrade, some characterized the release in early reports as a “minor trial upgrade.”  This framing was echoed by The Express Tribune, which also mentioned a DeepSeek representative describing it similarly in a private WeChat group. This iterative approach comes as DeepSeek prepares for its next-generation R2 reasoning model, the launch of which was reportedly accelerated to better compete with global AI labs.

The original DeepSeek R1 made a significant impact earlier in the year by outperforming OpenAI’s o1 on several reasoning benchmarks. Its open-source nature has spurred third-party adaptations, such as Perplexity AI’s R1 1776, a censorship-free variant. DeepSeek has consistently contributed to open-source AI, releasing tools like the FlashMLA decoding kernel and the DeepSeek-Prover-V2-671B model for mathematical theorem proving.

These advancements unfold amidst considerable geopolitical pressures. In April, a US House Select Committee on the CCP labeled DeepSeek a national security risk. Committee Chairman John Moolenaar asserted the report showed DeepSeek was not just another AI app but “a weapon in the Chinese Communist Party’s arsenal, designed to spy on Americans, steal our technology, and subvert U.S. law.”

In response to such pressures and US export controls limiting access to top-tier Nvidia GPUs, DeepSeek has strategically focused on computational efficiency. This focus was seemingly validated when Tencent confirmed leveraging DeepSeek models in late 2024.

Markus Kasanmascheff
Markus Kasanmascheff
Markus has been covering the tech industry for more than 15 years. He is holding a Master´s degree in International Economics and is the founder and managing editor of Winbuzzer.com.

Recent News

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments
0
We would love to hear your opinion! Please comment below.x
()
x