Tencent says it’s reducing its reliance on NVIDIA GPUs by deploying AI models from DeepSeek that deliver higher efficiency with fewer chips—a shift the company describes as a long-term infrastructure strategy aimed at reducing hardware dependency and improving scalability.
The update came during its Q4 2024 earnings call, where the company emphasized performance gains from using DeepSeek’s architecture-optimized models.
According to the company’s chief strategy officer, Tencent has been able to minimize GPU consumption while maintaining output. “[T]he industry and we, within the industry, are getting much higher productivity on a large language model training from existing GPUs without needing to add additional GPUs at the pace previously expected,” the executive said. He explained further:
“Chinese companies are generally prioritizing efficiency and utilization — efficient utilization of the GPU servers. And that doesn’t necessarily impair the ultimate effectiveness of the technology that’s being developed. And I think DeepSeek’s success really sort of symbolize and solidify — demonstrated that — that reality. […] If suddenly sort of there’s a search demand, then we can definitely sort of increase our order for GPUs. So, I think we’ll be sort of very flexible and dynamic in responding to the market dynamics.”
The company’s evolving approach marks a departure from traditional scale-up strategies—favoring compute efficiency, model specialization, and local sourcing.
Efficiency Over Scale: Why Tencent Is Backing DeepSeek
DeepSeek’s upcoming R2 model is now being fast-tracked for release ahead of its originally planned May 2025 launch.
Reports also indicate that R2 will include multimodal capabilities, enhancing its usefulness across enterprise use cases.
Newer models like those from DeepSeek aren’t just lighter; they’re also optimized for Tencent’s computing stack, making them more efficient during both training and inference.
While the company has also invested in in-house development—such as its HunYuan Turbo S model—DeepSeek is being used for more complex workloads requiring advanced reasoning and cross-lingual performance.
Tencent’s investment in DeepSeek doesn’t preclude continued GPU spending. In fact, the company has reportedly made large orders of NVIDIA’s China-specific H20 chips to support DeepSeek integration across apps like WeChat, as detailed in this TrendForce report on its H20 chip purchases. However, the infrastructure play is clearly shifting from raw expansion to architectural optimization.
Scaling Smarter: Contrasting DeepSeek with OpenAI’s GPT-4.5
The shift comes as other tech companies double down on scale. In February, OpenAI introduced GPT-4.5, calling it its largest and most capable model to date.
Yet CEO Sam Altman downplayed expectations, admitting on X: “Bad news: it is a giant, expensive model.”
The model improved performance in multilingual and multimodal tasks but failed to outperform smaller reasoning-specific models like o3-mini in structured domains like math and scientific logic.
Industry reaction was mixed, with analysts questioning whether OpenAI’s continued scaling strategy delivers enough in return. This shows how Tencent’s approach—focused on inference efficiency and locally tailored models— might stand out as a more sustainable strategy.
Industry Turns Toward Inference Optimization
This trend isn’t unique to Tencent. A recent research paper proposed a method called “Sample, Scrutinize and Scale”, which improves reasoning through inference-time self-verification.
Models generate multiple outputs per query and select the most accurate using internal scoring mechanisms. While this increases computational overhead at runtime, it avoids the ballooning costs of pre-training massive models and is seen as a more targeted approach to improving reasoning tasks.
While DeepSeek has not confirmed whether it uses such techniques, its performance in reasoning-heavy benchmarks and low GPU training footprint suggest it may be adopting similar strategies. For Tencent, this offers a route to scale AI services without overcommitting to limited or restricted GPU inventories.
Microsoft’s CoreWeave Exit Highlights a Broader Shift
Tencent isn’t alone in reevaluating its infrastructure. Microsoft recently declined a $12 billion GPU cloud option with CoreWeave, which was instead taken up by OpenAI. The $11.9 billion deal included a $350 million equity stake by OpenAI ahead of CoreWeave’s IPO. This move allows OpenAI to diversify its compute sources beyond Microsoft Azure.
Microsoft, meanwhile, is doubling down on its in-house chips, such as the Azure Maia and Cobalt accelerators. The company is also scaling back physical expansion. Microsoft canceled multiple AI data center leases, including a $3.3 billion facility in Wisconsin, after internal demand forecasts were revised. TD Cowen analysts noted that updated OpenAI usage projections played a major role in the decision.
This divergence in strategies—OpenAI racing for more external compute, while Microsoft and other companies like Tencent build more streamlined internal pipelines—reflects broader industry discomfort with the current cost and availability of high-performance AI chips.
China’s Open Source Strategy after DeepSeek’s “Sputnik” Moment
DeepSeek’s rise also fits neatly into China’s broader AI strategy. In response to U.S. export restrictions on advanced chips, many Chinese tech firms are turning to open-sourcing their models.
This approach enables faster iteration, encourages global adoption, and reduces the cost of training. For Tencent, adopting DeepSeek’s more open and cost-effective models aligns with these national and operational priorities.
The Guardian described its emergence as a “Sputnik moment” for the U.S. AI industry, with $1 trillion briefly wiped off global tech stock valuations following its debut.
While the long-term viability of DeepSeek’s models will depend on real-world performance, scale, and regulatory scrutiny, the trend is unmistakable. Chinese developers are building highly capable AI models using fewer chips and less capital—challenging the traditional model of success based solely on scale and compute.
There are caveats. Tencent has had to purchase large quantities of NVIDIA’s H20 chips to maintain service delivery, despite overall reductions in GPU usage. These models still require robust backend hardware, and China’s ability to sustain supply remains uncertain under ongoing U.S. restrictions.
Even so, Tencent’s strategy sends a clear signal. Efficiency, not expansion, is becoming the new standard. As global tech companies assess the cost of scaling and the fragility of supply chains, models like DeepSeek R2 offer a different blueprint: smaller, smarter, and possibly more sustainable.