- Demand Outlook: AI server demand is expected to stay strong through 2027, raising supply pressure for cloud operators and hardware makers.
- Capacity Pressure: Chip-on-Wafer-on-Substrate packaging, components, power systems, and two-year fab lead times could slow server availability.
- Forecast Caveat: Nomura and Mizuho figures are analyst estimates, not confirmed market outcomes or investment advice.
- Product Drivers: Nvidia’s Vera Rubin AI platform and Amazon Web Services’ Trainium3 AI chip keep demand tied to deployable infrastructure.
Demand for servers built for artificial intelligence training, inference, and high-performance computing is expected to stay strong through 2027. Supply-chain risk could leave server supply chains tight enough to affect delivery planning. Prior AI infrastructure demand pressure and CPU supply constraints make processor availability part of the same procurement problem as accelerator supply.
Cloud operators such as Amazon, Google, and Microsoft, plus hardware makers and data-center customers, face higher prices, longer lead times, and fresh bottleneck risk. Global AI server revenue could grow 78% in 2026 and 76% in 2027. A prior 43% view framed the earlier 2026 estimate, while Mizuho’s TSMC CoWoS capacity forecasts could add a separate packaging constraint to the stronger server CPU demand surge outlook.
Aaron Jeng, head of Taiwan equity research at Nomura, warns that the demand outlook still carries unresolved shortage risk: “The market hasn’t even dealt with some of the risks and shortages to come,” he said. For customers, the AI infrastructure boom becomes a planning problem around delivery dates, price protection, and supplier depth.
Where Capacity Starts to Pinch
Those constraints start with the parts of the semiconductor chain that scale slowly. Relief may lag because new fabrication plants take about two years to build, keeping new output beyond the reach of a single spending cycle. Server deployment schedules also depend on synchronized capacity across fabs, packaging lines, power systems, and data-center construction. Power, networking, and construction teams can become schedule blockers even when chip supply improves, because each step has to be reserved before servers can be installed.
From packaging, Chip-on-Wafer-on-Substrate (CoWoS) is the advanced process used to connect high-performance chips and memory inside AI systems. TSMC’s CoWoS advanced-packaging yield has improved, but demand forecasts still point to a tight allocation window. TSMC monthly CoWoS packaging capacity could reach 140,000 units by 2026 and 190,000 to 200,000 units by 2027, while forecasted Nvidia CoWoS demand could stand at 630,000 units in 2026, rising to 1,005,000 CoWoS units in 2027.
Beyond packaging, CPU availability is becoming a separate constraint rather than a footnote to GPU supply. Expected server CPU demand growth tops 50% year over year in 2027 across x86 and ARM-based chips, spanning Nvidia, Intel, AMD, and cloud-service-provider chips. Customers have to watch packaging allocations, component-shortage risk, and separate memory supply shortages that could persist into 2027.
GPU allocation alone do not guarantee deployable servers if CPUs, memory, substrates, or power equipment arrive late.
The bottleneck is not a single component: server vendors need substrates, high-end capacitors, power-management chips, optical parts, and enough packaging slots to arrive in sequence.
Why the Demand Looks Sticky
Product timing gives the demand outlook a concrete server roadmap. Nvidia’s 2025 AI superchips milestone and Vera CPU shipments for agentic AI data centers put the newer platform into 2026, and Nvidia’s Rubin-based products will be available from partners in the second half of 2026. Multi-step AI systems need low-latency, high-bandwidth processors to coordinate tools, memory, and model calls, which could increase server CPU demand alongside GPU and accelerator demand.
Partner timing turns product availability into a capacity test for cloud providers, server builders, and customers waiting for usable instances.
Cloud-provider demand adds a second product lane. AWS, Google Cloud, Microsoft, OCI, CoreWeave, Lambda, Nebius, and Nscale are among the first cloud providers and partners expected to deploy Vera Rubin-based instances in 2026. Amazon EC2 Trn3 UltraServers currently use Trainium3, AWS’s fourth-generation AI chip and first 3nm AWS AI chip, scale to 144 Trainium3 chips and up to 362 MXFP8 petaflops, and deliver up to 4.4 times higher performance than Trn2 UltraServers. A shortfall in the 2026 CoWoS ramp would leave 2027 server rollouts queued behind substrate and packaging slots before accelerator orders become deployed capacity.


