NVIDIA CEO Jensen Huang has introduced the company’s latest AI infrastructure product, the Blackwell Ultra AI Factory Platform, designed to significantly enhance AI reasoning, agentic AI, and physical AI workloads. This announcement was made at NVIDIA’s annual GPU Technology Conference GTC 2025.
Technical Specifications and AI Performance
The Blackwell Ultra platform introduces the GB300 NVL72 rack-scale solution, integrating 72 NVIDIA Blackwell Ultra GPUs with 36 Arm-based NVIDIA Grace CPUs. According to NVIDIA, this configuration delivers 1,400 petaFLOPS of FP4 AI performance, representing a 1.5× increase in dense FP4 computing capability over the previous Blackwell B200 generation.
A single Blackwell Ultra GPU maintains the same 20 petaflops of AI compute power as its standard predecessor but increases high-bandwidth memory capacity from 192GB to 288GB of HBM3e memory per GPU.
Consequently, a full-scale DGX GB300 Superpod cluster provides identical processing power (11.5 exaflops of FP4 computing) as earlier Blackwell models but now offers 300TB of total memory—up from 240TB—reflecting NVIDIA’s strategic focus on efficiently handling larger AI models and improved reasoning performance.
Alongside hardware improvements, NVIDIA also announced Dynamo, an open-source inference serving software framework, designed to optimize throughput, reduce latency, and scale AI reasoning services, further enhancing the capabilities of Blackwell Ultra deployments.
Comparison to Previous NVIDIA Architectures
Rather than directly comparing Blackwell Ultra with its immediate predecessor, NVIDIA emphasized performance advantages over its widely adopted 2022-era H100 chips. Specifically, NVIDIA highlighted Blackwell Ultra’s superior inference speeds—delivering 1.5 times faster FP4 inference performance.
Additionally, NVIDIA demonstrated substantial AI reasoning speedups; an NVL72 cluster running the DeepSeek-R1 671B model now generates responses in just 10 seconds, significantly outperforming the H100’s typical 90-second response time.
This performance gain stems from a tenfold improvement in token processing speed, reaching 1,000 tokens per second compared to the H100’s 100 tokens per second.
“The Blackwell Ultra NVL72 dramatically accelerates AI reasoning workloads, enabling near-instantaneous responses even on the largest models,” said NVIDIA CEO Jensen Huang during the GTC keynote.
NVIDIA’s AI Roadmap: Vera Rubin Superchips in 2026
Following Blackwell Ultra, NVIDIA’s ambitious roadmap includes the Vera Rubin superchip, scheduled for release in late 2026. Named in honor of astronomer Vera Rubin, this next-gen chip will pair NVIDIA’s custom-designed Vera CPU, based on the new Olympus architecture, with the Rubin GPU. It aims to deliver double the performance of current Grace CPUs and features up to 288GB of high-bandwidth memory per GPU.
The Vera Rubin architecture will support dual-GPU designs on a single die, providing 50 petaFLOPS of FP4 inference performance per chip. The Vera CPU itself comprises 88 Arm cores supporting simultaneous multithreading for 176 threads per socket, complemented by a high-speed 1.8TB/s NVLink core-to-core interface for enhanced CPU-GPU communication.
These planned innovations underscore NVIDIA’s commitment to maintaining its leadership position in AI hardware through continuous, annual advances in chip architecture and performance.
Industry Adoption: Companies Announcing Products Featuring NVIDIA Blackwell Ultra
Dell Technologies
Dell Technologies announced its support for NVIDIA’s Blackwell Ultra GPUs in the latest generation of PowerEdge Pro Max AI servers. These systems target enterprises and research institutions demanding robust AI infrastructure, promising significant gains in AI workload efficiency and large-model processing.
The new PowerEdge Pro Max AI servers will incorporate NVIDIA’s Blackwell Ultra GPUs, offering up to 288GB of HBM3e GPU memory per GPU. Configurations will enable massive GPU memory capacities ideal for large-scale AI reasoning and inference tasks, significantly boosting AI model complexity handling and data throughput.
Giga Computing
Giga Computing is integrating NVIDIA’s Blackwell Ultra technology into its latest rack-scale server solutions. The products aim to deliver scalable performance for demanding AI, machine learning, and data-intensive applications, enhancing compute capabilities across cloud and data center environments.
Technical highlights include compatibility with NVIDIA’s GB300 NVL72 systems, integrating 72 Blackwell Ultra GPUs and 36 Grace CPUs per rack. These configurations achieve optimized AI workloads through improved performance, reduced inference latency, and increased scalability for enterprises managing advanced AI deployments.
Hewlett Packard Enterprise (HPE)
Hewlett Packard Enterprise introduced new enterprise-focused AI systems built around NVIDIA’s Blackwell Ultra GPUs, targeting accelerated deployment of generative, agentic, and physical AI applications. These solutions enable businesses to rapidly scale their AI capabilities, reducing deployment complexity and enhancing performance.
HPE’s Blackwell Ultra-powered offerings include the HPE Cray XD6500 supercomputing platform, which features the NVIDIA GB300 NVL72 configuration.
Inventec
Inventec has unveiled new server systems leveraging NVIDIA’s Blackwell Ultra GPUs designed for large-scale AI workloads and data center deployments. The company aims to empower enterprises with powerful AI computing platforms, simplifying the adoption of sophisticated AI models and workloads.
Pegatron
Pegatron introduced the SVR series servers incorporating NVIDIA Blackwell Ultra GPUs, designed for data centers needing powerful, efficient computing solutions for advanced AI and machine learning applications.
Wiwynn
Wiwynn showcased its latest liquid-cooled AI server solutions featuring NVIDIA’s Blackwell Ultra GPUs at GTC 2025. Designed for data centers prioritizing efficiency and high-performance computing, Wiwynn’s products facilitate energy-efficient AI computing.