Recent analyses have unveiled that Nvidia is securing nearly a 1,000% profit on every H100 Tensor Core GPU it sells. Financial insights from Raymond James, a reputable financial services firm, shared on Barron´s, have estimated the production cost of one such GPU to be around $3,320. In stark contrast, Nvidia's selling price for these GPUs fluctuates between $25,000 and $30,000, contingent on the order volume.
Access to Nvidia H100 GPU compute is essentially sold out until 2024. AI startups are starting to worry they won't be able to acquire the needed GPU capacity to serve their models https://t.co/qyQNk34Mkw pic.twitter.com/1CWndW6Xbq
— tae kim (@firstadopter) August 10, 2023
— tae kim (@firstadopter) August 16, 2023
The H100 AI Accelerator is A Gold Mine
The H100 Tensor Core GPU, often referred to as the “Hopper” AI Accelerator, is a cutting-edge GPU designed for high-performance computing, particularly in the realm of artificial intelligence. GPUs, or Graphics Processing Units, are essential components in computers, enabling them to perform complex tasks, especially those related to graphics and AI computations. The H100 stands out due to its advanced capabilities, making it a sought-after component in the tech industry.
Comparing Margins Across Products
To put these profit figures into perspective, when compared with other products, the H100's margins appear even more astonishing. For instance, EVGA, once in collaboration with Nvidia, disclosed that its profit margins for power supply units (PSUs) surpassed its GPU margins by 300%. This underscores that Nvidia's profit margins for its consumer-facing GeForce GPUs might be considerably slimmer than those for its server-oriented GPUs.
Costs Beneath the Surface
While the aforementioned profit margins might seem exorbitant at first glance, it's crucial to delve deeper into the associated costs of bringing a product like the H100 to market. Such a sophisticated GPU necessitates extensive research and development, often spanning years. Nvidia compensates its engineers generously, with average salaries for electronics hardware engineers hovering around $202,000 annually. The cumulative effort to develop a GPU like the H100 involves thousands of specialized workers, each contributing countless hours. These factors, combined with initial expenses like wafer capacity and supply chain payments, can significantly offset the apparent profit margins.
The AI GPU Market is on Fire
The tech industry's current trajectory indicates an insatiable appetite for Nvidia's H100 GPUs. TSMC, a taiwanese semiconductor titan, is projected to ship H100 chips valued at a whopping $22 billion in 2023 alone. This burgeoning demand is propelled by the ongoing AI revolution, with a plethora of AI firms vying to procure Nvidia's H100 GPUs in vast quantities.
With the AI accelerator market's valuation anticipated to reach approximately $150 billion by 2027, Nvidia's foothold in this sector appears robust. Current market trends suggest that Nvidia's AI-centric products, including the H100, are likely to be in short supply, with orders projected to be filled only by 2024.
Microsoft is just one major client that uses the H100 and A100 data center GPUs from Nvidia to power its AI virtual machines on its Azure cloud infrastructure. Last November, both companies started a long-term collaboration with the plan to develop “one of the most powerful AI supercomputers in the world”. In March, Microsoft launched the most powerful AI virtual machine series it has ever had on Azure. Known as the ND H100 v5 VM, this is an on-demand AI VM available in different sizes from just eight to thousands of connected GPUs.
Palo Alto-based AI startup Inflection AI, in collaboration with CoreWeave and NVIDIA, is constructing the largest AI cluster in the world, comprising 22,000 NVIDIA H100 Tensor Core GPUs. This unprecedented deployment will support the training and deployment of a new generation of large-scale AI models. The cluster is estimated to develop a staggering 22 exaFLOPS in the 16-bit precision mode, and even more if lower precision is utilized.
AI Supercomputers on the Rise
AI supercomputers are the backbone of AI research and development, enabling the creation of sophisticated models and algorithms. Here are some of the current systems that are being developed and/or used currently.
-
Summit: Developed by IBM and Oak Ridge National Laboratory, Summit is one of the most prominent AI supercomputers. It has a peak performance of 200 petaflops and can handle 3.3 exabytes of data. Summit is used for various scientific and medical research projects, such as simulating climate change, discovering new drugs, and analyzing genomic data.
-
Sunway TaihuLight: Built by China's National Research Center of Parallel Computer Engineering and Technology, Sunway TaihuLight has a peak performance of 125 petaflops and can process 10.65 petabytes of data. It is mainly used for industrial and engineering applications, such as weather forecasting, oil exploration, and aerospace design.
-
Selene: Developed by NVIDIA and hosted by the New Mexico Consortium, Selene has a peak performance of 63 petaflops and can store 1.6 petabytes of data. Selene is designed to support NVIDIA's research and development in AI, such as natural language processing, computer vision, and recommender systems.
-
Andromeda: Built by Cerebras, Andromeda is a unique AI supercomputer with 13.5 million cores capable of speeds over an exaflop. It is designed specifically for AI and has demonstrated near-perfect linear scaling of AI workloads for large language models.
-
IBM Vela: IBM's first AI-optimized, cloud-native supercomputer, Vela, is designed exclusively for large-scale AI. It is housed within IBM Cloud and is currently used by the IBM Research community. Vela's design offers flexibility to scale up at will and readily deploy similar infrastructure into any IBM Cloud data center across the globe.
-
DGX SuperPOD: NVIDIA's DGX SuperPOD is a powerful AI supercomputer designed for enterprise-scale AI infrastructure. It is powered by NVIDIA A100 Tensor Core GPUs and delivers 700 petaflops of AI performance.
-
Meta's AI Research SuperCluster (RSC): Meta's RSC is a supercomputer designed to accelerate AI research. It is one of the fastest AI supercomputers in the world and is used to train large AI models, including natural language processing and computer vision models.