NVIDIA's DGX Cloud platform, a cloud-hosted artificial intelligence supercomputing service, is now broadly available, offering enterprises instant access to the infrastructure and software needed to train advanced models for generative AI and other applications. The platform was first unveiled at NVIDIA's GTC conference in March and is now accessible on Oracle Cloud Infrastructure, as well as NVIDIA infrastructure located in the U.S. and U.K.
DGX Cloud: A Game Changer for AI Infrastructure
The DGX Cloud platform provides access to eight of NVIDIA's 80-gigabyte Tensor Core GPUs, equating to 640 gigabytes of GPU memory per node. This high-performance, low-latency networking fabric ensures workloads can scale across clusters of interconnected systems, allowing multiple DGX Cloud instances to act as one massive GPU. The platform is paired with NVIDIA's AI Enterprise software, which offers customers access to more than 100 AI frameworks and pre-trained models.
Pricing and Availability
The DGX Cloud platform is available for rent on a monthly basis, starting at $36,999 per instance. It is initially launching on Oracle Corp.'s cloud infrastructure and will later be offered on Microsoft Corp.'s Azure platform, followed by Google Cloud. There's no mention yet if cloud industry leader Amazon Web Services Inc. will also host DGX Cloud.
Impact on Various Industries
Generative AI, which could add more than $4 trillion to the economy annually according to McKinsey, is being adopted by leading companies in every industry. Early adopters of the DGX Cloud platform have already recorded achievements in various sectors. Healthcare firms have been using DGX Cloud to train protein models and accelerate drug discovery and clinical reporting. Financial services firms are using the platform to optimize portfolios, forecast trends, build recommendation engines, and develop intelligent chatbots. Insurance companies are also using DGX Cloud to build models that can automate much of the claims process.
The Future of AI Supercomputing
NVIDIA's DGX Cloud platform is seen as a strategic move to meet the high demand for GPUs and the growing need for AI supercomputing resources. As generative AI becomes more common, organizations are responding to the demand for changes in the way AI is used, from a publicly trained powerhouse like OPenAI´s GPT-4 to private instances in which organizations can use their own data and develop their own proprietary use cases. Access to the heavy-duty computing power needed will change accordingly.
Other Powerful AI Supercomputers
AI supercomputers are the backbone of modern artificial intelligence research and development. These powerful machines are capable of processing vast amounts of data and running complex algorithms, enabling the creation and training of sophisticated AI models. The following list provides an overview of some of the most powerful AI supercomputers in the world – already existing or in development:
Condor Galaxy 1 (CG-1): Developed by Cerebras Systems and G42, CG-1 is part of the Condor Galaxy network of interconnected supercomputers. It boasts a staggering 4 exaFLOPs of AI training capacity and 54 million cores. The supercomputer is offered as a cloud service, allowing customers to leverage its power without having to manage physical systems. Cerebras and G42 plan to deploy more supercomputers in the network, aiming for a total compute power of 36 exaFLOPs by 2024.
Summit: Developed by IBM and Oak Ridge National Laboratory, Summit has a peak performance of 200 petaflops and can handle 3.3 exabytes of data. Summit is used for various scientific and medical research projects, such as simulating climate change, discovering new drugs, and analyzing genomic data.
DGX SuperPOD: NVIDIA's DGX SuperPOD is a powerful AI supercomputer designed for enterprise-scale AI infrastructure. It is powered by NVIDIA A100 Tensor Core GPUs and delivers 700 petaflops of AI performance.
Sunway TaihuLight: Built by China's National Research Center of Parallel Computer Engineering and Technology, Sunway TaihuLight has a peak performance of 125 petaflops and can process 10.65 petabytes of data. It is mainly used for industrial and engineering applications, such as weather forecasting, oil exploration, and aerospace design.
Selene: Developed by NVIDIA and hosted by the New Mexico Consortium, Selene has a peak performance of 63 petaflops and can store 1.6 petabytes of data. Selene is designed to support NVIDIA's research and development in AI, such as natural language processing, computer vision, and recommender systems.
Andromeda: Built by Cerebras, Andromeda is a unique AI supercomputer with 13.5 million cores capable of speeds over an exaflop. It is designed specifically for AI and has demonstrated near-perfect linear scaling of AI workloads for large language models.
IBM Vela: IBM's first AI-optimized, cloud-native supercomputer, Vela, is designed exclusively for large-scale AI. It is housed within IBM Cloud and is currently used by the IBM Research community. Vela's design offers flexibility to scale up at will and readily deploy similar infrastructure into any IBM Cloud data center across the globe.
METI's AI Supercomputer: Japan's Ministry of Economy, Trade and Industry (METI) plans to introduce a new supercomputer through its affiliated laboratory, the National Institute of Advanced Industrial Science and Technology (AIST), as early as 2024. This machine will have a computing power 2.5 times greater than AIST's current one and will be available to Japanese companies developing generative AI via a cloud service. The Japanese government also plans to invest in a new supercomputer in Hokkaido that will specialize in large language model training and start operating in 2024.
Meta's AI Research SuperCluster (RSC): Meta's RSC is a supercomputer designed to accelerate AI research. It is one of the fastest AI supercomputers in the world and is used to train large AI models, including natural language processing and computer vision models.