The Chan Zuckerberg Initiative (CZI), founded by Mark Zuckerberg and Priscilla Chan, is taking a significant leap in medical research by funding the development of a massive new computing system. This system, designed for medical research, will consist of over 1,000 top-tier Nvidia H100 GPUs, specifically tailored for AI-driven servers.
A Vision for the Future of Medical Research
The primary objective behind this ambitious project is to provide researchers with access to generative AI, enabling them to study both healthy and diseased cells. By utilizing predictive models of human cells, scientists can gain a deeper understanding of how the human body reacts to diseases and potential new medications.
This approach can be visualized as running a “virtual cell” through various simulations to predict outcomes. As CZI co-founders Chan and Zuckerberg wrote in an essay for MIT Technology Review, “AI models could predict how an immune cell responds to an infection, what happens at the cellular level when a child is born with a rare disease, or even how a patient's body will respond to a new medication.“
Bridging the Gap for Researchers
One of the challenges in the scientific community is the high cost associated with advanced tools, making them inaccessible to many researchers. The Chan Zuckerberg Initiative aims to change this narrative. The upcoming GPU cluster is designed to power “openly available” models of human cells, thereby accelerating medical research and fostering collaboration among scientists.
This computing system, once operational, is anticipated to be among the largest AI clusters dedicated to nonprofit research. However, it's worth noting that it won't surpass the size of similar systems used for developing commercial products in the private sector.
Collaborative Efforts and Future Prospects
The CZI-funded computing system will be trained using existing datasets, including data from a CZI software tool that has already indexed approximately 50 million unique cells. The initiative's Biohub Network will be responsible for purchasing the GPUs.
This network collaborates with various tech and science institutions, focusing on grand scientific challenges spanning a 10 to 15-year timeframe. A dedicated team at the San Francisco biohub is entrusted with setting up the new computing system.
Leveraging the H100 GPU
The Nvidia H100 Tensor Core GPU, often referred to as the “Hopper” AI Accelerator, is a state-of-the-art graphics processing unit (GPU) designed for high-performance computing (HPC) and artificial intelligence (AI). GPUs are essential components in modern computers, enabling them to perform complex tasks, especially those related to graphics and AI computations. The H100 stands out from other GPUs due to its advanced capabilities, including:
- Faster AI performance: The H100 features fourth-generation Tensor Cores and a Transformer Engine with FP8 precision, which provides up to 4x faster training for large language models such as GPT-3.
- Higher memory bandwidth: The H100 has up to 80GB of HBM3 memory, which provides the bandwidth needed to train and run the largest AI models.
- More versatile computing: The H100 can also be used for a wide range of HPC applications, such as scientific computing, data analytics, and engineering simulation.