Hugging Face has launched an initiative to provide $10 million worth of GPU compute to the public, aiming to alleviate the financial strain on smaller AI development teams. The program, named ZeroGPU, was introduced by CEO Clem Delangue on X.
Addressing Resource Disparities
Delangue emphasized the resource gap between large tech companies and the open-source community, which often lacks the infrastructure needed to train and deploy AI models. This disparity has contributed to the dominance of applications like ChatGPT. ZeroGPU aims to bridge this gap by offering shared infrastructure for independent and academic AI developers to run AI demos on Hugging Face Spaces, thereby reducing their financial burden.
GPU-Poor no more: super excited to officially release ZeroGPU in beta today. Congrats @victormustar & team for the release!
In the past few months, the open-source AI community has been thriving. Not only Meta but also Apple, NVIDIA, Bytedance, Snowflake, Databricks, Microsoft,… pic.twitter.com/6UzWvYhmpw
— clem 🤗 (@ClementDelangue) May 16, 2024
Founded in 2016, Hugging Face has become a leading provider of open-source AI models optimized for various hardware platforms, thanks to partnerships with industry giants such as Nvidia, Intel, and AMD. Delangue views open-source as the future of AI innovation and adoption, and the ZeroGPU initiative is a step towards democratizing access to essential resources.
Shared GPU Infrastructure
ZeroGPU will be accessible through Hugging Face's application hosting service and will utilize Nvidia's older A100 accelerators on a shared basis. Unlike traditional cloud providers that often require long-term commitments for cost-effective GPU rentals, Hugging Face's approach allows for more flexible usage, which is beneficial for smaller developers who cannot predict the success of their models in advance. This shared infrastructure is initially limited to AI inferencing rather than training, due to the substantial computational resources required for training even small models.
The support documentation indicates that GPU functions are capped at 120 seconds, which is insufficient for training purposes. A Hugging Face spokesperson has confirmed that the focus is primarily on inferencing, though there are plans to explore other applications in the future.
Technical Implementation
The technical specifics of how Hugging Face manages to share GPU resources efficiently remain somewhat unclear. Delangue mentions that the system can “efficiently hold and release GPUs as needed,” but the exact mechanisms are not detailed. Potential methods include time slicing to run multiple workloads simultaneously, Nvidia's multi-instance GPU (MIG) technology, and GPU-accelerated containers orchestrated by Kubernetes. These techniques have been used by other cloud providers to make GPU compute more accessible.
The A100 GPUs being used have a 40GB memory capacity, which can support substantial but limited model workloads. ZeroGPU will be available via Hugging Face's Spaces, a hosting platform for publishing apps, which has over 300,000 AI demos created so far on CPU or paid GPU.
Funding and Vision
Hugging Face's commitment to this initiative is made possible by its financial stability, as the company is “profitable, or close to profitable” and recently raised $235 million in funding, valuing it at $4.5 billion. Delangue expressed concerns about AI startups' ability to compete with tech giants, who keep significant advancements in AI proprietary and have substantial computational resources. Hugging Face aims to make advanced AI technologies accessible to everyone, not just tech giants.
Community Impact
Access to compute poses a significant challenge to constructing large language models, often favoring companies like OpenAI and Anthropic. Andrew Reed, a machine learning engineer at Hugging Face, created an app that visualizes the progress of proprietary and open-source LLMs over time, showing the gap between the two inching closer together. Over 35,000 variations of Meta's open-source AI model Llama have been shared on Hugging Face since Meta's first version a year ago.