Hugging Face Exposed API Tokens Compromise Major Microsoft, Google, and Meta Data

A comprehensive investigation by Lasso Security has uncovered that key API tokens from prominent technology companies like Meta, Microsoft, and Google were publicly exposed on Hugging Face, the popular open-source AI and machine learning platform. Researchers identified more than 1,500 such tokens, potentially granting unauthorized access to 723 organizations' accounts.

In the majority of cases, the compromised tokens provided write permissions, which could allow external parties to alter files within account repositories. Significantly, leaked tokens included access to major AI projects like Meta's Llama, EleutherAI's Pythia, and BigScience Workshop's Bloom.

Hugging Face, often likened to GitHub but for machine learning and AI models, has been at the forefront of the AI community. The platform hosts a plethora of generative AI models, including notable ones like Stable Diffusion from Stability AI and Meta's Llama 2.

The company also develops its own generative AI models, including the SafeCoder AI code assistant that Hugging Face revealed this month. The design of SafeCoder allows for on-premises deployment, giving enterprises ownership of their code. In July, Hugging Face announced Agent.js, a JavaScript library that allows developers to create conversational agents that can run in the browser or on the server.

Potential Impact on AI Supply Chains

The scope of this security breach presents a concern in artificial intelligence and machine learning domains. With the tokens offering both read and write permissions, the opportunity for serious cyberattacks, such as data poisoning, looms large. Researchers estimate more than 1 million users could be affected, with the possibility of attackers stealing datasets, corrupting training data, or absconding with proprietary models.

Lasso Security pointed out that the ramifications of such breaches extend to foundational elements of the digital ecosystem, including Google's anti-spam filters and network traffic management. The researchers were able to demonstrate the ease with which they could modify popular datasets, underscoring the urgency to secure AI supply chains.

Industry Response and Risk Mitigation

Following Lasso Security's findings, Meta, Google, Microsoft, and other tech giants were prompt in revoking the exposed tokens and removing vulnerable code from their repositories. Hugging Face, resembling a GitHub for AI, features over 250,000 datasets and 500,000 AI models. In response to these security lapses, Hugging Face offers tools to alert users of exposed API tokens. Moreover, they have taken specific measure to block deprecated organization API tokens (org_api) possessing read and billing access, which researchers found exploitable.

While the exposed API tokens have been addressed, the incident brings to light the delicate nature of security within AI platforms. The technological community has been reminded of the importance of diligent security practices, including the use of secret scanning tools similar to those provided by GitHub, to protect against such vulnerabilities.

In a digital age where artificial intelligence and machine learning are becoming increasingly integral, the discovery serves as a stark warning of the potential for cyber attacks that could have far-reaching consequences for both organizations and end-users. Industry experts continue to advocate for heightened security measures to safeguard against these risks.

Hugging Face Exposed API Tokens Compromise Major Microsoft, Google, and Meta Data

Potential Impact on AI Supply Chains

Industry Response and Risk Mitigation

Recent News

Reddit Launches Dynamic Product Ads in Global Public Beta

Google Announces Direct Microsoft 365 App Access on ChromeOS