Microsoft's AI research division is coming under scrutiny after inadvertently exposing a staggering 38 terabytes of sensitive AI training data. Techcrunch reports that the data, which was meant to be a part of open-source training data on GitHub, included the intended AI models and personal computer backups of Microsoft employees, passwords to various Microsoft services, secret keys, and a vast archive of internal Microsoft Teams messages.
The root cause of this massive exposure was traced back to the use of Azure's “SAS tokens”, which were configured to grant “full control” over the entire storage account, rather than the intended “read-only” access.
Misconfigured SAS Tokens
Shared Access Signature (SAS) tokens are a feature of Microsoft´s Azure Cloud that allows users to create links granting access to an Azure Storage account's data. However, when misconfigured, these tokens can pose significant security risks. The Microsoft AI developers, in this case, included an overly permissive SAS token in the URL, which led to the unintended exposure. Cloud security firm Wiz, which discovered the misconfiguration, emphasized the challenges in monitoring and revoking such tokens. They highlighted that due to a lack of centralized management within the Azure portal, these tokens are hard to track. Furthermore, they can be set to last indefinitely, making their use for external sharing a potential security hazard.
Aftermath and Microsoft's Response
Upon discovering the oversight, Wiz promptly reported the issue to Microsoft in June 2023. Microsoft acted swiftly, revoking the SAS token within two days, thereby blocking external access to the Azure storage account. Following an internal investigation, Microsoft confirmed that no customer data was compromised, and no other internal services were jeopardized due to the incident. As a preventive measure, Microsoft expanded GitHub's secret scanning service to monitor public open-source code changes for potential exposure of credentials and other secrets, especially those related to SAS tokens.