Microsoft customers using the company’s Azure public cloud service were faced with a serious outage yesterday. The service was down in several regions around the world between 15:42 Pacific to 23:47 Pacific. Microsoft says the issue started in its datacenter regions and spread to other services because of storage reasons.
The outage was almost a blanket loss of service across all Azure regions. 26 out of 28 data center regions were affected by storage issues. Microsoft announced the problem on a status page update. The company was active through the evening fixing the issue and keeping users informed.
“Starting at 22:42 UTC on 15 Mar 2017, customers using Storage may receive failure notifications when performing service management operations – such as create, update, delete – for resources hosted in this region,” Microsoft said on its Azure status page.
“Other services that leverage Storage may also be experiencing impact,” the company continued. “Retries may be successful. In addition, a subset of customers in East US may be unable to access their Storage accounts. Engineers have identified a possible fix for the underlying cause, and are applying mitigations [sic]. The next update will be provided in 60 minutes, or as events warrant.”
Microsoft specifically said that customers would likely face issues when building new virtual machines (VMs).
The company sent out updates through the evening. Microsoft initially said at 17:30 Pacific (two hours after the first outage) that the problem was solved. However, the company returned to say the East US regions was still causing problems. These issues spread once again by 17:48 Pacific:
“Starting at 15:19 UTC on 15 Mar 2017, a subset of customer using Storage in East US may experience difficulties accessing their Storage accounts in East US,” Microsoft said in an update.
“Due to a dependency on Storage, customers using the following services may experience failures provisioning or connecting to Azure Search, Azure Service Bus, Azure EventHub and Azure Stream Analytics in East US. The next update will be provided as events warrant.”
It was over five hours later before Microsoft could lock down the problem and solve it. The company updated at 23:27 Pacific that Azure public services were operating normally.
Needless to say, when a cloud service goes down it is a massive problem. Customers rely on the cloud to provide services and storage that should be accessible. It is worth mentioning that they pay for this privilege. Considering Azure is pushed to enterprises above consumers, organizations lose productivity and likely money when a cloud service goes down.
Of course, Microsoft is not alone in facing such outages. At the end of last month, Amazon Web Services (AWS) suffered a major outage in its popular S3 storage service. Hundreds of major websites and companies were down as a result of the issue.
The Azure outage comes at a bad time for Microsoft. Just a week ago, the company defended itself against claims that Google Cloud is down less than Azure, and indeed AWS. Speaking at the Google Cloud Next conference, senior vice president Diane Greene showed data highlighting the consistency of Google Cloud over its rivals.
However, Microsoft said the figures did not show the full picture. This was backed up by the company that compiled the data.