Massive Office 365 outage blocks email access for 2 days
A massive Office 365 outage last week left many users without access to email and other services for several days. First reported on Jan. 24, the event affected users across the globe and appeared to be one of Microsoft’s worst Office 365 outages in the last few years.
Microsoft acknowledged the outage on Twitter, taking steps to restore service. However, users continued to experience issues the following day and throughout the weekend, which means some users may have gone without access to their mailboxes for up to 4 days.
What happened? On Thursday, a number of users reported that they were unable to access their Exchange mailboxes in Office 365 and Microsoft 365. (Microsoft 365 is a bundled service package that includes the Office 365 suite as well as other services, like Windows Enterprise.) A Twitter account for Microsoft 365 Status confirmed they were investigating an issue in which “users can’t access their mailboxes through multiple protocols.” While some users were unable to access their mailboxes at all, others reported performance issues with the service. Email sending and receiving was delayed more than 3 hours in some cases. Others reported emails going missing, as well as receiving duplicates of several e-mails at once.
Who was impacted? It’s unclear how many users experienced the Office 365 outage, though a Microsoft statement quoted by ZDNet said the event was impacting “a limited subset of enterprise customers in Europe.” Based on reports, business users in Europe did appear to be hit hardest, though the event was widespread. Outage maps from Downdetector.com showed users were impacted across North America as well.
What caused the Office 365 outage?
Microsoft explained in a series of tweets what was likely causing the service interruption. Around 9 am on the morning of the outage (roughly 4 hours after acknowledging the event), the company posted that the issue was caused by problems within its Domain Controller infrastructure.
The company wrote, “We’ve determined that a subset of Domain Controller infrastructure is unresponsive, resulting in user connection time outs. We’re applying steps to mitigate the issue.” The following morning, as some users were still unable to access their mailboxes, Microsoft added, “Our telemetry data is indicating connection time outs within the Exchange authentication infrastructure, resulting in impact.” As of 3:44 am Monday, 4 days after the incident began, the company said the issue was largely resolved: “We’ve deployed some fixes and made some configuration changes throughout the affected infrastructure. Things are looking good so far but we’ll monitor throughout the day to ensure the service remains healthy.”
Has this happened before? Every SaaS-based platform experiences downtime once in a while, and Office 365 is no exception. Sometimes the downtime is part of a planned outage, and other times it’s unexpected. Last November, another Microsoft outage left users unable to access their Office 365 or Azure accounts. That, and a similar outage one week earlier, were due to issues caused by Microsoft’s multi-factor authentication system. In the grand scheme of things, Office 365 still maintains an impressive uptime rate of 99.98%, which means that service outages are few and far between. But as we’ve seen, this doesn’t mean the services are guaranteed to be available all the time. When unexpected outages inevitably occur, they can cause a nightmare for users and a costly disruption for businesses.
How can SaaS outages hurt business? Let’s take last week’s O365 outage for example. In that scenario, some users could not access their Exchange Online mailboxes for at least two days. If you didn’t experience the outage at your organization, consider how that kind of disruption might impact your business.
- What if a sales rep were unable to access her email for important information about a major client meeting occurring that afternoon?
- What if marketing teams were unable to meet a critical deadline because they couldn’t email anyone inside or outside the company for more than 48 hours?
- What if you needed your email messages to process large orders that are critical to your operations, but your mailbox wasn’t accessible for days?
Sure, there are workarounds. But even then, each of these scenarios would likely still cause a measurable disruption to your operations. Every lost minute of productivity is money down the drain. And those losses are compounded by other costs, like lost sales, lost customers, missed deadlines and so on.
A whole lotta hurt If you followed the reaction to the O365 outage on social media, then you got a small glimpse into these real-life disasters unfolding across the globe. Office 365 users made their frustration clear on Twitter, posting digs at Microsoft with hashtags like #office364.
- “This outage has prevented our business from sending and accessing email for almost an entire day, and many other businesses across Europe. Shocked by suggestions it’s a single point of failure with no failover after 6 hours here.” (@stonkcat)
- “Outage has been all day – and as a single business owner this is critical. We come to you because we trust you – or used to! More updates please – even if just to reassure us you’re on it. Lousy customer management and communication.” (@chapplecartoons)
- “Crazy that such a big company with a system like azure with so many proclaimed redundancy is not able to 1st see what is the problem 2nd solve it nor 3rd inform about it. Shame on you! #office364” (@mrburnshalle)
Did Microsoft lose any user data? No, there have been no reports of data being permanently lost during the outage. And frankly, that would rarely be the case. While service outages can and will occur from time to time, as with any SaaS provider, Microsoft has dependable safeguards in place to prevent data from being permanently erased.
However, there are two points to consider here: 1) The outage itself was still disruptive enough to impact operations at businesses around the world, even though there was no data loss; and 2) What if it had been worse? What if Microsoft actually had lost user data and the loss was permanent? Both points underscore the importance of implementing a backup system that ensures you’re able to recover lost or inaccessible data when these events occur.
How can you defend against the threat of outages? Businesses can’t do anything to prevent a service outage. But they can control whether such outages will have an impact on their operations. A SaaS backup tool is perhaps the single greatest layer of defense against these threats. It backs up the data within your SaaS applications, such as Office 365, and stores it in another cloud, independent of the SaaS provider.
Having a backup of your SaaS data ensures that you can still access critical files from your cloud applications, even if those applications are down. For example, if your Exchange mailbox is suddenly inaccessible, you can access the critical messages you need through the backup tool. At a time when companies are increasingly running their businesses on SaaS applications, having a backup tool like this is essential.
Which backup tool is best for Office 365? There are several O365 backup solutions available, though each one offers different features and functionality. We like Backupify from Datto, because it’s a simple, all-in-one backup, restore and export solution that protects virtually everything within Office 365. What it backs up:
- Exchange Online emails, including attachments
- OneDrive files and folders, including folder structure
- Calendars, including all event data and attachments
- Sharepoint files and data
How often is data backed up? Backupify can be set to back up your O365 data up to 3 times a day. Backups happen automatically, so you don’t have to even think about them, but you can also perform a manual backup or export at any time (without impacting your automatic backup schedule). The restore capabilities are where Backupify really shines. You can recover individual files and folders, or entire point-in-time backups, with just a few clicks. An advanced search function makes it easy to find the specific data you’re looking for, and you can restore it directly to the user’s account or download it to the admin’s computer.
Doesn’t O365 have its own backup capabilities? There are some limited restore capabilities within Office 365 that can help you recover lost data in some scenarios. But when it comes to the most common data-loss events, you’re out of luck unless you have an independent recovery solution. In addition to providing access to your critical data during an outage, Backupify enables a quick recovery in several other scenarios:
- Data loss due to inadvertent or malicious deletion
- Data loss due to accidental file overwrites
- Data loss due to ransomware or other malware infections
- Data loss due to inactive or cancelled O365 licenses
Again, in the case of service interruptions, having a backup tool like Backupify won’t magically restore service, but it is the next best thing. It gives you access to the important data you need during the outage, while also defending against the most common SaaS data-loss events.