Data Protection Tool

What Businesses Can Learn from the Latest Salesforce Outage

Picture of Tracy Rock

Tracy Rock

Director of Marketing @ Invenio IT

Published

salesforce outage

If you need proof that downtime can strike any company at any time, look no further than Salesforce. This cloud-based customer relationship management (CRM) software giant has stringent security and carefully tested procedures, but that doesn’t make it immune to outages. As recently as September 2023, the Salesforce team found themselves struggling to recover from a serious outage that affected an enormous swath of customers across virtually all their services.

It’s vital for companies that rely on Salesforce — or a similar CRM provider — to understand why this outage happened and the types of damage it can cause. In this post, we’ll walk you through everything we know about the incident and others Salesforce has experienced in the past. More importantly, we’ll share some insights into how you can protect yourself from these kinds of events going forward.

A Deep-Dive Into the Salesforce Outage

The most recent Salesforce outage wasn’t the result of a ransomware attack, hacking, or some other nefarious act. Like so many other instances of downtime and data loss, it happened because of a single human error. Digging into the details of this particular outage underscores how frighteningly easy it is to lose access to a software-as-a-service (SaaS) company’s services.

Why It Happened

On September 20th, 2023, Salesforce customers suddenly discovered that they couldn’t log in to the site. They raised the alarm about the issue, unaware that Salesforce was already scrambling behind the scenes to correct it.

Although they initially assumed that a third-party provider caused the problem, the Salesforce team quickly realized that they had only themselves to blame. A few minutes prior, they had made a planned update to internal permissions that wreaked havoc on their systems. While they intended to strengthen their security controls, they instead blocked customers from accessing their services.

After discovering the root of the issue, they immediately initiated a rollback to get back online. The company later explained that the error occurred, in part, because they couldn’t use their standard change deployment pipeline because of the nature of the change they were making, highlighting a potentially significant flaw in their processes.

How Long It Lasted

An outage that lasts only a few hours, as was the case in September, can feel like hours for providers and their customers. This is a breakdown of how it played out:

  • At 9:42 AM EST, Salesforce deployed a change that restricted permissions. The disruption began a few minutes later.
  • The Salesforce team became aware of the problem at around 10:00 AM.
  • Just after 10:30 AM, the team initiated a rollback to resolve the issue. It was complete within ten minutes.
  • At approximately 11:30 AM, the team confirmed that all services had recovered except MuleSoft and Tableau.
  • Those services remained down for another couple of hours, with Salesforce confirming that both had been restored at around 2:00 PM.
  • At 2:15 PM, Salesforce declared that the incident was fully resolved.

While Salesforce restored their services relatively quickly, some customers expressed concerns about their timing. They wondered why the company chose to implement an update during business hours rather than overnight when an outage would have been far less disruptive.

In addition, customers questioned why Salesforce didn’t provide an initial customer update about the problem until nearly an hour after they discovered it. Company representatives said that they wanted to know exactly what was happening and why before sharing that information with customers. While understandable, the delay left customers in the dark about how long the services they needed to do their jobs would be unavailable.

How Did the Outage Affect Users?

An outage is never a good thing, but it’s especially concerning when it affects services that businesses use for essential functions. When Salesforce goes down, it can create huge obstacles for customers, and that was definitely the case in this instance.

A Disruption Across Multiple Clouds

This incident caused pain for a wide range of Salesforce clients. In their summary of the incident, they said that customers on these clouds experienced the outage:

  • Commerce Cloud
  • MuleSoft
  • Tableau
  • Salesforce Services
  • Marketing Cloud Account Engagement
  • Marketing Cloud Intelligence
  • Omni Channel
  • ClickSoftware
  • Trailblazer
  • Data Cloud

With more than 150,000 customers worldwide, it’s safe to assume that such a widespread outage had a substantial impact on business operations, even if it was for a short timeframe.

What Customers Experienced

During the outage, customers were unable to log into Salesforce or access the affected services because the internal permissions change inadvertently blocked access to legitimate and necessary resources. With Salesforce down, businesses can’t carry out their normal sales, customer service, marketing, and e-commerce activities, so the inability to access their data presents a huge challenge. It creates a ripple effect because it makes it difficult for businesses to maintain their own operations and prevents them from serving their customers.

Are Salesforce Outages Common?

While Salesforce has a solid record of maintaining service uptime and cloud availability, this certainly isn’t the first time an outage has occurred, and it won’t be the last. Let’s take a stroll back in time to see some of their most significant outages.

May 2021

Just over two years before the 2023 outage, Salesforce had a similar event. A configuration change applied to their servers left clients unable to log in to the system for approximately five hours. After the issue was resolved, leadership at Salesforce explained that an engineer had implemented a change too quickly rather than using a staggered deployment. This caused the outage to affect a much larger number of customers than it would have otherwise.

May 2019

A massive Salesforce outage in May 2019 left thousands of customers unable to access the service for several days. The incident was one of the worst Salesforce downtime events in the company’s history, with as many as 3,200 users temporarily losing access to their SaaS data around the world.

Officials from Salesforce confirmed that the outage was intentional, after a faulty database script accidentally broke permission settings, giving some users access to all of their company’s Salesforce data. ZDNet reported that the issue arose from a change that the company made to its production environment within Pardot, Salesforce’s digital marketing tool.

A bad database script deployment inadvertently broke access permission settings, giving users access to all their company’s data. Worse yet, they gained write access, creating a serious security problem.

As a result, Salesforce shut down large segments of its infrastructure, intentionally removing access for current and former Pardot customers. This meant that the service disruption also affected a chunk of Salesforce users who were not actively using Pardot and who weren’t affected by the permission issue.

May 2016

May seems to be a problematic month for Salesforce. In addition to the 2019 and 2021 incidents, the company also experienced a significant outage in 2016. This particular incident left customers without access to their CRM data for around 20 hours.

A bug in the firmware of its storage arrays initially caused the disruption. During the resolution, the company had to move its data to another data center, which led to a massive database failure. Salesforce restored a backup, but many companies permanently lost some of their data in the process.

Forced and Accidental Outages

One of the most noteworthy differences between the 2019 and 2023 outages is that the first was forced while the second was not. Knowing the difference between the two can help clarify why these outages unfolded in such different ways.

Why Salesforce Chose to Shut Down Their Services in 2019

A forced outage occurs when a company chooses to take its services down temporarily. It’s far from an ideal option, and it usually only occurs if the organization feels that they have no other choice.

On the surface, that might not seem to make a whole lot of sense. Why would a company want to force an outage of their services? To answer that question, let’s take another look at the 2019 incident.

In 2019, Salesforce eventually realized that the faulty database script had effectively removed all permission settings for some companies. Their decision basically boiled down to this: allow those users to have access to everything, or remove access for everyone.

Taking the service down temporarily was the only viable decision.

If they had allowed the service to remain up, then users at every affected company would be able to access data that they weren’t supposed to. This could have created a dangerous situation and a liability nightmare for each of their customers.

For example, consider what would happen if an employee who has just been terminated could delete large amounts of critical data before they exited the company. Similarly, imagine a scenario in which a user maliciously copies data to provide it to competitors. Even an accidental deletion of important company data would be a major problem for any organization. In this case, Salesforce had no choice but to take the service down until it could restore those permissions.

How Accidental Outages Occur

The 2023 Salesforce outage was a very different story. It didn’t occur because the company wanted to sever access to their services. Rather, it was the result of a completely unforeseen effect of a seemingly safe change to their access permissions with Amazon Web Services (AWS).

Salesforce has acknowledged that they could have handled things differently to mitigate the risk of this particular outage. For example, they’ve promised customers that they will work on new testing programs for changes like the one that caused the outage.

However, no matter what Salesforce does, they can’t promise that they’ll never experience an outage again. There are simply too many potential causes, including:

  • Hardware failures
  • Power outages
  • Software issues
  • Cyber attacks
  • Flawed third-party integrations
  • Human errors

Keep in mind that these problems aren’t exclusive to Salesforce. Unplanned outages happen all the time to businesses across the globe, which is why it’s so important to have a backup solution in place.

Other Ways to Lose Data in Salesforce

Extended Salesforce outages may not happen all the time, but there are several other ways that users can experience data loss — and these events occur a lot more frequently than you might think. These are just a few examples:

  • Users might inadvertently or maliciously delete data
  • Data migrations may fail
  • The system might overwrite data during third-party app integrations

To be clear, these are user-caused data-loss events, and Salesforce isn’t responsible for them. What’s most worrying is that, statistically, they are far more common than service disruptions, meaning that the protection of your data falls primarily on your shoulders.

Protecting Your Salesforce Data

SaaS platforms like Salesforce allow companies to run powerful applications in the cloud, rather than installing on-premise software or storing their data on on-site servers. But just because the data is stored in the cloud doesn’t mean it’s protected against data-loss events or service disruptions. The only way to ensure that you’ll have access to your data is by using an independent backup solution.

Why Independent Backups Matter

In the immediate aftermath of the Salesforce outage, customers with backups were able to restore their data while the company worked on executing a full rollback. They saved precious time, and they didn’t have to speculate whether the outage would last for hours, days, or weeks.

Unfortunately, most companies didn’t have this luxury because they didn’t keep independent Salesforce backups. Instead, they had to cross their fingers that Salesforce would act quickly so they could get back to work.

While Salesforce does offer some backup export options, they’re expensive and limited. For greater protection, organizations need a cloud-to-cloud SaaS backup solution that replicates all Salesforce data and stores it independently in other data centers. This ensures that companies can maintain business continuity when disruptions occur.

Choosing the Right Salesforce Backup Solution

Finding the perfect Salesforce backup provider can be challenging, especially if you’ve never used one before. When you search, prioritize reliability, simplicity, ease of use, and speed.

A strong Salesforce backup solution should be easy to deploy, with minimal or no hardware installations. It should also offer automatic scheduling options, so you don’t have to waste time manually configuring them.

With an effective system, you can create regular backups of all Salesforce data and protect everything on your network, such as:

  • Servers
  • Virtual machines (VMs)
  • Desktops
  • Laptops
  • Databases
  • Apps

Seamless data restoration for any and all of these sources is also vital. A superior backup solution can restore all your data or individual objects with just a few clicks.

Above all, your backups should be completely separate from the Salesforce platform and stored in an independent, secure cloud. This allows you to access your data within the platform’s interface, even if Salesforce is down.

How to Learn More About Salesforce Backups

Salesforce is a powerhouse CRM provider for businesses, but it’s not invincible. As past outages have shown, the system could go down at any time, leaving you without access to your most important data.

No business can predict when the next SaaS service outage will occur. By backing up your SaaS data, you can ensure that your organization can continue operating through the disruption. To learn about backing up your data from Salesforce and other SaaS platforms, contact the team at Invenio IT. Schedule a meeting with a data protection specialist, who can help you find the best solution for your company.

Get The Ultimate Business Continuity Resource for IT Leaders
Invenio it logo

Join 23,000+ readers in the Data Protection Forum

Related Articles