Invenio IT

The Ultimate Guide to Disaster Recovery for SMBs

Tracy Rock

Tracy Rock

Director of Marketing @ Invenio IT

Published

Disaster recovery is a component of business continuity planning that is intended to help companies recover quickly from a disruptive event.

Typically focused on information technology (but applicable to all business operations), the planning can encompass a broad range of tools and processes:

  • Data backup and recovery technologies
  • Failover systems
  • Redundant hardware and equipment
  • Secondary business locations
  • Recovery protocols and procedures

Together, these components guide a business through all stages of the disaster management cycle: prevention, preparation, mitigation and recovery.

Why it’s important

Operational disruptions—and the financial losses they cause—are arguably the single greatest threat to a business.

However, many small businesses are unprepared when these incidents occur. Experts at Insurance Business magazine state that a “major mistake is that many organizations lack an operational business continuity and disaster recovery plan, which leads them to underestimate the potential impacts and the length of potential disruptions.”

Consider some of these alarming disaster recovery statistics:

  • 46% of businesses have no documented disaster recovery plan, according to data from Computing Research. Among organizations that do have a plan, 7% never test any of the protocols or systems documented in the plan.
  • Nearly 40% of small businesses fail to reopen following a disaster, according to FEMA.
  • 90% of smaller companies fail within a year if they can’t resume operations within 5 days after experiencing a disaster.
  • Each hour of downtime can cost businesses anywhere from $10,000 to millions of dollars, depending on the size of the company.

Disaster recovery planning is critical to ensuring that companies are prepared for any threat and that personnel know how to respond when those incidents happen.

The truth about disasters

Disaster recovery is not focused solely on destructive natural disasters, such as tornadoes and hurricanes. While those are indeed serious threats that require proper planning, other types of disasters are far more common.

Examples:

  • Ransomware & malware that disables or destroys data
  • Data loss caused by accidental or malicious deletion
  • Network outages that block internet, communication and server access
  • Hardware failure that destroys or prevents access to data
  • Utility outages that prohibit the business from functioning
  • Fire or flooding that destroys infrastructure and forces relocation

Each of these events—even the loss of a single critical file—can pose enormous challenges for a business. Operational disruptions of any kind can translate into tremendous costs that are difficult for smaller companies to overcome.

Costs of prolonged recoveries

Consider, for example, the impact of a single ransomware infection. With files encrypted, entire computers become unusable and operations are effectively frozen across the organization. This results in lost wages, lost productivity, interrupted revenue streams, costly recovery efforts and a host of other expenses.

In 2017, the NotPetya ransomware attack cost FedEx a staggering $300 million. The same attack caused more than $1.4 billion in losses for pharmaceutical giant Merck, underscoring the importance of comprehensive disaster recovery planning.

More recently, ransomware has forced prominent organizations to permanently close their doors, including:

  • Lincoln College, a 157-year-old liberal arts school in Illinois
  • Wood Ranch Medical, a healthcare provider in Simi Valley, CA
  • The Heritage Company, a 300-employee telemarketing firm in Arkansas

Creating a disaster recovery plan

A disaster recovery plan (DRP) outlines a business’s strategies for dealing with operational disruptions. Much like a business continuity plan, a DRP is a comprehensive document that spells out how the business should respond to various disaster scenarios (and how to avoid them).

A typical disaster recovery plan includes the following sections:

  • Key Contacts: Contact information for recovery personnel or key stakeholders
  • Objectives: Purpose and scope of the plan
  • Review & update schedule: How often the plan should be updated and by whom
  • Activation protocol: Under what circumstances the plan is activated and how
  • Recovery procedures: Detailed processes for recovering from specific incidents
  • Systems: Data backup and other IT systems that support the recovery process
  • Secondary locations & assets: Backup space, equipment and resources for temporary relocation
  • Recommended action steps: Identification of areas that require additional planning
  • Testing: Parameters for testing the various systems and protocols in the plan

Creating a disaster recovery plan is the first critical step of the planning process. It requires a business to consider the specific incidents that threaten operations and create detailed recovery protocols for each scenario.

Business continuity and disaster recovery (BC/DR)

The term “business continuity and disaster recovery” is often used to describe a business’s data backup system. Shortened as BC/DR, it is an essential IT deployment that ensures a business can restore data from a backup after a disaster.

  • While many forms of data backup software exist, BC/DR systems typically offer greater protection against a range of data-loss events.
  • Many of today’s disaster recovery solutions deploy a dedicated backup device, combined with intelligent software and cloud storage.
  • High backup frequencies and fast recovery methods are what define the best BC/DR solutions, ensuring that businesses can maintain continuity through any disaster. (Below, we identify some specific features to look for in a disaster recovery system.)

System failover

Beyond data, businesses also need to have a backup plan for replacing a wide array of systems that are critical for company operations to function. Failover systems create redundancy, enabling businesses to quickly fall back on secondary resources when primary systems become unavailable.

Examples of failover systems include:

  • Backup generators that continue to supply power to the business during electrical outages
  • Network failover systems that enable communication to continue during outages (i.e. via redundant telecommunications lines, wireless failover, network failover systems, etc.)
  • Failover servers that are activated during planned/unplanned maintenance on primary systems

Failover ensures that critical systems are constantly available, even when primary resources go down.

Understanding risks

Creating effective disaster recovery protocols is impossible without having a deep understanding of the potential risks. A risk assessment is needed to identify the most likely threats and their impact on the business.

A risk assessment is typically included within the disaster recovery plan or business continuity plan. An IT-specific risk assessment is often created separately from the larger, business-wide assessment. This helps to measure the unique impact of disaster on essential technology deployments.

  • Identified risks should be prioritized by their likelihood, as well as their impact.
  • Risk assessments are typically accompanied by an impact analysis, which provides greater insight into the specific consequences of each disruption.
  • Impact is typically measured by the end cost of the disruption as it affects IT and all other business functions (i.e. downtime, revenue disruptions, etc.)

Disaster prevention

Steps to prevent a disaster are just as critical as those to recover from one, if not more so. As such, prevention is an important piece of disaster recovery planning: it enables continuity without the need to activate a recovery plan.

Examples of preventative steps:

  • Security solutions to prevent disruptions from malware, cyberattack, data theft, etc.
  • Network/firewall configurations to block dangerous incoming/outgoing traffic
  • Access control/permissions to restrict users from accessing sensitive file directories
  • Load balancing to prevent network slowdown and server crashes
  • Scheduled server maintenance and hardware replacement to prevent unexpected failure

Another critical form of prevention that’s often overlooked is user education. Today’s most destructive events, like ransomware attacks, are often caused by user error. For example, users may inadvertently open a malicious email attachment or fall victim to a phishing scam that steals their login credentials.

Ongoing employee training programs can greatly reduce the risk of these events by educating users on safe web/email practices.

Recovery processes

Every disaster requires its own unique process for recovery. As part of the disaster recovery plan, businesses must carefully outline the steps that personnel should follow to carry out the recovery. This could include steps for restoring a backup, reinstalling a critical application or even moving mission-critical operations to a secondary location.

Tips for effective recovery protocols:

  • Create specific procedures for each scenario identified in the risk assessment and/or impact analysis.
  • Leave nothing to guesswork. Clearly spell out each step with the assumption that it could be carried out by personnel who aren’t deeply familiar with the process.
  • When applicable, incorporate diagrams, flow charts or other visuals to make the process easier to follow.

Recovery procedures should also state who is responsible for carrying them out, including any secondary/substitute personnel for scenarios in which the primary recovery team is unavailable.

Objectives

The speed and timing of the recovery process should be guided by the objectives set in the disaster recovery plan. Within IT, two of the core objectives pertain to how quickly systems should be recovered to prevent the negative consequences of a prolonged outage. Those two objectives are referred to as:

  • Recovery time objective (RTO): The desired maximum amount of time that the recovery process should take. This can be applied toward specific systems or events, such as data loss, network outages, website outages, and so on.
  • Recovery point objective (RPO): The desired maximum age of the most recent backup. This objective sets a limit for the age of backups (as well as goals for backup frequency), helping to minimize the amount of data loss when a backup needs to be restored.

Secondary locations and assets

In the event of catastrophic disasters in which physical business locations become inaccessible, organizations must have a plan for restoring critical operations at a secondary location. This means having access not only to the backup location itself but also equipment and resources for that location.

  • If a business does not already have access to a secondary location, it should have a plan for quickly securing one.
  • Backup equipment must be made available to the mission-critical personnel that will use the secondary location. Beyond server and network infrastructure, this can include individual computers, desks, chairs and so on.
  • The disaster recovery plan should prioritize the personnel that should relocate, and the business should further communicate this with all applicable personnel via the emergency communication methods identified in the plan.

Recommended data backup & recovery

As mentioned above, data backup is an important piece of the disaster-recovery puzzle. When data is lost—for whatever reason and no matter how small or large the loss—businesses must be able to restore it quickly in order to prevent an operational disruption.

While there are numerous BC/DR solutions on the market, there are some key capabilities that today’s businesses should look for when comparing options:

  • Dedicated backup devices to process and store the backups
  • Hybrid cloud backups (stored locally and in the cloud)
  • Ability to perform backups frequently (every few minutes or more frequently)
  • Ability to boot backups as virtual machines for instant access to protected files, apps and operating systems
  • Numerous recovery options: file-level, rapid rollback, bare metal restore, direct restore, etc.
  • Automatic backup integrity checks

In the age of ransomware, cybersecurity experts advise businesses to deploy robust disaster recovery solutions that can quickly recover the entire infrastructure, in addition to individual files and folders.

Solutions like the Datto SIRIS provide a complete infrastructure backup (physical, virtual, cloud) as often as every five minutes, while also enabling near-instant backup virtualization, locally or in the cloud. 

Disaster recovery testing

Ideally, every system and procedure listed in a disaster recovery plan should be tested on a routine basis. Otherwise, how does a business know if its planning will actually work in a real incident? Disaster recovery testing is essential for ensuring the protocols are effective. It also helps to identify any weaknesses that still need to be resolved.

Examples of disaster recovery testing:

  • Testing and validating data backups to ensure that data can be restored without error.
  • Running network penetration tests to identify weak spots and confirm that network disruptions can be quickly restored.
  • Cybersecurity tests and threat assessments that identify vulnerabilities.
  • Mock drills that test the recovery procedures for various disaster scenarios.
  • Pre-planned evacuations and other safety drills that test employee procedures for emergencies.

Third-party vendor management 

Today’s businesses are increasingly interconnected with other businesses, including vendors, suppliers, distributors and an array of technology services. These partnerships often require providing authorized access to company systems or other integrations that allow a seamless connection between the two company’s systems.

However, these partnerships must be managed carefully to minimize cybersecurity risks. Consider, for example, that the infamous 2013 Target data breach was caused by a third-party vendor’s network credentials being stolen. This incident underscored the importance of vendor management as part of a company’s disaster recovery planning.

Considerations for third-party vendors:

  • Which vendors require access to company systems, and how does that access expose the company to risk?
  • How can vulnerabilities be eliminated when integrating third-party systems?
  • In a disaster caused by a breach to third-party systems, who is responsible for recovery? What is the communication plan?
  • What is the role of a third-party vendor, such as an IT company or managed-service provider, for disaster recovery?

Frequently asked questions (FAQ) about disaster recovery

1. What is disaster recovery?

Disaster recovery is a form of planning that ensures an organization can recover from operational disruption. Often focused on IT infrastructure, disaster recovery establishes procedures and systems that help to maintain business continuity after a disaster, such as data loss, server failure or electrical outage.

2. What is the purpose of disaster recovery?

The goal of disaster recovery is to mitigate the impact of a disaster on a business’s operations. Disaster recovery equips an organization with the tools, protocols and technologies it needs to anticipate disruptions, rapidly restore affected operations and minimize downtime.

3. What are the 4 phases of disaster recovery?

Disaster recovery is often defined by four phases of planning and response: 1) prevention, 2) preparedness, 3) mitigation and 4) recovery. Together, these phases consist of all the strategies leveraged by a business to minimize disruptions to a business. The fundamental goal of each phase is:

  • Prevention: Prevent disasters from occurring in the first place
  • Preparedness: Ensure the business is adequately prepared if various disasters do occur.
  • Mitigation: Reduce the impact of a disaster with swift response immediately following the incident.
  • Recovery: Fully recover affected systems and operations; resume business as usual.

4. What are examples of disaster recovery?

The most common example of disaster recovery is having data backups that can be restored when files have been lost, deleted or destroyed. Backups are a critical component of disaster recovery, ensuring that a business can quickly restore lost data, applications or operating systems, especially after a server failure or ransomware attack.

5. How does disaster recovery work?

For disaster recovery to work, businesses must develop clear documentation outlining their strategies for preventing and recovering from a disaster. This is known as a disaster recovery plan (DRP). A DRP establishes protocols and systems that work to minimize the risk and impact of various operational disruptions.

Conclusion

No business is immune to disaster. And when it occurs, it can cause a catastrophic disruption that puts the entire company at risk.

Disaster recovery planning ensures that organizations are thoroughly prepared for every possible scenario. The right planning can significantly reduce the risk of certain incidents, in addition to providing clear steps for recovering systems after a disruption has occurred. Without effective disaster recovery, a business increases its risk of operational downtime, financial losses and, at worst, permanent closure.

Request a Free Demo

For more information on protecting your business with the data backup and disaster recovery solutions from Datto, request a free demo or contact our experts at Invenio IT. Call us at (646) 395-1170 or email success@invenioIT.com.

Get The Ultimate Business Continuity Resource for IT Leaders
invenio logo

Join 23,000+ readers in the Data Protection Forum