Protect IT Infrastructure with Server Disaster Recovery Plan Template

Picture of Tracy Rock

Tracy Rock

Director of Marketing @ Invenio IT

Published

recovery-plan

Cyberattacks. Natural disasters. Human error. Any of these could take down your IT systems and grind your operations to a halt, without warning. If you’re not prepared, the business may never recover. In our Server Disaster Recovery Plan Template below, we’ve outlined the critical steps for approaching a real-world disaster scenario and getting your systems back up and running—before it’s too late.

You’ll be able to use the template as a basic framework for identifying:

  • What disaster recovery procedures to include as part of your business continuity planning
  • Who is on your recovery teams and what roles they’ll play
  • How to perform a risk assessment to determine which disasters pose a threat to your servers and operations
  • What weaknesses currently exist that require urgent action
  • How often to update your own recovery plan and who will do it

No recovery plan? Game over.

When your business-critical servers go down, it can devastate the business. Every minute of downtime translates into losses in productivity and profits. Idle workers alone can be extremely costly.

Consider a threat like ransomware, which encrypts the data on your servers and demands ransom money in exchange for the key to restore your files. By targeting vulnerable businesses, government agencies and healthcare facilities, ransomware is already extorting over $1 billion a year from its victims.

But that’s not the whole picture.

Experts say the downtime after such an attack is where costs really start to rack up. They estimate that each hour of inactivity costs small businesses an average of $8,581 per hour in lost productivity and expenses. These costs can be crippling for many companies, which is why some businesses ultimately decide to pay the ransomware attackers, rather than risk a painfully slow recovery.

But other types of disasters, like a fire or flooding in your server, don’t come with such an easy out. If your IT infrastructure is destroyed, and you don’t have a backup plan, it’s game over.

40 percent of businesses never reopen after a disaster, according to Federal Emergency Management Agency (FEMA). Without a sound recovery plan in place, your company could become another statistic.

Let’s look at the fundamental sections you need to include in your continuity planning documents.

What Goes in a Server Disaster Recovery Plan Template?

A server disaster recovery plan template includes a framework of planning for the risks of server outages. It outlines the steps and systems for preventing server disruptions, as well as protocols for responding to outages and recovering lost data.

The following server disaster recovery plan template provides a basic illustration of what to include in the plan and how to structure it. 

1) Plan Objectives

Regardless of whether your document is specific to server continuity, or part of your larger business continuity plan, you should open with a statement of intent. This provides key stakeholders and other recovery team personnel with a clear purpose for the plan, why it’s important and what objectives it should achieve.

If you’re creating your own server disaster recovery plan template from scratch, the “Plan Objectives” section also serves as a useful guide for what the plan needs to accomplish.

Example objectives could include:

  • Develop a company-wide plan to adequately prepare for an unforeseen disaster
  • Help ensure the company can recover rapidly after a disaster that has impacted information systems, thus minimizing impact on business-critical operations
  • Provide instructions, procedures and emergency contact information for recovery personnel to use in a disaster situation
  • Identify processes and technologies for restoring server data and networking configurations after a critical event
  • Identify current risks and recommend action steps for preventing and resolving a network failure

The purpose of this section is to make it clear what the scope of the plan is (and what its limitations are), so that everyone is on the same page. For example, if the plan is narrowly focused on your servers and IT infrastructure, this should be clearly stated in the objectives.

2) Points of Communication

This section identifies key personnel, such as stakeholders, executives and department managers, along with emergency contact information for each, which should include:

  • Name
  • Title
  • Phone(s): (Work, Home and Alternate)
  • Email(s): (Work and Home)

You will also want to include an additional section that lists the personnel on your recovery team. These are the individuals who will be tasked with updating the plan, activating it during a disaster and ultimately overseeing the recovery.

In addition to their contact information, consider adding their specific roles and responsibilities on the team. Examples of role definitions within a server recovery plan could include:

  • Inspects physical on-site servers and infrastructure for abnormalities or damage
  • Oversees server backup management, configuration and schedules
  • Initiates data recovery processes; determines the appropriate recovery point from backups, checks backups for integrity
  • Conducts backup validation and recovery tests

Some businesses may also want to include a calling tree or communications plan in this section. A calling tree is a flowchart that identifies who should contact whom in an emergency incident. This tree is essential for maintaining effective communications after a major event. It ensures that all personnel are notified of the incident and know what to do.

In a more comprehensive DR plan, businesses may choose to add the communications plan as its own designated section.

3) Plan Management

Here, you will specify who is responsible for periodically reviewing and updating your disaster recovery plan, and how often. This will likely include a primary plan manager, in addition to others on your recovery team who will need to make sure that various elements of the plan are up to date and fully tested. Also, identify where and how the plan is stored, so that there’s never any confusion on where the documents can be found.

While the “Plan Management” section may not seem as crucial as other components listed below, this section is essential for ensuring that the disaster recovery plan is accurate and up to date.

Here are just a few common ways that server DR plans become quickly outdated:

  • Recovery team members identified in the plan are no longer at the company
  • Information about the server or the O/S no longer apply after system updates
  • Instructions about server backups and recovery procedures are no longer applicable after new backup systems are deployed

Most disaster recovery plans should be reviewed and updated at least once per year. For greater assurance, consider setting a quarterly review schedule.

4) Backup Strategy

For most businesses, data backup is the foundation of disaster recovery planning. So you’ll likely need to devote an entire section of your DR plan to backup strategy, implemented systems and objectives.

Start by including a high-level overview of the backup strategies for various company operations. Here’s a small sample of what this list could look like, specifically for a server disaster recovery plan:

Operations Strategy
IT Infrastructure Mirrored recovery site (identify location)
Email data Daily backup, local and cloud
Customer data Hourly backup, local and cloud
Tech support Fully mirrored recovery site
Microsoft 365 data Cloud-to-cloud SaaS backup by Datto

 

Following this high-level overview, you’ll want to include more information on each of the backup strategies identified. For example, if there is a backup location for the business, specify where it is and who has access. For data backups, specify how those backups occur and with what systems. If you’re using third-party services or managed service providers, include that information along with points of contact.

Your backup objectives should also be clearly defined. These objectives are typically documented in the form of goals for how quickly data should be recovered from backups (referred to as your recovery time objective or RTO) and how recent those backups should be (recovery point objective or RPO).

Why do these objectives matter?

  • An RTO provides a guideline for the speed of recovery. It helps recovery teams understand what’s at stake and how quickly they need to restore servers to avert a major operational disruption.
  • The RTO thus also helps to inform BCDR investments. If, for example, a currently deployed backup solution cannot meet the recovery objective, then you’ll know it’s time for an upgrade.
  • Similarly, the RPO ensures that you’re configuring your backups with the frequency you need to avoid significant data loss. If a backup must be restored, a more recent recovery point (i.e. 2 hours) will theoretically only cause 2 hours’ worth of data to be lost at max.

5) Risk Assessment

Now, it’s time to identify all the “what ifs” that pose risks for the company. In this section, you’ll include various types of disaster scenarios and their impact on the business.

Chances are your business is more at risk of certain incidents than others, so you’ll want to assess the probability of each. For example, if the company is located right on the coast, it could be more at risk of flooding, rather than a tornado. If you store sensitive and valuable data, your risks may be greater for a cyberattack.

Consider using a numerical rating system to identify both the probability and impact on the business, such as:

  • Probability: 5=Very likely, 1=not likely
  • Business impact: 5=Major disruption, 1=Minor disruption

Let’s consider a single event for illustrative purposes:

Event Probability Impact Consequences and/or Response
Ransomware attack 3 3 Up to 12 hours loss of business-critical data; restore most recent backups from cloud

6) Event Definition

Here, you will define what each of those events is and what they would look like. While events like “fire” and “flooding” would be obvious to personnel, a specific form of cyberattack, like ransomware, may not always be so clear. Describing what such an attack looks like is important for both your recovery teams and training of personnel.

Include the level of severity that would warrant activating the recovery plan. In a fire, for example, you need to determine specifically what needs to happen before the emergency protocols should be followed. (How big of a fire? Where? What is the impact on server infrastructure?)

7) Response

Elaborate on the specific procedures that should be used to resolve the incident. For example:

  • Upon detection of ransomware, notify IT manager
  • Define and communicate immediate actions to appropriate teams
  • Isolate infected machines; remove from network and/or power-off
  • Rollback to healthy data recovery point, ideally no more than 12 hours previous

Be as specific as possible. Remember to include steps for contacting the appropriate authorities, when applicable. In a natural disaster, for example, your protocols will likely include dozens of steps beyond those that apply to the server recovery. For example, you may want to identify locations for employee safety, evacuation procedures, assembly points and so on. (Some organizations may choose to include these protocols in their larger DRP document, rather than the more narrow server disaster recovery plan.) 

8) Preventative & Recommended Guidance

Identify the systems, technologies and other tools that are already in place to help mitigate the risks of the event or resolve it. This is where you will include more details on your data backup and recovery systems, anti-malware software, training programs or even things like server-room fire suppression systems.

You should also include any recommended action steps for resolving weaknesses that you’ve discovered during planning. Update the plan again once those systems have been implemented.

9) External Communication

In a major event, your teams may need to contact a wide range of external parties, such as:

  • Third-party recovery providers
  • Media
  • Insurance agencies
  • Financial firms
  • Attorneys

This section should identify the primary points of contact for each of those parties and the scenarios or rules for communicating with them. For media communication, some companies include pre-written press release drafts (with blank areas for the specifics) in their business continuity plans. This helps save precious time after a disaster and also ensures the communication is pre-approved and consistent with company policies.

10) Asset Management

Depending on the scope of your plan (company-wide or IT-specific), you’ll also want to include a list of physical assets. This will probably be similar to versions that you’ve already provided to your insurance company. The list can include everything from the components of your IT infrastructure to your office furniture and valuable paper files.

Frequently Asked Questions

1. What is server disaster recovery?

Server disaster recovery is a form of disaster planning that defines the procedures for recovering a server after an unplanned outage. This planning can include measures to prevent server failure, as well as systems for recovery, such as restoring a data backup.

Protocols for server disaster recovery are typically included in an organization’s larger disaster recovery plan, which is the master document for all disaster planning and response.

2. What is a server disaster?

A server disaster is defined as any adverse event, flaw or system failure that disrupts the proper function of a server. It can refer to a complete server outage or the loss of specific volumes of data stored on the server.

3. How do I back up a server?

To back up a server, you will need backup software and at least one storage device where the backup will be stored. Some backup solutions include a dedicated backup device that stores the protected data locally and also replicates it to a secure off-site data center or cloud.

These systems are also referred to as business continuity and disaster recovery solutions, or BCDR for short.

4. How do I restore a server from a backup?

The steps for restoring a server from a backup depend on a variety of factors, including the specific context of the server failure and the type of backup platform you’re using.

If the server is still booting, backups can usually be restored back onto the original drives. If the server is not booting, other methods should be used, such as a bare-metal restore or a hypervisor upload for virtual servers.

5. How often does a server fail?

On average, a server has a 5 percent failure rate in its first year of operation, according to data from Statista. Server failure rates increase as the server ages. After 4 years of operation, the average server failure rate increases to 11 percent.

6. What causes server failure?

Server failure can be caused by a number of issues, including hardware malfunction, malware, power loss/surges, O/S failure, overheating, application errors and human error. The most common cause of server failure is a malfunctioning hard drive.

Conclusion

Keep in mind: no two recovery plans are exactly the same. While this server disaster recovery plan template provides the basic structure for creating your own plan, we strongly suggest customizing your plan according to your business’s specific needs.

Use the template above only as a starting point for identifying your unique risks, emergency protocols and technology solutions. You may find that the nature of your business, and the threats unique to your industry, require a different approach. As long as you’re carefully considering your disaster scenarios and creating the appropriate response plans, you can significantly reduce the risks of a prolonged server outage.

Free Demo: Data Backup for Full Server Recovery

Discover robust data backup designed to keep your business running after a server disaster. Request a free demo of disaster recovery solutions from Datto, or contact our business continuity specialists at Invenio IT by calling (646) 395-1170 or emailing success@invenioIT.com.

Get The Ultimate Business Continuity Resource for IT Leaders
Invenio it logo

Join 23,000+ readers in the Data Protection Forum

Related Articles