Disaster recovery via the DevOps approach
Organizations can’t always avoid disasters. But, do you have a plan on how your infrastructure will recover when it actually happens? Disaster is something, we imagine most companies are probably underprepared for, however having disaster recovery plans and the preventative measures, potential damage can be minimized and, a company can quickly get things back up and running, and most importantly, prevent disasters happening in the first place.
There are two parts of this conversation, first, does your company have a disaster recovery process in place which is running on schedule and ensuring you have regular backups in place? And second, how frequently is your disaster recovery process tested?
In this article, let’s review both the parts to understand how vulnerable you are in an event of a major outage and how DevOps can improve your DR planning.
What is Disaster Recovery?
In simple words, Disaster recovery is how does the business continue to function as soon as possible in the event of a disaster. It’s a subset of business continuity planning and a set of actions to be taken before, during and after a disaster to help protect businesses in such an event.
What can these events of disasters be? Well, these events can be anything from simple human mistakes to hardware failures to uncontrolled outages that can lead to costly business damages. For instance, when your cloud provider is down, an accidental mistake blew up your database, a wrong swap emptied your storage account or a most common one, the fat finger problem.
Backup and Restore Vs Disaster Recovery
People often mistake Backup/Restore for Disaster Recovery when in fact, there’s an important distinction between backup and disaster recovery. Backup/Restore is the process of making copies of your data. You back up data to protect it. You might need to restore backup data if you encounter an accidental problem. However, Disaster recovery refers to the plan and processes for quickly re-establishing access to applications, data, and IT resources after an outage. Simply keeping copies of your data might not be enough to keep your business running. Therefore, to maintain continuity, an organization needs a robust and tested recovery plan.
Challenges in Disaster Recovery Planning
Most companies either do not have a disaster recovery plan in place or if they have it, they hardly test it, which leaves them vulnerable in the event of an actual disaster. Even the best-laid disaster plans can go awry if no one bothers to test them. Still, why aren’t these companies building or testing their DR plans? Well, reasons DR plans don’t get tested and implemented can vary between inadequate resources, shortage of man-power, not having enough time, difficult and expensive process.
To be honest, Disaster Recovery is tricky and its ongoing testing can be another major challenge. This testing requires the IT team to run through the DR plan and ensure that a successful recovery is possible. It can be a time-consuming, expensive and complex process that can disrupt the production environment as well, thus avoided by most organizations.
DevOps can Help!
Yes! Taking the DevOps approach can enable more effective DR planning because it enables continuous Integration (CI) and continuous delivery (CD) of software changes so that the applications can be managed more effectively. The same tools and processes you use to push code from development to testing, to production can also play a role in disaster recovery. By incorporating DR planning into the DevOps pipeline, IT team can ensure that the DR plan is managed along with the application because the tools and procedures that you use to move applications from development to testing to production and back to development again can also be applied to failing over and recovering from disasters and service interruption. All of this can happen in a matter of minutes.
Case Study:
Let’s draw a practical example. Assume a SaaS product that consists of code, configuration and data. In an event of a disaster, the expectation is to be able to bring up the whole application as soon as possible with all the information at hand. If a DevOps approach is used to restore such a DR plan, the following actions would be taken:
- Provision of all required resources such as servers, storages, databases, networking, security and any other infra level component. This can be accomplished from a few minutes to hours depending on the number of resources, the provisioning method and the custom configurations. Examples of such provisioning tools are Packer, Terraform, Cloud-formation, blueprints, etc.
- Install dependencies such as application roles (Nginx, apache, node, etc.), third party integrations and apply configurations. Examples of configuration management tools are Ansible, Chef, Puppet, Salt, etc.
- Restore data from backups in file storages and databases. Such a process can also be automated by having scripts ready and can be integrated with DevOps pipeline.
- Test to ensure endpoints are alive and the basic functionality of an application is restored. This process is automated as well and before an actual user logs in, the application will be restored with basic tests passed.
Don’t wait for unforeseen Disaster, Be Prepared!
Disaster Recovery is one of the most critical things when you’re talking about application especially in businesses where your software and data are most of the value of what you’re doing. DevOps is the perfect remedy for such sensitive business situations as its pipeline automates your Disaster Recovery Process while minimizing human error and reducing the time of restore as much as possible.
Do you have a disaster recovery process in place? If not, this is one of the pre-packaged services we have in our offerings.
Contact us to find out more: info@planetofit.ca