How does the cloud affect an organization’s Disaster Recovery (DR) and Business Continuity Plan (BCP)?
I recently worked on a SaaS platform and there was a lot of discussion around the fact that since the product is entirely developed in the cloud, we need not have to worry about the BC or DR plans. This is the responsibility of the cloud service provider (in our case Amazon Web Services).
Before I jump on my thoughts about the entire scenario, this is a quick clarification that BC and DR CAN’T be used interchangeably as they often are.
Disaster recovery (DR) refers to having the ability to restore the data and applications that run the business i.e. data center, servers, or other infrastructure and how quickly data and applications can be recovered and restored. Business continuity (BC) planning refers to a strategy that lets a business operate with minimal or no downtime or service outage.
In a nutshell, BC and DR plans are more about how are we going to operate our business operations including the people and processes and not just critical infrastructure. {Personally, I think this is a hard-learned lesson after 9/11}
BTW what is a cloud?
In the simplest terms – when you’re storing/processing your data on someone else’s computer, you are using a cloud.
But why do we even need BC/DR plans? Can’t I just train everyone and be through it?
No. These audit plans are specifically required to be documented, assessed, approved and audited by audit frameworks and third-party reports such as SOC1/SOC2, ISO 27001, NIST 800-53, HITRUST, HIPAA, PCI, GDPR, etc.
Shared Cloud Security Responsibility
We know security has 3 primary requirements – Confidentiality, Integrity and Availability. These requirements don’t change when we move from the local infrastructure to the cloud; however, the responsibilities of carrying out these requirements are shared by the cloud provider as well as the customer.
Even with the shared responsibility model, it is notable that when the ‘execution’ can be outsourced to the cloud provider, the ‘accountability’ still lays with the customer i.e. us.
Availability in the cloud
There’s a notion that since the entire data is in the cloud, the customer need not have to test the plan or think it through and the entire responsibility is of the cloud provider. As everything is in the cloud, the availability is not at risk.
This is an incorrect notion.
There are many legal and licensing issues which put the availability in the cloud at risk (Further reading: Cloud Computing Legal Issues).
Specifics in the cloud
More often than not, the customer forgets about the people part of the plan. For instance, when I was working on the SaaS application, we had a lot of discussion on enabling the firewall on AWS but little was discussed around who is responsible for ensuring the rules are up-to-date per the security requirements? Also, the backups in the cloud are as insecure as the original data and need the same level of security as the original data. The same RPO/RTO are required to be assessed for the cloud as well. Lastly, as mentioned earlier, cloud security is a shared responsibility which should not be overlooked.
Service Administration in the Cloud
According to the Cloud Security Alliance, the best way to administer the services in the cloud is to assign a single admin for each of the services.
An example of this control is to assign different services like password, user admin, etc. to different service administrators.
When different admins are assigned for each of the services, it limits the risks posed by each of the admins for their specific services.
Best practices
Use Multi-factor Authentication (MFA) – The use of MFA reduces the possibility of incorrect configuration by a malicious employee and reduced the ‘blast-radius’. Also, the separate administration accounts are managing the inappropriate configurations.
Assess applications for cloud architecture – Assess if the application can be moved from a local architecture to a cloud architecture. There are many, many legacy applications which are still working on mainframes and can’t be migrated to the cloud easily. I wrote an entirely different article on the same topic which can be accessed by clicking here: The 48 assessment questions to ask before Cloud Migration
Don’t reinvent the wheel – As much as we love defense-in-depth, it would make little sense to have multiple firewalls with the same rules. If there are security features provided by the cloud, make sure to utilize them.
Manage SofD between administration accounts: Refer ‘Service Administration in the Cloud’ section for more information.
Lessons from previous incidents
Recently, there has been an incident where access to the AWS East region went down. Though the probability of the entire region going down is very low, the services providing you access to the region can go down. This outage put many services like Alexa and Atlassian down.
That being said, the backups in the cloud are always susceptible to:
- Unauthorized disclosure
- Accidental deletion
- Disclosure from the reuse of infrastructure
- Malware and ransomware attacks
As in an ideal scenario, we should always ensure that in a multi-region scenario:
- Metastructure is configured
- Images are not region-specific
- No data sovereignty issues
- Clients do not have restrictions
- ALWAYS test the plan
Next steps
With the limited experience I have with Azure and AWS, I can certainly attest that testing in the cloud is technologically easier. When we’re in the cloud, the testing need not have to be a full-scale test but a representation of the testing of critical services. For companies which are just starting with cloud migration, Table-top exercises are a good start.
Lastly, don’t forget to update the numbers on the contact-sheet while you’re updating the firewall rules. 🙂
References: