Even if you’re remotely associated with the Cloud, I am sure you must have heard about the Availability and Scalability of the instances. Even though this is one of the fundamentals of the Cloud, I have seen many people using both the services interchangeable. Please be mindful – they are NOT the same. Here is my brief attempt to highlight the difference between each of those.
What is Scalability?
Scalability means that an application or System can handle greater loads by adapting to the user requests (also called Auto-scaling – one of the most important features of the Cloud). We can increase the Scalability of the instance in 2 ways:
- Vertical Scalability means increasing the size of the instance.
- Horizontal Scaling means you’re increasing the number of instances.
This Horizontal Scalability is also called Elasticity.
On the other hand, High Availability (or Availability) means that we’re running the application/system in at least 2 data centers i.e. Availability Zones.
When to use Vertical Scalability?
Use Vertical Scalability when you’re using a non-distributed system such as a database. For this purpose, AWS offers RDS, ElastiCache services that can scale vertically.
How to achieve Vertical Scalability?
By increasing the size of the instance. E.g., in AWS Scale Up / Scale Down by increasing the size of the instance from t2.nano instance to u-12tb1.metal.
When to use Horizontal Scalability?
Use Horizontal Scalability when you’re using distributed systems such as web applications. For this purpose, AWS offers EC2.
How to achieve Horizontal Scalability?
By increasing the number of instances. E.g., in AWS, Scale-Out / Scale In by using the Auto Scaling Groups or Load Balancers.
What is Availability?
High Availability or Availability usually goes hand-in-hand with Horizontal Scaling. Horizontal Scaling means running the application in at least 2 availability zones (AZs).
The Goal of high availability is to survive a data center loss.
How to achieve Availability?
By using at least 2 Regions/Availability Zones for the system. E.g., backup all the critical data in at least 2 regions i.e. East & West or Availability Zones i.e. North Virginia and Ohio.
When to use what?
Let’s say I work for the HBO and I know a lot of HBO customers will be opening a new account just to see Game of Thrones. Now, I know a lot of people will be buying the membership of HBO and creating accounts for the first time – this is where I will increase the size of the instance (in our case Database) by using Vertical Scaling.
Once that is done, I know a lot of people will be logging in at the same time to watch the episodes on every Sunday at 9:00 PM – this is where I will be increasing the number of instances and will use the Load Balancers to ensure that all the users can log in on time and experience the show seamlessly. This is Horizontal Scaling or Elasticity.
And just so if the AWS region/AZ is down for some reason, I will back up the episodes at multiple locations so that if the North Virginia AZ goes down, the users can still watch the show stored from the Ohio region.
Now you ask, wouldn’t it be much better to store the show at multiple geographic locations initially so that the users can access the show without latency? Of course! That is done by Content Delivery Networks (CDNs). How does that work? I’ll save that for another post. 🙂