What happens when the cloud goes down?
Organisations everywhere leverage cloud technology to provide better service to their customers and a better experience for their employees. Cloud computing so prevalent in business today that it’s normal.
See surprising cloud usage stats in our infographic.
Enterprises trust the cloud to store their data and apps knowing it’s secure and reliable. And with service level agreements (SLAs) in place between consumers and providers, cloud customers are guaranteed a certain level of uptime – usually of 99.999%.
But the cloud isn’t perfect.
Even the most reliable cloud providers experience service outages from time to time. It’s inevitable. And the longer you're a cloud customer, the more likely it is you will be affected by a service outage at some point. Even the biggest cloud service providers in the world aren’t immune; we’ve already seen outages from them in recent years.
So, what exactly is a cloud service outage? An outage is a period of time during which a cloud service is unavailable to end users. Users might not be able to access some of their apps and data, or all of their cloud-based apps might by unavailable. If the service is performing inadequately or the outside the terms of a customer’s SLA, this can also be classed as an outage.
Find out why Azure is the cloud of choice for so many enterprises here.
Cloud outages are caused by lots of different things. Some of them are within the cloud service provider’s control, but many aren’t. And the reasons for an outage are more straightforward than you might think.
One of the most common causes of a cloud service outage is an outage of the power supply to the cloud datacentre. Providing cloud services to thousands of enterprises in multiple locations takes huge amounts of power. This is often supplied by a third-party power plant or the national grid. And, as we all know, sometimes power goes off. No matter how big the energy supplier, it’s a challenge to supply all datacentres everywhere with the power needed to run perfectly, 24 hours a day, 365 days a year. When the power cuts out, the cloud will too.
Technical issues can also bring enterprise-grade datacentres to a halt. Obvious issues can be identified quickly, but minor issues might go unnoticed until end users are affected. How quickly service can be restored depends entirely on the nature of the problem, how complex it is and how quickly it is spotted. Similarly, when a cloud vendor has partnered with a telecommunications provider to provide the service, connectivity issues can cause a service outage for end users. In this case, it’s the telecommunications provider needs to resolve the problem quickly.
Like any IT system, cloud platforms need ongoing maintenance to keep them working properly and updates to improve performance. Disruption caused by scheduled upgrades can be planned for and communicated to customers, but unscheduled maintenance might cause unforeseen service disruption while workloads are shifted and bugs are fixed.
Cyberattacks are another cause of cloud outage. Cyberattacks can overload datacentres with traffic which stops users from accessing the service normally.
Cloud platforms are not immune to human error. Just one wrong move or instruction can cause an entire cloud outage, even though measures are usually in place to try and mitigate this. The AWS outage of 2017 was caused by human error.
The benefits of cloud computing can’t be underestimated. But to deliver these benefits, cloud providers need to operate on a scale and at a level such that occasional outages are inevitable. Customers need to understand that while the cloud is very reliable, there will be occasions when the service is unavailable, and that’s the trade-off.
Watch our Q&A on how businesses can start to adopt and embrace the cloud here.
Cloud outages don’t happen often, but when they do, you can mitigate the impact on your business by having a plan in place.
It’s important to remember that the implications of a service outage will vary from app to app. It might not matter if some of your apps are unavailable for a while, but others might need access restoring immediately. You can choose to run apps with reduced functionality using cached data so that some features are available even in the case of an outage. Many cloud-based apps also have built-in features that support availability and resilience in case of a service disruption. If running with reduced functionality isn’t an option, you can failover apps and data to an alternate datacentre so that you can still access it while the outage at the other datacentre is fixed.
It’s also a good idea to run regular backups so that if data is affected or corrupted by a cloud outage you have a recent version that can be restored. For example, if you perform an automated backup every hour, and experience an outage where data is lost, the restored data will be one hour old at the most. Automating backups means you won't have to remember to do it, either.
Sometimes a cloud service outage can be “waited out”, but for business-critical tasks affected by the outage an immediate resolution will be needed. When choosing which cloud to use, organisations need to weigh up the impact of a potential cloud outage on their business, and if necessary, invest in a higher availability SLA to guarantee a better uptime.
Next steps
If you're beginning your cloud journey, Core can help. Our Cloud Readiness Solution can help you establish your cloud maturity and develop a roadmap for you to migrate to the cloud easily and effectively. Book your assessment today or contact us for more details.