There was a time – over a decade ago; an eternity in the world of tech – when 100% uptime was impossible. No matter how well a server worked, no matter how capable the administrator, it would eventually need to be taken down. Infrastructure failed. Vulnerabilities and glitches needed to be patched out. New features needed to be implemented.
Back then, outages – however small – were simply a fact of life. Servers like the one documented by Ars Technica user Axatax were the exception rather than the rule. The expectation that a server always be available to its users was unreasonable at best.
Much has changed since those days.
In the world of web hosting, this is thanks in no small part to how the Internet itself has changed. These days, nearly everyone in the developed world is online, and nearly everyone uses the Internet for something. Those users almost universally expect whatever services they use to be available at their leisure. Dropping one’s services for even a short time could easily drive frustrated users into the arms of competitors – to say nothing of lost traffic and sales revenue.
Taken together, the costs can be positively astronomical, particularly for larger organizations. Allow me to quantify matters: on August 17, 2013, Google was down for somewhere between one and five minutes. Although the organization (unsurprisingly) emerged unscathed from the incident, it nevertheless lost upwards of $500,000 in revenue. That outage didn’t just cost Google a mint, either – during the incident, overall web traffic dropped by a staggering 40%.
It should be clear from this that unplanned outages don’t just impact an organization’s customers. Depending on what servers are being used for, everything from employees to relationships with strategic partners could be adversely affected. It therefore goes without saying that downtime needs to be minimized as much as humanly possible.
That’s where cloud computing comes in. Thanks to the services offered by modern cloud vendors, 100% uptime is not only possible, but also affordable. This is thanks to the distributed nature of the cloud, which allows computing tasks to be spread across multiple servers. That way, even if one server cluster is down, a client can simply utilize another without interrupting their services.
Now, this system isn’t necessarily perfect. As evidenced by the 2012 Christmas Eve outage suffered by Amazon Web Services, sometimes things happen that are beyond a vendor’s control. A small glitch in a provider’s load balancing software, an error on the part of an administrator, or an act of god can all contribute to an unplanned and arguably unmanageable service interruption.
Because of this reality, many have made the erroneous claim that 99.9% uptime is the only thing worth striving for; there will always be outages, and the best thing an organization can do is select a vendor that will bring them into the loop the moment a problem becomes apparent (and offer reimbursement on top of that).
While it’s certainly true that one should exercise caution when selecting their vendor it’s also true that a business can take a few steps of its own to maximize uptime.
“Choose an IaaS provider with an architecture designed to limit the impact of outages,” writes Marco Meinardi of Tech Apostle. “If this sounds too theoretical, then think about EBS (AWS Elastic Block Store) which is a centralized macro-component highly dependent on the network.”
Alternatively, he continues, a business could build its own resilient application at the IaaS layer, in addition to selecting a provider with a decent refund policy. That way, if an outage does happen, one can leverage their own load balancing software in order to minimize the damage done.
More than a decade ago, achieving 100% uptime was a pipe dream. No matter how powerful and reliable a server, it would eventually run into problems. Today, thanks to the technology provided by cloud vendors, it’s very much within the realm of possibility – and something every organization should strive for.