Can anyone point me to a cloud hosting provider who will really commit to 99.999% availability? From what I’ve seen some offer this but only back it up with meaningless compensation if they fail such as refunding a months hosting fees which for any business which really needs 99.999% availability will be out of proportion with the damage unscheduled downtime brings. The other catch that you often see is 99.999% planned availability which means that they can schedule as much downtime as they like as long as the plan it and let you know in advance.
I ask this question because although I’m very much in favour of cloud hosting I increasingly feel that where clients really need 99.999% availability they need to look at a hybrid solution which either combines your own hosting scaling out to the cloud which has been done successfully, hosting in multiple availability zones with your chosen cloud platform or maybe ultimately a multi-cloud solution spreading the hosting over two or more cloud providers, this is the only solution if you really want to engineer out single points of failure.
So while contracts often call for five nines can the client afford that extra nine, are they prepared for the complications it brings and even if they are is it more of a target than something anyone will actually commit to in application hosting today?
Any application with 99.999% availability will need to have been designed from the ground up with that availability level in mind. You can’t take an existing application and easily retrofit this level of high or rather continuous availability. In reality, if you want to upgrade an existing application from 99.99% availability to 99.999% you are going to have to engage in a serious refactoring project.
Update: January 7th 2020
If your business really needs 99.999% availability or indeed beyond this, you may want to look at a new offering from IBM called Continuous Availability (CA). IBM launched this in mid-2019 on their flagship IBM i small mainframe platform before rolling out the technology to other platforms at some point in the future. Their solution isn’t a replacement for HA and mirrored systems but it’s a new ability to keep a server online constantly through database, operating system and software upgrades. Most customers running these platforms will host in private data centres, on premises or potentially contract cloud provision from IBM. 99.999 availability means what?
This means that the service is available, operational and usable within specified performance criteria 99.999% of the period. The period and performance criteria should be clearly specified within the contract. Lets say we are taking about a 7 day rolling period. 7 days * 24 hours * 60 minutes * 60 seconds = 604,800 seconds. 1% = 604,800/100=6048 is about 6 seconds of unplanned outage every being allowed in a 7 day rolling period.
Update: April 6th 2022
Welcome to 7 Nines Availability 99.9999999%: The latest IBM Z Series Mainframe is designed to deliver 7 Nines availability. This level of availability and performance is increasingly essential for customers running large scale financial institutions and online services. See the ITIC 2021 Global Server Hardware, Server OS Reliability Report here.
Interesting article, I was just comparing the Google Apps SLA to the office 365 version, and see that Microsoft offer a 99.9% uptime with a finacially backed SLA, where Google use a credit based system that is limited to 15 incidents per month, and as you suggested, you get extra days for free should your business grinds to a halt!