1784

Sun, Apr 24, 2011

21st April 2011. The date the machines rise. The date the  ”Global Digital Defense Network” known as Skynet becomes sentient and takes over the world. Except Amazon had other ideas. It’s also the date Amazon’s cloud infrastructure went down for an extended period. When the machines rose it was a case of “all dressed up and nowhere to go”.

It also exposed some people to an uncomfortable public airing of their griefs. This company was monitoring patients’ ECG outputs using the cloud and were unable to fulfil their commitments to patients’ lives. They took a bollocking for putting a system on which peoples’ lives depend in the cloud without any backup or failover strategy but is that really the case? There are some real howlers in the extended exchange, from which Amazon support is noticeably missing. Gems like:

“mission critical systems should never be ran in the cloud” “Just because AWS is HIPPA certified doesn’t mean it won’t go down for 48+ hours in a row” “Well, it is supposed to be reliable” “premium support its a freaking joke” “we are really desparate”

I don’t know anything about the application concerned or whether it really is a life saving service but if you brought a patient into hospital and plugged them into an ECG monitor that had “99.95%” uptime and was “supposed to be reliable” as the “support is a … joke” you’d probably be taken to court and that seems to be the overriding message being given to the company from the “peanut gallery” observers.

But how fair are such comments? The company has three server IDs and Amazon provide three Availability Zones to each account. So if we assume the company has three systems in the cloud, one in each zone, that’s the equivalent of using three different data centres. It’s the equivalent of having two spare ECG monitors in the cupboard. The chances of three data centres going offline for an extended period is virtually zero. But Amazon managed to do just that.

What are you meant to do if a respected cloud provider can’t provide reliable cloud services? The company in question might have been better served if they’d shelled out for premium support although this is unlikely as there was no cloud to support. It had evaporated. Instead they were forced to wash their dirty laundry in public and reap the consequences of trusting to statements on bits of paper about reliability and support.

Presumably if you provide services upon which lives depend you’d be well served by using redundancy in your cloud provider setup. But doesn’t that defeat the purpose of cloud computing? It’s meant to take the heavy lifting away from your hardware and software requirements and yet it’s not quite reliable enough for such services.

I don’t think this episode will really affect cloud computing. Amazon’s digital world shook and some people just realised they shouldn’t really be there.

As for the machines. Well, they just missed the bus.

comments powered by Disqus