Cloud Reliability Stumbles

Some big names in the cloud world tripped up over the past few weeks, raising questions of how reliable cloud delivery of enterprise software actually is. How big a deal were these outages?

First to stumble was’s cloud services for business. Problems at a Virginia data center put several of Amazon’s cloud customers offline with problems companies reported including “being unable to access data, service interruptions and sites being shut down.” Is this a black eye for Amazon? Maybe not. Amazon offers several levels of service including backup and remote recovery.

But another issue, Mr. Eastwood said, will be a re-examination of the contracts that cover cloud services — how much to pay for backup and recovery services, including paying extra for data centers in different locations. That is because the companies that were apparently hit hardest by the Amazon interruption were start-ups that, analysts said, are focused on moving fast in pursuit of growth, and less apt to pay for extensive backup and recovery services.

Clearly Amazon passes Data Center Management 101 even if some of their customers did not.

Next to stumble was Microsoft’s Business Productivity Online Services. Problems included periodic e-mail outages lasting several days. Apparently Microsoft did not provide backup and remote recovery, as users needed alternate e-mail accounts elsewhere (perhaps on premises Outlook) to survive the interruptions.

Yes, Microsoft’s outage last week was annoying and potentially costly to paying customers. If you’re a current or prospective customer of Microsoft’s Business Productivity Online Services (BPOS), you’ll want to look carefully at how the company handled last week’s outages and what their response says about the long-term reliability of BPOS.

Definite shiner on Microsoft’s cloud credibility, and the level of customer service provided does not inspire confidence.

Third, Google crashed and burned during a maintenance upgrade. This is especially alarming as simple cost effective upgrades are a key selling point of cloud based enterprise software featuring multi-tenant architectures. Some Google customers did not take advantage of a system option to manually backup data to another computer (an on premises PC or Mac). Many customers have not seen their lost data restored even after 6 days.

A Blogger Service Disruption update contains four updates from the last 24 hours, starting with this one:
We have rolled back the maintenance release from last night and as a result, posts and comments from all users made after 7:37 am PDT on May 11, 2011 have been removed. Again, we apologize that this happened and our engineers are working hard to return Blogger to normal and restore your posts and comments.
That’s nearly 48 hours of downtime, and counting. Overnight updates promise “We’re making progress” and “We expect everything to be back to normal soon.”

Google definitely fails Data Center Management 101 and gets a black eye. Unbelievably the Google customer support during the outage was not only poor, but hostile to customers.

Given these events, one would expect unaffected cloud vendors to issue press releases reassuring their own customers that in the case of a data center problem (local crash, maintenance / upgrade fail, network outage, etc.) the vendor cloud structure features server failover protection, remote hot site availability, full backup, upgrade failover protection, and so on. Aside from Amazon as noted above, the silence is deafening.
Instead we are treated to the hype machine shifting to a higher gear.

But the point I raise in my headline is a wider one. It’s about the capacity of cloud application vendors to constantly extend functionality — for all their customers at once — at a much faster rate than the on-premise vendors, who will always struggle to keep up.

[ed. that all at once thing sure worked for Google.] This conveniently ignores the lesson we should be learning – faster is only better if it is safer. Far too many cloud vendors are ”start-ups focused on moving fast in pursuit of growth, and less apt to pay for extensive backup and recovery services.” Enterprises should beware.

At the end of the day what an enterprise needs is an ERP or CRM solution that solves the business problems, is backed by a vendor that supports the business (not the software), manages data center risks including remote failover and full backup, and is affordable. Such systems are provided by customer focused software vendors for whatever deployment option (on premises, cloud, or a hybrid) makes the most sense for your enterprise .

This entry was posted in Cloud computing. Bookmark the permalink.

Leave a Reply