Microsoft admits network outages violated service agreements
- By Kurt Mackie
- Sep 13, 2010
Microsoft this week disclosed that its Business Productivity Online
Services (BPOS) had three service outages that affected BPOS customers
in North America in August and September.
Morgan Cole, a Microsoft employee, admitted in a blog post
that BPOS had service outages on Aug. 23, as well as on Sept. 3 and
Sept. 7. All of the outages were associated with a Microsoft network
upgrade effort, which initially knocked out the service for two hours
on Aug. 23. A fix led to additional problems in September, including
problems with the "sign-in service and administrative portals," Cole
explained.
On Sept. 7, Microsoft had a problem with BPOS that had "more
widespread customer impact, although the duration was relatively
short," Cole stated, without explaining the nature of the problem.
Microsoft is currently monitoring this situation after isolating
"suspect traffic," according to the blog post.
A comment in that blog post by "Jim Glynn" indicated that Microsoft
has credited some of its customers affected by the Aug. 23 BPOS outage.
However, Glynn noted that customers should contract their BPOS
representative to request compensation afforded by Microsoft for not
meeting its service level agreement (SLA).
Uptime is a prime consideration for organizations using
software-as-a-service (SaaS) applications instead of the more
traditional customer premises-installed solutions. Microsoft's "all-in"
organizational move to the cloud,
providing services to businesses and organizations, hangs on meeting
its SLA agreements. However, SLAs don't change the fact that businesses
using hosted applications will be dependent on external infrastructure
that they do not control.
Compensation for not meeting the SLA may not be equivalent to the
costs of lost business time, but it's the common practice for service
providers, according to Robert Mahowald, research vice president for
SaaS and cloud services at the IDC research and consulting firm.
"Web applications rely on access to the Internet, which of course
adds another potential weak link in the chain of getting access to
information and functionality," Mahowald explained in a phone interview.
"But it's pretty much common practice for SaaS providers to guarantee
'three nines' of uptime ... which is about 28 hours a year in which they
will not be accessible. Most of that is supposed to be scheduled
downtime. Essentially, it's pretty much common practice for providers
to pay service credits in recompense for the lost opportunity and to
not pay any monetary fine."
The two-hour outage on Aug. 23 appears to have violated Microsoft's
SLA guarantee for BPOS applications. BPOS is expected to be available
99.9 percent of the time per month, or as Microsoft's FAQ specifies:
"Microsoft provides a 99.9 percent uptime Service Level Agreement for
Exchange Online, SharePoint Online, Office Live Meeting and Office
Communications Online."
BPOS users are credited based on a calculation of the monthly uptime percentage, according to Microsoft's Exchange Online SLA document.
If the service availability dips below 99.9 percent, then the service
credit is 25 percent of the monthly service fees. If it dips below 99
percent, Microsoft pays out 50 percent of the monthly service fees.
Lastly, Microsoft pays the full monthly service fee if the service
availability dips below 95 percent.
Microsoft informs its BPOS customers and the public about BPOS uptime problems via a "Microsoft Online Service Notifications" RSS feed.
According to that feed, Microsoft restored services on Sept. 7 for
multiple applications, including Exchange Online, SharePoint Online,
Office Live Meeting, Office Communications Online, plus a few others.
Microsoft had planned to conduct maintenance on some of its BPOS
services on Sept. 11 in its North American data centers. However, the
company has now postponed its network upgrade plans, according to the
RSS feed.
Mahowald wasn't aware of any disaster scenarios for SaaS providers,
but the prospect is "bound to happen," especially for educational
institutions and governments that may have outsourced important
operations by relying on SaaS. In such cases, SLAs will become even more
important.
"It's an incredibly important issue to understand. It's no longer
about simply saying, on a functional basis, 'does your application do
what mine does and what's the price,'" Mahowald said. "I think
understanding the SLA behind it and actually having some teeth in the
SLA is going to become an even more important distinction than it is
right now — perhaps more important than price as you go up the chain
with mission-critical applications."