Microsoft BPOS outages cause customers to question Exchange

The cloud-based suite experienced the issues as a result of 'malformed e-mail traffic,' according to Dave Thompson, corporate vice president of Microsoft Online Services

Microsoft's cloud-based Business Productivity Online Suite (BPOS) network underwent multiple outages last, according to complaints from customers using the company's hosted e-mail service.

Hints that something was awry were noted by ZDNet, which starting receiving complaints from angry customers on May 12. Customers using Microsoft's BPOS services described the outages in posts to a Microsoft Online Services forum, ZDNet noted.

"Cause of #bpos mail delays mitigated; message queues are draining; watch the dashboard for updates," Microsoft stated in a post on Twitter this afternoon. A Microsoft spokeswoman confirmed the outages, saying the company will issue a formal statement.

The outages appear to have impacted Exchange Online, with some customers reporting message delays as well as lost e-mails. One Microsoft partner said in a phone interview that messages were delayed up to six hours.

"It's been resolved in that there are not currently delays but it's been in and out since [May 10]," the partner said. "I'm hoping that it [the outages] won't come back up again but I don't know for certain that it is permanently resolved."

Users of Exchange Online fumed on the online forum. "We're a worldwide corporation using this. If it doesn't improve, we may have to go back to in-house Exchange," said one poster.

Said another: "I migrated our company to Exchange Online from in-house Exchange 2003 last October, and I'm sorry to say that I regret everything I ever said about how this would be better. It has been far worse in terms of both performance and reliability. I hate to be so harsh, but I am deeply frustrated. ... We're actively looking at migration paths back to in-house e-mail."

The outages come as Microsoft is looking to promote the next generation of Exchange Online in its Office 365 suite, which was released for beta testing last month. According to the Microsoft spokeswoman, the outages "only affected BPOS and did not affect Office 365 whatsoever."

UPDATE: Late on May 12, Dave Thompson, corporate vice president of Microsoft Online Services, posted an explanation for the BPOS outages on the Microsoft Online Services Team Blog. According to Thompson, the intermittent access reported by BPOS users was the result of issues relating to "malformed email traffic" that occurred on May 10 and 12.

Thompson wrote:

"On Tuesday at 9:30am PDT, the BPOS-S Exchange service experienced an issue with one of the hub components due to malformed email traffic on the service. Exchange has the built-in capability to handle such traffic, but encountered an obscure case where that capability did not work correctly. The result was a growing backlog of email. By 12:00am PDT, the malformed traffic was isolated and the mail queues cleared. The delays encountered by customers varied, on the order of 6-9 hours. Short term mitigation was implemented and a fix was under development.

"At 9:10am PDT today, service monitoring again detected malformed email traffic on the service. The problem was resolved at 10:03am, but users experienced up to 45 minute email delays during this time. A second, but related issue was detected via monitoring at 11:35am PDT, resulting in email stuck in some end users' outboxes. The issue was remediated at 12:04pm PDT. During this time, more than 1.5 million messages had queued on the service awaiting delivery. The backlog was 90% clear by 4:12 PM, but because of this large backlog of email, customers may have experienced delays of as long as 3 hours. We are implementing a comprehensive fix to both problems."

Thompson's post also noted that an unrelated DNS issue on May 12 "prevented users from accessing Outlook Web Access hosted in the Americas, and partially impacted some functionality of Microsoft Outlook and Microsoft Exchange ActiveSync devices." That issue was resolved a few hours later, Thompson said.

About the Author

Jeffrey Schwartz is executive editor of Redmond Channel Partner and an editor-at-large at Redmond magazine, affiliate publications of Government Computer News.

inside gcn

  • IoT security

    A 'seal of approval' for IoT security?

Reader Comments

Wed, May 18, 2011 Walt Connery rim of fire

Clouds do not now, and will never, take the place of robust in-house storage soluitons. Clouds are *back up*--not a primary working set. Companies hoping to save some steps and allow themselves laziness and complacency by depending on Clouds in exchange for a few dollars in "savings" are short-sighted and shooting themselves in the foot. As a person in IT in one form or another for the past 30 years I find the fact that many younger professionals simply do not know this. To depend on Clouds to manage your most critical, day-to-day stuff--like email--is exactly as efficacious as rubbing a rabbit's foot and hoping for the best. With the momentous changes that have occurred in recent years pertaining to extremely robust, exceptionally reliable, and practically *dirt cheap* mediums of local storage (terabyte hard drives with self-back-up, error-correcting RAID, huge DVD storage platters at pennies per megabyte for nearly permanent storage (*so* much better than microfiche, for instance), and the fact that all of this technology is priced so that John Doe @ Home can easily afford it--there's *no reason* a business, any business, should ever, ever get caught with its pants down simply because their "cloud" breaks down on one day or another. As opposed to a primary solution, the "cloud" should merely be one layer in a layer of redundancies designed to facilitate the successful *uninterrupted* flow of business on a daily basis. The cloud is there for the day the internal solutions malfunction; the internal solutions exist for the day the cloud ceases to function, etc. ad inifinitum. This "cloud" marketing extravaganza reminds me of SUN's vaunted "Network Computer" paradigm decades ago. The idea was that the only thing people needed was a dumb terminal locally, connected to cloud storage via the Internet, where all the computing was done remotely with the data spit back to the originating dumb terminals. Bad idea then, bad idea now. Nothing wrong with Cloud computing, of course, provided it isn't used as a substitute for local administration and prudence to the effect that if the cloud goes down so does the business.

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above

More from 1105 Public Sector Media Group