The Ups & Downs of a Successful Service

As many of you have noticed, during the last couple of weeks TypePad performance has not been what we aspire to and you pay for. While I am as displeased as you, at the current time I can do nothing more than apologize -- a weak sentiment without action to back it up. Ben has been in the midst of understanding and fixing the problems so I have asked him to write an update on the situation and tell you what we are doing to get back to the great service you have come to expect.

From Ben:

This has been a bad month for TypePad's performance and general availability, and I'd like to talk about a number of the issues we've faced, how frustrated they make us, and what we're doing about them.

For some background: TypePad-hosted blogs are, to say the least, incredibly popular, and growing at an incredible rate. We're currently pushing about 250mbps of traffic through our multiple network pipes, and that's growing by 10-20% each month. (If you're more familiar with bandwidth stated in terms of transfer allowances, that's a transfer rate of almost 3TB (terabytes!) per day.) And because TypePad customers are so invested in their blogs, we see activity on the service-both reading & writing-that equals services with 100 times the number of users on TypePad.

Because of the growth of the service, we've been increasing our capacity steadily, but a few months ago the data center we are in ran out of space and power, limiting the amount of equipment we could add. After some shopping, we found a great new data center and have been building it out for over a month. We're currently in the middle of that move, and that's when the trouble started.

While a data center move generally tends to add some risk to running a service day-to-day, we could never have anticipated anything like the last couple of weeks. We've seen failures in our storage servers, failures that we had never seen before. We've seen a failure in a piece of networking equipment that had never failed before, and so on, ranging from hardware failures to software failures. After some analysis we believe that all of these failures are related to the fast growth of our service resulting in heavy load on each box—and until the completion of the move, we didn't have the capacity to add more boxes.

These failures and other issues have caused outages both in the reading and publishing of blogs on TypePad. We've also seen some sporadic application performance issues related to statistics, which we're working on solving this week.

Our operations team—and the rest of the company—have been working day and night to understand and overcome these issues. And yet, as a user of the service myself, I can fully understand why that might not sound good enough. But we're committed to providing an amazing service for our customers, and we have been really, really striving to deliver that over the past month. We're working very hard to make November and beyond a better experience, and to get back to the quality of service that you've always received from TypePad.

Over the next week you should see significant improvement in performance as we get extra equipment on line and finish moving data off of heavily loaded servers. By the end of the move we will have five times the bandwidth we had before, as well as hundreds of thousands of dollars of new equipment, and room and power to add more equipment as needed.

We apologize for the poor service you've experienced over the past couple of weeks, and also for the lack of official communication on Mena's Corner or Everything TypePad. At the same time, I know that an apology sounds hollow until we've fixed these issues and the service is stable once again.

We're going to do a better job of giving you updates on our status as we work to improve the service. Thank you for your loyalty, and we're working very hard to earn back your trust.