Postmortem on the incoming email delay issue on January 24th and 25th, 2018

Last Thursday morning on January 24th and Friday morning on January 25th, incoming emails to all Helpmonks mailboxes were delayed by two hours, respectively on the 25th for up to four hours.

The root cause was that we a recent update to the “message id” of an email (each email has a unique message id) and the reply-to-id were not correctly set. This in return, caused messages from another Helpmonks mailbox to be accepted, and if there was an auto-reply, it created an endless loop for some emails.

The endless loop created thousands of emails that came back to our mail server. As these emails got larger each time in size, it brought our parsing engine to its knees. Besides, our search server started to index all those massive emails causing a backlog in searches and an unaccounted overhead on our database.

We were actively testing and working on this update for several days and felt confident to roll it out to all of our customers. However, we failed to properly account for how to parse internal emails with auto-replies correctly.

In hindsight, it is clear to us that we could have avoided the problem. It’s bad enough that our customers had to experience a significant delay, but knowing that this could have been prevented is challenging to accept. I cannot express my apologies deeply enough.

Even though we had issues with relaying emails to our customers promptly, we did not lose a single email, as all emails automatically get stored on different incoming servers that are independently setup. That was another major upgrade we conducted in the recent months.

We work hard every day to bring the best team email collaboration platform for you. We have a weekly uptime record of 99.98% and in some weeks even a 100% uptime. The delay in delivering emails to you does not live up to our standards and we will take this incident and will discuss what we can do better to avoid this happening again.

I want to thank our support team who responded to each customer individually and also to our customers who despite the issue and the inconvenience this caused showed their understanding and encouragement.

If you have any questions, or if we can help in any way, please reach out to us. Thank you.