Downtime for August 11, 2014

UPDATE August 12, 2014 –  We have managed to recover most of the missing emails, and will work to restore them over the next couple days.

On Monday, August 11, 2014 we took ProtonMail offline for upgrades, which we estimated would take just a few minutes. Part of the upgrades for Monday were to optimize our database table structure, which would speed up the performance of the site for all users.

Unfortunately, a wrong command typed by one of our developers resulted in corrupting the messages table which resulted in losing messages for most of Sunday and the earlier part of Monday (about 20 hours). Needless to say this was a terrible mistake and we’re very sorry that this happened.

This seemingly benign update took a terrible turn because our standard operating procedure (SOP) wasn’t followed. Because a backup was not made prior to beginning database operations (as mandated by our SOP), we lost a day’s worth of user data. The ProtonMail team is growing right now, and unfortunately, we did not properly communicate our standard operating procedures to some of the new team members.

In the next few weeks we will be rolling out a more robust server architecture and deployment process that will not only speed up the site, but also increase reliability, reduce the need for downtime and prevent these sort of errors from happening. We will also be increasing oversight of staff and centralizing operations to improve our internal communications and ensure our established procedures are always followed.

These procedures have helped ProtonMail avoid data loss and extended downtime up until today, and could have prevented the incident today.

The road to bringing privacy back to email is a challenging one, but its all been possible thanks to your support. Over the coming weeks, we will be building in multiple safeguards to prevent data loss in the future. Thanks for bearing with us through our beta and understanding that human errors like this do happen.

 

About the Author

Jason Stockman

Jason is the Co-Founder of ProtonMail. He works on building ProtonMail's webmail interface and front-end encryption. Jason has 10+ years experience building websites and applications.

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

62 comments on “Downtime for August 11, 2014

  • what about the emails that I have received during the upgrade? are they lost or the all new email that now I can see in my inbox are them? so is there any new/unopened email or not…

    Reply
      • Hi guys. Not to worry. Keep strong and keep going. This is still beta and we all know it! Thanks for the privacy you are giving us.

        Reply
    • Jason,
      Really disappointed – but I understand everything that comes with beta testing.

      For me, a very important email may or may have not been lost. I’ll guess I’ll have to check with sender.

      Could we get a discount on a shirt? hell I was a org. contributor! 😀

      *worth a ask!

      Reply
  • what exactly do you mean “the messages table” and “losing messages?” what was lost exactly or do you just mean the site was non-functioning?

    Reply
    • We lost around 20 hours worth of emails as we had to restore from an older backup file which didn’t contain the most recent emails.

      Reply
  • No worries, guys. It’s still early days for Protonmail and you have clearly stated warnings on your site that Protonmail is going to be buggy for a while, during these early stages of development. Gmail had lots of problems too, as I remember. It IS good to back things up, though! :p Godspeed!

    Reply
  • I havent had any missing emails nor corrupted messages. I am really glad to see you noticed, advised, fixed and admitted your mistake. Sure goes along way in my book. Great work!

    Reply
  • You’ve got to be kidding me. First if all why are you “upgrading” during peak traffic hours on the US East coast?!

    Reply
  • I think that everyone here understands that during the Beta stage of this project unexpected occurrences can and will take place. I appreciate your proactive response to this issue, and for being so open about it.

    I have been an exceptionally happy Donor and user of ProtonMail, and I am excited to see it grow and progress so rapidly. THANK YOU very much to every single person who is involved in this project!

    Let’s make it happen! Privacy = freedom. Do not let anyone else tell you otherwise.

    Reply
  • thanks to this, a important email was lost. Stranger still when saving a half typed reply, sending resulted in is disappearing. Odd, but not catastrophic. I simply did some online research, found the email, brewed some coffee and started typing.

    Thank you for the enjoyment of hot coffee in a relaxing environment. Your bugs are worth waiting out 😀

    Reply
  • Hi Jason,

    No worries, To err is human. No one is perfect.

    Does it mean that during the 20 hours downtime in-coming emails were rejected as “errors” and were not cached somewhere?

    If it was cached is there any way of knowing which emails were rejected during the 20 hour downtime if they don’t now come through gradually?

    Keep up the amazing work you guys are doing. Thank you.

    Reply
    • Incoming emails were accepted and saved to the database, and they were present and readable. However, around 930 Eastern Time this morning, we had to roll back the database to the last backup which meant around 20 hours of message vanished. We are working now on recovering the last messages and will replace then if we manage to recover from the corrupted database.

      Reply
  • Thanks for the info, guys! S**T happens! We all appreciate the work you are doing to develop this secure solution. Looking forward to the next update.

    Reply
  • Perhaps I jinxed it — I started using my protonmail for the first time this weekend. Fortunately, nothing important (per the warnings that this was a beta product), just some queries to merchants.

    Reply
    • Unfortunately, it was a new employee not familiar with our SOP. We are working on communicating these standard practices to everybody on the team.

      Reply
      • Try the Rickover nuclear power weirding-way. EVERYTHING is done by an approved written procedure, right-wrong-or-indifferent. As experience is gained the procedure is polished but never wrong.

        A particular radiochemical sampling was done three times per day, easily memorized but as easily hurried. An independent reader was/is required and a formalized dialogue of order-acknowledgement-execution.

        Reply
  • The difference between you guys and the IRS is that you admit your mistake, are transparent about how it went south, take the flak for the obvious misstep and sincerely vow to do better.

    The IRS just invents a cockamamie story and hopes we’ll lose interest.

    Fight on.

    Reply
  • You were clear that email is in beta and we shouldn’t use your system for emails that we can’t afford to lose. Appreciate you being up front about the problem, the cause and the solution.

    So sorry for whoever did the delete. I’ve brought down production systems before and it really sucks. Experience is the best teacher. Regrettably, experience is something you get right after you really needed it. 🙂

    Reply
  • Hi Jason,

    In the big scheme of things this is nothing, therefore I do not understand all the whining that is going on here / It is beta, people should expect a little glitz now and then…..
    I think you guys are doing a wonderful job, and it is only getting better, so I take this opportunity to thank you for the important job that you’re doing !

    Reply
  • I sent a couple of unencrypted messages to outside recipients before the downtime. I cannot see them in my sent mail folder now. Were they actually delivered?

    Reply
  • Good luck ProtonMail. We will support you. I suggest we send and receive emails on a test basis only for the next week or so!

    Keep up the amazing work you guys are doing. Thank you.

    Reply
  • Err… what about backup? Before doing anything to a DB, we *have* to backup…

    Especially in Beta. Especially when users don’t have any way to do their own backups… New employee or not, backup, backup, backup. And it’s maybe not a good idea to put a new-comer in this kind of migration -.-.

    Not really good for the future, may I say. Commands should be written down, scripted and tested on DEV db before that. A production migration should never involve self-typed commands.

    Hopefully you’ll learn from this mistake.

    Reply
  • Transaction logs might be very useful in times like these (between backups). But no doubt you have already considered this.

    Reply
  • I think I belong to the very very few people affected about that issue after the registration. I finished the registration and saw the welcome mail but today I wasn’t able to log in.
    I did again the registration through the preinvite mail and all worked smoothly.

    Reply
  • The last time I checked this was still a beta project, so I appreciate the effort. Looking forward to the day where you guys feel comfortable slapping a $5/month charge (or whatever) on for “premium” service. My hope is that this will turn into a private alternative to paid GMail. (A few more features needed, but that’s what I mean by “premium”.

    Reply
  • Thanks for the info on the ‘down time’ We all understand what ‘Beta’ means and I’m sure we are all grateful for your efforts to bring privacy to our emails. Keep going chaps !!!

    Reply
  • Hey guys, you probably know this already but I’m going to say it either way. Normally database engines keep a log of the changes made to the database. This log was made withe the purpose of crash recovery, so that if the server crashes in the middle of an operation that operation is concluded when it restarts. This log can also be used when there’s an human error. You take your last backup and the logs with the changes that happened after that backup up until right before the human error, and you start the database engine in recovery mode. The server will take the backup and start applying the changes in the log. That way you don’t lose the transactions that were applyed to the database after the backup. I don’t know what database engine you are using but try to find a configurations that allows to do continuous archiving of the Write Ahead Log (WAL). Your backup procedure should include them both, “snapshot” of the database and do the log.

    Reply
  • Not sure how to write this. On one hand I’m interested in the recovery, was it possible? Have anyone seen some old mails yet? Are your routines updated? (Great learning experience) etc… On the other hand since this is a Beta and all, we really should expect stuff like this and be happy to be part of the learning experience and try to help out as much as possible 🙂

    Keep up the great work°!

    Reply