Effective Spam Filtering with Encrypted Email

protonmail spam filtering

One of the biggest challenges at ProtonMail is doing effective spam filtering with encrypted emails.

As many of you have noticed, in recent weeks, spam filtering performance at ProtonMail has dramatically improved. This is because we have recently deployed a series of updates to improve spam filtering performance. In this three part series of posts, we will discuss many of the spam challenges ProtonMail faces and discuss in detail how to fight spam in the end-to-end encrypted world. ProtonMail’s spam challenges can be summarized broadly into three categories.

  1. Incoming Spam
  2. Outgoing Spam
  3. Internal Spam

Incoming spam is spam sent to ProtonMail from third party email providers, for example Hotmail. Outgoing spam is spammers using ProtonMail to send spam to third party email providers. Finally, Internal spam is ProtonMail accounts being used to spam other ProtonMail accounts. Each of these pose very different risks and challenges. As an encrypted email service provider with over 1 million users, spam is a continuous battle and one of the toughest challenges to overcome.

In this blog post, we will be discussing incoming spam filtering. In future blog posts, we will cover the challenges of preventing ProtonMail from being used by spammers and the challenges of doing spam filtering with end-to-end encrypted emails which we cannot read.

Incoming Spam

Incoming spam is not dangerous, but it can be a major inconvenience for users. Incoming spam doesn’t just clog up user inboxes, but it can also cause a performance problem if not handled efficiently due to the sheer volume of incoming spam emails which sometimes arrives at the ProtonMail servers.

Emails that come from third party email providers obviously cannot be delivered with end-to-end encryption, but upon reaching our mail servers, we will encrypt them with the recipient’s public key before saving the messages. All this is done in memory so that by the time anything is permanently stored to disk, the email is already un-readable to us. This gives us a very limited window to perform spam filtering on incoming messages.

When an incoming message is received, it goes through the following filtering steps. The goal is to use less computationally extensive methods first to reject as much spam as possible before more expensive methods are used.

 

1. First, the IP address of the incoming SMTP server is checked against spam blacklists which contain IP addresses of servers we have previously received spam from. If we receive a hit, the message is rejected.

2. Secondly, the message is passed through our customized Bayesian filters which marks suspicious messages as spam.

3. Next, we generate checksums of incoming messages and check them against a database of known spam messages. If there is a match, we mark the message as spam. The checksums are done in such a way that it is also effective against mutating spam emails.

4. Afterwards, we also apply a few other anti-spam techniques which we cannot detail here for security reasons (see below)

5. Since email headers can be easily spoofed and abused, we also verify the authenticity (SPF, DKIM, and DMARC) of incoming emails to protect users. An email that fails DMARC is likely spoofed so it will be sent to the spam folder with a warning for our users.

6. Finally, user specific spam rules are applied. This will apply user specified whitelists and blacklists to avoid false positives, or catch more spam messages.

Over the past few months, we have optimized and improved many of the above components to achieve a 500% improvement in spam detection. In doing this, we learned a few lessons:

Security through Obscurity

Generally speaking, security through obscurity is not recommended. This is why ProtonMail is open source, and we have all of our front end code open to inspection from the community. Security through obscurity is the anti-thesis of open source, and relies on the notion that security of a system can be improved if attackers do not know how the system works. Generally, this is a bad approach. It is better to have a system so secure that even if attackers know how it works, they cannot bypass it. This is certainly the case with the PGP email encryption that ProtonMail utilizes.

However, one case where security through obscurity DOES work is fighting spammers. This is particularly the case when it comes to fighting outgoing spam which we will discuss in next week’s installment. Fighting spam is like trying to hit a moving target, it requires constant adjustment and tuning, especially since the distinction between spam and non-spam messages can be unclear at times.

There simply isn’t any foolproof method for defeating spam. Thus, if spammers don’t know how we are blocking their messages, it makes it much more difficult for them to find a workaround. This is why we cannot publish detailed specs of how our spam filters work. It also means we cannot open source our backend server configs which contain our spam filter settings.

In terms of privacy and trust, there is little advantage in open sourcing the server configs because even if the configs were released, there is no way to guarantee that is the config running on the server side. On the other hand, releasing the backend configs would let spammers know exactly what they need to do to bypass our spam filters, which would put the entire ProtonMail community at risk.

Personalized Spam Filtering

Before designing our spam filtering system, we looked through months of spam reports from the community. What we quickly learned is that every user has a different definition of spam. What you consider to be spam won’t be the same as what your neighbor considers to be spam. Thus, it is impossible to define a single ruleset that works for everybody. This pushed us in the direction of personalized rulesets.

Today, every single ProtonMail account comes with its own spam filter settings which are unique to that account. When you mark messages as spam or not-spam, the filter will dynamically adjust to take into account your personal preferences. You can also view and modify your personal spam filter settings. ProtonMail also accounts for whether an email came from one of your contacts or not. If it comes from a contact, it is allowed through the spam filter.

The Best Spam Filter is You

ProtonMail has a comprehensive multi-tiered protection system to prevent spam from entering your inbox, but actually you are the best protection against spam. The vast majority of spam can be avoided by simply not giving your ProtonMail email address to unscrupulous websites which then resell that information to spammers. To learn more about how to avoid receiving spam in the first place, you can read our guide to avoiding spam here: https://protonmail.com/support/knowledge-base/avoid-spam/

 

 

About the Author

Admin

We are scientists, engineers, and developers drawn together by a shared vision of protecting civil liberties online. Ensuring online privacy and security are core values for the ProtonMail team, and we strive daily to protect your rights online.

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

17 comments on “Effective Spam Filtering with Encrypted Email

  • The best weapon against spam would be aliases. If you don’t give your address, you are not spammed.

    Aliases allow to detect who sold e-mail addresses, to disable addresses when spammed, to automatize labelling…

    But real aliases, not aliases with “+” character which are never accepted and which allow to guess the original address.

    Reply
      • Yes, that’s true.
        From my point of view, these aliases are interesting for professionals who want addresses like « contact@entreprisename.com », with a name which represent there company, or distinguish intern or extern communication.

        The limitation in term of number of aliases is not interesting for a standard user who just don’t want to give his real address when he subscribe to forum, create an account or access to content only available for user signed up, etc.
        This idea https://protonmail.uservoice.com/forums/284483-feedback/suggestions/11627253-aliases would be more interesting for this goal.

        Not really the same use I think. In the first case alias have just a « representation » goal. In the second case, the goal is to be able to delete the alias if spammed.

        Reply
      • Why didn’t you add filtering feature by using recepient’s field?

        Like:
        From “spammer@mail.ru”
        To “innocentvictim+protectionagainstspammers@protonmail.com”.

        So, all you need is to mark all letters you receive with this “to-field”. I can’t do that now.

        Reply
  • The problem with spam is that it comes because someone I have once sent an email to gets their computer infected with malware that makes a copy of their contacts and hence sends spam to all those.

    I would prefer that I never give my proton address to anyone. This can be done like so:

    When sending an email to a new person my From address becomes 345h3458fd@protonmail.com, so if I ever should get spam on that address, then I can see who that had their computer compromised and revoke that email address.

    By that method no need for spam filters.

    Reply
  • I had a Amazon gift card get sent to my spam folder just today. Once I select “Not Spam” does the system notate the address to keep this from occurring again?

    Reply
  • The spam filter does not seem to be effective. Few days ago I moved last of my emails to protonmail and now I receive daily about 10 – 15 spam emails ( during working day only and US timezone ) to my inbox and it is slowly starting to getting on my nerve. I have moved across about 10 email addresses with 3 custom domain names I have been using for number of years with google apps. No matter how careful you are not to give your email, over the years it adds up and you will receive SPAM.

    Thus I can say that Protonmail is no match to google’s spam filter. In a single month my SPAM folder was filled with approx 690 spam email from which about 25 – 25% was moved manually from my inbox.

    Interesting part is that some of the emails prevent to be sent from my email address thus when moved to SPAM folder my email address gets blacklisted.

    Any ideas what to do to avoid further spam?

    Reply
    • I concur.
      I have the same experience coming from Gmail.
      Number of spam under ProtonMail is unbearable. In my case I have to move manually to spam folder over 50% of the spam e-mails – that is not acceptable!

      Reply
    • I concur.
      I have the same experience coming from Gmail.
      Number of spam under ProtonMail is unbearable. In my case I have to move manually to spam folder over 50% of the spam e-mails – that is not acceptable!

      Reply