Modern anti-spam and E2E crypto (opens in new tab)

(moderncrypto.org)

398 pointstimmclean11y ago137 comments

137 comments

108 comments · 28 top-level

ch11y ago· 11 in thread

Couldn't some form of proof-of-work system be used to increase the cost of sending a message without it having much of an economic impact on a casual sender? Was that what he was alluding to with the "burning bitcoin" reference?

diafygi11y ago

Funnily enough, proof-of-work was originally invented to combat spam and denial of service attacks[1][2]. I asked that in a reply, and it seems that the large gap between server compute power and mobile compute power would make a reasonably taxing proof-of-work system too costly for mobile phones[3].

[1] - https://en.wikipedia.org/wiki/Hashcash

[2] - http://hashcash.org/papers/pvp.pdf

[3] - https://moderncrypto.org/mail-archive/messaging/2014/000782....

drfuchs11y ago

Just to be clear, "Pricing via Processing or Combatting Junk Mail" (Crypto 1992), by Dwork and Naor, invented the idea of Proof Of Work, suggested it for fighting Spam (before the term was even coined!), provided actual functions, and more. This predates HashCash by half a decade. [Edited to de-obfuscate.]

1 more reply

marcell11y ago

I've been working on a system that implements this, called "bitnet".

The anti-spam system is basically what you said: you spend a small amount of bitcoin, and the server grants you some large number of "tokens". These tokens can be used to perform actions which use resources on the server, either storing messages, or getting messages. The exact price is configurable, and currently set low [1].

There is a second mechanism that I use, which is intended to lower costs for the average user, but still prevent spam. Essentially, in each bitcoin transaction, a small "transaction fee" is required, which goes to the miner. It is possible to prove, via a cryptographic signature, that you were the originator of a transaction, and therefore the person who spent that transaction fee. The bitnet system grants tokens to people who can prove they spent money on transaction fees [2]. The idea being, a typical bitcoin user will accumulate transaction fees anyways, but a spammer will have to go out of his way to send bulk messages.

[1] https://github.com/ortutay/bitnet/blob/master/bitnet.go#L50 [2] https://github.com/ortutay/bitnet/blob/master/bitnet.go#L38

ams611011y ago

Heh. Bitnet was an early (1980s) point-to-point network built out of leased lines between educational institutions.

http://en.wikipedia.org/wiki/BITNET

zrm11y ago

> Couldn't some form of proof-of-work system be used to increase the cost of sending a message without it having much of an economic impact on a casual sender? Was that what he was alluding to with the "burning bitcoin" reference?

The idea seems to be that you provably "burn" a small amount of Bitcoin to get an identity. An innocent person can then carry on using that identity forever without doing any more computation. Meanwhile a spammer will ruin that identity's reputation almost immediately and then have to pay again to get another one.

chii11y ago

so the war's front moves to botnets - where spammer first installs malware/virus to take over a user's machine, in order to send email _from_ that user's identity. Given that a user's machine is more easily compromised, the burned identity wouldn't cost more than current botnet acquisition.

bradleyjg11y ago

The problem is smartphones don't have much processing power, and what they do have uses batteries. So either they are trivial for servers to generate en masse or prohibitive for smartphones.

There's also botnets, whose resources hijackers are happy to exploit.

ch11y ago

Sure but certainly one could still make it costly enough to prohibit mass sending but not too costly to drain a cell battery under casual use. Or am I underestimating the level of complexity needed for a viable proof of work?

1 more reply

pmorici11y ago

https://en.bitcoin.it/wiki/Proof_of_burn

OpenBazzar a bit-torrent like p2p version of ebay uses proof of burn to make bad behavior expensive.

awt11y ago

AKA Bitmessage

ch11y ago

Yeah like bitmessage, but the network overhead (all users see all messages) of bitmessage might be prohibitive for battery powered clients, no?

1 more reply

petercooper11y ago· 10 in thread

It's amazing how little sender reputation can count for with Gmail in the face of other features, however. I have a good reputation as a sender but also send almost a million mails a month and I spend a lot of time investigating oddities in Gmail deliverability.

All of my mails are newsletters containing 10-30 links, and more than once I've found the mere inclusion of a single link to a certain domain can get something into spam versus a version without that link, often with no clear reason why (domains that are particularly new are one marker, though). Or.. how about using a Unicode 'tick' symbol in a mail? That can get a reputable sender into Spam versus a version without the same single character (all double tested against a clean, new Gmail account) :-) Or how about if you have a link title that includes both ALL CAPS words and ! anywhere? Your risk goes up a good bit, but just go with one of them, you're fine..

I now have a playbook based around numerous findings like this, some based on gut feelings looking at the results and some truly proven, and even with my solid reputation as a sender, I'm having to negotiate a lot content-wise each week. But do I like it? Yeah, in a way, because it's also what stops everyone else being a success at it.. Gmail sets the bar high! :-)

(Oh, a bonus one.. include a graphic over a certain size? Your chance of ending up in the Promotions folder just leapt up. Remove it, you're good. It doesn't seem to be swayed much by actual content. So I've stopped using images where at all possible now and open rates stay up because of it.)

higherpurpose11y ago

I think the author has developed too much the kind of thinking he needed to fight spam at regular e-mail companies.

I think there could be multiple, relatively easy methods to avoid encrypted spam.

Someone here already suggested the first email being a "poke". And only if you send a poke back, would that user be allowed to send you an e-mail.

The user could also have some description about him, from his profile, appear when you hover over his profile image or whatever. If you receive an e-mail say from a company you're expecting to receive email from, then you could poke back, so they can send you that email. I mean there should be ways to make it easy for people to know who's a total stranger that could be a spammer, or someone trying to reach out to them for good reasons.

Then you could also have the emails under different labels by default. All the trusted e-mails would come to the regular Inbox, while the rest will go under a different label.

As you said, the email provider could also see the user's reputation over time, and if he's a spammer or not.

And these are just some easy solutions we can come up with almost immediately. I'm sure there can be others with a little bit more thought put into it. I certainly don't see encrypted email as some kind of "doomsday scenario" like the author predicts in the post.

mike_hearn11y ago

Actually the "poke" method would work and I suggested it on a different thread on that mailing list. It's the S/MIME model although these days you'd just stick an ECC key into a header and sign it with DKIM, then upgrade the clients. Doesn't have to be technically complicated.

There are at least three major downsides:

1) You still leak lots of metadata and the full data of the poke including most obviously the subject line.

2) Do users understand that their spangly new "encrypted mail" actually fails to protect a lot of important data? What if they (gasp) came to rely on it? I'd want to see usability studies showing a clear understanding of what is protected and what isn't.

3) You break other features that rely on the server being able to see content, like search, and the ads that pay for all of this.

petercooper11y ago

Google does a little of this already although the mechanism is not as direct as your version. E-mail from certain senders gains "importance" based on your interactions with that sender, such as if you'd first sent a mail to that address, if you'd ever replied to that address, if you open a certain amount of mail from that address, etc. Mail from senders considered "important" is then more likely to hit your inbox.

It seems to work reasonably well, although there are some interesting ways you can game it. One I learnt from the Internet marketing world was some list builders (using legit methods, but perhaps promoting things that often get caught by spam filters) hire people or implement techniques to encourage new list signups to reply to mails sent from the same address as the list by asking them questions, etc.

chii11y ago

This mechanism would cause so much phishing that a whole new type of war would begin based on it.

DanBC11y ago

How are pokes different to regular challenge response whitelists, and how does poking avoid the problems of CR?

haroldp11y ago

Gmail deliverability and rendering is the new IE6.

TeMPOraL11y ago

I'd say GMail deliverability is the new SEO - as long as you're honest, kind, and don't try to cheat/abuse people, you'll be fine.

RadioactiveMan11y ago

Can you share that playbook? In particular the truly proven findings?

petercooper11y ago

I need to codify it as it's just notes and numbers scattered across experiments for now, but it's something I plan to do as I want to blog about each example (along with all of the other weird things I've learnt in the e-mail business so far).

I realized I should add a note, however, that everything I've said only applies to bulk e-mail (and sent through systems with a reputation for such) and not transactional or manual e-mail which suffers from fewer oddities for obvious reasons.

1 more reply

baudehlo11y ago

You think spammers aren't going to read it too?

1 more reply

bilalhusain11y ago· 8 in thread

I wish Google provided an API to lookup a sender's reputation so that even a locally deployed spam filter could use the information.

JohnTHaller11y ago

It wouldn't be in Google's best interest to do so. First, it could be used by spammers to lookup their own reputation for a given IP or domain before deciding whether to move on. Second, it would enable competitors to use some of Google's hard work in their products.

serf11y ago

a central authority for email reputation is one of the ideas put out by Mike during his presentation at RIPE, not that his ideas are Googles'.

https://ripe64.ripe.net/archives/video/25/

dpweb11y ago

Wait! Maybe not so fast dismissing this.. G already does this, with web search. It ranks reputations and its publicly available.

Trying to game the system in web search is a huge thing, cause there's a big reward. But G fights it pretty well.

Not such a bad idea G publicly ranks mail senders.

Someone123411y ago

Too easy for spammers to utilise to game the system. Just make a new domain (reputation 0) then keep on sending different emails to see how the reputation changes. The linked reply kind of covers this (e.g. security through obscurity).

ill0gicity11y ago

I don't believe Google wants to turn GMail into a data broker service, especially one that powerful. Why? Besides the obvious (spammers use it to learn how to beat the system, and other companies use it to compete with GMail) you'll be creating an insane amount of workload in having to deal with all the complaints. The current system of "spam is what our users say it is" leaves very little for debate. As someone who's run large email systems this appeals to my laziness.

rkuykendall-com11y ago

I wonder if Google uses "shadowbanning." If a spam account emails another spam account, is the email filtered?

I know that's probably not why it's private, but it got me thinking.

crazypyro11y ago

They use shadowbanning on newly created accounts at the very least. If you don't have an active token on gmail creation (this is referenced in the OP by the randomized javascript), your account gets tagged and wiped in banwaves.

superuser211y ago

GMail's size and centralization is its competitive advantage. It's not in Google's interests for you to benefit from its data mining efforts without also contributing to them.

loup-vaillant11y ago· 7 in thread

> Botnets appeared as a way to get around RBLs, and in response spam fighters mapped out the internet to create a "policy block list" - ranges of IPs that were assigned to residential connections and thus should not be sending any email at all.

So basically, I can't send email from home? This is… unfortunate. If we want freedom, we need decentralization, and this kills it.

Spearchucker11y ago

Hey I've no idea of anything, and don't have a dog in this fight either, but reading everything here I think decentralization is actually the answer. Yes, you'd not get that global reach, and your network of contacts would be severely limited (presumably to those you know). But a decentralised system could do E2E and P2P and run from home. Running that beside traditional, clear-text and consequently spam-proof email strikes me as a sensible balance.

One (open, centralized) system for global comms. Another (closed, decentralized) for secure comms. Maybe even more than one "another", if contexts require different audiences.

loup-vaillant11y ago

I'd rather not have most of my communications (even mundane ones) read by a third party that also reads everything else. Gmail is bad enough, but this is a really Big Brother.

anon411y ago

You can, but it goes in the spam folder. I receive mails from smartmon and mdadm and have just marked my IP as "definitely not spam, you guys".

loup-vaillant11y ago

Many corporate email systems don't have personal spam folders. They just redirect suspect email /dev/null, or otherwise make it inaccessible. It has been a problem for me in the past, and I don't even send my email from home.

And there's the case of looking for work. I'm somewhat proud of showing my personalized domain name, but if using something other than a huge webmail can cause it to fall into a spam folder… Fortunately that has yet to happen, but this is one of the reason I hesitate to switch from remote SSH to a physical server at home.

baudehlo11y ago

You can't send unauthenticated (anonymous) email from home machines, at least not directly. It's not really as bad as it sounds.

zokier11y ago

Afaik outgoing port 25 (ie smtp) is blocked in large portion of residential connections. And it is a very good thing in the current state of affairs.

loup-vaillant11y ago

It forces me to trust a remote host. Not good.

Email is supposed to go from the sender's machine to the receiver's machine. That's how it should work by default, that's how TLS connections makes the communication vaguely secure, and that's how it makes it difficult for powerful third parties to have a peek at everyone's communications.

As far as I know, Gmail accounts are only a subpoena away from the US government. But a sheeva plug (or R-Pi) hosted at my home? They need a warrant. Even for countries that don't need warrants, wire-tapping everyone is expensive: it must be done one home at a time.

Now maybe the botnet situation is so bad that it is worth sacrificing our ability to send e-mail. Still, this strikes me as the wrong solution. Blocking outgoing 25 by default is fine, but we need to be able to lift the restriction if we want.

beloch11y ago· 6 in thread

I'm not too knowledgeable about this stuff, but would it work if end-to-end encryption was only initiated after the first time somebody replies to an address? e.g. If somebody contacts you for the first time, they lack your public key (and/or a shared secret for authentication) and must send you plaintext. Then, if you reply, you automatically provide them with your public key and/or authentication info to send you encrypted messages in the future. Thus, most spam would be in plain-text, anyone who knows how the system works would avoid discussing sensitive info in the first email they send somebody, and everybody else wouldn't know the difference.

thefreeman11y ago

Not a bad idea.

One issue I could see though is the initial email would essentially devolve into a "poke". Nobody would bother writing anything in it, which would mean the spam filters would have nothing to filter on.

y0ghur7_xxx11y ago

>One issue I could see though is the initial email would essentially devolve into a "poke". Nobody would bother writing anything in it, which would mean the spam filters would have nothing to filter on.

that is a good thing: if the first message contains something else that is not just "poke", it's spam.

1 more reply

superuser211y ago

MITMing the message with the public key attached would be pretty straightforward and impossible to catch without verification over some other secure channel

macrael11y ago

This would be solved by public key encrypting and signing both sides of those messages. Nothing stops people from sharing your public key, so you could develop some kind of one off token for everyone instead, that way you can kill those tokens after a time.

1 more reply

click17011y ago

While what you say is true, it would at least be a step in the right direction towards better privacy.

It would provide protection against passive snooping (NSA/GCHQ) even if it wouldn't prevent active attacks.

Edit: Typo

alexjeffrey11y ago

in my mind, the first email would be encrypted using a public address obtained by asking for the key from the receiver's domain's server, or otherwise leveraging the DNS for the receiver's mailserver.

thaumaturgy11y ago· 6 in thread

Well this is pretty neat.

I've been working on custom software to improve the spam filtering on my mail server for the last year (side project). It currently works by letting hosted users forward spam messages to a flytrap account, and then the daemon runs, reads the forwarded message, tracks down the original in the user's mail directory, does a whois on the origin in the mail headers, consults its logs, and then adds a temporary network-wide blackhole to iptables.

Originally it was intended to work alongside SpamAssassin and SQLGrey and all that, but last night I started considering replacing SpamAssassin altogether. I love SA, but the spammers are beating it regularly now. My TODO notes in the code actually say, "reputation tracking for embedded URLs, domains, ccTLDs and gTLDs, sender addresses, and content keywords." I wrote the first bits of code for reputation tracking this morning.

It's not much of a step for the software really, because it already uses embedded URLs in a message as part of the profile "fingerprint" for finding the original message from a forwarded version.

But I'm a bit chuffed to hear that I'm on the right track, considering how effective Gmail's tactics have been. :-)

Small service providers have it really tough right now. Users don't tolerate any spam at all. A few years ago, the state of the art for small independent services was SpamAssassin + SQLGrey (or other greylisting) plus a few other tricks; that's not sufficient anymore, and most of us smallfry lack the resources to come up with something much better.

After just 6 weeks in production, the software already has 20+million IPs blocked at any given time.

baudehlo11y ago

I think SA has suffered from some of the original core developers (myself included) moving on to other projects in a completely different tech area. The good news is that other projects have taken up some of the mantle, like Haraka, check out the karma plugin. It does some amazing blocking of spam and penalizing clients.

Beyond that also one of the things SA doesn't do well is actually rejecting hard on sensible blacklists like the CBL. We worked hard to make everything heuristic based but it wasn't always the right choice. Some things need to be black and white. There's some code on SA for short circuiting now but it's not really the best solution. In my own spam filtering I have a bunch of hard rejects and they work really well.

Anyway, check out Haraka or Qpsmtpd for solid anti spam mail serving solutions. They work really well.

patio1111y ago

Hey wait, I did vaguely recognize your username from somewhere. Thanks for SpamAssassin! I spent more hours spelunking through your guys' source code and community rulesets than you'd want to know, about a decade ago, while working for the anti-spam group in our R&D department at a tech incubator.

xorcist11y ago

Thank you for SA. I do not share the experience that it is outsmarted, I still achieve > 98% accuracy just as I did ten years ago. That is really a testament to all the hard work of its developers.

baudehlo11y ago

This sounds a lot like DCC. Have you considered adding it into your filtering layer?

oalders11y ago

Do you have any plans to release this software? I've got a small hosting business and SpamAssassin seems to be getting beaten on a very regular basis.

thaumaturgy11y ago

Yeah, I've been thinking about that recently.

There are some challenges though. Probably the biggest one is that it's designed for a specific mail server setup: postfix + dovecot (configured for maildir) + fail2ban + php (the code is in php, because it was convenient) + mysql. I don't know yet how portable it will be.

If you'd like to try it anyway, let me know and I'll post what I've got to GitHub in the next few days.

edit: alternatively, I've been more seriously considering making the current network ban list available as an RBL. Since I already have DNS servers, it would be pretty trivial to do.

3 more replies

runeks11y ago· 5 in thread

> A possibly better approach is to use money to create deposits. There is a protocol that allows bitcoins to be sacrificed to miners fees, letting you prove that you threw money away by signing challenges with the keys that did so.

This wouldn't work, because a miner can easily pay himself any amount of bitcoins that he has saved up in fees, and include this transaction in his own block (not broadcasting it). Thus he can basically create these "deposits" for free, and sell them for a profit.

That's the thing: whatever you try as a counter-measure, you always come back to money: in the above scenario, money would replace "deposits" because "deposits" would just be sold on the open market for money. Proof-of-work becomes money: if something important requires proof-of-work, you can be sure that a web app would surface that performs proof-of-work in exchange for money.

It always comes back to money, because whatever restriction you put on something, whether it be "pay fee to Bitcoin miners", "Solve proof-of-work puzzle", or something else entirely, these things will always end up being sold for money in an efficient market, because of the increased efficiency of division of labor: why should I use my inefficient smartphone to calculate proof-of-work, when I can pay a service with custom ASICs to do the job for me at a fraction of the cost?

As far as I can see, the only alternative that can work besides money is something that cannot be sold for money. And I can't come up with anything that fits this requirement.

thefreeman11y ago

Not sure why you were downvoted. While I may or may not agree with your opinion I think you expressed it in a completely reasonable fashion and made a number of interesting points.

tomjen311y ago

But your smartphone is something you already have. If it takes ten seconds then you already paid for those ten seconds when you brought it.

Of course spammers can buy their services but when the price for normal sending is effectively free for normal users you can jack up the price for spammers to make it too expensive for them.

A nice benefit is that it forces sites to use something other than email, such as RSS, since they can't afford to send newsletters anymore.

comex11y ago

Maybe, but the performance difference between a power-strapped smartphone CPU and an ASIC tailored for the specific task is so massive I doubt you could make it expensive enough for the latter while maintaining a reasonable experience for the former.

2 more replies

alexjeffrey11y ago

having everything come back to money is a good thing though - a user can afford to pay e.g. 1 microdollar per email sent but if a spammer is sending 10 million emails a day, they can't afford that level of operational expense.

salmonellaeater11y ago

It's sufficient to just destroy the bitcoins by sending them to a non-existent address. Alternatively, they could be donated to a third party such as a charity or an open-source software foundation.

danso11y ago· 5 in thread

Fascinating read, and as amazing as email is, the OP manages to still make me realize how much I take it for granted:

> So I think we need totally new approaches. The first idea people have is to make sending email cost money, but that sucks for several reasons; most obviously - free global communication is IMHO one of humanities greatest achievements, right up there with putting a man on the moon. Someone from rural China can send me a message within seconds, for free, and I can reply, for free! Think about that for a second.

drzaiusapelord11y ago

I remember using dial-up to connect to my college's unix system. I fired up the email client (mail? mailx?) and was hesitant for a moment to send an email to England. I thought back to my days using BBS's and worrying about dialing out to per minute charges. I just couldn't believe I could email anyone in the world for free using smtp email.

Obviously, both of us need computers, email accounts, network access, etc but there's no per region metering or anything. The cost of sending an email to someone sitting 10 feet from me or 10,000 miles from me is exactly the same. Mike's right, this is revolutionary.

_delirium11y ago

In the actual BBS days, I don't recall that being a common point of confusion, oddly enough. I dialed up to local BBSs, and it was obvious where I was dialing because I actually entered the digits, and the modem audibly dialed them. Then sometimes I would correspond with people in other states or countries, through FidoNet echoes or mail. But it was clear that I wasn't dialing them to do so. I transmitted my message to my local BBS, and the BBS relayed my message onwards. I'm not sure at the time I entirely understood what mechanism the BBS used to do so, but I knew that I wasn't myself dialing Norway or Germany to do it, nor paying any kind of destination-based charge. I even played some multi-country multiplayer door games, all for free. So once I got an internet account, it didn't seem too magical!

nsns11y ago

    Someone from rural China can send me a message within seconds, 
    for free, and I can reply, for free!

Yeah, you both "only" need some machine that can go online, an internet connection, electricity, the required technical knowledge, and to (implicitly) agree to your personal data getting harvested... otherwise it's "free".

scrollaway11y ago

I am willing to give you £50 in cash for free as long as you come pick it up at my place in Sweden.

This offer is only available for you alone and in person. Are you interested? Because I'd really like to know how your pedantic definition of "free" works out for you when it's obvious that things that require infrastructure (such as online communications, or traveling to a different country)... require infrastructure.

(TLDR: If you don't find it amazing that with a very small indirect investment we can actually communicate with people anywhere in the world for free, you and I need to have a long chat about being spoiled.)

1 more reply

angersock11y ago

Yep. Pretty amazing, isn't it?

sounds11y ago· 4 in thread

One important concept that seems to be missing from the discussion is Sender Stores.

Email currently uses a Receiver Stores model. SMTP servers can relay messages, but in almost all cases the message is transmitted directly from the originator's network to the recipient's network. The storage of the message only effectively changes _ownership_ once, even if the message headers say it was forwarded many times.

That makes email a Receiver Stores model: the recipient's network is expected to accept the message at any time and then hold it until the recipient comes to look at it.

Some of the bitcoin messaging protocols propose a Sender Stores model. That is, the message may be transmitted any number of times but the recipient's network is not responsible for long-term storage. The sender's network must be able to provide the message at any time up to the point when the recipient actually looks at the message.

There are some obvious restrictions such as requiring that the message be encrypted with a Diffie-Helman key (negotiated when the message is first transmitted to the receiver's network) to reduce the feasibility of de-duplicating millions of messages. And in order to prevent revealing exactly when the recipient reads the message, the recipient's network doesn't ack the message for a while.

Ultimately all of this is just designed to make bulk email (slightly) more expensive. Spammers run on very, very thin margins. But it doesn't do anything to solve the problem of account termination or blacklisting.

fanf211y ago

The problem with the sender stores model is that the sender does not need to store anything: they just generate the spam message at retrieval time. So it does not actually increase their costs. Spam moves to the notification mechanism that tells receivers that a message is available: this is just as unsolicited as in current junk mail, and needs to contain enough information for the receiver to know if it is worth retrieving the message. All the current spam and anti-spam techniques will apply fairly exactly to these notification messages.

dredmorbius11y ago

A fair point, but it does require that the sending host persist in its network location (or have continuously updated DNS which reports its present location).

Since early recipients of the spam will likely report it, it's fairly likely that subsequent retrieval attempts will find a downed (or sanitized) host, no longer delivering spam.

This will reduce the amount of spam actually delivered, and the spammer's production / revenue margins.

gioele11y ago

What you call "Sender Stores" is at the basis of djb's IM2000 that is supposed to replace SMTP and email delivery in general.

See http://cr.yp.to/im2000.html and http://en.wikipedia.org/wiki/Internet_Mail_2000

sounds11y ago

Thanks, I hadn't seen that before.

End-to-end encryption is needed to increase the load on a spammer as much as possible. Even if the spammer tries to re-generate the message "at retrieval time" the receiver should request retrieval several times (to obfuscate when the message is actually read) and the message should use multiple iterations of a cipher (and possibly HMAC) after an initial DH negotiation, or any other means to increase the cost for a spammer _and_ tie a message to a unique sender for reputation-tracking purposes.

p4bl011y ago· 4 in thread

The discussion here is already quite long so maybe I missed it, but I don't see anyone asking (or answering) the first question that came to me while reading the linked email:

Why is the cost of end-to-end crypto never taken into account?

I just can't believe that we have reached a point where it is possible to cheaply mass mail the way spammers do if you need to encrypt each email for each recipient. That alone should be disuasive enough, at least that's always what I thought. If I'm right, all the discussion about the need for client to extract features from emails and send them to a necessarily trusted centralized third party is useless. But I may be missing something, where am I wrong?

hueving11y ago

Even at a reduced rate, end to end crypto means spammers will have a much higher success rate since they don't have to fight a centralized spam system with global knowledge. This more than makes up for the extra time to encrypt a message.

p4bl011y ago

Mh, okay.

But let's make the crazy assumptions that we are effectively in a world where end-to-end crypto is massively used, virtually by everyone.

I guess it would be stupid to assume that the message are encrypted but not signed.

This means that it would be easy to have a list of identities (i.e., keys) which are sending spams, for instance using a web of trust, without users having to disclose any other informations that "I trust these identities, not these ones" (which could be as easy to do as clicking "spam" or "not spam" for the users).

Now that means that the spammers not only have to encrypt every single email for every single recipient but also to generate new key-pairs for almost each encryption.

Of course it is also very easy to mark as spam any email signed with a key that is considered too small (i.e., too quick to generate).

Now if you tell be that still won't do it without a "centralized spam system with global knowledge", I have to seriously rethink a lot of my assumptions about the cost of some computations.

1 more reply

rando28911y ago

Encryption is pretty darn cheap in cpu cost. And they have botnets to do it for them.

p4bl011y ago

Of course using end-to-end crypto does not only mean "encryption", see my other response.

orf11y ago· 3 in thread

The Gmail spam filter is indeed impressive, but on several occasions I have found 'real' emails being triggering it. Those times were just me browsing the spam folder randomly and I hate to think what else it has swallowed.

_delirium11y ago

Yes, I think it's too aggressive, at least for my risk-sensitivity preferences. I've actually never gotten spam in my Gmail inbox, but I've had two serious false-positives that caused me problems, along with a number of less serious false positives, like mailing list subscriptions disappearing. That level of aggressiveness is too much for me, and doesn't seem to be configurable so I can tell it to err more on the side of avoiding false positives.

The first incident was that Gmail flagged an important email from my landlord as spam because it contained a forwarded message written in Danish, which the filter deemed to be a language I don't normally correspond in (it is nice, to be fair, that the filter actually tells me why it flagged the message). True enough. But I do live in Denmark, and in fact a mail containing Danish is a very good signal, for me, that it should not be spam-filtered.

I've moved to hosting my own mail as a result, and it's been going well so far. I use a fairly conservative host-based filtering approach. Just blocking hosts whose DNS doesn't match their rDNS rejects >70% of spam attempts, and adding Spamhaus's DNSBL brings that up to >95%. As far as I can tell from perusing the logs, it's quite conservative, and they're all true positives. And at least it rejects (if it's going to) in the SMTP session, so the sender will get a bounce rather than get silently filed into a spam folder, like Gmail does.

I do still get some spam, almost all of it from legitimate free-mail hosts who I can't feasibly filter by host (mostly Yahoo and Gmail). But it's fairly infrequent.

raverbashing11y ago

Even worse, I've seen Google originating emails being flagged in GMail as spam/fishing

(in mailing lists, and yes, I checked, it seemed to be a legitimate email, so I let the sender know, but never heard back)

ams611011y ago

I have had gmail flag a message I sent to myself as spam.

idlewords11y ago· 2 in thread

This is an incredible write-up. Can someone who knows the author plead with him to write up the long history of the Spam Wars that he mentions in this document? I could read this stuff all day.

baudehlo11y ago

Read Spam Kings. It covers a lot of the history.

ireflect11y ago

Agreed. I'd buy the book.

zerr11y ago· 2 in thread

>we had put sufficient pressure on spammers that they were unable to make money using their older techniques

Could anyone comment how spammers make money actually?

skizm11y ago

One way is referral links. If you click on a link in my email and buy something on, say, Amazon, I could potentially make up to 10% of the purchase price (Amazon is actually normally around 5% I think).

aslewofmice11y ago

Direct email marketing

awt11y ago· 2 in thread

No mention of Bitmessage, which provides E2E crypto and anti-spam.

Canada11y ago

Bitmessage is hardly immune from spam. I've seen it there.

And cranking up the proof of work isn't going to do anything to prevent it. The only thing that prevents Bitmessage from becoming a cesspool is its obscurity.

awt11y ago

I would not call the random strings showing up in the general chan spam. I highly doubt it is profitable for whoever sent it. Industrial scale spam seems unlikely on Bitmessage.

1 more reply

joelthelion11y ago· 2 in thread

Can someone explain botguard? I'm not sure I get it.

chii11y ago

my understanding (which may be wrong - please i welcome all corrections) is that an obfuscated piece of code that is somewhat randomized is generated on the signup page. If you ran it, it produced a token, which is submitted as part of the signup.

If you didn't obtain a fresh piece of this script, but instead either reused, or tried to guess the token, then your signup still succeeds, but is marked as bad. Then, in an undetermined amount of time, a wave of bans would occur on all accounts that got marked as bad.

This prevents signups via scripts, but does so via making signup scripts untrustworthy, and so no one would be willing to put money in for they cannot be sure that it actually works.

joelthelion11y ago

But what prevents you from requesting another token from Google?

1 more reply

skrebbel11y ago· 1 in thread

I don't understand the objection against email costing money. I send you a mail? I pay $0.0001 to you. You reply? You pay $0.0001 back.

There is idea that this somehow blocks access to email for people who have a hard time paying for things on the internet (for whatever reason), but it is misguided: everybody who has access to the internet pays for it. ISPs could easily give every subscribed 10000 free emails every month.

Texting costs money and yet people do it.

What am I missing?

mike-cardwell11y ago

"What am I missing?"

The specs and the software for a start.

Then there's the fact that it's not the spammers that will end up paying, but the people running the systems that are compromised and abused to send spam, be they shared hosting servers or home computers.

Then there's the network effect. I'm not going to feel good telling my friends and family that it will now cost them money to email me. Especially when they can just contact me using Facebook instead for free and without having to set anything new up. Especially when the email service they're already using probably wont even support this new fangled paid-email system.

It would be a massive task to add this functionality to email, and it wouldn't stop the spam, so it's not worth it.

anon411y ago· 1 in thread

So why not use one key per source, kind of something like this:

Alice wants to receive mail from Bob. Alice generates a public/private key pair and gives the public half to Bob. When Bob wants to send mail to Alice, Bob uses the public key Alice gave him. If Alice receives spam, she marks the public key it was encrypted with as "fuck it, the spammers got it" and never receives mail with that key again. Then she notifies Bob that the key he had has been compromised and sends him a new one. Alice could then, after Bob has lost her key to spammers one too many times, simply decide not to talk to someone like him.

This would give mailing list operators a large incentive never to share your email with anyone, otherwise you could just block them forever.

On the flip side, if the mailing list is really important to you, the operator could reject your new key and tell you you'll either receive their spam or you won't be part of the mailing list. Though I don't see why someone would do that in favour of just including ads in the mails themselves.

chii11y ago

Lets suppose Bob was a spammer pretending to be a mailing list operator.

Alice gives her key to Bob, in the expectation that Bob would not be sending her spam. Bob then sends both spam, as well as legit mail that alice did want. Assuming that alice does not want to stop receiving the legit mail, but want to stop the spam, how does she do it in this scenario?

If alice blacklist the key for bob, but sends a new one, the situation didn't improve. If she doesn't send a new one, she stops receiving legit mail (that she wants, and cannot go without).

1 more reply

rwallace11y ago· 1 in thread

> When we started gmails were about $25 per 1000 so we were able to quadruple the price. Going higher than that is hard because all big websites use phone verification to handle false positives and at these price levels it becomes profitable to just buy lots of SIM cards and burn phone numbers.

How does that work? Don't SIM cards cost more than 10 cents?

cjg11y ago

Not all the accounts need verification by phone.

patio1111y ago

Worth reading for confirmation regarding the importance of reputation in deliverability, which is something that is not widely understood by non-experts but which has really toothy consequences for many HNers' businesses.

zokier11y ago

One thing nice about E2E crypto in messaging is that it implies strong identities, which most importantly allow building whitelists with high level of confidence. And of course if we can make those identities costly to acquire/burn, either by proof-of-work or even just with a CA model, that alone should cut spam significantly.

sgentle11y ago

I wonder if this would be an interesting application for Homomorphic Encryption. True FHE is still wildly inefficient, but there are some interesting applications like CryptDB where sort-of-Homomorphic-Encryption is feasible for certain restricted operations (keyword search being one).

In a system like that, maybe you could send your encrypted message along with some encrypted keywords that you consider to be spammy to some centralised service. That would, at least, avoid some of the client-side-filtering-is-too-hard problem.

As far as reputation, this might be one of the rare times where a Web of Trust seems like a good idea. Generating lots of false positives and negatives would be a lot less powerful if the value of those reports was filtered by how much you trust the account that made them. With email you already have an implicit source of trust, in that anyone you mutually email with is unlikely to be a spammer.

Seems like a really interesting problem space to be involved in.

fdsary11y ago

Btw, this is written by Mike Hearn, who'd I'd like to nominate to hacker of the year. Super cool guy, mad respect to him :)

PaulHoule11y ago

I think reputations are part of it but there are other aspects to.

I switched to gmail because my mail with every other provider and client was choked with phishing messages from major banks. So much work has been done on preventing origin spoofing in 2014 that accepting phony mail from chase.com is a sign of gross incompetence.

dochtman11y ago

I submitted this without the ?hn at approximately the same time. Pretty weird that this one gained traction while my submission did not.

https://news.ycombinator.com/item?id=8275787

hendzen11y ago

Mike Hearn is also a core Bitcoin developer, as well as an HN commenter. Hi Mike!

lazylizard11y ago

could there be a antispam gateway that replies to 'maybe'(as in spam, ham and maybe) mails with a temporary url that hosts a webform, before they reach the inbox? the webform could even limit message length, prevent attachments, be protected by akismet and so on. let the message from the form be actually relayed to the real mail server. and once the recipient replies, automatically whitelist that sender or possibly even the domain?

Zigurd11y ago

Some of my contacts have been using verification gateways/whitelists for email for decades. If spam were to become a problem, I would use one.

Oculus11y ago

Really interesting article until it gets into the Bitcoin talk. I feel like his passion towards Bitcoins seeped a little too much into the article towards the end.

j / k navigate · click thread line to collapse

137 comments

108 comments · 28 top-level

ch11y ago· 11 in thread

diafygi11y ago

[1] - https://en.wikipedia.org/wiki/Hashcash

[2] - http://hashcash.org/papers/pvp.pdf

[3] - https://moderncrypto.org/mail-archive/messaging/2014/000782....

drfuchs11y ago

1 more reply

marcell11y ago

I've been working on a system that implements this, called "bitnet".

[1] https://github.com/ortutay/bitnet/blob/master/bitnet.go#L50 [2] https://github.com/ortutay/bitnet/blob/master/bitnet.go#L38

ams611011y ago

Heh. Bitnet was an early (1980s) point-to-point network built out of leased lines between educational institutions.

http://en.wikipedia.org/wiki/BITNET

zrm11y ago

chii11y ago

bradleyjg11y ago

The problem is smartphones don't have much processing power, and what they do have uses batteries. So either they are trivial for servers to generate en masse or prohibitive for smartphones.

There's also botnets, whose resources hijackers are happy to exploit.

ch11y ago

1 more reply

pmorici11y ago

https://en.bitcoin.it/wiki/Proof_of_burn

OpenBazzar a bit-torrent like p2p version of ebay uses proof of burn to make bad behavior expensive.

awt11y ago

AKA Bitmessage

ch11y ago

Yeah like bitmessage, but the network overhead (all users see all messages) of bitmessage might be prohibitive for battery powered clients, no?

1 more reply

petercooper11y ago· 10 in thread

higherpurpose11y ago

I think the author has developed too much the kind of thinking he needed to fight spam at regular e-mail companies.

I think there could be multiple, relatively easy methods to avoid encrypted spam.

Someone here already suggested the first email being a "poke". And only if you send a poke back, would that user be allowed to send you an e-mail.

Then you could also have the emails under different labels by default. All the trusted e-mails would come to the regular Inbox, while the rest will go under a different label.

As you said, the email provider could also see the user's reputation over time, and if he's a spammer or not.

mike_hearn11y ago

There are at least three major downsides:

1) You still leak lots of metadata and the full data of the poke including most obviously the subject line.

3) You break other features that rely on the server being able to see content, like search, and the ads that pay for all of this.

petercooper11y ago

chii11y ago

This mechanism would cause so much phishing that a whole new type of war would begin based on it.

DanBC11y ago

How are pokes different to regular challenge response whitelists, and how does poking avoid the problems of CR?

haroldp11y ago

Gmail deliverability and rendering is the new IE6.

TeMPOraL11y ago

I'd say GMail deliverability is the new SEO - as long as you're honest, kind, and don't try to cheat/abuse people, you'll be fine.

RadioactiveMan11y ago

Can you share that playbook? In particular the truly proven findings?

petercooper11y ago

1 more reply

baudehlo11y ago

You think spammers aren't going to read it too?

1 more reply

bilalhusain11y ago· 8 in thread

I wish Google provided an API to lookup a sender's reputation so that even a locally deployed spam filter could use the information.

JohnTHaller11y ago

serf11y ago

a central authority for email reputation is one of the ideas put out by Mike during his presentation at RIPE, not that his ideas are Googles'.

https://ripe64.ripe.net/archives/video/25/

dpweb11y ago

Wait! Maybe not so fast dismissing this.. G already does this, with web search. It ranks reputations and its publicly available.

Trying to game the system in web search is a huge thing, cause there's a big reward. But G fights it pretty well.

Not such a bad idea G publicly ranks mail senders.

Someone123411y ago

ill0gicity11y ago

rkuykendall-com11y ago

I wonder if Google uses "shadowbanning." If a spam account emails another spam account, is the email filtered?

I know that's probably not why it's private, but it got me thinking.

crazypyro11y ago

superuser211y ago

GMail's size and centralization is its competitive advantage. It's not in Google's interests for you to benefit from its data mining efforts without also contributing to them.

loup-vaillant11y ago· 7 in thread

So basically, I can't send email from home? This is… unfortunate. If we want freedom, we need decentralization, and this kills it.

Spearchucker11y ago

One (open, centralized) system for global comms. Another (closed, decentralized) for secure comms. Maybe even more than one "another", if contexts require different audiences.

loup-vaillant11y ago

I'd rather not have most of my communications (even mundane ones) read by a third party that also reads everything else. Gmail is bad enough, but this is a really Big Brother.

anon411y ago

You can, but it goes in the spam folder. I receive mails from smartmon and mdadm and have just marked my IP as "definitely not spam, you guys".

loup-vaillant11y ago

baudehlo11y ago

You can't send unauthenticated (anonymous) email from home machines, at least not directly. It's not really as bad as it sounds.

zokier11y ago

Afaik outgoing port 25 (ie smtp) is blocked in large portion of residential connections. And it is a very good thing in the current state of affairs.

loup-vaillant11y ago

It forces me to trust a remote host. Not good.

beloch11y ago· 6 in thread

thefreeman11y ago

Not a bad idea.

y0ghur7_xxx11y ago

that is a good thing: if the first message contains something else that is not just "poke", it's spam.

1 more reply

superuser211y ago

MITMing the message with the public key attached would be pretty straightforward and impossible to catch without verification over some other secure channel

macrael11y ago

1 more reply

click17011y ago

While what you say is true, it would at least be a step in the right direction towards better privacy.

It would provide protection against passive snooping (NSA/GCHQ) even if it wouldn't prevent active attacks.

Edit: Typo

alexjeffrey11y ago

thaumaturgy11y ago· 6 in thread

Well this is pretty neat.

It's not much of a step for the software really, because it already uses embedded URLs in a message as part of the profile "fingerprint" for finding the original message from a forwarded version.

But I'm a bit chuffed to hear that I'm on the right track, considering how effective Gmail's tactics have been. :-)

After just 6 weeks in production, the software already has 20+million IPs blocked at any given time.

baudehlo11y ago

Anyway, check out Haraka or Qpsmtpd for solid anti spam mail serving solutions. They work really well.

patio1111y ago

xorcist11y ago

Thank you for SA. I do not share the experience that it is outsmarted, I still achieve > 98% accuracy just as I did ten years ago. That is really a testament to all the hard work of its developers.

baudehlo11y ago

This sounds a lot like DCC. Have you considered adding it into your filtering layer?

oalders11y ago

Do you have any plans to release this software? I've got a small hosting business and SpamAssassin seems to be getting beaten on a very regular basis.

thaumaturgy11y ago

Yeah, I've been thinking about that recently.

If you'd like to try it anyway, let me know and I'll post what I've got to GitHub in the next few days.

edit: alternatively, I've been more seriously considering making the current network ban list available as an RBL. Since I already have DNS servers, it would be pretty trivial to do.

3 more replies

runeks11y ago· 5 in thread

As far as I can see, the only alternative that can work besides money is something that cannot be sold for money. And I can't come up with anything that fits this requirement.

thefreeman11y ago

Not sure why you were downvoted. While I may or may not agree with your opinion I think you expressed it in a completely reasonable fashion and made a number of interesting points.

tomjen311y ago

But your smartphone is something you already have. If it takes ten seconds then you already paid for those ten seconds when you brought it.

Of course spammers can buy their services but when the price for normal sending is effectively free for normal users you can jack up the price for spammers to make it too expensive for them.

A nice benefit is that it forces sites to use something other than email, such as RSS, since they can't afford to send newsletters anymore.

comex11y ago

2 more replies

alexjeffrey11y ago

salmonellaeater11y ago

It's sufficient to just destroy the bitcoins by sending them to a non-existent address. Alternatively, they could be donated to a third party such as a charity or an open-source software foundation.

danso11y ago· 5 in thread

Fascinating read, and as amazing as email is, the OP manages to still make me realize how much I take it for granted:

drzaiusapelord11y ago

_delirium11y ago

nsns11y ago

    Someone from rural China can send me a message within seconds, 
    for free, and I can reply, for free!

scrollaway11y ago

I am willing to give you £50 in cash for free as long as you come pick it up at my place in Sweden.

1 more reply

angersock11y ago

Yep. Pretty amazing, isn't it?

sounds11y ago· 4 in thread

One important concept that seems to be missing from the discussion is Sender Stores.

That makes email a Receiver Stores model: the recipient's network is expected to accept the message at any time and then hold it until the recipient comes to look at it.

fanf211y ago

dredmorbius11y ago

A fair point, but it does require that the sending host persist in its network location (or have continuously updated DNS which reports its present location).

Since early recipients of the spam will likely report it, it's fairly likely that subsequent retrieval attempts will find a downed (or sanitized) host, no longer delivering spam.

This will reduce the amount of spam actually delivered, and the spammer's production / revenue margins.

gioele11y ago

What you call "Sender Stores" is at the basis of djb's IM2000 that is supposed to replace SMTP and email delivery in general.

See http://cr.yp.to/im2000.html and http://en.wikipedia.org/wiki/Internet_Mail_2000

sounds11y ago

Thanks, I hadn't seen that before.

p4bl011y ago· 4 in thread

The discussion here is already quite long so maybe I missed it, but I don't see anyone asking (or answering) the first question that came to me while reading the linked email:

Why is the cost of end-to-end crypto never taken into account?

hueving11y ago

p4bl011y ago

Mh, okay.

But let's make the crazy assumptions that we are effectively in a world where end-to-end crypto is massively used, virtually by everyone.

I guess it would be stupid to assume that the message are encrypted but not signed.

Now that means that the spammers not only have to encrypt every single email for every single recipient but also to generate new key-pairs for almost each encryption.

Of course it is also very easy to mark as spam any email signed with a key that is considered too small (i.e., too quick to generate).

Now if you tell be that still won't do it without a "centralized spam system with global knowledge", I have to seriously rethink a lot of my assumptions about the cost of some computations.

1 more reply

rando28911y ago

Encryption is pretty darn cheap in cpu cost. And they have botnets to do it for them.

p4bl011y ago

Of course using end-to-end crypto does not only mean "encryption", see my other response.

orf11y ago· 3 in thread

_delirium11y ago

I do still get some spam, almost all of it from legitimate free-mail hosts who I can't feasibly filter by host (mostly Yahoo and Gmail). But it's fairly infrequent.

raverbashing11y ago

Even worse, I've seen Google originating emails being flagged in GMail as spam/fishing

(in mailing lists, and yes, I checked, it seemed to be a legitimate email, so I let the sender know, but never heard back)

ams611011y ago

I have had gmail flag a message I sent to myself as spam.

idlewords11y ago· 2 in thread

This is an incredible write-up. Can someone who knows the author plead with him to write up the long history of the Spam Wars that he mentions in this document? I could read this stuff all day.

baudehlo11y ago

Read Spam Kings. It covers a lot of the history.

ireflect11y ago

Agreed. I'd buy the book.

zerr11y ago· 2 in thread

>we had put sufficient pressure on spammers that they were unable to make money using their older techniques

Could anyone comment how spammers make money actually?

skizm11y ago

aslewofmice11y ago

Direct email marketing

awt11y ago· 2 in thread

No mention of Bitmessage, which provides E2E crypto and anti-spam.

Canada11y ago

Bitmessage is hardly immune from spam. I've seen it there.

And cranking up the proof of work isn't going to do anything to prevent it. The only thing that prevents Bitmessage from becoming a cesspool is its obscurity.

awt11y ago

I would not call the random strings showing up in the general chan spam. I highly doubt it is profitable for whoever sent it. Industrial scale spam seems unlikely on Bitmessage.

1 more reply

joelthelion11y ago· 2 in thread

Can someone explain botguard? I'm not sure I get it.

chii11y ago

This prevents signups via scripts, but does so via making signup scripts untrustworthy, and so no one would be willing to put money in for they cannot be sure that it actually works.

joelthelion11y ago

But what prevents you from requesting another token from Google?

1 more reply

skrebbel11y ago· 1 in thread

I don't understand the objection against email costing money. I send you a mail? I pay $0.0001 to you. You reply? You pay $0.0001 back.

Texting costs money and yet people do it.

What am I missing?

mike-cardwell11y ago

"What am I missing?"

The specs and the software for a start.

It would be a massive task to add this functionality to email, and it wouldn't stop the spam, so it's not worth it.

anon411y ago· 1 in thread

So why not use one key per source, kind of something like this:

This would give mailing list operators a large incentive never to share your email with anyone, otherwise you could just block them forever.

chii11y ago

Lets suppose Bob was a spammer pretending to be a mailing list operator.

If alice blacklist the key for bob, but sends a new one, the situation didn't improve. If she doesn't send a new one, she stops receiving legit mail (that she wants, and cannot go without).

1 more reply

rwallace11y ago· 1 in thread

How does that work? Don't SIM cards cost more than 10 cents?

cjg11y ago

Not all the accounts need verification by phone.

patio1111y ago

zokier11y ago

sgentle11y ago

Seems like a really interesting problem space to be involved in.

fdsary11y ago

Btw, this is written by Mike Hearn, who'd I'd like to nominate to hacker of the year. Super cool guy, mad respect to him :)

PaulHoule11y ago

I think reputations are part of it but there are other aspects to.

dochtman11y ago

I submitted this without the ?hn at approximately the same time. Pretty weird that this one gained traction while my submission did not.

https://news.ycombinator.com/item?id=8275787

hendzen11y ago

Mike Hearn is also a core Bitcoin developer, as well as an HN commenter. Hi Mike!

lazylizard11y ago

Zigurd11y ago

Some of my contacts have been using verification gateways/whitelists for email for decades. If spam were to become a problem, I would use one.

Oculus11y ago

Really interesting article until it gets into the Bitcoin talk. I feel like his passion towards Bitcoins seeped a little too much into the article towards the end.

j / k navigate · click thread line to collapse