Spam – foreign emails and friend-of-friends address books

I’m getting lots more spam on my work account, apparently due to the storm botnet. Now I drag 10-20 new mails to the spam_to_learn folder every day and I check and delete 50-100 identified spams each day.

Of the incoming spams – a small but steady percentage are in a foreign language. They use unicode, I guess it is Chinese, Japanese or Korean. The thing is – I can’t read these mails! I’ve never been able to read far-Eastern languages. Couldn’t my spam filter flag these as ‘questionable’ automatically?

Come to think of it – I can’t read many of the world’s languages. I can read a bit of French, but I’d be stumped by anything from Africa or South America. How come my mail reader can’t filter these away – a dictionary lookup can’t be that hard, surely?

The biggie of course is the amount of spam I receive from people I don’t know. I’ve never seen these email addresses before, it is likely that I never will again. Anyone that I talk to frequently on the web won’t know these people either.

What about if our email clients spoke (hashed and encrypted) to the address books of authorised friends and asked them ‘do you know this person?‘.

If the answer is yes then this new email address is probably something to do with my social network. This can be flagged for my attention – useful!

If the answer is ‘no’ then either they’re a foreign person entirely (and probably spam) or they’re from outside my long-developed social network. Either way they can be flagged as ‘unknown’ and filtered to a separate folder.

A chance to avoid deleting emails from people I might actually know is a Good Thing in my book. I nearly did it to an important email just a few days ago, I’d dragged the mail all the way to the spam_to_learn folder before I’d realised that the ‘financial questions’ title, name and ‘attachment’ were signs of a Really Important Mail from someone new (but known to a friend) and not the usual spam that I receive.

Can someone come up with a plug-in for Thunderbird?


  • I got some spam this morning in some strange alphabet. I ran it through Google Translate and it turned out to be Russian. Playing around with it, I found the single character word 'и' which seems to be 'and'. My Gmail filter now filters Russian spam :)
  • This is exactly the sort of spam I'm talking wouldn't be rocket science to auto-classify 90% of the text as 'russian'. A rule can then be applied that says 'Ian hasn't said that he can read Russina, I'll ask him and then filter in the future if he can't read it'. The downside is that spams could start to arrive in multi-languages and then we'll need to kick-off research into how human's would understand the email. The Red Queen keeps on running.