Spam giving rise to new-breed A.I.?

It is Christmas and I exercise my right to wave my hands in the air, tell a story and make a bold prediction. You may exercise your right to comment and tell me just what you think of my idea.

During this year, especially whilst growing ShowMeDo, we’ve had to contend with a rise in spam. We get spam on our site (each video has a comments section, like comments on a blog), spam on our blog and spam in our forums. Dealing with spam eats hours, and when you only have hours to grow the site each evening this quickly gets boring.

Sadly, I see this only getting worse as time goes on, at least until technology catches up.

This year we’ve witnessed the rise of a new type of image spam – frequently used for delivering pump n’dump messages (which fool enough people to net the spammers a healthy profit). These images fool OCR spam-filtering tools, and this is likely to only get worse.

The two types of spam that frequently get through to my inbox (and my personal inbox via Yahoo) include clever use of text-phrases in text-only mails, and this new type of image-based spam (with random visuals in the background which fool the OCR software).

Humans are good are understanding both types of spam, but computers are bad so the filtering begins to break down. Spamming is easy, computers are relatively badly defended, so delivering torrents of spam is cheap…yet the rewards are high for the spammer. Economics drives the entire process, and more spam is the likely result.

Add to this new initiatives like the One Laptop Per Child project, which aims to deliver very cheap laptops to the third world. One obvious form of employment for unscrupulous individuals will be to write text and image based spam which gets past our current spam-filtering technology. Given that the OLPC is network-enabled (via built-in WiFi), it’ll be relatively easy and cheap to hook up cheap-human-brains to the spam-generation system.

How can we stop this new breed of better-crafted spam? We can’t employ people to act as human filters, that’ll get too expensive too quickly. Instead we’ll need to improve current A.I. techniques in the field of Image Processing and Natural Language Processing.

So, here’s my Christmas prediction. By next Christmas we’ll see some big advances in the application of Artificial Intelligence to the problem of spam-detection, in addition to the statistical and rule-based methods currently used. This will have been driven by improved spamming techniques, often centred around the use of cheap human labour to supplement the current algorithm-based approaches.

I’d be curious to know what you think…

6 Comments

  • Ian, you should know better than making big predictions about AI ;) Yeah, spam is evolving and spam-filters are forever playing catch-up but I'm not convinced this game has to last forever. I'm getting a lot of bounced mail from spammers using my domain name for a fraudulent From field. How can they do this? If the mail protocol was revised to demand authenticated users then I think it would be a lot easier to get a handle on the spammers. Of course, the real problem behind this (as with many things) is the pool of stupid people who actually click through and fund spam. Unfortunately, I can't predict a rise in Human Intelligence in the near future. Not sure about the OLPC scheme when there are other ways to spend the money: http://www.wdm.org.uk/deathcounter
  • Well - that's why I said I was waving my hands about in the air :-) You're quite right that the big problem is the fact that people click the damn spam. However - even if the email protocol was revised, people would still click the spam...what happens if spammers get cleverer about using one infiltrated machine to send mail to another person in the address book - then it looks like your friend is sending you an invite for a Nigerian money-shifting scam. The spam is still spam, as people can earn money out of it. Human brain cycles get used and wasted, and all the while AI could be getting smarter about detecting this stuff. Well, here's hoping!
  • Technical solutions is one approach to SPAM. However, it is always a game of catchup; as the filters improve so do the bots. Despite the improvements over the years, SPAM has just got worse. I would suggest a commercial approach. If someone SPAMs (I don't mean a few douzen EMAILs, but thousands), then their service provider can be fined. Very quickly, service providers will find ways to recover the costs (e.g. to have an EMAIL account you have to provide a deposit or a credit card). They would pass the fines onto the spammers, and it would no longer be financially worthwhile. I'm sure that people can poke holes in this idea. However, I think that the base idea is sound and if it could be implemented correctly it would work.
  • Bit of AI and spam stuff mentioned here: http://developers.slashdot.org/article.pl?sid=07/01/07/1739217&from=rss
  • Will an Army of Captcha-Solvers Be Unleashed on Web 2.0 One scenario that should scare anyone who loves the Web 2.0 movement, who asks for or submits comments, who believes in the interactivity of the Internet, is an increase in blog spamming. And we're not talking about annoying Nigeria 419 email scams, ...
  • [...] classifiers will have to get more-AI-ish to deal with the visual and language elements that spammers keep bringing to the [...]