Monday, July 9, 2007

Hey, spammer! Here's a list for you!

Last week I started noticing from my log summaries that my mail servers had seen a lot more mail to non-existent users than usual. This usually happens when somebody has picked one of our domains as the home of their made-up return addresses for their spam run. This time, from the looks of it, the spam runs were mainly targeted at Russian and Ukrainian users. At least that's where most of the backscatter appears to have come from.

As I've written before in the PF tutorial and the malware paper (updated version available as this blog post -- that's the end of today's plugging, I promise), I've used the "Unknown user" messages as a valuable data source for my spamtrap list, just quitely adding addresses that looked really unlikely to ever become valid. After a brief airing on the OpenBSD-misc mailing list and running it by my colleagues at Datadok and Dataped, I've decided to take it a bit further.

Now that I've got a list of addresses which will never receive any legitimate mail, I really want spammers to try to send mail to those addresses. After all, if they send anything to an address which consists of a random string with one of our domains stuck on after the '@', we know it's all spam from there on.

We don't care about the rest, for the next 24 hours. Your SMTP dialogue with us (actually our spamd) will be all a-stutter, receiving answers one byte at the time until you give up. For the record, that usually takes about 400 seconds, with the really imbecile ones taking a lot longer. See the paper or the tutorial for some numbers.

The other possibility is of course that your system is set up in a way which makes it actually receive and try to deliver spam. Some of the spam will be addressed to non-existent users in your domain, so if your users receive spam, you will be trying to send bounce messages back to the purported sender for spam to non-existent users. That's tough, kid. If you're set up that way, your machine will be treated to the tarpit here for the next 24 hours. All a-stutter and all that. Repeat offenders stay there longer.

Now for the spamtrap list, I've checked that my colleagues and associates have never actually wanted to use those addresses for anything, and I made this page which wraps it all in a bit of explanation. For some reason, the list keeps growing each time I look at my log summaries.

When I get around to it and find a visually not-too-horrible way to do it, I'll include links to that page where they fit naturally on our web sites. In the meantime, here's hoping that the spammers' address harvesting robots find this list and put it to good use.

The chapter, it's improving. More later.

UPDATE 12-jul-2007: The softer side of me ponders the possibility of sending email form letters to the various postmaster@s with the URL to this blog post. On the other hand, I'm not sure I'm ready for another round of finding out that postmaster@ is in fact not deliverable at a surprising number of sites around the world.

One other thing I've noticed since I published the traplist is that bounces to addresses like mixt.apex.dp.ua-1184227575-testing@datadok.no have started appearing in the logs. I don't see how messages like these could be useful by themselves, but the addresses are of course obvious traplist material.

13-jul-2007: Oddly enough, there's still a stream of backscatter, and my logs tell me a few new addresses turn up every day. This morning's fresh ones were careersogt2083@datadok.no, phalanxesxb88@datadok.no and retryingvtt@datadok.no. Another few bytes to help weed out the bad ones early, thanks to the robots out there.

5 comments:

  1. Running Bob Beck's greyscanner along side spamd compliments your spam handling effectiveness. In addition to using spam trap addresses like waylay@bonetruck.org, the greyscanner will net hosts with bad/non-existent mx records, invalid sender addresses and anything else you want to bolt on. It also supports checking against a list of valid address (you supply the list) and everything else is greytrapped.

    Basically, the greyscanner does deeper inspection on all those "grey" connections that haven't been white listed yet.

    ReplyDelete
  2. Sounds like a great idea! So if I send some emails to you from gmail.com to non-existent addresses, you wont be able to receive any mail from gmail for 5 days ? (I'm not sure what the retry time that gmail uses is, but default for other mailservers is 5 days, if the transaction tempfails that is, which is what tarpitting is classified as)


    Maybe implementing bounce killing spamassassin-rules together with some sort of "pen pals"-feature would be a better route to take?

    ReplyDelete
  3. This comment has been removed by a blog administrator.

    ReplyDelete
  4. how can i find spamtrap address list?

    ReplyDelete
    Replies
    1. You're more than welcome to grab mine (referenced in the article, but the raw file is available as http://www.bsdly.net/~peter/sortlist) and do a bit of search and replace so it fits the domains you are serving.

      I've seen quite a few of those user names in logs from other domains too, so I wouldn't be surprised if you get good match for your greytrapping.

      The other option is to do what I did: check your mail servers ' logs for rejects for addresses that you are sure will never exist in your domains and add those as spamtraps. Publishing the list somewhere visible on the web is an optional extra.

      Delete

Note: Comments are moderated. On-topic messages will be liberated from the holding queue at semi-random (hopefully short) intervals.

I invite comment on all aspects of the material I publish and I read all submitted comments. I occasionally respond in comments, but please do not assume that your comment will compel me to produce a public or immediate response.

Please note that comments consisting of only a single word or only a URL with no indication why that link is useful in the context will be immediately recycled so those poor electrons get another shot at a meaningful existence.

If your suggestions are useful enough to make me write on a specific topic, I will do my best to give credit where credit is due.