Sunday, December 25, 2022

The Despicable, No Good, Blackmail Campaign Targeting ... Imaginary Friends?

Natalia here speaks to our imaginary friend

In which we confront the pundits' assumption that the embarrasment-based extortion attempts would grow more “sophisticated and credible” over time with real data.

It's a problem that should not exist. 

It's a scam that's so obvious it should not work.

Yet we still see a stream of reports about people who have actually gone out and bought their first bitcoins (or more likely fractions of one) in order to pay off blackmailers who claim to have in their possesion videos that record the vicim while performing some autoerotic activity and the material they were supposedly viewing while performing that activity.

And occasionally one of those messages actually find their way to some pundit's inbox (like yours truly), and at times some of those pundits will say things like that those messages represent a real problem and will evolve to be ever more sophisticated.

Note: This piece is also available, with more basic formatting but with no trackers, here.

I am here to tell you that

  1. That incriminating video does not exist, and
  2. The pundits who predicted that those scams would evolve to become more sophisticated were wrong.

If you stumbled on this article because one of those messages reached you, it's safe to not read any further and please do ignore the extortion attempt.

I wrote a piece in 2019 The 'sextortion' Scams: The Numbers Show That What We Have Is A Failure Of Education, also available without trackers, where the summary is,

Every time I see one of those messages reach a mailbox that is actually read by one or more persons, I also see delivery attempts for near identical messages aimed at a subset of my now more than three hundred thousand spamtraps, also known imaginary friends.

Over the years since the piece was originally written, I have added several updates — generally when some of this nonsense reaches a mailbox I read — and while I have seen the messages in several languages, no real development beyond some variations in wording has happened.

Whenever one of those things does reach an inbox, my sequence of actions is generally to save the message and add it to the archive, see if the sending IP address has already entered the blocklist that is later exported and add it by hand if not. Then check if the number of trapped addesses has swelled recently by checking the log file from the export script

$ tail -n 96 /var/log/traplistcounts

See if there is a sharp increase since the last blocklist export

$ doas spamdb | grep -c TRAPPED

Then check for related activity in the log

$ tail -n 500 -f /var/log/spamd

Check for the full subject in the same log file

$ grep "You are in really big troubles therefore, you much better read" /var/log/spamd

Then check older, archived logs to see how long this campaign has been going on for

$ zgrep "You are in really big troubles therefore, you much better read" /var/log/spamd.0.gz

This time, the campaign had not gone on for long enough to show traces in the older archive, so I go on to extracting the sending IP addresses

$ grep "You are in really big troubles therefore, you much better read" /var/log/spamd | awk '{print $6}' | tr -d ':' | sort -u

Check for activity from one of the extracted addresses

$ grep /var/log/spamd | tee wankstortion/20221123_trapped_183.111.115.4.txt

Extract the sender IP addresses to an environment variable to use in the next oneliner,

$ grep trouble /var/log/spamd | awk '{print $6}' | tr -d ':' | sort -u | grep -vc BLACK | tee -a wankstortion/20221123_campaign_ip_addresses.txt

which will record all activity involving those IP addresses since the last log rotation:

$ for foo in $troubles ; do grep $foo /var/log/spamd | tee -a wankstortion/20221123_campaign_log_extract.txt ; done

You will find all those files, along with some earlier samples, and by the time you read this, possibly even newer samples, in the archive.

When something of the sort inboxes, I probably will go on adding to the archive, and if I have time on my hands, also run similar extraction activities as the ones I just described. But unless something unexpected such as actual development in the senders' methods occurs, I will not bother to write about it.

The subject is simply not worth attention past persuading supposed victims to not bother to get bitcoins or spend any they might have to hand. None of my imaginary friends have, and they are just as fine as they were before somebot tried to scam them.

Good night and good luck.


Friday, December 23, 2022

Can Your Spam-eater Manage to Catch Seventy-one Percent Like This Other Service?

Measuring the effect of what you do is important. Equally important is knowing what is the measure of your actions.

A question turned up on IRC that had me thinking.

Do you have a percentage of the spam traffic you catch on your MXes? The reason I ask is I lust learned that claim they catch 71% of all incoming spam. Also a rate of false positives would be nice to have, but that's likely harder to measure.

My first impulse was that I would consider a seventy-one percent hit rate on the low side of what we are seeing here at and associated domains.

But getting actually useful data would require some thinking. That said, comparing a major mail operator that sells deliverability and promises a 71 percent catch rate for incoming spam and would be like comparing apples and oranges at best. 

While (which is also known under a few other domain names) is my main mail service for my personal use and for a very select number of other people, to the rest of the world it is primarily a honeypot that generates security relevant data that other sites use, and that contributes to IP reputation rankings.

The site has been in operation in those roles for a little more than 15 years, since shortly before the original announcement in the article Hey, spammer! Here's a list for you!. When we started using the greylisting and greytrapping based setup, we saw a sharp drop in undesirable messages actually reaching inboxes, and I observed a marked decrease in load on the mail servers that did the content filtering.

Not long after I had set up our early greylisting setup, a message turned up on the openbsd-misc mailing list that pretty much matched our experience — a 95% reduction in spam in line to be treated to content filtering — so setting up precise measuring became a thing to do when we could get around to it.

Now enough with the background. It is relatively easy to extract at least some data that would give us a rough picture of the relative effectiveness of the greylisting and greytrapping versus the content filtering on receipt. The setup is very similar to the one described in the practically-oriented parts of the Effective Spam and Malware Countermeasures - Network Noise Reduction Using Free Tools and is part of a syncronizing multi-domain setup rougly as described in the earlier article In The Name Of Sane Email: Setting Up OpenBSD's spamd(8) With Secondary MXes In Play - A Full Recipe.

Using only tools found in the OpenBSD base system, I went on to collect data.

Whenever spamd(8) closes a connection it logs a message to that effect, so

$ zgrep "Nov  1" /var/log/spamd.6.gz | grep -c disconnected

Supplies the total number of connections closed by spamd(8) during November 1st, fetched from the archived log file.


$ zgrep "Nov  1" /var/log/spamd.6.gz | grep -c BLACK

provides the number of connections during the same 24 hour period initiated by hosts that were already in one of the blocklists used.

The command to get the number of connections that had cleared the first hurdle and entered greylisted status would be

$ zgrep "Nov  1" /var/log/spamd.6.gz | grep -c GREY

And the number of hosts that had been well behaved enough to enter the whitelist and be allowed to talk to the real SMTP service comes out of

$ zgrep "Nov  1" /var/log/spamd.6.gz | grep -c whitelisting

For hosts that have reached this far and did not fail the content filtering we do during receipt, we get the number with

$ doas zgrep 2022-11-02 /var/spool/exim/logs/main.log.6.gz | grep -c Completed

It is however worth noting that our MTA exim reports Completed for apparently message deliveries in both directions, so the number of received messages, or messages that did inbox is likely about thirty percent lower.

The number of messages rejected for one reason or the other, by being addressed to an undeliverable address or by failing content filtering we find with

$ doas zgrep 2022-11-02 /var/spool/exim/logs/main.log.6.gz | grep -c rejected

And finally, a side effect of a frequently run log reading script that adds hosts with certain kinds of characteristics such as not having a correct reverse DNS entry to a blocklist and kills all their connections will at times produce an unexpected disconnection while reading SMTP command message. We find those with

$ doas zgrep 2022-11-02 /var/spool/exim/logs/main.log.6.gz | grep -c unexpected

Those are hosts that somehow got past spamd(8) by behaving enough like a real SMTP server to clear greylisting. However spamd(8) does not have the ability to check for valid reverse, so that part is left in our case to check for by reading the log files at intervals.

The following table has the data for November 2022 —

Date Incoming SMTP
New whitelist
Deliveries Rejected Unexpected
2022-11-01 53303 38951 2580 54 1347 409 384
2022-11-02 55653 40467 2174 121 1297 549 330
2022-11-03 59658 43901 2086 85 1260 865 759
2022-11-04 57462 45674 1683 71 1270 30 0
2022-11-05 44993 43571 2146 105 1182 43 0
2022-11-06 36768 37802 2322 86 1366 184 0
2022-11-07 49464 44213 2398 182 1424 67 0
2022-11-08 52285 45904 2676 113 1513 69 3
2022-11-09 47652 47988 2085 105 1438 154 0
2022-11-10 57850 49875 2614 104 1435 192 2
2022-11-11 60269 56719 2355 99 1420 90 1
2022-11-12 46139 54073 1160 96 1182 29 0
2022-11-13 40497 40221 1777 70 1239 189 0
2022-11-14 59965 59951 2062 63 1382 145 73
2022-11-15 56265 32727 2304 113 1298 351 301
2022-11-16 77252 58029 1925 109 1340 282 33
2022-11-17 43107 30713 786 131 1250 215 17
2022-11-18 49448 48999 1590 96 1327 194 1
2022-11-19 42413 45927 973 92 1182 182 70
2022-11-20 50890 55318 1558 77 1203 358 33
2022-11-21 36601 35070 1707 125 1321 241 146
2022-11-22 37840 35499 2055 99 1359 142 17
2022-11-23 43186 34545 1314 114 1345 103 21
2022-11-24 46802 45765 1856 66 1269 729 52
2022-11-25 70911 52404 1315 89 1326 1488 395
2022-11-26 39780 32226 1500 77 1175 954 379
2022-11-27 67578 41581 1743 85 1231 523 315
2022-11-28 54688 37534 2433 77 1337 321 269
2022-11-29 70893 45917 2502 65 1248 87 39
2022-11-30 50280 35585 2567 67 1324 1293 1113

The table is also available as a comma separated (CSV) file.

As I mentioned earlier, the number of connections to the outer layer spamd(8) is likely higher than what would be expected on sites that are not considered a honeypot and home to in excess of three hundred thousand imaginary friends (see The Things Spammers Believe - A Tale of 300,000 Imaginary Friends or the trackerless version.

That said, I think the data shows that catching the unwanted traffic early, and discarding as much as possible of that traffic before it reaches the resource hungry content filtering is definitely beneficial. 

Even sites that do not actively bait the baddies out there would likely see noticeable energy bill savings by having their mail servers run quiter and cooler, as they definitely will after getting a greylisting, and optionally greytrapping setup in front of them. Those services have a truly low energy consumption profile.

If you found this article interesting, useful or just simply irritating, I would like to hear from you. Please use the comment field, or if you prefer, send email to nix at nxdomain dot no with a subject that at least tries to sound sensible and relevant.

As always, if you are interested in research on items mentioned in this article, I will be able to provide data for study. I will honor reasonable requests.

Friday, December 9, 2022

Harvesting the Noise While it's Fresh, Revisited

A year's worth of logs yields entertaining but unsurprising findings about spammer behavior.
Spam mail, masked but detected, from the archive

Returning readers will be almost painfully aware that here at (also known as we host and maintain a blocklist, which in turn is the product of traffic that hits our mail system with attempts at delivery to one or more of the now more than three hundred thousand known bad addresses, also featured at the blocklist home page.

Note: This piece is also available without trackers but only basic formatting here

When I first set up the greytrapping back in 2007, the initial spamtraps were non-deliverable addresses in our domains that I had extracted from mail server logs. I won't bore you with the details (which are anyway documented at length in earlier articles), but it was clear from those logs that the domains we hosted back then were more or less continously subject to Joe jobs, as in somebody sending messages with a forged From: field with a made up address in our domains.

After a while I started extracting the potential new spamtraps from the greylist — actually dumping data from there once per hour as part of the script that also generated the exported blocklist. The basic process is described in the July 25 2007 article Harvesting the noise while it's still fresh; SPF found potentially useful (also available trackerless but with links to tracked articles).

Then today it struck me that while that method is useful, by extracting only from the greylist we will only ever collect the address from the initial connections. Any addresses attempted after the miscreants enter the blocklist will simply not be recorded there.

This of course lead to the question: What did we miss?

Fortunately I keep my logs around for a while, the most easily accessible log archive for my main spamd spans a lttle over a year. So I set about with some very basic grep and awk, which netted me this raw list of targeted addresses from the spamd logs.

The list weighs in at a total of 269903 entries, as counted by wc -l.

Some of those addresses are valid, and a small, but actually significant, number are in domains we do not actually serve here, and some entries do not look like mail addresses at all. The stranger ones could be strings encoded in a character set that spamd is not equipped to handle, or could be other binary data that might have been intended to trigger bugs in some of the variants of fully equipped SMTP servers that are out there. Or simply noise of any other kind, including a byproduct of the not very intelligent extraction one-liner I used.

The target addresses in foreign domains I take as a sign that at least some spamming operators mistake a reasonably configured spamd for an open relay, just like they did all those years ago when I started running the greytrapping.

Some things apparently stay the same no matter how the rest of the world has found a way to move forward.

While I did a few other tasks and finally started writing this article, the bulk of the processes that would answer the question posed earlier (What did we miss?) could fortunately run unattended in the background, and after some manual massaging we are left with a results file, with 1530 entries that were none of

  • actually useful deliverable addresses in our domains
  • existing spamtraps

This means of course that the collection of imaginary friends expanded by the same number, and now stands at 304154 entries.

Which I suppose means that harvesting the noise even after a period of aging for refinement can be a good thing.

The entries added represent a wide variety of phenomena. Quite a few seem to be truncated versions of earlier spamtrap entries, and a fair number of the new entries look like they may have descended from artifacts of stupidity such as products of SMTP callbacks. Proving mainly that in mail and spam handling, there appears to be a space still for the less intellectually astute.

With all of this said, the natural followup question is, given the modest net result, was this worth the effort?

Well, the raw output that yielded 269903 entries needed some manual operations in order to weed out the obvious noise (exact time used not recorded), followed by another background task that took, according to time(1)

    real        105m24.220s
    user        73m3.280s
    sys	        29m14.930s

which yielded 1577 entries that were pared down to 1530 entries that met the criteria for inclusion in the circle of imaginary friends (also known as spamtraps).

Before this experiment, the spamtraps list numbered 302625, after including the result here, the count stands at 304154, for a gain of less than one percent of the previous total. Again, if you check back at the traplist home page now, the total number is likely to have increased again.

So was it worth the effort? I feel that as an experiment, it was worth doing.

Whether or not it is an experiment that is worth repeating is a question for another day.

If you have opinions on this, I would love to hear from you, in comments, via email or messages on whichever social media brought you the link to this article.

As always, parties interested in studying the data referenced in this article and other pieces I have written are welcome to contact me for arrangements. I can easily dig out more and rawer data than directly referenced here on request.

Stay safe out there.

As a side note, a slightly improved way of extracting useful data about other domains' mail service via SPF records can be found in the November 2018 artice Goodness, Enumerated by Robots. Or, Handling Those Who Do Not Play Well With Greylisting.

That article (naturally) works from the premise that you are running a recent OpenBSD system.