Monday, February 22, 2021

RFC7505 Means Yes, Your Domain Can Refuse to Handle Mail. Please Leave Us a TXT If You Do.

If you do not want a domain to receive any mail, there is a way to be at last somewhat civil about it. There's a different DNS trick for that.

It used to be that if you went to the trouble of registering a domain, one of the duties that came with it was set up somewhere to receive mail.

A number of networking professionals, myself included, have been know to insist that not only should a valid domain receive mail, at least a significant subset of the identities listed in RFC2142 (dated May 1997) should exist and mail sent there should be read at some reasonable interval.

Then of course we all know that a number of things happened in networking in the years between 1997 and today.

As regular or returning readers of this column will be aware, one of the phenomena that rose to become a prominent irritation and possible risk factor was spam, otherwise known as unsolicited commercial email, and of course some of the unsolicited traffic carried payloads that were part of various kinds of criminal activity.

I have written fairly extensively on how to suppress spam and other malicious traffic and have fun doing so, all the while assuming that if you run a domain you will want at least some mail to have a chance of making it to an inbox that is actually read by a person or perhaps processed by your robotic underlings.

Then there is that other consideration that with the proliferation of top level domains means that organizations that own trademarks and would in the early days see the need only for .com or .net domain (the latter was in fact originally intended for organizations involved in networking) or perhaps a country domain such as a .no or .se one would tend to hoard domains in other top level domains too.

There are of course those who try to exploit trademark protection too, as we have seen in among other things my brush with a certain Chinese registrar or that time when what could only be seen as an extortion attempt a little too forcefully telemarketed landed me an otherwise white-elephant .se domain.

Now with the combination of potentially for most practical purposes redundant domains and the likely burden of handling spam for the same, it is understandable that attitudes started to shift. Finally in June 2015 RFC7505 was issued, with a simple and practical solution, dubbed the NULL MX record. The RFC explains how to set one up, though in language that is not too easy to penetrate.

For any domain that runs a mail service, there should be at least one MX record. Looking up, say, bsdly.net with dig bsdly.net mx yields a response where the answer section gives

;; ANSWER SECTION:
bsdly.net. 300 IN MX 1 skapet.bsdly.net.
bsdly.net. 300 IN MX 5 portal.nuug.no.

In your zone file, you would probably have similar lines, likely with only the MX <priority> hostname part on the actual line, the rest taken care of by the zone file it's all wrapped in.

If you want to make your domain an RFC7505-adherent one, you would remove your current MX records and replace with

MX 0 .

I did that for my little white elephant domain last week, since I did not by then remember when I last received anything sensible via that domain. 

So if you run dig bsdly.se mx now, it will yield

;; ANSWER SECTION:
bsdly.se. 300 IN MX 0 .

Which means nobody will ever see mail you attempt to send to bsdly.se. The delivery will fail immediately and produce a bounce message that likely references the RFC if your mailer is a reasonably recent version.

But while I was doing the change it struck me that it would be useful to let the world know why I did not want that domain to handle mail. Fortunately there is already an appropriate DNS record type for the purpose: the TXT record.

TXT records are used for some specific purposes such as the SPF records used to list allowed outoing SMTP senders for the domain, and a few other variants tied to specific services. But fundamentally a TXT record is simply a string of characters most applications will not actually attempt to handle. This means you have the option of fitting a message on your own in one. Now, if you do a lookup on that white elephant domain's TXT records, you will get

;; ANSWER SECTION:
bsdly.se. 300 IN TXT "v=spf1 -all"
bsdly.se. 300 IN TXT "This exists only because https://bsdly.blogspot.com/2011/07/sek-1995-for-six-months-worth-of.html happened."
bsdly.se. 300 IN TXT "For actual contact info please check the corresponding net domain."

Note the first TXT record here, which carries the domain's SPF specification that had been in place for a while already. It says essentially in terse if eloquent SPF speak, "This domain does not send mail".

So wrapping up, with these simple changes, quick to implement if you are in a position to edit your DNS zones we achieved:

  • Ridding ourselves of an entry point that produced only annoyances
  • Letting the world know (or at least the subset that knows how to operate common DNS tools) what the status of the mail service is and why, plus a small hint on how to make contact in case that is actually required.
A little DNS will sometimes go a long way.

A big Thank You to Security Evangelist Per Thorsheim (yes, that is his actual title) who brought RFC7505 to my attention again with this somewhat shorter blog post in Norwegian (also in English here).

Update 2021-02-23: After gentle prodding in this tweet (via JP Mens)
-- also preserved as a screenshot - 


 
I added a dmarc record for the domain too (kind of overkill, but can't hurt I suppose).

Friday, February 28, 2020

The 'sextortion' Scams: The Numbers Show That What We Have Is A Failure Of Education

Subject: Your account was under attack! Change your credentials!
From: Melissa <chenbin@jw-hw.com>
To: adnan@bsdly.net

Hello!

I am a hacker who has access to your operating system.

I also have full access to your account.

I've been watching you for a few months now.

The fact is that you were infected with malware through an adult site that you visited.


Did you receive a message phrased more or less like that, which then went on to say that they have a video of you performing an embarrasing activity while visiting an "adult" site, which they will send to all your contacts unless you buy Bitcoin and send to a specific ID?

The good news is that the video does not exist. I know this, because neither does our friend Adnan here. Despite that fact, whoever operates the account presenting as Melissa appears to believe that Adnan is indeed a person who can be blackmailed. You're probably safe for now. I will provide more detail later in the article, but first a few dos and don'ts:
  • Whatever some tempting web site tells you in a popup, unless you know what you are doing, do not install software on your devices from any other sources than the official ones. You do not need to install a new video viewer for that site or update your existing one, neither do you need to enter your administrator user name and password along with your credit card details into an unfamiliar-looking dialog box or web form.
     
  • Unless you know what you are doing, stay away from Bitcoin or other cryptocurrencies. If that message is the first you've heard of Bitcoin, you do not know what you are doing, leave it alone. As assets go, there is not much difference between financial derivatives, toxic waste and cryptocurrencies like Bitcoin, in that they should be handled with equal care and only from a distance unless you are in fact an expert in the field.
     
  • If you are not sure about either of the two bullet points before this one, please forget any shame over what you may or may not have done, and contact somebody you trust and who knows the subject better. This may be an adult such as a parent, teacher, social worker or other, a tech-savvy friend, or for that matter law enforcement such as your local police.

The important point is that you are or were about to be the victim of what I consider a very obvious scam, and for no good or even nearly valid reason. You should not need to become the next victim.

And this, dear policy makers and tech heads in general is our problem: A large subset of the general public simply do not know their way around the digital world we created for them to live in. We need to do better.

In that context I find it quite disturbing that people who should know better, such as the Norwegian Center for Information Security, in a recently issued report (also see Digi.no's article (both in Norwegian only, sorry)) predict that the sextortion attacks will become "more sophisticated and credible". Then again at some level they may technically be right, since this kind of activity starts out with a net negative credibility score.

A case in point: Some versions of the scam messages I have been able to study went as far as to claim that the perpetrators had not only had taken control of the target's device, they had even sent that very email message from there. That never happened, of course, and it would have been easy for anybody who had learned to interpret Received: headers to verify that the message was in fact sent from the great elsewhere. Unfortunately the skill of reading email headers is rarely, if ever, taught to ordinary users.

The fact that people do not understand those -- to techies -- obvious facts is a fairly central and burdening problem, and again we need to do better.

Now let me explain. Things get incrementally more technical from here, so if you came here only for the admonitions or practical advice and have no use for the background, feel free to wander off.

I know the message I quoted at the beginning here is a scam because I run my own mail service, and looking at just the logs there just now I see that since the last logs archiving rotation early Saturday morning, more than 3000 attempts at delivery of messages like the one for Adnan happened, aimed at approximately 200 non-existent recipients before my logs tell me they finally tried to deliver one to my primary contact address, never actually landing in any inboxes.

One of the techniques we use to weed out unwanted incoming mail is to maintain and publish a list of known bad and invalid email addresses in our domains. These known bad addresses have then in ways unknown (at least not known to us in any detail) made it into the list of addresses sold to spammers, and we at the receiving end can use the bad addresses as triggers to block traffic from the sending hosts (If you are interested, you can read elsewhere on this blog for details on how we do this, look for tags such as greylisting, greytrapping or antispam).

If it was not clear earlier, those numbers tell us something about the messages at hand. It should be fairly obvious that compromising videos of non-existent users could not, in fact, exist.

Looking back in archived logs from the same system I see that a variant of this message started appearing in late January 2018. The specifics of that message sequence will be interesting to revisit when the full history of sextortion (I still do not like the term, but my preferred alterantive is at risk of being filtered out by polite society-serving robots) will be written, but let us rather turn to the more recent data, as in data recorded earlier this week.

Mainly because I found the media coverage of the "sextortion" phenomenon generally uninformed and somewhat annoying, I had been been mulling writing an article about it for a while, but I was still looking for a productive angle when on Wednesday evening I noticed a slight swelling in the number of greytrapped hosts. A glance at my spamd log seemed to indicate that at least one of the delivery attempts had a line like

       I am a hacker who has access to your operating system.

Which was actually just what I had been pondering writing about.  

So I set about for a little research. I greped (searched) in my yet-unrotated spamd logs for the word hacker, which yielded lots of lines of the type

Feb 22 04:04:35 skapet spamd[8716]: 89.22.104.47: Body: I am a hacker who has access to your operating system.
Feb 22 04:17:04 skapet spamd[8716]: 5.79.23.92: Body: I am a hacker who has access to your operating system.
Feb 22 04:34:03 skapet spamd[8716]: 153.120.146.199: Body: I am a hacker who has access to your operating system.
Feb 22 04:40:30 skapet spamd[8716]: 45.181.93.45: Body: I am a hacker who has access to your operating system.
Feb 22 04:55:04 skapet spamd[8716]: 93.186.247.18: Body: I am a hacker who has access to your operating system.
Feb 22 05:09:39 skapet spamd[8716]: 123.51.190.154: Body: I am a hacker who has access to your operating system.
Feb 22 05:13:22 skapet spamd[8716]: 212.52.131.4: Body: I am a hacker who has access to your operating system.
Feb 22 05:38:02 skapet spamd[8716]: 5.79.23.92: Body: I am a hacker who has access to your operating system.
Feb 22 05:44:39 skapet spamd[8716]: 123.51.190.154: Body: I am a hacker who has access to your operating system.
Feb 22 06:00:30 skapet spamd[8716]: 45.181.93.45: Body: I am a hacker who has access to your operating system.

(the full result has been preserved here). Extracting the source addresses gave a list of 198 IP addresses (preserved here).

Extracting the To: addresses from the fuller listing yielded 192 unique email addresses (preserved here). Looking at the extracted target email addresses yielded some interesting insights:

1) The target email addresses were not exclusively in the domains my system actually serves, and

2) Some ways down the list of target email addresses, my own primary address turns up.

Of course 2) made me look a little closer, and only one IP address in the extract had tried delivery to my email address.

A further grep on that IP address turned up this result.

There are really no surprises to be had here, at least to a large subset of my supposed readers. The sender had first tried to deliver one of the sexstortion video messages to one of the by now more than quarter million spamtraps, and its IP address was still blacklisted by the time it finally tried delivery to a potentially deliverable address.

Doing a few spot checks on the sender IP addresses in recent and less recent logs it looks like the only two things could be mildly exciting about those messages. One is the degree the content was intended to be embarrasing to the recipient. The other is a possible indicator of the campaign's success: Looking back through the logs for the approximate year of known activity, it even looks like the campaign became multilingual, while retaining the word "hacker" in most if (possibly) not all language versions.

Other than that it is almost depressing how normal the sextortion campaign is: It uses the same spam sending infrastructure and the same low quality target address lists (the ones containing some subset of my spamtrap addresses) as the regular and likely not too successful spammers of every stripe. Nothing else stands out.

And as returning readers will notice, the logs indicate that the spambots are naive enough in their SMTP code that they frequently mistake spamd's delaying tactics for a slow, but functional open SMTP relay.

Now to recap the main points:
  • Regular users: The sextortion messages are scams, the videos do not exist. If this quasi-random sample is representative, the scammers are seen to send to 200 non-existing, invalid addresses before lucking on a real one. This alone strongly indicates that no videos exist. There is no reason to send money, bitcoin or otherwise. Look instead to learning how your devices and the networks and services they connect to actually work.
  • Competent mail admins: The tools to stop the flow of sextortion messages or at least slow to a manageable trickle are available today. You simply need to keep your antispam game up to speed with best practices and best of breed tools. If you are a user or someone who manages mail admins, check what your mail service does.
  • Competent authorities: Please step up to the task of educating the public. Sane, fact based approaches to IT security work. While it is easy to get distracted by the potential presence of porn and users' feelings of shame over accessing that kind of material, assigning much weight to that side of the matter is counterproductive. Work to educate the public and please focus on real threats, not imagined ones like the present topic.
Whatever evolves next out of these rather hamfisted attempts at blackmail is unlikely to ever achieve any level of sophistication worthy of the name.

We would all be much better served by focusing on real threats such as, but not limited to, credential harvesting via deceptive content delivered over advertising networks, which themselves are a major headache security- and privacy-wise, or even harvesting via phishing email.

Both of the latter have been known to lead to successful compromise with data exfiltration and identity theft as possible-to-probable results.

To a large extent the damage could could have been significantly limited had the general public been taught sensible security practices such as using multi-factor authentication or at least actually good passwords combined with securely coded password management applications, and insisting that services encourage such practices.

Yes, I know you have been dying to ask: What is the thing about Adnan? According to my activity log, the address adnan@bsdly.net was added as a spamtrap on July 8th, 2017 after somebot had tried to log on as the user adnan, a user name not seen before at bsdly.net,

Jul  8 09:40:34 skapet sshd[34794]: Failed password for invalid user adnan from 118.217.181.8 port 41091 ssh2

apparently from a network in South Korea.

As always, there is more log material available to competent practitioners and researchers with a valid research agenda. Please contact me if you are such a person who could use the collected data productively.


Update 2020-02-29: For completeness and because I felt that an unsophisticated attack like the present one deserves a thorough if unsophisticated analysis, I decided to take a look at the log data for the entire 7 day period, post-rotation.

So here comes some armchair analysis, using only the tools you will find in the base system of your OpenBSD machine or any other running a sensibly stocked unix-like operating systen. We start with finding the total number of delivery attempts logged where we have the body text 'am a hacker' (this would show up only after a sender has been blacklisted, so the gross number actual delivery attempts will likely be a tad higher), with the command

zgrep "am a hacker" /var/log/spamd.0.gz | awk '{print $6}' | wc -l

which tells us the number is 3372.

Next up we use a variation of the same command to extract the source IP addresses of the log entries that contain the string 'am a hacker', sort the result while also removing duplicates and store the end result in an environment variable called lastweek:

 export lastweek=`zgrep "am a hacker" /var/log/spamd.0.gz | awk '{print $6}' | tr -d ':' | sort -u `

With our list of IP addresses tucked away in the environment variable go on to: For each IP address in our lastweek set, extract all log entries and store the result (still in crude sort order by IP address), in the file 2020-02-29_i_am_hacker.raw.txt:

 for foo in $lastweek ; do zgrep $foo /var/log/spamd.0.gz | tee -a 2020-02-09_i_am_hacker.raw.txt ; done

For reference I kept the list of unique IP addresses (now totalling 231) around too.

Next, we are interested in extracting the target email addresses, so the command

grep "To:" 2020-02-29_i_am_hacker.raw.txt | awk '{print substr($0,index($0,$8))}' | sort -u

finds the lines in our original extract containing "To:", and gives us the list of target addresses the sources in our data set tried to deliver mail to.

The result is preserved as 2020-02-29_i_am_hacker.raw_targets.txt, a total of 236 addresses, mostly but not all in domains we actually host here. One surprise was that among the target addresses one actually invalid address turned up that was not at that time yet a spamtrap. See the end of the activity log for details (it also turned out to be the last SMTP entry in that log for 2020-02-29).

This little round of armchair analysis on the static data set confirms the conclusions from the original article: Apart from the possibly titillating aspects of the "adult" web site mentions and the attempt at playing on the target's potential shamefulness over specific actions, as spam campaigns go, this one is ordinary to the point of being a bit boring.

There may well be other actors preying on higher-value targets through their online clumsiness and known peculiarities of taste in an actually targeted fashion, but this is not it.

A final note on tools: In this article, like all previous entries, I have exclusively used the tools you will find in the OpenBSD (or other sensibly put together unixlike operating system) base system or at a stretch as an easily available package.

For the simpler, preliminary investigations and poking around like we have done here, the basic tools in the base system are fine. But if you will be performing log analysis at scale or with any regularity for purposes that influences your career path, I would encourage you to look into setting up a proper, purpose-built log analysis system.

Several good options, open source and otherwise, are available. I will not recommend or endorse any specific one, but when you find one that fits your needs and working style you will find that after the initial setup and learning period it will save you significant time.

As per my practice, only material directly relevant to the article itself has been published via the links. If you are a professional practitioner or researcher with who can state a valid reason to need access to unpublished material, please let me know and we will discuss your project.

Update 2020-03-02: I knew I had some early samples of messages that did make it to an inbox near me squirreled away somewhere, and after a bit of rummaging I found them, stored here (note the directory name, it seemed so obvious and transparent even back then). It appears that the oldest intact messages I have are from December 2018. I am sure earlier examples can be found if we look a littler harder.

Update 2020-03-17: A fresh example turned up this morning, addressed to (of all things) the postmaster account of one of our associated .no domains, written in Norwegian (and apparently generated with Microsoft Office software). The preserved message can be downloaded here

Update 2020-05-10: While rummaging about (aka 'researching') for something else I noticed that spamd logs were showing delivery attempts for messages with the subject "High level of danger. Your account was under attack."  So out of idle curiosity on an early Sunday afternoon, I did the following:

$ export muggles=`grep " High level of danger." /var/log/spamd | awk '{print $6}' | tr -d ':' | sort -u`
$ for foo in $muggles; do grep $foo /var/log/spamd >>20200510-muggles ; done


and the result is preserved for your entertainment and/or enlightenment here. Not much to see, really other than that they sent the message in two language varieties, and to a small subset of our imaginary friends.

Update 2020-08-13: Here is another snapshot of activity from August 12 and 13: this file preserves the activity of 19 different hosts, and as we can see that since they targeted our imaginary friends first, it is unlikely they reached any inboxes here. Some of these campaigns may have managed to reach users elsewhere, though

Update 2020-09-06: Occasionally these messages manage to hit a mailbox here. Apparently enough Norwegians fall for these scams that Norwegian language versions (not terribly well worded) get aimed at users here. This example, aimed at what has only ever been an email alias made it here, slipping through by a stroke of luck during a time that IP address was briefly not in the spamd-greytrap list here, as can be seen from this log excerpt. It is also worth noting that an identically phrased message was sent from another IP address to mailer-daemon@ for one of the domains we run here.

Update 2021-01-06: For some reason, a new variant turned up here today (with a second message a few minutes later and then a third), addressed to a generic contact address here. A very quick check of logs here only turned up only this indication of anything similar (based on a search for the variant spelling PRONOGRAPHIC), but feel free to check your own logs based on these samples if you like.

Update 2021-01-16: One more round, this for my Swedish alter ego. Apparently sent from a poorly secured Vietnamese system.

Update 2021-01-18: A Norwegian version has surfaced, attempted sent to approximately 115 addresses in .no domains we handle, fortunately the majority of the addresses targeted were in fact spamtraps, as this log extract shows.

Update 2021-03-03: After a few quiet weeks, another campaign started swelling our greytrapped hosts collection, as this hourly count of IP addresses in the traplist at dump to file time shows:

Tue Mar  2 21:10:01 CET 2021 : 2425
Tue Mar  2 22:10:01 CET 2021 : 4014
Tue Mar  2 23:10:01 CET 2021 : 4685
Wed Mar  3 00:10:01 CET 2021 : 4847
Wed Mar  3 01:10:01 CET 2021 : 5759
Wed Mar  3 02:10:01 CET 2021 : 6560
Wed Mar  3 03:10:01 CET 2021 : 6774
Wed Mar  3 04:10:01 CET 2021 : 7997
Wed Mar  3 05:10:01 CET 2021 : 8231
Wed Mar  3 06:10:01 CET 2021 : 8499
Wed Mar  3 07:10:01 CET 2021 : 9910
Wed Mar  3 08:10:01 CET 2021 : 10240
Wed Mar  3 09:10:01 CET 2021 : 11872
Wed Mar  3 10:10:01 CET 2021 : 12255
Wed Mar  3 11:10:01 CET 2021 : 13689 
Wed Mar  3 12:10:01 CET 2021 : 14181
Wed Mar  3 13:10:01 CET 2021 : 15259
Wed Mar  3 14:10:01 CET 2021 : 15881
Wed Mar  3 15:10:02 CET 2021 : 17061
Wed Mar  3 16:10:01 CET 2021 : 17625
Wed Mar  3 17:10:01 CET 2021 : 18758
Wed Mar  3 18:10:01 CET 2021 : 19170
Wed Mar  3 19:10:01 CET 2021 : 20028
Wed Mar  3 20:10:01 CET 2021 : 20578
Wed Mar  3 21:10:01 CET 2021 : 20997

and they attempted to get to mailer-daemon@, as can be seen from this preserved message as well as this one (both of which actually did inbox due to aliases).

Stay safe out there.

Update 2021-04-17: A new variant, somewhat crudely worded, inboxed today. Preserved here, here and here.

Saturday, December 28, 2019

The Year 2019 in Review: This Was, Once Again, Weirder Than the Last One

The year is 2019. By now Blade Runner is a movie about the past, but there are still bots out there trying to guess our passwords. It gets betterworse from here while the dictionaries expand.

The year is coming to an end and events during that year, as they happened, somehow lead to me leaving writing mainly to one side and blog posting only until I saw a bigger picture.

Now with only a couple of days left to go, we see that this year began much like the previous, with a not too bright set of bots endlessly trying to guess passwords. But on January 2, a new development caught my eye:


It was fairly obvious that some bot operator had the columns in their database mixed up, and I found the episode so laughable myself that I did not even bother to include it as the local part of a spamtrap. But as we will see later, it was an early sign of things to come. As you have probably suspected, the ssh password guessing activities have continued at pace, yielding this year so far

[Sat Dec 28 17:01:28] peter@skapet:~/website$ grep 2019 spamtraps-dateadded.txt | grep -c SSH
51233


that is, the local part of more than fifty thousand spamtraps.

In addition the early part of the year saw several campaigns of the email scams trying to extort various Bitcoin amounts in return for not publishing supposedly embarrassing videos, one of which I tweeted about:

You should be able to find further absurdities of a similar kind by looking for the hashtags #blooper_reel from that tweet as well ast #turbators. With those hashtags you will notice that there is at least anecdotal evidence that messages of the same kind have been directed at a significant subset of our spamtraps here (which for obvious reasons would not have been used in connection with any actual user login anywhere), evidenced by the spamd(8) log snippet preserved in this tweet:

And as noted in the followup tweet, other weirness was already happening:

More specifically, in the overnight haul on the morning of January 30th, I noticed via my scriptery that reports on such things that a large number of apparent bounce message deliveries to messages made up of "Western-firstname.Chinese-lastname@mydomain.tld", such as aaron.pu@bsdly.net or abby.na@bsdly.net, had turned up, in addition to a few other varieties with no dot in the middle, possibly indicating separate sources.

That initial overnight batch only had only a couple of hundred new potential spamtraps in it (as evidenced by the spamtraps added log), but even at that point the greylist data seemed to indicate that the bounces were produced by a relatively small set of IP addresses in Chinese networks. We see such bursts at times, but they rarely last long, so at first I did not think much of it before simply adding those addresses as spamtraps.

This was one round that kind of exceeded expectations, in that what we can only conclude was the noise generated by one or more phishing campaigns targeting Chinese users lasted well into April of this year and ended up yielding more than 120,000 "imaginary friends", or spamtraps as others would say. It is likely that each of those fake addresses were used more than once, and in this context we only count new ones, so the actual number of messages and users targeted was probably a lot larger than the number of faked email addresses found in our logs here.

The delivery attempt from this tweet may well have been a product of the same campaigns:


By this time whoever was behind the campaign may have acieved their goals and moved on, or we could hope that they had been shut down by competent authorities.

But back to the password guessers, sometimes referred to as The Hail Mary Cloud. We have seen amazing feats of incompetence on their part before, but I seriously thought we had reached peak when some bot tried to log on a system in my care as the user "*" (yes, asterisk):


It should be noted of course that this confused my very much grep(1)-based script that among other things turns up new candiates for spamtraps. But again it was an early indication that by their incompetence at least some of the bot herders had exposed their methods. Weird things turn up on occasion, but it took until October before it dawned on me that at least some of the password guessing bots could be running with their username and passwords fields swithched around:


A few days later I stopped trying to write a witty article about the phenomenon:

but I kept harvesting new entries for the local parts of spamtraps, while noting that my still grep(1)-centric script for detecting candiates would relatively frequently fail while trying to interpret what looked like regular expressions, with messages such as

grep: repetition-operator operand invalid
-bash: [: ==: unary operator expected


turning up instead.

Some of these entries (this month's worth so far* can be found in this file) were weird enough (would you actually have created a user called !@#$%^&*()dianlut+_ ?) that they had me thinking that the operators of those bots were actually trying to be smart by working from stolen and published collections of password hashes.

The wrinkle to this could turn to our advantage is that some of these operators managed to get the order of their fields wrong and are throwing either raw password hashes or decoded ones at our systems instead of matching user names. By reversing the process we might be able to see which collections are used, or other weird and creative things.

If you are interested in doing further research on this, please contact me by email or the comments. I consider the traplist data, the dates added log and the other material mentioned in this piece and links therein to be public and the data should be available to anyone. Howerer, some data exists only in some more detailed logs that are preserved here only to be seen by competent eyes and used for valid purposes. If you consider yourself such a person (aka a professional), please feel free to contact me.

All the while, we see the dictionaries of user names and passwords expanding, and I for one is more than willing to help out in the effort. It helps us all identify the never-do-wells as early as possible in the game.

The raw numbers for our contributions to the hopefully confusing dictionary as they stand right now are (they will be different when you read this):

We have a total of 242778 spamtraps, with the numbers added according to the dates added log this year at 131195 from SMTP traffic, 51233 from failed SSH login attempts and 11 innovations from POP3 logon attempts.

This means our list of spamtraps did not reach a full quarter milllion this year.

But I already sense that somebody, somewhere is about to say "Hold my beer".



* Update 2020-01-02: This file now has the complete, or as complete as can be with the current scriptery, list of usernames tried during the whole month of December 2019.

Update 2020-01-07: You would probably not notice from looking at the raw listing of attempted usernames so far this month, but the theme so far in 2020 among the new arrivals seems to be, of all things, three letter user names (take a peek at 2020 part so far at the end of the spamtraps added log). Go figure.


Sunday, November 4, 2018

Goodness, Enumerated by Robots. Or, Handling Those Who Do Not Play Well With Greylisting

SMTP email is not going away any time soon. If you run a mail service, when and to whom you present the code signifying a temporary local problem code is well worth your attention.

SMTP email is everywhere and is used by everyone.

If you are a returning reader, there is a higher probability that you run a mail service yourself than in the general population.

This in turn means that you will be aware that one of the rather annoying oversights of the original and still-current specifications of the SMTP based mail system is that while it's straightforward to announce which systems are supposed to receive mail for a domain, specifying which hosts would be valid email senders was not part or the original specification at all.

Any functioning domain MUST have at least one MX (mail exchanger) record published via the domain name system, and registrars will generally not even let you register a domain unless you have set up somewhere to receive mail for the domain.

But email worked most of the time anyway, and while you would occasionally hear about valid mail not getting delivered, it was a rarer occurrence than you might think.

Then a few years along, the Internet grew out of the pure research arena and became commercial, and spam started happening. Even in the early days of spam it seems that a significant subset of the messages, possibly even the majority, was sent with faked sender addresses in domains not connected to the actual senders.

Over time people have tried a number of approaches to the problems involved in getting rid of unwanted commercial and/or malware carrying email. If you are interested in a deeper dive into the subject, you could jump over to my earlier piece Effective Spam and Malware Countermeasures - Network Noise Reduction Using Free Tools.

Two very different methods of reducing spam traffic were originally formulated at roughly the same time, and each method's adherents are still duking it out over which approach is the better one.

One method consists simply of implementing a strict interpretation of a requirement that was already formulated in the SMTP RFC at the time.

The other is a complicated extension of the SMTP-relevant data that is published via DNS, and full implementation would require reconfiguration of every SMTP email system in the world.

As you might have guessed, the first is what is commonly referred to as greylisting, where we point to the RFC's requirement that on encountering a temporary error, the sender MUST (RFC language does not get stronger than this) retry delivery at a later time and keep trying for a reasonable amount of time.

Spammers generally did not retry as per the RFC specifications, and even early greylisting adopters saw huge drop in the volume of spam that actually made it to mailboxes.

On the other hand, end users would sometimes wonder why their messages were delayed, and some mail administrators did not take well to seeing the volume of data sitting in the mail spool directories grow measurably, if not usually uncontrollably, while successive retries after waiting were in progress.

In what could almost almost appear as a separate, unconnected universe, other network engineers set out to fix the now glaringly obvious omission in the existing RFCs.

A way to announce valid senders was needed, and the specification that was to be known as the Sender Policy Framework (SPF for short) was offered to the world. SPF offered a way to specify which IP addresses valid mail from a domain were supposed to come from, and even included ways to specify how strictly the limitations it presented should be enforced at the receiving end.

The downsides were that all mail handling would need to be upgraded with code that supported the specification, and as it turned out, traditional forwarding such as performed by common mailing list software would not easily be made compatible with SPF.

The flame wars over both methods. You either remember them or should be able to imagine how they played out.

And while the flames grew less frequent and generally less fierce over time, mail volumes grew to the level where operators would have a large number of servers for outgoing mail, and while the site would honor the requirement to retry delivery, the retries would not be guaranteed to come from the same IP address as the original attempt.

It was becoming clear to greylisting practitioners that interpreting published SPF data as known good senders was the most workable way forward. Several of us already had started maintaining nospamd tables (see eg this slide and this), and using the output of

$ host -ttxt domain.tld

(sometimes many times over because some domains use include statements), we generally made do. I even made a habit of publishing my nospamd file.

As hinted in this slide, smtpctl (part of the OpenSMTPd system and in your OpenBSD base system) now since OpenBSD 6.3 is able to retrieve the entire contents of the published SPF information for any domain you feed it.

Looking over my old nospamd file during the last week or so I found enough sedimentary artifacts there, including IP addresses for which there was no explanation and that lacked a reverse lookup, that I turned instead to deciphering which domains had been problematic and wrote a tiny script to generate a fresh nospamd on demand, based on fresh SPF lookups on those domains. The list of domains fed to the script is available here, but please do edit to suit your local needs.

For those wary of clicking links to scripts, it reads like this:

#!/bin/sh
domains=`cat thedomains.txt`
outfile=nospamd
generatedate=`date`
operator="Peter Hansteen <peter@bsdly.net>"
locals=local-additions

echo "##############################################################################################">$outfile;
echo "# This is the `hostname` nospamd generated from domains at $generatedate. ">>$outfile;
echo "# See https://bsdly.blogspot.com/2018/11/goodness-enumerated-by-robots-or.html for some">>$outfile;
echo "# background and on why you should generate your own and not use this one.">>$outfile;
echo "# Any questions should be directed to $operator. ">>$outfile;
echo "##############################################################################################">>$outfile;
echo >>$outfile;

for dom in $domains; do 
 echo "processing $dom";
 echo "# $dom starts #########">>$outfile;
 echo >>$outfile;
 echo $dom | doas smtpctl spf walk >>$outfile;
 echo "# $dom ends ###########">>$outfile;
 echo >>$outfile;
done

echo "##############################################################################################">>$outfile;
echo "# processing done at `date`.">>$outfile; 
echo "##############################################################################################">>$outfile;

echo "adding local additions from $locals";
echo "# local additions below here ----" >>$outfile;
cat $locals >> $outfile;

If you have been in the habit of fetching my nospamd, you have been fetching the output of this script for the last day or so.

What it does is simply read a prepared list of domains, run them through smtpctl spf walk and slap the results in a file which you would then load into the pf configuration on your spamd machine. You can even tack on a few local additions that for whatever reason do not come naturally from the domains list.

But I would actually recommend you do not fetch my generated data, and rather use this script or a close relative of it (it's a truly trivial script and you probably can create a better version) and your own list of domains to generate a nospamd tailored to your local environment.

The specific list of domains is derived from more than a decade of maintaining my setup and the specific requests for whitelisting I have received from my users or quick fixes to observed problems in that period. It is conceivable that some domains that were problematic in the past no longer are, and unless we actually live in the same area, some of the domains in my list are probably not relevant to your users. There is even the possibility that some of the larger operators publish different SPF information in specific parts of the world, so the answers I get may not even match yours in all cases.

So go ahead, script and generate! This is your chance to help the robots generate some goodness, for the benefit of your users.

In related news, a request from my new colleagues gave me an opportunity to update the sometimes-repeated OpenBSD and you presentation so it now has at least some information on OpenBSD 6.4. You could call the presentation a bunch of links in a thin wrapper of advocacy and you would not be very wrong.

If you have comments or questions on any of the issues raised in this article, please let me know, preferably via the (moderated) comments field, but I have also been known to respond to email and via various social media message services.

Update 2018-11-11: A few days after I had posted this article, an incident happened that showed the importance of keeping track of both goodness and badness for your services. This tweet is my reaction to a few quick glances at the bsdly.net mail server log:

A little later I'm clearly pondering what to do, including doing another detailed writeup.
Fortunately I had had some interaction with this operator earlier, so I knew roughly how to approach them. I wrote a couple of quick messages to their abuse contacts and made sure to include links to both my spamtrap resources and a fresh log excerpt that indicated clearly that someone or someones in their network was indeed progressing from top to bottom of the spamtraps list.
As the last tweet says, delivery attempts stopped after progressing to somewhere into the Cs. The moral might be that a list of spamtraps like the one I publish might be useful for other sites to filtering their outgoing mail. Any activity involving the known-bad addresses would be a strong indication that somebody made a very unwise purchasing decision involving address lists.

Update 2019-08-07: Gmail seems to be stuck on considering bsdly.net mail spam these days. If you are using a Google-attached mail service and have not received mail you were expecting from me, please check your spam folder and if you find anything, please use the "Report as not spam" feature.

Update 2019-08-07: Updated script and generated file comment with encouragement to generate your own nospamd based on local needs, included link to the list used for the last generate-nospamd run.

Monday, August 13, 2018

Badness, Enumerated by Robots

A condensed summary of the blacklist data generated from traffic hitting bsdly.net and cooperating sites.

After my runbsd.info entry (previously bsdjobs.com) was posted, there has been an uptick in interest about the security related data generated at the bsdly.net site. I have written quite extensively about these issues earlier so I'll keep this piece short. If you want to go deeper, the field note-like articles I reference and links therein will offer some further insights.

There are three separate sets of downloadable data, all automatically generated and with only very occasional manual intervention.


Known spam sources during the last 24 hours

This is the list directly referenced in the BSDjobs.com piece.

This is a greytrapping based list, where the conditions for inclusion are simple: Attempts at delivery to known-bad addresses (download link here) in domains we handle mail for have happened within the last 24 hours.

In addition there will occasionally be some addresses added by cron jobs I run that pick the IP addresses of hosts that sent mail that made it through greylisting performed by our spamd(8) but did not pass the subsequent spamassassin or clamav treatment. The bsdly.net system is part of the bgp-spamd cooperation.

The traplist has a home page and at one point was furnished with a set of guidelines.

A partial history (the log starts 2017-05-20) of when spamtraps were added and from which sources can be found in this log (or at this alternate location). Read on for a bit of information on the alternate sources.

Misc other bots: SSH Password bruteforcing, malicious web activity, POP3 Password Bruteforcing.

The bruteforcers list is really a combination of several things, delivered as one file but with minimal scripting ability you should be able to dig out the distinct elements, described in this piece.

The (usually) largest chunk is a list of hosts that hit the rate limit for SSH connections described in the article or that was caught trying to log on as a non-existent user or other undesirable activity aimed at my sshd(8) service. Some as yet unpublished scriptery helps me feed the miscreants that the automatic processes do not catch into the table after a manual quality check. For a more thorough treatment of ssh bruteforcers, see the The Hail Mary Cloud and the Lessons Learned overview article which links to several other articles in the sequence.

The second part is a list of IP addresses that tried to access our web service in undesirable ways, including trying for specific URLs or files that will never be found at any world-facing part of our site.

After years of advocating short lifetimes (typically 24 hours) for blacklist entries only to see my logs fill up with attempts made at slightly slower speeds, I set the lifetime for entries in this data set to 28 days (since expanded to 2419200 seconds, or if you will, six weeks). The background including some war stories of monitoring SSH password groping can be found in this piece, while the more recent piece here covers some of the weeding out bad web activity.

The POP3 gropers list comes in two variations. Again lists of IP addresses caught trying to access a service, most of those accesses are to non-existent user names with an almost perfect overlap with the spamtraps list, local-part only (the part before the @ sign).

The big list is a complete corpus of IP addresses that have tried these kinds of accesses since I started recording and trapping them (see this piece for some early experience and this one for the start of the big collection).

There is also a smaller set, produced from the longterm table described in this piece. For much the same reason I did not stick to 24-hour expiry for the SSH list, this one has six-week expiry. With some minimal scriptery I run by hand one or two times per day, any invalid POP3 accesses to valid accounts get their IP adresses added to the longterm table and the exported list.

If you're wondering about the title, the term "enumerating badness" stems from Marcus Ranum's classic piece The Six Dumbest Ideas in Computer Security. Please do read that one.

Here are a few other references other than those referenced in the paragraphs above that you might find useful:

The Book of PF, 3rd edition
Hey, spammer! Here's a list for you! which contains the announcement of the bsdly.net traplist.
Effective Spam and Malware Countermeasures, a more complete treatment of those keywords

If you're interested in further information on any of this, the most useful contact information is in the comment blocks in the exported lists.

Update 2020-07-29: I added a direct link to the complete list of spamtraps, since the web page seemed a bit crowded to at least one visitor. Direct link again here for your convenience.

Update 2021-01-15: Note that at some point after the article was written I cranked up expiry for the bruteforce tables to six weeks (sorry, I forgot to note the exact date).

Update 2021-03-11: In light of recent Microsoft Exchange exploits it might interest some that any request to bsdly.net for "GET /owa/" lands the source in the webtrash table, exported as part of the bruteforcers list.

Sunday, April 1, 2018

ed(1) mastery is a must for a real Unix person

ed(1) is the standard editor. Now there's a book out to help you master this fundamental Unix tool.

In some circles on the Internet, your choice of text editor is a serious matter.

We've all seen the threads on mailing lits, USENET news groups and web forums about the relative merits of Emacs vs vi, including endless iterations of flame wars, and sometimes even involving lesser known or non-portable editing environments.

And then of course, from the Linux newbies we have seen an endless stream of tweeted graphical 'memes' about the editor vim (aka 'vi Improved') versus the various apparently friendlier-to-some options such as GNU nano. Apparently even the 'improved' version of the classical and ubiquitous vi(1) editor is a challenge even to exit for a significant subset of the younger generation.

Yes, your choice of text editor or editing environment is a serious matter. Mainly because text processing is so fundamental to our interactions with computers.

But for those of us who keep our systems on a real Unix (such as OpenBSD or FreeBSD), there is no real contest. The OpenBSD base system contains several text editors including vi(1) and the almost-emacs mg(1), but ed(1) remains the standard editor.

Now Michael Lucas has written a book to guide the as yet uninitiated to the fundamentals of the original Unix text editor. It is worth keeping in mind that much of Unix and its original standard text editor written back when the standard output and default user interface was more likely than not a printing terminal.

To some of us, reading and following the narrative of Ed Mastery is a trip down memory lane. To others, following along the text will illustrate the horror of the world of pre-graphic computer interfaces. For others again, the fact that ed(1) doesn't use your terminal settings much at all offers hope of fixing things when something or somebody screwed up your system so you don't have a working terminal for that visual editor.

ed(1) is a line editor. And while you may have heard mutters that 'vi is just a line editor in drag', vi(1) does offer a distinctly visual interface that only became possible with the advent of the video terminal, affectionately known as the glass teletype. ed(1) offers no such luxury, but as the book demonstrates, even ed(1) is able to display any part of a file's content for when you are unsure what your file looks like.

The book Ed Mastery starts by walking the reader through a series of editing sessions using the classical ed(1) line editing interface. To some readers the thought of editing text while not actually seeing at least a few lines at the time onscreen probably sounds scary.  This book shows how it is done and while the author never explicitly mentions it, the text aptly demonstrates how the ed(1) command set is in fact the precursor of of how things are done in many Unix text processing programs.

As one might expect, the walkthrough of ed(1) text editing functionality is followed up by a sequence on searching and replacing which ultimately leads to a very readable introduction to regular expressions, which of course are part of the ed(1) package too. If you know your ed(1) command set, you are quite far along in the direction of mastering the stream editor sed(1), as well as a number of other systems where regular expressions play a crucial role.

After the basic editing functionality and some minor text processing magic has been dealt with, the book then proceeds to demonstrate ed(1) as a valuable tool in your Unix scripting environment. And once again, if you can do something with ed, you can probably transfer that knowledge pretty much intact to use with other Unix tools.

The eighty-some text pages of Ed Mastery are a source of solid information on the ed(1) tool itself with a good helping of historical context that will make it clearer to newcomers why certain design choices were made back when the Unix world was new. A number of these choices influence how we interact with the modern descendants of the Unix systems we had back then.

Your choice of text editor is a serious matter. With this book, you get a better foundation for choosing the proper tool for your text editing and text processing needs. I'm not saying that you have to switch to the standard editor, but after reading Ed Mastery , your choice of text editing and processing tools will be a much better informed one.

Ed Mastery  is available now directly from Michael W. Lucas' books site at https://www.michaelwlucas.com/tools/ed, and will most likely appear in other booksellers' catalogs as soon as their systems are able to digest the new data.

Do read the book, try out the standard editor and have fun!

Saturday, February 17, 2018

A Life Lesson in Mishandling SMTP Sender Verification

An attempt to report spam to a mail service provider's abuse address reveals how incompetence is sometimes indistinguishable from malice.

It all started with one of those rare spam mails that got through.

This one was hawking address lists, much like the ones I occasionally receive to addresses that I can not turn into spamtraps. The message was addressed to, of all things, root@skapet.bsdly.net. (The message with full headers has been preserved here for reference).

Yes, that's right, they sent their spam to root@. And a quick peek at the headers revealed that like most of those attempts at hawking address lists for spamming that actually make it to a mailbox here, this one had been sent by an outlook.com customer.

The problem with spam delivered via outlook.com is that you can't usefully blacklist the sending server, since the largish chunk of the world that uses some sort of Microsoft hosted email solution (Office365 and its ilk) have their usually legitimate mail delivered via the very same infrastructure.

And since outlook.com is one of the mail providers that doesn't play well with greylisting (it spreads its retries across no less than 81 subnets (the output of 'echo outlook.com | doas smtpctl spf walk' is preserved here), it's fairly common practice to just whitelist all those networks and avoid the hassle of lost or delayed mail to and from Microsoft customers.

I was going to just ignore this message too, but we've seen an increasing number of spammy outfits taking advantage of outlook.com's seeming right of way to innocent third parties' mail boxes.

So I decided to try both to do my best at demoralizing this particular sender and alert outlook.com to their problem. I wrote a messsage (preserved here) with a Cc: to abuse@outlook.com where the meat is,

Ms Farell,

The address root@skapet.bsdly.net has never been subscribed to any mailing list, for obvious reasons. Whoever sold you an address list with that address on it are criminals and you should at least demand your money back.

Whoever handles abuse@outlook.com will appreciate the attachment, which is a copy of the message as it arrived here with all headers intact.

Yours sincerely,
Peter N. M. Hansteen

What happened next is quite amazing.

If my analysis is correct, it may not be possible for senders who are not themselves outlook.com customers to actually reach the outlook.com abuse team.

Almost immediately after I sent the message to Ms Farell with a Cc: to abuse@outlook.com, two apparently identical messages from staff@hotmail.com, addressed to postmaster@bsdly.net appeared (preserved here and here), with the main content of both stating

This is an email abuse report for an email message received from IP 216.32.180.51 on Sat, 17 Feb 2018 01:59:21 -0800.
The message below did not meet the sending domain's authentication policy.
For more information about this format please see http://www.ietf.org/rfc/rfc5965.txt.

In order to understand what happened here, it is necessary to look at the mail server log for a time interval of a few seconds (preserved here).

The first few lines describe the processing of my outgoing message:

2018-02-17 10:59:14 1emzGs-0009wb-94 <= peter@bsdly.net H=(greyhame.bsdly.net) [192.168.103.164] P=esmtps X=TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128 CV=no S=34977 id=31b4ffcf-bf87-de33-b53a-0 ebff4349b94@bsdly.net

My server receives the message from my laptop, and we can see that the connection was properly TLS encrypted

2018-02-17 10:59:15 1emzGs-0009wb-94 => peter <root@skapet.bsdly.net> R=localuser T=local_delivery

I had for some reason kept the original recipient among the To: addresses. Actually useless but also harmless.

2018-02-17 10:59:16 1emzGs-0009wb-94 [104.47.40.33] SSL verify error: certificate name mismatch: DN="/C=US/ST=WA/L=Redmond/O=Microsoft Corporation/OU=Microsoft Corporation/CN=mail.protection.outlook.com" H="outlook-com.olc.protection.outlook.com"
2018-02-17 10:59:18 1emzGs-0009wb-94 SMTP error from remote mail server after end of data: 451 4.4.0 Message failed to be made redundant due to A shadow copy was required but failed to be made with an AckStatus of Fail [CO1NAM03HT002.eop-NAM03.prod.protection.outlook.com] [CO1NAM03FT002.eop-NAM03.prod.protection.outlook.com]
2018-02-17 10:59:19 1emzGs-0009wb-94 [104.47.42.33] SSL verify error: certificate name mismatch: DN="/C=US/ST=WA/L=Redmond/O=Microsoft Corporation/OU=Microsoft Corporation/CN=mail.protection.outlook.com" H="outlook-com.olc.protection.outlook.com"


What we see here is that even a huge corporation like Microsoft does not always handle certificates properly. The certificate they present for setting up the encrypted connection is not actually valid for the host name that the outlook.com server presents.

There is also what I interpret as a file system related message which I assume is meaningful to someone well versed in Microsoft products, but we see that

2018-02-17 10:59:20 1emzGs-0009wb-94 => janet@prospectingsales.net R=dnslookup T=remote_smtp H=prospectingsales-net.mail.protection.outlook.com [23.103.140.138] X=TLSv1.2:ECDHE-RSA-AES256-SHA384:256 CV=yes K C="250 2.6.0 <31b4ffcf-bf87-de33-b53a-0ebff4349b94@bsdly.net> [InternalId=40926743365667, Hostname=BMXPR01MB0934.INDPRD01.PROD.OUTLOOK.COM] 44350 bytes in 0.868, 49.851 KB/sec Queued mail for delivery"

even though the certificate fails the verification part, the connection sets up with TLSv1.2 anyway, and the message is accepted with a "Queued mail for delivery" message.

The message is also delivered to the Cc: recipient:

2018-02-17 10:59:21 1emzGs-0009wb-94 => abuse@outlook.com R=dnslookup T=remote_smtp H=outlook-com.olc.protection.outlook.com [104.47.42.33] X=TLSv1.2:ECDHE-RSA-AES256-SHA384:256 CV=no K C="250 2.6.0 <31b4ffcf-bf87-de33-b53a-0ebff4349b94@bsdly.net> [InternalId=3491808500196, Hostname=BY2NAM03HT071.eop-NAM03.prod.protection.outlook.com] 42526 bytes in 0.125, 332.215 KB/sec Queued mail for delivery"
2018-02-17 10:59:21 1emzGs-0009wb-94 Completed


And the transactions involving my message would normally have been completed.

But ten seconds later this happens:

2018-02-17 10:59:31 1emzHG-0004w8-0l <= staff@hotmail.com H=bay004-omc1s10.hotmail.com [65.54.190.21] P=esmtps X=TLSv1.2:ECDHE-RSA-AES256-SHA384:256 CV=no K S=43968 id=BAY0-XMR-100m4KrfmH000a51d4@bay0-xmr-100.phx.gbl
2018-02-17 10:59:31 1emzHG-0004w8-0l => peter <postmaster@bsdly.net> R=localuser T=local_delivery
2018-02-17 10:59:31 1emzHG-0004w8-0l => peter <postmaster@bsdly.net> R=localuser T=local_delivery


That's the first message to my domain's postmaster@ address, followed two seconds later by

2018-02-17 10:59:33 1emzHI-0004w8-Fy <= staff@hotmail.com H=bay004-omc1s10.hotmail.com [65.54.190.21] P=esmtps X=TLSv1.2:ECDHE-RSA-AES256-SHA384:256 CV=no K S=43963 id=BAY0-XMR-100Q2wN0I8000a51d3@bay0-xmr-100.phx.gbl
2018-02-17 10:59:33 1emzHI-0004w8-Fy => peter <postmaster@bsdly.net> R=localuser T=local_delivery
2018-02-17 10:59:33 1emzHI-0004w8-Fy Completed


a second, apparently identical message.

Both of those messages state that the message I sent to abuse@outlook.com had failed SPF verification, because the check happened on connections from NAM03-BY2-obe.outbound.protection.outlook.com (216.32.180.51) by whatever handles incoming mail to the staff@hotmail.com address, which apparently is where the system forwards abuse@outlook.com's mail.

Reading Microsoft Exchange's variant SMTP headers has never been my forte, and I won't try decoding the exact chain of events here since that would probably also require you to have fairly intimate knowledge of Microsoft's internal mail delivery infrastructure.

But even a quick glance at the messages reveals that the message passed SPF and other checks on incoming to the outlook.com infrastructure, but may have ended up not getting delivered after all since a second SPF test happened on a connection from a host that is not in the sender domain's SPF record.

In fact, that second test would only succeed for domains that have

include:spf.protection.outlook.com

in their SPF record, and those would presumably be Outlook.com customers.

Any student or practitioner of SMTP mail delivery should know that SPF records should only happen on ingress, that is at the point where the mail traffic enters your infrastructure and the sender IP address is the original one. Leave the check for later when the message may have been forwarded, and you do not have sufficient data to perform the check.

Whenever I encounter incredibly stupid and functionally destructive configuration errors like this I tend to believe they're down to simple incompetence and not malice.

But this one has me wondering. If you essentially require incoming mail to include the contents of spf.outlook.com (currently no less than 81 subnets) as valid senders for the domain, you are essentially saying that only outlook.com customers are allowed to communicate.

If that restriction is a result of a deliberate choice rather than a simple configuration error, the problem moves out of the technical sphere and could conceivably become a legal matter, depending on what outlook.com have specified in their contracts that they are selling to their customers.

But let us assume that this is indeed a matter of simple bad luck or incompetence and that the solution is indeed technical.

I would have liked to report this to whoever does technical things at that domain via email, but unfortunately there are indications that being their customer is a precondition for using that channel of communication to them.

I hope they fix that, and soon. And then move on to terminating their spamming customers' contracts.

The main lesson to be learned from this is that when you shop around for email service, please do yourself a favor and make an effort to ensure that your prospective providers actually understand how the modern-ish SMTP addons SPF, DKIM and DMARC actually work.

Otherwise you may end up receiving more of the mail you don't want than what you do want, and your own mail may end up not being delivered as intended.

Update 2018-02-19: Just as I was going to get ready for bed (it's late here in CET) another message from Ms Farell arrived, this time to an alias I set up in order to make it easier to filter PF tutorial related messages into a separate mailbox.

I wrote another response, and as the mail server log will show, despite the fact that a friend with an Office365 contract contacted them quoting this article, outlook.com have still not fixed the problem. Two more messages (preserved here and here) shot back here immediately.

Update 2018-02-20: A response from Microsoft, with pointers to potentially useful information.

A message from somebody identifying as working for Microsoft Online Safety arrived, apparently responding to my message dated 2018-02-19, where the main material was,

Hi,

Based on the information you provided, it appears to have originated from an Office 365 or Exchange Online tenant account.

To report junk mail from Office 365 tenants, send an email to junk@office365.microsoft.com   and include the junk mail as an attachment.

This link provides further junk mail education https://technet.microsoft.com/en-us/library/jj200769(v=exchg.150).aspx.

Kindly,
I have asked for clarification of some points, but no response has arrived by this getting close to bedtime in CET.

However I did take the advice to forward the offending messages as attachment to the junk@ message, and put the outlook.com abuse address in the Cc: on that message. My logs indicate that the certificate error had not gone away, but no SPF-generated bounces appeared either.

If Microsoft responds with further clarifications, I will publish a useful condensate here.

Update 2019-07-16: If you were wondering how I make the output of smtpctl spf walk useful as mentioned in this article, please see the article Goodness, Enumerated by Robots. Or, Handling Those Who Do Not Play Well With Greylisting for some specifics.


In other news, there will be PF tutorial at the 2018 AsiaBSDCon in Tokyo. Follow the links for the most up to date information.