Monday, August 13, 2018

Badness, Enumerated by Robots

A condensed summary of the blacklist data generated from traffic hitting bsdly.net and cooperating sites.

After my BSDjobs.com entry was posted, there has been an uptick in interest about the security related data generated at the bsdly.net site. I have written quite extensively about these issues earlier so I'll keep this piece short. If you want to go deeper, the field note-like articles I reference and links therein will offer some further insights.

There are three separate sets of downloadable data, all automatically generated and with only very occasional manual intervention.


Known spam sources during the last 24 hours

This is the list directly referenced in the BSDjobs.com piece.

This is a greytrapping based list, where the conditions for inclusion are simple: Attempts at delivery to known-bad addresses in domains we handle mail for have happened within the last 24 hours.

In addition there will occasionally be some addresses added by cron jobs I run that pick the IP addresses of hosts that sent mail that made it through greylisting performed by our spamd(8) but did not pass the subsequent spamassassin or clamav treatment. The bsdly.net system is part of the bgp-spamd cooperation.

The traplist has a home page and at one point was furnished with a set of guidelines.

Misc other bots: SSH Password bruteforcing, malicious web activity, POP3 Password Bruteforcing.

The bruteforcers list is really a combination of several things, delivered as one file but with minimal scripting ability you should be able to dig out the distinct elements, described in this piece.

The (usually) largest chunk is a list of hosts that hit the rate limit for SSH connections described in the article or that was caught trying to log on as a non-existent user or other undesirable activity aimed at my sshd(8) service. Some as yet unpublished scriptery helps me feed the miscreants that the automatic processes do not catch into the table after a manual quality check.

The second part is a list of IP addresses that tried to access our web service in undesirable ways, including trying for specific URLs or files that will never be found at any world-facing part of our site.

After years of advocating short lifetimes (typically 24 hours) for blacklist entries only to see my logs fill up with attempts made at slightly slower speeds, I set the lifetime for entries in this data set to 28 days. The background including some war stories of monitoring SSH password groping can be found in this piece, while the more recent piece here covers some of the weeding out bad web activity.

The POP3 gropers list comes in two variations. Again lists of IP addresses caught trying to access a service, most of those accesses are to non-existent user names with an almost perfect overlap with the spamtraps list, local-part only (the part before the @ sign).

The big list is a complete corpus of IP addresses that have tried these kinds of accesses since I started recording and trapping them (see this piece for some early experience and this one for the start of the big collection).

There is also a smaller set, produced from the longterm table described in this piece. For much the same reason I did not stick to 24-hour expiry for the SSH list, this one has six-week expiry. With some minimal scriptery I run by hand one or two times per day, any invalid POP3 accesses to valid accounts get their IP adresses added to the longterm table and the exported list.

If you're wondering about the title, the term "enumerating badness" stems from Marcus Ranum's classic piece The Six Dumbest Ideas in Computer Security. Please do read that one.

Here are a few other references other than those referenced in the paragraphs above that you might find useful:

The Book of PF, 3rd edition
Hey, spammer! Here's a list for you! which contains the announcement of the bsdly.net traplist.
Effective Spam and Malware Countermeasures, a more complete treatment of those keywords

If you're interested in further information on any of this, the most useful contact information is in the comment blocks in the exported lists.

Sunday, April 1, 2018

ed(1) mastery is a must for a real Unix person

ed(1) is the standard editor. Now there's a book out to help you master this fundamental Unix tool.

In some circles on the Internet, your choice of text editor is a serious matter.

We've all seen the threads on mailing lits, USENET news groups and web forums about the relative merits of Emacs vs vi, including endless iterations of flame wars, and sometimes even involving lesser known or non-portable editing environments.

And then of course, from the Linux newbies we have seen an endless stream of tweeted graphical 'memes' about the editor vim (aka 'vi Improved') versus the various apparently friendlier-to-some options such as GNU nano. Apparently even the 'improved' version of the classical and ubiquitous vi(1) editor is a challenge even to exit for a significant subset of the younger generation.

Yes, your choice of text editor or editing environment is a serious matter. Mainly because text processing is so fundamental to our interactions with computers.

But for those of us who keep our systems on a real Unix (such as OpenBSD or FreeBSD), there is no real contest. The OpenBSD base system contains several text editors including vi(1) and the almost-emacs mg(1), but ed(1) remains the standard editor.

Now Michael Lucas has written a book to guide the as yet uninitiated to the fundamentals of the original Unix text editor. It is worth keeping in mind that much of Unix and its original standard text editor written back when the standard output and default user interface was more likely than not a printing terminal.

To some of us, reading and following the narrative of Ed Mastery is a trip down memory lane. To others, following along the text will illustrate the horror of the world of pre-graphic computer interfaces. For others again, the fact that ed(1) doesn't use your terminal settings much at all offers hope of fixing things when something or somebody screwed up your system so you don't have a working terminal for that visual editor.

ed(1) is a line editor. And while you may have heard mutters that 'vi is just a line editor in drag', vi(1) does offer a distinctly visual interface that only became possible with the advent of the video terminal, affectionately known as the glass teletype. ed(1) offers no such luxury, but as the book demonstrates, even ed(1) is able to display any part of a file's content for when you are unsure what your file looks like.

The book Ed Mastery starts by walking the reader through a series of editing sessions using the classical ed(1) line editing interface. To some readers the thought of editing text while not actually seeing at least a few lines at the time onscreen probably sounds scary.  This book shows how it is done and while the author never explicitly mentions it, the text aptly demonstrates how the ed(1) command set is in fact the precursor of of how things are done in many Unix text processing programs.

As one might expect, the walkthrough of ed(1) text editing functionality is followed up by a sequence on searching and replacing which ultimately leads to a very readable introduction to regular expressions, which of course are part of the ed(1) package too. If you know your ed(1) command set, you are quite far along in the direction of mastering the stream editor sed(1), as well as a number of other systems where regular expressions play a crucial role.

After the basic editing functionality and some minor text processing magic has been dealt with, the book then proceeds to demonstrate ed(1) as a valuable tool in your Unix scripting environment. And once again, if you can do something with ed, you can probably transfer that knowledge pretty much intact to use with other Unix tools.

The eighty-some text pages of Ed Mastery are a source of solid information on the ed(1) tool itself with a good helping of historical context that will make it clearer to newcomers why certain design choices were made back when the Unix world was new. A number of these choices influence how we interact with the modern descendants of the Unix systems we had back then.

Your choice of text editor is a serious matter. With this book, you get a better foundation for choosing the proper tool for your text editing and text processing needs. I'm not saying that you have to switch to the standard editor, but after reading Ed Mastery , your choice of text editing and processing tools will be a much better informed one.

Ed Mastery  is available now directly from Michael W. Lucas' books site at https://www.michaelwlucas.com/tools/ed, and will most likely appear in other booksellers' catalogs as soon as their systems are able to digest the new data.

Do read the book, try out the standard editor and have fun!

Saturday, February 17, 2018

A Life Lesson in Mishandling SMTP Sender Verification

An attempt to report spam to a mail service provider's abuse address reveals how incompetence is sometimes indistinguishable from malice.

It all started with one of those rare spam mails that got through.

This one was hawking address lists, much like the ones I occasionally receive to addresses that I can not turn into spamtraps. The message was addressed to, of all things, root@skapet.bsdly.net. (The message with full headers has been preserved here for reference).

Yes, that's right, they sent their spam to root@. And a quick peek at the headers revealed that like most of those attempts at hawking address lists for spamming that actually make it to a mailbox here, this one had been sent by an outlook.com customer.

The problem with spam delivered via outlook.com is that you can't usefully blacklist the sending server, since the largish chunk of the world that uses some sort of Microsoft hosted email solution (Office365 and its ilk) have their usually legitimate mail delivered via the very same infrastructure.

And since outlook.com is one of the mail providers that doesn't play well with greylisting (it spreads its retries across no less than 81 subnets (the output of 'echo outlook.com | doas smtpctl spf walk' is preserved here), it's fairly common practice to just whitelist all those networks and avoid the hassle of lost or delayed mail to and from Microsoft customers.

I was going to just ignore this message too, but we've seen an increasing number of spammy outfits taking advantage of outlook.com's seeming right of way to innocent third parties' mail boxes.

So I decided to try both to do my best at demoralizing this particular sender and alert outlook.com to their problem. I wrote a messsage (preserved here) with a Cc: to abuse@outlook.com where the meat is,

Ms Farell,

The address root@skapet.bsdly.net has never been subscribed to any mailing list, for obvious reasons. Whoever sold you an address list with that address on it are criminals and you should at least demand your money back.

Whoever handles abuse@outlook.com will appreciate the attachment, which is a copy of the message as it arrived here with all headers intact.

Yours sincerely,
Peter N. M. Hansteen

What happened next is quite amazing.

If my analysis is correct, it may not be possible for senders who are not themselves outlook.com customers to actually reach the outlook.com abuse team.

Almost immediately after I sent the message to Ms Farell with a Cc: to abuse@outlook.com, two apparently identical messages from staff@hotmail.com, addressed to postmaster@bsdly.net appeared (preserved here and here), with the main content of both stating

This is an email abuse report for an email message received from IP 216.32.180.51 on Sat, 17 Feb 2018 01:59:21 -0800.
The message below did not meet the sending domain's authentication policy.
For more information about this format please see http://www.ietf.org/rfc/rfc5965.txt.

In order to understand what happened here, it is necessary to look at the mail server log for a time interval of a few seconds (preserved here).

The first few lines describe the processing of my outgoing message:

2018-02-17 10:59:14 1emzGs-0009wb-94 <= peter@bsdly.net H=(greyhame.bsdly.net) [192.168.103.164] P=esmtps X=TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128 CV=no S=34977 id=31b4ffcf-bf87-de33-b53a-0 ebff4349b94@bsdly.net

My server receives the message from my laptop, and we can see that the connection was properly TLS encrypted

2018-02-17 10:59:15 1emzGs-0009wb-94 => peter <root@skapet.bsdly.net> R=localuser T=local_delivery

I had for some reason kept the original recipient among the To: addresses. Actually useless but also harmless.

2018-02-17 10:59:16 1emzGs-0009wb-94 [104.47.40.33] SSL verify error: certificate name mismatch: DN="/C=US/ST=WA/L=Redmond/O=Microsoft Corporation/OU=Microsoft Corporation/CN=mail.protection.outlook.com" H="outlook-com.olc.protection.outlook.com"
2018-02-17 10:59:18 1emzGs-0009wb-94 SMTP error from remote mail server after end of data: 451 4.4.0 Message failed to be made redundant due to A shadow copy was required but failed to be made with an AckStatus of Fail [CO1NAM03HT002.eop-NAM03.prod.protection.outlook.com] [CO1NAM03FT002.eop-NAM03.prod.protection.outlook.com]
2018-02-17 10:59:19 1emzGs-0009wb-94 [104.47.42.33] SSL verify error: certificate name mismatch: DN="/C=US/ST=WA/L=Redmond/O=Microsoft Corporation/OU=Microsoft Corporation/CN=mail.protection.outlook.com" H="outlook-com.olc.protection.outlook.com"


What we see here is that even a huge corporation like Microsoft does not always handle certificates properly. The certificate they present for setting up the encrypted connection is not actually valid for the host name that the outlook.com server presents.

There is also what I interpret as a file system related message which I assume is meaningful to someone well versed in Microsoft products, but we see that

2018-02-17 10:59:20 1emzGs-0009wb-94 => janet@prospectingsales.net R=dnslookup T=remote_smtp H=prospectingsales-net.mail.protection.outlook.com [23.103.140.138] X=TLSv1.2:ECDHE-RSA-AES256-SHA384:256 CV=yes K C="250 2.6.0 <31b4ffcf-bf87-de33-b53a-0ebff4349b94@bsdly.net> [InternalId=40926743365667, Hostname=BMXPR01MB0934.INDPRD01.PROD.OUTLOOK.COM] 44350 bytes in 0.868, 49.851 KB/sec Queued mail for delivery"

even though the certificate fails the verification part, the connection sets up with TLSv1.2 anyway, and the message is accepted with a "Queued mail for delivery" message.

The message is also delivered to the Cc: recipient:

2018-02-17 10:59:21 1emzGs-0009wb-94 => abuse@outlook.com R=dnslookup T=remote_smtp H=outlook-com.olc.protection.outlook.com [104.47.42.33] X=TLSv1.2:ECDHE-RSA-AES256-SHA384:256 CV=no K C="250 2.6.0 <31b4ffcf-bf87-de33-b53a-0ebff4349b94@bsdly.net> [InternalId=3491808500196, Hostname=BY2NAM03HT071.eop-NAM03.prod.protection.outlook.com] 42526 bytes in 0.125, 332.215 KB/sec Queued mail for delivery"
2018-02-17 10:59:21 1emzGs-0009wb-94 Completed


And the transactions involving my message would normally have been completed.

But ten seconds later this happens:

2018-02-17 10:59:31 1emzHG-0004w8-0l <= staff@hotmail.com H=bay004-omc1s10.hotmail.com [65.54.190.21] P=esmtps X=TLSv1.2:ECDHE-RSA-AES256-SHA384:256 CV=no K S=43968 id=BAY0-XMR-100m4KrfmH000a51d4@bay0-xmr-100.phx.gbl
2018-02-17 10:59:31 1emzHG-0004w8-0l => peter <postmaster@bsdly.net> R=localuser T=local_delivery
2018-02-17 10:59:31 1emzHG-0004w8-0l => peter <postmaster@bsdly.net> R=localuser T=local_delivery


That's the first message to my domain's postmaster@ address, followed two seconds later by

2018-02-17 10:59:33 1emzHI-0004w8-Fy <= staff@hotmail.com H=bay004-omc1s10.hotmail.com [65.54.190.21] P=esmtps X=TLSv1.2:ECDHE-RSA-AES256-SHA384:256 CV=no K S=43963 id=BAY0-XMR-100Q2wN0I8000a51d3@bay0-xmr-100.phx.gbl
2018-02-17 10:59:33 1emzHI-0004w8-Fy => peter <postmaster@bsdly.net> R=localuser T=local_delivery
2018-02-17 10:59:33 1emzHI-0004w8-Fy Completed


a second, apparently identical message.

Both of those messages state that the message I sent to abuse@outlook.com had failed SPF verification, because the check happened on connections from NAM03-BY2-obe.outbound.protection.outlook.com (216.32.180.51) by whatever handles incoming mail to the staff@hotmail.com address, which apparently is where the system forwards abuse@outlook.com's mail.

Reading Microsoft Exchange's variant SMTP headers has never been my forte, and I won't try decoding the exact chain of events here since that would probably also require you to have fairly intimate knowledge of Microsoft's internal mail delivery infrastructure.

But even a quick glance at the messages reveals that the message passed SPF and other checks on incoming to the outlook.com infrastructure, but may have ended up not getting delivered after all since a second SPF test happened on a connection from a host that is not in the sender domain's SPF record.

In fact, that second test would only succeed for domains that have

include:spf.protection.outlook.com

in their SPF record, and those would presumably be Outlook.com customers.

Any student or practitioner of SMTP mail delivery should know that SPF records should only happen on ingress, that is at the point where the mail traffic enters your infrastructure and the sender IP address is the original one. Leave the check for later when the message may have been forwarded, and you do not have sufficient data to perform the check.

Whenever I encounter incredibly stupid and functionally destructive configuration errors like this I tend to believe they're down to simple incompetence and not malice.

But this one has me wondering. If you essentially require incoming mail to include the contents of spf.outlook.com (currently no less than 81 subnets) as valid senders for the domain, you are essentially saying that only outlook.com customers are allowed to communicate.

If that restriction is a result of a deliberate choice rather than a simple configuration error, the problem moves out of the technical sphere and could conceivably become a legal matter, depending on what outlook.com have specified in their contracts that they are selling to their customers.

But let us assume that this is indeed a matter of simple bad luck or incompetence and that the solution is indeed technical.

I would have liked to report this to whoever does technical things at that domain via email, but unfortunately there are indications that being their customer is a precondition for using that channel of communication to them.

I hope they fix that, and soon. And then move on to terminating their spamming customers' contracts.

The main lesson to be learned from this is that when you shop around for email service, please do yourself a favor and make an effort to ensure that your prospective providers actually understand how the modern-ish SMTP addons SPF, DKIM and DMARC actually work.

Otherwise you may end up receiving more of the mail you don't want than what you do want, and your own mail may end up not being delivered as intended.

Update 2018-02-19: Just as I was going to get ready for bed (it's late here in CET) another message from Ms Farell arrived, this time to an alias I set up in order to make it easier to filter PF tutorial related messages into a separate mailbox.

I wrote another response, and as the mail server log will show, despite the fact that a friend with an Office365 contract contacted them quoting this article, outlook.com have still not fixed the problem. Two more messages (preserved here and here) shot back here immediately.

Update 2018-02-20: A response from Microsoft, with pointers to potentially useful information.

A message from somebody identifying as working for Microsoft Online Safety arrived, apparently responding to my message dated 2018-02-19, where the main material was,

Hi,

Based on the information you provided, it appears to have originated from an Office 365 or Exchange Online tenant account.

To report junk mail from Office 365 tenants, send an email to junk@office365.microsoft.com   and include the junk mail as an attachment.

This link provides further junk mail education https://technet.microsoft.com/en-us/library/jj200769(v=exchg.150).aspx.

Kindly,
I have asked for clarification of some points, but no response has arrived by this getting close to bedtime in CET.

However I did take the advice to forward the offending messages as attachment to the junk@ message, and put the outlook.com abuse address in the Cc: on that message. My logs indicate that the certificate error had not gone away, but no SPF-generated bounces appeared either.

If Microsoft responds with further clarifications, I will publish a useful condensate here.



In other news, there will be PF tutorial at the 2018 AsiaBSDCon in Tokyo. Follow the links for the most up to date information.

Sunday, August 27, 2017

Twenty-plus years on, SMTP callbacks are still pointless and need to die

A rarely used legacy misfeature of the main Internet email protocol creeps back from irrelevance as a minor annoyance. You should ask your mail and antispam provider about their approach to 'SMTP callbacks'. Be wary of any assertion that is not backed by evidence.

Even if you are an IT professional and run an email system, you could be forgiven for not being immediately aware that there is such a thing as SMTP callbacks, also referred to as callback verification. As you will see from the Wikipedia article, the feature was never widely adopted, and for all too understandable reasons.

If you do run a mail system, you have probably heard about that feature's predecessor, the still-required but rarely used SMTP VRFY and EXPN commands. Those commands offer a way to verify whether an address is valid and to show the component addresses that a mailing list resolves to, respectively.

Back when all things inter-networking were considered experimental and it was generally thought that information should flow freely in and between those experimental networks, it was quite common for mail servers to offer VRFY and EXPN service to all comers.

I'm old enough to remember using VRFY by hand, telnet-ing to port 25 on a mail server and running VRFY $user@$domain.$tld commands to check whether an email address was indeed valid. I've forgotten which domains and persons were involved, but I imagine the reason why was that I wanted to contact somebody who had said something interesting in a post to a USENET news group.

But networkers trying to make contact with each other were not the only ones who discovered the VRFY and EXPN commands.  Soon spammers were using those commands to actively harvest actually! valid! deliverable! addresses, and by 1999 the RFC2505 best practices document recommended disabling the features altogether. After all, there would usually be some other way available to find somebody's email address (there was even a FAQ, a longish Frequently Asked Questions document with apparent USENET origins written and maintained on the subject, a copy of which can be found here).

In roughly the same time frame, somebody came up with the idea of SMTP callbacks. The idea was that all domains on the Internet need to publish the address of their mail exchangers via DNS MX (mail exchanger) records. The logical next step is then that when a piece of mail arrives over SMTP, the receiving end should be able to contact the sender domain's known mail exchanger to check that the sender address is indeed valid. If you by now hear the echoes of VRFY and EXPN, you're right. There are indications that some early implementations did in fact use VRFY for that purpose.

But then the world changed, and you could not rely on VRFY being available in the post-RFC2505 world.

In the post-RFC2505 world, the other side would most likely not offer up any useful information in response to VRFY commands, and you would most likely be limited to the short interchange that the Wikipedia entry quotes,
HELO <verifier host name>
MAIL FROM:<>
RCPT TO:<the address to be tested>
QUIT
which a perceptive reader would identify as only verifying in a very limited sense that the domain's mail exchanger was indeed equipped with a functional SMTP service.

It is worth noting, as many have over the years, that the MX records only specify where a domain expects to receive mail, not where valid mail from the domain is supposed to originate. Several mechanisms to help identify valid mail senders for a domain have been devised in the intervening years, but none existed at the time SMTP callbacks were considered even remotely useful. 

For reasons that are not entirely clear, some developers kept working on SMTP callback code and several mail server implementations available today (2017) still contain code that looks like it was intended to support information-rich callbacks, if the system was configured to enable the feature at all. The default configurations in general do not enable the SMTP callback feature, and mail admins rarely bother to even learn much about the largely disused and (in my opinion at least) not too well thought out feature.

This all happened back in the 1990s, but quite recently an incident occurred that indicates that in some pockets of the Internet, SMTP callbacks are still in use, and in at least some cases data from the callbacks are used for generating blacklists and block mail delivery. The last part should raise a few eyebrows at least.

Jumping forward from the distant 1990s to the present day, regular readers of this column will be aware that bsdly.net and cooperating domains run SMTP service with OpenBSD spamd(8) doing greylisting service, and that the spamd(8) setup produces a greytrapping-based blacklist which is available for download, dumped to a file (available here and here) once per hour.

Maintaining the mail system and the blacklist also involves keeping an eye on mail-related activities, and invalid addresses in our domains that turn up in the greylist are usually added to the list of spamtrap addresses within not too many hours after they first appear. The process of actually adding spamtrap addresses is a manual one, but based on the output of pathetically simple shell scripts that run as cron jobs.

The list of spamtraps has grown over the years to more than 38 000 entries. Most of the entries have local parts that are pure generated gibberish, some entries are probably degraded versions of earlier spamtrap addresses and some again seem to conform with specific patterns, including but not limited to SMTP or NNTP message IDs.

On August 19th and 20th 2017 I noticed a different, but yet familiar pattern in some of the new entries.

The entry that caught my eye had the MAIL FROM: part as

mx42.antispamcloud.com-1503146097-testing@bsdly.com

The local part pattern was somewhat familiar, and breaks down to

    $localhostname-$epochtime-testing

with @targetdomain.$tld (in our case, bsdly.com) appended. I had at this point totally forgotten about SMTP callbacks, but I decided to check the logs for any traces of activity involving that host. The only trace I could find in the logs was at the spamd-serving firewall in front of the bsdly.com domain's secondary mail exchanger:

Aug 19 14:35:27 delilah spamd[26915]: 207.244.64.181: connected (25/24)
Aug 19 14:35:38 delilah spamd[26915]: (GREY) 207.244.64.181: <> -> <mx42.antispamcloud.com-1503146097-testing@bsdly.com>
Aug 19 14:35:38 delilah spamd[15291]: new entry 207.244.64.181 from <> to <mx42.antispamcloud.com-1503146097-testing@bsdly.com>, helo mx18-12.smtp.antispamcloud.com
Aug 19 14:35:38 delilah spamd[26915]: 207.244.64.181: disconnected after 11 seconds.

Essentially a normal first contact: spamd at our end answers slowly, one byte per second, but the greylist entry is created in the expectation that any caller with a valid message to deliver would try again within a reasonable time. The spamd synchronization between the hosts in our group of greylisting hosts would see to that an entry matching this sequence would appear in the greylist on all participating hosts.

But the retry never happened, and even if it had, that particular local-part would anyway have produced an "Unknown user" bounce. But at that point I decided to do a bit of investigation and dug out what seemed to be a reasonable point of contact for the antispamcloud.com domain and sent an email with a question about the activity.

That message bounced, with the following explanation in the bounce message body:

  DOMAINS@ANTISPAMCLOUD.COM
    host filter10.antispamcloud.com [31.204.155.103]
    SMTP error from remote mail server after end of data:
    550 The sending IP (213.187.179.198) is listed on https://spamrl.com as a source of dictionary attacks.

As you have probably guessed, 213.187.179.198 is the IPv4 address of the primary mail exchanger for bsdly.net, bsdly.com and a few other domains under my care.

If you go to the URL quoted in the bounce, you will notice that the only point of contact is via an email adress in an unrelated domain.

I did fire off a message to that address from an alternate site, but before the answer to that one arrived, I had managed to contact another of their customers and got confirmation that they were indeed running with an exim setup that used SMTP callbacks.

The spamrl.com web site states clearly that they will not supply any evidence in support of their decision to blacklist. Somebody claiming to represent spamrl.com did respond to my message, but as could be expected from their published policy was not willing to supply any evidence to support the claim stated in the bounce.

In my last message to spamrl.com before starting to write this piece, I advised

I remain unconvinced that the description of that problem is accurate, but investigation at this end can not proceed without at least some supporting evidence such as times of incidents, addresses or even networks affected.
If there is a problem at this end, it will be fixed. But that will not happen as a result of handwaving and insults. Actual evidence to support further investigation is needed.
Until verifiable evidence of some sort materializes, I will assume that your end is misinterpreting normal greylisting behavior or acting on unfounded or low-quality reports from less than competent sources.

The domain bsdly.com was one I registered some years back mainly to fend off somebody who offered to help the owner of the bsdly.net domain acquire the very similar bsdly.com domain at the price of a mere few hundred dollars.

My response was to spend something like ten dollars (or was it twenty?) to register the domain via my regular registrar. I may even have sent back a reply about trying to sell me what I already owned, but I have not bothered to dig that far back into my archives.

The domain does receive mail, but is otherwise not actively used. However, as the list of spamtraps can attest (the full list does not display in a regular browser, since some of the traps are interpreted as html tags, if you want to see it all, fetch the text file instead), others have at times tried to pass off something or other with from addresses in that domain.

But with the knowledge that this outfit's customers are believers in SMTP callbacks as a way to identify spam, here is my hypothesis on what actually happened:

On August 19th 2017, my greylist scanner identified the following new entries referencing the bsdly.com domain:
anecowuutp@bsdly.com
pkgreewaa@bsdly.com
eemioiyv@bsdly.com
keerheior@bsdly.com
mx42.antispamcloud.com-1503146097-testing@bsdly.com
vbehmonmin@bsdly.com
euiosvob@bsdly.com
otjllo@bsdly.com
akuolsymwt@bsdly.com

I'll go out on a limb and guess that mx42.antispamcloud.com was contacted by any of the roughly 5000 hosts blacklisted at bsdly.net at the time, with an attempt to deliver a message with a MAIL FROM: of either anecowuutp@bsdly.com, pkgreewaa@bsdly.com, eemioiyv@bsdly.com or perhaps most likely keerheior@bsdly.com, which appears as a bounce-to address in the same hourly greylist dump where mx42.antispamcloud.com-1503146097-testing@bsdly.com first appears as a To: address.

The first seen time in epoch notation for keerheior@bsdly.com is
1503143365
which translates via date -r to
Sat Aug 19 13:49:25 CEST 2017
while mx42.antispamcloud.com-1503146097-testing@bsdly.com is first seen here at epoch 1503146138, which translates to Sat Aug 19 14:35:38 CEST 2017.

The data indicate that this initial (and only) attempt to contact was aimed at the bsdly.com domain's secondary mail exchanger, and was intercepted by the greylisting spamd that sits in the incoming signal path to there. The other epoch-tagged callbacks follow the same pattern, as can be seen from the data preserved here.

Whatever action or address triggered the callback, the callback appears to have followed the familiar script:
  1. register attempt to deliver mail
  2. look up the domain stated in the MAIL FROM: or perhaps even the HELO or EHLO
  3. contact the domain's mail exchangers with the rump SMTP dialog quoted earlier
  4. with no confirmation or otherwise of anything other than the fact that the domain's mail exchangers do listen on the expected port, proceed to whatever the next step is.
The known facts at this point are:
  1. a mail system that is set up for SMTP callbacks received a request to deliver mail from keerheior@bsdly.com
  2. the primary mail exchanger for bsdly.com has the IPv4 address 213.187.179.198
Both of these are likely true. The second we know for sure, and the first is quite likely. What is missing here is any consideration of where the request to deliver came from.

From the data we have here, we do not have any indication of what host contacted the system that initiated the callback. In a modern configuration, it is reasonable to expect that a receiving system checks for sender validity via any SPF, DKIM or DMARC records available, or for that matter, greylist and wait for the next attempt (in fact, greylisting before performing any other checks - as an OpenBSD spamd(8) setup would do by default - is likely to be the least resource intensive approach).

We have no indication that the system performing the SMTP callout used any such mechanism to find an indication as to whether the communication partner was in fact in any way connected to the domain it was trying to deliver mail for.

My hypothesis is that whatever code is running on the SMTP callback adherents' systems does not check the actual sending IP address, but assumes that any message claiming to be from a domain must in fact involve the primary mail exchanger of that domain and since the code likely predates the SPF, DKIM and DMARC specifications by at least a decade, it will not even try to check those types of information. Given the context it is a little odd but perhaps within reason that in all cases we see here, the callback is attempted not to the domain's primary mail exchanger, but the secondary. 

With somebody or perhaps even several somebodies generating nonsense addresses in the bsdly.com domain at an appreciable rate (see the record of new spamtraps starting May 20th, 2017, alternate location here) and trying to deliver using those fake From: addresses to somewhere doing SMTP callback, it's not much of a stretch to assume that the code was naive enough to conclude that the purported sender domain's primary mail exchanger was indeed performing a dictionary attack.

The most useful lesson to take home from this sorry affair is likely to be that you need to kill SMTP callback setups in any system where you may find them. In today's environment, SMTP callbacks do not in fact provide useful information that is not available from other public sources, and naive use of results from those lookups is likely to harm unsuspecting third parties.

So,
  • If you are involved in selling or operating a system that behaves like the one described here and are in fact generating blacklists based on those very naive assumptions, you need to stop doing so right away.

    Your mistaken assumptions help produce bad data which could lead to hard to debug problems for innocent third parties.

    Or as we say in the trade, you are part of the problem.

  • If you are operating a system that does SMTP callbacks but doesn't do much else, you are part of a small problem and likely only inconveniencing yourself and your users.

    The fossil record (aka the accumulated collection of spamtrap addresses at bsdly.net) indicates that the callback variant that includes epoch times is rare enough (approximately 100 unique hosts registered over a decade) that callback activity in total volume probably does not rise above the level of random background noise.

    There may of course be callback variants that have other characteristics, and if you see a way to identify those from the data we have, I would very much like to hear from you.

  • If you are a customer of somebody selling antispam products, you have reason to demand an answer to just how, if at all, your antispam supplier utilizes SMTP callbacks. If they think it's a fine and current feature, you have probably been buying snake oil for years.

  • If you are the developer or maintainer of mail server code that contains the SMTP callbacks feature, please remove the code. Leaving it disabled by default is not sufficient. Removing the code is the only way to make sure the misfeature will never again be a source of confusing problems.
For some hints on what I consider a reasonable and necessary level of transparency in blacklist maintenance, please see my April 2013 piece Maintaining A Publicly Available Blacklist - Mechanisms And Principles.

The data this article is based on still exists and will be available for further study as long as the request comes wit a reasonable justification. I welcome comments in the comment field or via email (do factor in any possible greylist delay, though).

Any corrections or updates that I find necessary based on your responses will be appended to the article.



Update 2017-09-05: Since the article was originally published, we've seen a handful of further SMTP callback incidents. The last few we've handled by sending the following to the addresses that could be gleaned from whois on the domain name and source IP address (with mx.nxdomain.nx and 192.0.2.74 inserted as placeholders here to protect the ignorant):


Hi,

I see from my greylist dumps that the host identifying as 

mx.nxdomain.nx, IP address 192.0.2.74

is performing what looks like SMTP callbacks, with the (non-existent of course) address

mx.nxdomain.nx-1504629949-testing@bsdly.com

as the RCPT TO: address.

It is likely that this activity has been triggered by spam campaigns using made up addresses in one of our little-used domains as from: addresses.

A series of recent incidents here following the same pattern are summarized in the article

http://bsdly.blogspot.com/2017/08/twenty-plus-years-on-smtp-callbacks-are.html

Briefly, the callbacks do not work as you expect. Please read the article and then disable that misfeature. Otherwise you will be complicit in generating false positives for your SMTP blacklist.

If you have any questions or concerns, please let me know.

Yours sincerely,
Peter N. M. Hansteen

If you've received a similarly-worded notice recently, you know why and may be closer to having a sanely run mail service.



Update 2017-11-02: It looks like the spamrl.com service still has customers that believe in the same claim made in the bounce messages quoted earlier in this article and perform SMTP callbacks exactly like they did when this article was first written. If you have any verifiable information on that outfit and their activities, I would very much like to hear from you.

Monday, July 10, 2017

OpenBSD and the modern laptop

Did you think that OpenBSD is suitable only for firewalls and high-security servers? Think again. Here are my steps to transform a modern mid to high range laptop into a useful Unix workstation with OpenBSD.

One thing that never ceases to amaze me is that whenever I'm out and about with my primary laptop at conferences and elsewhere geeks gather, a significant subset of the people I meet have a hard time believing that my laptop runs OpenBSD, and that it's the only system installed.

A typical exchange runs something like,
"So what system do you run on that laptop there?"
"It's OpenBSD. xfce is the window manager, and on this primary workstation I tend to just upgrade from snapshot to snapshot."
"Really? But ..."
and then it takes a bit of demonstrating that yes, the graphics runs with the best available resolution the hardware can offer, the wireless network is functional, suspend and resume does work, and so forth. And of course, yes, I do use that system when writing books and articles too. Apparently heavy users of other free operating systems do not always run them on their primary workstations.

I'm not sure at what time I permanently converted my then-primary workstation to run OpenBSD exclusively, but I do remember that when I took delivery of the ThinkPad R60 (mentioned in this piece) in 2006, the only way forward was to install the most recent OpenBSD snapshot. By mid-2014 the ThinkPad SL500 started falling to pieces, and its replacement was a Multicom Ultrabook W840, manufactured by Clevo. The Clevo Ultrabook has weathered my daily abuse and being dragged to various corners of the world for conferences well, but on the trek to BSDCan 2017 cracks started appearing in the glass on the display and the situation worsened on the return trip.

So the time came to shop around for a replacement. After a bit of shopping around I came back to Multicom, a small computers and parts supplier outfit in rural Åmli in southern Norway, the same place I had sourced the previous one.

One of the things that attracted me to that particular shop and their own-branded offerings is that they will let you buy those computers with no operating system installed. That is of course what you want to do when you source your operating system separately, as we OpenBSD users tend to do.

The last time around I had gone for a "Thin and lightweight" 14 inch model (Thickness 20mm, weight 2.0kg) with 16GB RAM, 240GB SSD for system disk and 1TB HD for /home (since swapped out for a same-size SSD, as the dmesg will show).

Three years later, the rough equivalent with some added oomph for me to stay comfortable for some years to come ended me with a 13.3 inch model, 18mm thick and advertised as 1.3kg (but actually weighing in at 1.5kg, possibly due to extra components), 32GB RAM, 512GB SSD and 2TB harddisk. For now the specification can be viewed online here (the site language is Norwegian, but product names and units of measure are not in fact different).

That system arrived today, in a slender box:



Here are the two machines, the old (2014-vintage) and the new side by side:



The OpenBSD installer is a wonder of straightforward, no-nonsense simplicity that simply gets the job done. Even so, if you are not yet familiar with OpenBSD, it is worth spending some time reading the OpenBSD FAQ's installation guidelines and the INSTALL.$platform file (in our case, INSTALL.amd64) to familiarize yourself with the procedure. If you're following this article to the letter and will be installing a snapshot, it is worth reading the notes on following -current too.

The main hurdle back when I was installing the 2014-vintage 14" model was getting the system to consider the SSD which showed up as sd1 the automatic choice for booting (I solved that by removing the MBR, setting the size of the MBR on the hard drive that showed up as sd0 to 0 and enlarging the OpenBSD part to fill the entire drive).

Let's see how the new one is configured, then. I try running with the default UEFI "Secure boot" option enabled, and it worked.

Here we see the last part of the messages that scroll across the screen when the new laptop boots from the USB thumbdrive that has had the most recent OpenBSD/amd64 install61.fs dd'ed onto it:



And as the kernel messages showed us during boot (yes, that scrolled off the top before I got around to taking the picture), the SSD came up as sd1 while the hard drive registered as sd0. Keep that in mind for later.



After the initial greeting from the installer, the first menu asks what we want to do. This is a new system, so only (A)utoinstall and (I)nstall would have any chance of working. I had not set up for automatic install this time around, so choosing (I)nstall was the obvious thing to do.

The next item the installer wants to know is which keyboard layout to use and to set as the default on the installed system. I'm used to using Norwegian keyboards, so no is the obvious choice for me here. If you want to see the list of available options, you press ? and then choose the one you find the must suitable.

Once you've chosen the keyboard layout, the installer prompts you for the system's host name. This is only the host part, the domain part comes later. I'm sure your site or organization has some kind of policy in place for choice of host names. Make sure you stay inside any local norms, the one illustrated here conforms with what we have here.

Next up the installer asks which network interfaces to configure. A modern laptop such as this one comes with at least two network interfaces: a wireless interface, in this case an Intel part that is supported in OpenBSD with the iwm(4) driver, and a wired gigabit ethernet interface which the installer kernel recognized as re0.

Quite a few pieces the hardware in a typical modern laptop requires the operating system to load firmware onto the device before it can start interacting meaningfully with the kernel. The Intel wireless network parts supported by the iwm(4) driver and the earlier iwn(4) all have that requirement. However, for some reason the OpenBSD project has not been granted permission to distribute the Intel firmware files, so with only the OpenBSD installer it is not possible to use iwm(4) devices during an initial install. So in this initial round I only configure the re0 interface. During the initial post-install boot the rc.firsttime script will run fw_update(1) command that will identify devices that require firmware files and download them from the most convenient OpenBSD firmware mirror site.

My network here has a DHCP server in place, so I simply choose the default dhcp for IPv4 address assignment and autoconf for IPv6.

With the IPv4 and IPv6 addresses set, the installer prompts for the domain name. Once again, the choice was not terribly hard in my case.



On OpenBSD, root is a real user, and you need to set that user's password even if you will rarely if ever log in directly as root. You will need to type that password twice, and as the install documentation states, the installer will only check that the passwords match. It's up to you to set a usefully strong password, and this too is one of the things organizations are likely to have specific guidelines for.

Once root's password is set, the installer asks whether you want to start sshd(8) by default. The default is the sensible yes, but if you answer no here, the installed system will not have any services listening on the system's reachable interfaces.

The next question is whether the machine will run the X Windows system. This is a laptop with a "Full HD" display and well supported hardware to drive it, so the obvious choice here is yes.

I've gotten used to running with xenodm(1) display manager and xfce as the windowing environment, so the question about xenodm is a clear yes too, in my case.

The next question is whether to create at least one regular user during the install. Creating a user for your systems adminstrator during install has one important advantage: the user you create at this point will be a member of the wheel group, which makes it slightly easier to move to other privilege levels via doas(1) or similar.

Here I create a user for myself, and it is added, behind the scenes, to the wheel group.

With a user in place, it is time to decide whether root will be able to log in via ssh. The sensible default is no, which means you too should just press enter here.

The installer guessed correctly for my time zone, so it's another Enter to move forward.

Next up is the part that people have traditionally found the most scary in OpenBSD installing: Disk setup.

If the machine had come with only one storage device, this would have been a no-brainer. But I have a fast SSD that I want to use as the system disk, and a slightly slower and roomier rotating rust device aka hard disk that I want primarily as the /home partition.

I noted during the bsd.rd boot that the SSD came up as sd1 and the hard drive came up as sd0, so we turn to the SSD (sd1) first.

Since the system successfully booted with the "Secure boot" options in place, I go for the Whole disk GPT option and move on to setting partition sizes.

The default suggestion for disk layout makes a lot of sense and will set sensible mount options, but I will be storing /home on a separate device, so I choose the (E)dit auto layout option and use the R for Resize option to redistribute the space left over to the other partitions.

Here is also where you decide the size of the swap space, traditionally on the boot device's b partition. Both crashdumps and suspend to disk use swap space for their storage needs, so if you care about any of these, you will need to allocate at least as much space as the amount of physical RAM installed in the system. Because I could, I allocated the double of that, or 64GB.

For sd0, I once again choose the Whole disk GPT option and make one honking big /home partition for myself.

The installer then goes on to create the file systems, and returns with the prompt to specify where to find install sets.

The USB drive that I dd'ed the install61.fs image to is the system's third sd device (sd2), so choosing disk and specifying sd2 with the subdirectory 6.1/amd64 makes sense here. On the other hand, if your network and the path to the nearest mirror is fast enough, you may actually save time choosing a http network install over installing from a relatively slow USB drive.

Anyway, the sets install proceeds and trundles through what is likely the longest period of forced inactivity that you will have during an OpenBSD install.

The installer verifies the signed sets and installs them.



Once the sets install is done, you get the offer of specifying more sets -- your site could have a site-specific items in an install set -- but I don't have any of those handy, so I just press enter to accept the default done.

If you get the option to correct system time here, accept it and have ntpd(8) set your system clock to a sane setting gleaned from well known NTP servers.

With everything else in place, the installer links the kernel with a unique layout, in what is right now a -current-only feature, but one that will most likely be one of the more talked-about items in the OpenBSD 6.2 release some time in the not too distant future.

With all items on the installer's agenda done, the installer exits and leaves you at a root shell prompt where the only useful action is to type reboot and press enter. Unless, of course you have specific items you know will need to be edited into the configuration before the reboot.

After completing the reboot, the system did unfortunately not, as expected, immediately present the xenodm login screen, but rather the text login prompt.

Looking at the /var/log/Xorg.0.log file pointed to driver problems, but after a little web searching on the obvious keywords, I found this gist note from notable OpenBSD developer Reyk Flöter that gave me the things to paste into my /etc/xorg.conf to yield a usable graphics display for now.

Update 2017-09-27: Kaby Lake support is now available. I installed the 2017-09-27 snapshot, and I am now running the machine with no xorg.conf. I preserved updated dmesg(8) output and xdpyinfo(1) output. It is worth noting that what is in that snapshot is likely very close to what will be in OpenBSD 6.2.

My task for this evening is to move my working environment to new hardware, so after install there are really only two items remaining, in no particular order:
  • move my (too large) accumulation of /home/ data to the new system, and
  • install the same selection of packages on the old machine to the new system.
The first item will take longer, so I shut down all the stuff I normally have running on the laptop such as web browsers, editors and various other client programs, and use pkg_info(1) to create the list of installed packages on the 'from' system:

$ pkg_info -mz >installed_packages

then I transfer the installed_packages file to the fresh system, but not before recording the df -h status of the pristine fresh install:

$ df -h
Filesystem     Size    Used   Avail Capacity  Mounted on
/dev/sd1a     1005M   76.4M    878M     8%    /
/dev/sd0d      1.8T    552K    1.7T     0%    /home
/dev/sd1d     31.5G   12.0K   29.9G     0%    /tmp
/dev/sd1f     98.4G    629M   92.9G     1%    /usr
/dev/sd1g      9.8G    177M    9.2G     2%    /usr/X11R6
/dev/sd1h      108G    218K    103G     0%    /usr/local
/dev/sd1k      9.8G    2.0K    9.3G     0%    /usr/obj
/dev/sd1j     49.2G    2.0K   46.7G     0%    /usr/src
/dev/sd1e     98.4G    5.6M   93.5G     0%    /var

Not directly visible here is the amount of swap configured in the sd1b partition. As I mentioned earlier, crashdumps and suspend to disk both use swap space for their storage needs, so if you care about any of these, you will need to allocate at least as much space as the amount of physical RAM installed in the system. Because I could, I allocated the double of that, or 64GB.

I also take a peek at the old system's /etc/doas.conf and enter the same content on the new system to get the same path to higher privilege that I'm used to having. With those in hand, recreating the set of installed packages on the fresh system is then a matter of a single command:

$ doas pkg_add -l installed_packages

and pkg_add(1) proceeds to fetch and install the same packages I had on the old system.

Then there is the matter of transferring the I-refuse-to-admit-the-actual-number-of gigabytes that make up the content of my home directory. In many environments it would make sense to just restore from the most recent backup, but in my case where the source and destination sit side by side, i chose to go with a simple rsync transfer:

$ rsync  -rcpPCavu 192.168.103.69:/home/peter . | tee -a 20170710-transferlog.txt

(Yes, I'm aware that I could have done something similar with nc and tar, which are both in the base system. But rsync wins by being more easily resumable.)

While the data transfers, there is ample time to check for parts of the old system's configuration that should be transferred to the new one. Setting up the hostname.iwm0 file to hold the config for the wireless networks (see the hostname.if man page) by essentially copying across the previous one is an obvious thing, and this is the time when you discover tweaks you love that were not part of that package's default configuration.

Some time went by while the content transferred, and I can now announce that I'm typing away on the machine that is at the same time both the most lightweight and the most powerful machine I have ever owned.

I am slowly checking and finding that the stuff I care about just works, though I haven't bothered to check whether the webcam works yet. I know you've been dying to see the dmesg, which can be found here. I'm sure I'll get to the bottom of the 'not configured' items (yes, there are some) fairly soon. Look for updates that will be added to the end of this column.

And after all these years, I finally have a machine that matches my beard color:



If you have any questions on running OpenBSD as a primary working environment, I'm generally happy to answer but in almost all cases I would prefer that you use the mailing lists such as misc@openbsd.org or the OpenBSD Facebook group so the question and hopefully useful answers become available to the general public. Browsing the slides for my recent OpenBSD and you user group talk might be beneficial if you're not yet familiar with the system. And of course, comments on this article are welcome.


Update 2017-07-18: One useful thing to do once you have your system up and running is to submit your dmesg to the NYCBUG dmesg database. The one for the system described here is up as http://dmesgd.nycbug.org/index.cgi?do=view&id=3227.

Update 2017-08-18: Ars Technica reviews the same model, in a skinnier configuration, with a focus on Linux, in Review: System76’s Galago Pro solves “just works” Linux’s Goldilocks problem.

Update 2017-08-24: After questions about how the OpenBSD installer handles UEFI and the 'Secure boot' options, I recorded the output of fdisk -v in this file, which I hope clears up any ambiguity left by the original article.

Update 2017-09-27: Kaby Lake support is now available. I installed the 2017-09-27 snapshot, and I am now running the machine with no xorg.conf. I preserved updated dmesg(8) output and xdpyinfo(1) output. It is worth noting that what is in that snapshot is likely very close to what will be in OpenBSD 6.2.

Wednesday, April 19, 2017

Forcing the password gropers through a smaller hole with OpenBSD's PF queues

While preparing material for the upcoming BSDCan PF and networking tutorial, I realized that the pop3 gropers were actually not much fun to watch anymore. So I used the traffic shaping features of my OpenBSD firewall to let the miscreants inflict some pain on themselves. Watching logs became fun again.

Yes, in between a number of other things I am currently in the process of creating material for new and hopefully better PF and networking session.

I've been fishing for suggestions for topics to include in the tutorials on relevant mailing lists, and one suggestion that keeps coming up (even though it's actually covered in the existling slides as well as The Book of PF) is using traffic shaping features to punish undesirable activity, such as


What Dan had in mind here may very well end up in the new slides, but in the meantime I will show you how to punish abusers of essentially any service with the tools at hand in your OpenBSD firewall.

Regular readers will know that I'm responsible for maintaining a set of mail services including a pop3 service, and that our site sees pretty much round-the-clock attempts at logging on to that service with user names that come mainly from the local part of the spamtrap addresses that are part of the system to produce our hourly list of greytrapped IP addresses.

But do not let yourself be distracted by this bizarre collection of items that I've maintained and described in earlier columns. The actual useful parts of this article follow - take this as a walkthrough of how to mitigate a wide range of threats and annoyances.

First, analyze the behavior that you want to defend against. In our case that's fairly obvious: We have a service that's getting a volume of unwanted traffic, and looking at our logs the attempts come fairly quickly with a number of repeated attempts from each source address. This similar enough to both the traditional ssh bruteforce attacks and for that matter to Dan's website scenario that we can reuse some of the same techniques in all of the configurations.

I've written about the rapid-fire ssh bruteforce attacks and their mitigation before (and of course it's in The Book of PF) as well as the slower kind where those techniques actually come up short. The traditional approach to ssh bruteforcers has been to simply block their traffic, and the state-tracking features of PF let you set up overload criteria that add the source addresses to the table that holds the addresses you want to block.

I have rules much like the ones in the example in place where there I have a SSH service running, and those bruteforce tables are never totally empty.

For the system that runs our pop3 service, we also have a PF ruleset in place with queues for traffic shaping. For some odd reason that ruleset is fairly close to the HFSC traffic shaper example in The Book of PF, and it contains a queue that I set up mainly as an experiment to annoy spammers (as in, the ones that are already for one reason or the other blacklisted by our spamd).

The queue is defined like this:

   queue spamd parent rootq bandwidth 1K min 0K max 1K qlimit 300

yes, that's right. A queue with a maximum throughput of 1 kilobit per second. I have been warned that this is small enough that the code may be unable to strictly enforce that limit due to the timer resolution in the HFSC code. But that didn't keep me from trying.

And now that I had another group of hosts that I wanted to just be a little evil to, why not let the password gropers and the spammers share the same small patch of bandwidth?

Now a few small additions to the ruleset are needed for the good to put the evil to the task. We start with a table to hold the addresses we want to mess with. Actually, I'll add two, for reasons that will become clear later:

table <longterm> persist counters
table <popflooders> persist counters 

 
The rules that use those tables are:

block drop log (all) quick from <longterm> 


pass in quick log (all) on egress proto tcp from <popflooders> to port pop3 flags S/SA keep state \ 
(max-src-conn 1, max-src-conn-rate 1/1, overload <longterm> flush global, pflow) set queue spamd 

pass in log (all) on egress proto tcp to port pop3 flags S/SA keep state \ 
(max-src-conn 5, max-src-conn-rate 6/3, overload <popflooders> flush global, pflow) 
 
The last one lets anybody connect to the pop3 service, but any one source address can have only open five simultaneous connections and at a rate of six over three seconds.

Any source that trips up one of these restrictions is overloaded into the popflooders table, the flush global part means any existing connections that source has are terminated, and when they get to try again, they will instead match the quick rule that assigns the new traffic to the 1 kilobyte queue.

The quick rule here has even stricter limits on the number of allowed simultaneous connections, and this time any breach will lead to membership of the longterm table and the block drop treatment.

The for the longterm table I already had in place a four week expiry (see man pfctl for detail on how to do that), and I haven't gotten around to deciding what, if any, expiry I will set up for the popflooders.

The results were immediately visible. Monitoring the queues using pfctl -vvsq shows the tiny queue works as expected:

 queue spamd parent rootq bandwidth 1K, max 1K qlimit 300
  [ pkts:     196136  bytes:   12157940  dropped pkts: 398350 bytes: 24692564 ]
  [ qlength: 300/300 ]
  [ measured:     2.0 packets/s, 999.13 b/s ]


and looking at the pop3 daemon's log entries, a typical encounter looks like this:

Apr 19 22:39:33 skapet spop3d[44875]: connect from 111.181.52.216
Apr 19 22:39:33 skapet spop3d[75112]: connect from 111.181.52.216
Apr 19 22:39:34 skapet spop3d[57116]: connect from 111.181.52.216
Apr 19 22:39:34 skapet spop3d[65982]: connect from 111.181.52.216
Apr 19 22:39:34 skapet spop3d[58964]: connect from 111.181.52.216
Apr 19 22:40:34 skapet spop3d[12410]: autologout time elapsed - 111.181.52.216
Apr 19 22:40:34 skapet spop3d[63573]: autologout time elapsed - 111.181.52.216
Apr 19 22:40:34 skapet spop3d[76113]: autologout time elapsed - 111.181.52.216
Apr 19 22:40:34 skapet spop3d[23524]: autologout time elapsed - 111.181.52.216
Apr 19 22:40:34 skapet spop3d[16916]: autologout time elapsed - 111.181.52.216


here the miscreant comes in way too fast and only manages to get five connections going before they're shunted to the tiny queue to fight it out with known spammers for a share of bandwidth.

I've been running with this particular setup since Monday evening around 20:00 CEST, and by late Wednesday evening the number of entries in the popflooders table had reached approximately 300.

I will decide on an expiry policy at some point, I promise. In fact, I welcome your input on what the expiry period should be.

One important takeaway from this, and possibly the most important point of this article, is that it does not take a lot of imagination to retool this setup to watch for and protect against undesirable activity directed at essentially any network service.

You pick the service and the ports it uses, then figure out what are the parameters that determine what is acceptable behavior. Once you have those parameters defined, you can choose to assign to a minimal queue like in this example, block outright, redirect to something unpleasant or even pass with a low probability.

All of those possibilities are part of the normal pf.conf toolset on your OpenBSD system. If you want, you can supplement these mechanisms with a bit of log file parsing that produces output suitable for feeding to pfctl to add to the table of miscreants. The only limits are, as always, the limits of your imagination (and possibly your programming abilities). If you're wondering why I like OpenBSD so much, you can find at least a partial answer in my OpenBSD and you presentation.

FreeBSD users will be pleased to know that something similar is possible on their systems too, only substituting the legacy ALTQ traffic shaping with its somewhat arcane syntax for the modern queues rules in this article.

Will you be attending our PF and networking session in Ottawa, or will you want to attend one elsewhere later? Please let us know at the email address in the tutorial description.



Update 2017-04-23: A truly unexpiring table, and downloadable datasets made available

Soon after publishing this article I realized that what I had written could easily be taken as a promise to keep a collection of POP3 gropers' IP addresses around indefinitely, in a table where the entries never expire.

Table entries do not expire unless you use a pfctl(8) command like the ones mentioned in the book and other resources I referenced earlier in the article, but on the other hand table entries will not survive a reboot either unless you arrange to have table contents stored to somewhere more permanent and restored from there. Fortunately our favorite toolset has a feature that implements at least the restoring part.

Changing the table definition quoted earler to read

 table <popflooders> persist counters file "/var/tmp/popflooders"

takes part of the restoring, and the backing up is a matter of setting up a cron(8) job to dump current contents of the table to the file that will be loaded into the table at ruleset load.

Then today I made another tiny change and made the data available for download. The popflooders table is dumped at five past every full hour to pop3gropers.txt, a file desiged to be read by anything that takes a list of IP addresses and ignores lines starting with the # comment character. I am sure you can think of suitable applications.

In addition, the same script does a verbose dump, including table statistiscs for each entry, to pop3gropers_full.txt for readers who are interested in such things as when an entry was created and how much traffic those hosts produced, keeping in mind that those hosts are not actually blocked here, only subjected to a tiny bandwidth.

As it says in the comment at the top of both files, you may use the data as you please for your own purposes, for any re-publishing or integration into other data sets please contact me via the means listed in the bsdly.net whois record.

As usual I will answer any reasonable requests for further data such as log files, but do not expect prompt service and keep in mind that I am usually in the Central European time zone (CEST at the moment).

I suppose we should see this as a tiny, incremental evolution of the "Cybercrime Robot Torture As A Service" (CRTAAS) concept.

Update 2017-04-29: While the world was not looking, I supplemented the IP address dumps with versions including one with geoiplocation data added and a per country summary based on the geoiplocation data.

Spending a few minutes with an IP address dump like the one described here and whois data is a useful excersise for anyone investigating incidents of this type. This .csv file is based on the 2017-04-29T1105 dump (preserved for reference), and reveals that not only is the majority of attempts from one country but also a very limited number of organizations within that country are responsible for the most active networks.

The spammer blacklist (see this post for background) was of course ripe for the same treatment, so now in addition to the familiar blacklist, that too comes with a geoiplocation annotated version and a per country summary.

Note that all of those files except the .csv file with whois data are products of automatic processes. Please contact me (the email address in the files works) if you have any questions or concerns.

Update 2017-05-17: After running with the autofilling tables for approximately a month, and I must confess, extracting bad login attempts that didn't actually trigger the overload at semi-random but roughly daily intervals, I thought I'd check a few things about the catch. I already knew roughly how many hosts total, but how many were contactin us via IPv6? Let's see:

[Wed May 17 19:38:02] peter@skapet:~$ doas pfctl -t popflooders -T show | wc -l
    5239
[Wed May 17 19:38:42] peter@skapet:~$ doas pfctl -t popflooders -T show | grep -c \: | wc -l
77

Meaning, that of a total 5239 miscreants trapped, only 77, or just short of 1.5 per cent tried contacting us via IPv6. The cybercriminals, or at least the literal bottom feeders like the pop3 password gropers, are still behind the times in a number of ways.

Update 2017-06-13: BSDCan 2017 is past, and the PF and networking tutorial with OpenBSD session had 19 people signed up for it. We made the slides available on the net here during the presentation and announced them on Twitter and elsewhere just after the session concluded. The revised tutorial was fairly well received, and it is likely that we will be offering roughly equivalent but not identical sessions at future BSD events or other occasions as demand dictates.

Update 2017-07-05: Updated the overload criteria for the longterm table to what I've had running for a while: max-src-conn 1, max-src-conn-rate 1/1.

Update 2017-10-07: I've decided to publish a bit more of the SSH bruteforcer data. The origin is from two gateways in my care, both with these entries in their pf.conf:

table persist counters file "/var/tmp/bruteforce"
block drop log (all) quick from <bruteforce>

supplemented with cron jobs that dump the current data to the file so the data survives a reboot. After years of advocating 24-hour expiry on blacklists, I recently changed my mind so, both hosts now run with 28-day expiry. For further severity on my part, the hosts also exchange updates to their bruteforce tables via cron jobs that dump table contents to file, fetch the partner's data and load into their own local table. In addition, a manual process extracts (at quasi-random but approximately daily intervals) addresses of failures that do not reach the limits and add those to the tables as well.

The data comes in three varieties: a raw address list, (with a #-prepended comment at the start) suitable for importing into such things as a PF table you block traffic from, the address list with the country code for each entry appended, and finally a summary of list entries per country code. All varieties are generated twice per hour.

Update 2018-05-10: It appears that my spamtraps have entered a canon of sorts. On our freshly configured imapd, this happened. A few dozen login attemps to spamtrap IDs, earning of course only a place in the permanent blocks along with the popflooders.