That grumpy BSD guy: FreeBSD

Showing posts with label FreeBSD. Show all posts

Sunday, August 10, 2025

Eighteen Years of Greytrapping - Is the Weirdness Finally Paying Off?

With the imaginary friends, also known as spamtraps, now more numerous than the inhabitants of their virtual landlord's home country, a greytrapping retrospective is in order.

Friends, it finally happened. On August 7th, 2025, the number of spamtraps intended to woo the unwary spammer rolled past the number of inhabitants in my home country of Norway, as tallied by the official statistics compiled by Statistisk Sentralbyrå, also known as Statistics Norway.

After the morning run that day, the number of spamtraps (imaginary friends) stood at 5620384, inching past the country's total population of 5601049. And yes, the first number is likely to have increased when you read this. Under normal circumstances, the second will likely move a bit in the near future too. To mark the occasion, I present to you the retrospective that some correspondents have been asking for in response to some recent mail related articles of mine.

The Experiment Started in 2007

Greytrapping at nxdomain.no, also known as bsdly.net and a few other domain names, has been a long running experiment. I had been running a mail service for my own and my colleagues' benefit for some years already when I converted that setup stepwise from a Debian Linux setup to one involving OpenBSD hosts as the outer line of defense and a mix of FreeBSD, OpenBSD and other hosts in an evironment not unlike what is described in some of the rather basic configurations described early on in the PF tutorial and later The Book of PF.

Soon after converting the outer defense at that site to an OpenBSD one running a basic PF ruleset, I introduced the then blocklist-importing and greylisting only spamd, and experienced (as described elsewhere) that the fan noise coming from the mail server, obviously burdened by performing content filtering, just stopped immediately, only to occasionally to rise just a quiet murmor for the rest of that server's service life.

Note: This piece is also available without trackers but classic formatting only here.

I did not retain records of when I did that conversion, but my original PF presentation slides from January 2005 describes a spamd setup with greylisting as well as imported lists from spews and spamhaus, which is a strong indication that I had had that running for a while at that point.

Greytrapping was only introduced a little later, but when the feature became available I was ready and eager to put it into production as soon as at all possible. I went on to initiate the greytrapping experiment some time in 2007 and announced to the world in the article Hey, spammer! Here's a list for you! (also here) on July 9, 2007.

Unfortunately, or some would say fortunately, we have not been able to preserve all logs and records, but enough survives that we can sense the general thread and trends until we can get into the details of what we do have available from the last handful of years.

In Retrospect, What Changed Over the Years?

Looking back to the mid-noughties, the most significant change I see is that back then, people did this sort of thing.

Even for small organizations like the company I was attached to then, it was entirely normal to set up their own, in-house mail service as soon as they had some sort of Internet connectivity available.

In the years since then, the Internet in general, and SMTP email in particular, has been centralized to a degree we would not have considered even imaginable back in the mid-noughties.

We call it The Cloud, but as we all know it's really about running your stuff on other people's computers, and in the email case, the centralization is even more extreme.

In some of the field notes and articles linked at the end of this piece you will find mention of the major players in hosted or cloud email field and the fallout from their policies. Those policies and the companies' actions hint strongly that they really think that unless you are them, you have no business running a mail service.

So if it is not clear already, this is a piece that is written for people who either run their own mail service or are considering setting up one, as well as people in their immediate surroundings.

If your perspective on email is "how can I do $THING in Outlook?" or similar, this is really not for you, but you are of course welcome to read on for entertainment and/or enlightenment value, if such is to be found.

If you are considering setting up your own mail service, my main recommendation to you, after you have skimmed this piece and a selection of the linked resources, is to get Michael W. Lucas' 2024 book Run Your Own Mail Server, read it from cover to cover, and do what the man says. That really is the best book on the subject currently available, and it is recent enough to not yet be outdated.

What I saw as the main attraction of the greylisting and greytrapping combo back in the day and even still do, was and is that a set of actuallly quite simple network-level tricks and a tending-towards-pedantic interpretation of the SMTP protocol specification could have such a dramatic effect on the amount of work involved in running a sane mail service.

With a greytrapping spamd and a mail service that would utilize the content filtering setup du jour, my colleagues in the various organizations where we had these setups in place never saw the need to even consider listening to sales pitches for other offerings.

The early field notes and articles very much reflect that situation. We were quite enthusiastic about what we had running. What we had was cheap and reliable, and when there was a need to debug something, we would either point to the other party's configuration fumble or do such things as slowly come to the realization that not all senders play well with greylisting (also here).

I Hear You Say It's Good, But You're Weird Anyway

Over the years my experience of advocating both OpenBSD or FreeBSD as systems to use in general and implementing a greylisting and trapping spamd specifically, more often than not the attitude I would need to try turning around would typically be along the lines of I hear you say it's all good, but you're weird anyway.

In retrospect some of that may have come from me generally using various versions of the somewhat lengthy Effective Spam and Malware Countermeasures - Network Noise Reduction Using Free Tools (also here), sometimes supplemented with In The Name Of Sane Email: Setting Up OpenBSD's spamd(8) With Secondary MXes In Play - A Full Recipe (also here) more or less as promotional material. Both texts have to my mind stood up well over the years and are potentially useful for the right audience, but may not have been quite appropriate in a sales context.

There would be some update here and there, and questions I got during tutorial sessions and via various online channels indicate that people were setting up similar setups to what I have described there, and the various exported blocklists (see eg Badness, Enumerated by Robots (also here)) are quite popular downloads both at the primary and the mirror site.

Over the years there would be some odd episodes, sometimes involving the big players, with a piece such as Does Your Email Provider Know What A "Joejob" Is? (also here) a prime example of behavior I personally do not appreciate experiencing from anyone. On the other hand, in A Life Lesson in Mishandling SMTP Sender Verification (also here) we see an example of a different big player actually contributing well to resolving a puzzling situation.

In addition to the big players, we have of course also at times ran into less pleaseant encounters with not-exactly-captains-of-industry too. An early example was that in 2008, the notion that a challenge-response setup could be an effective antispam mechanism was apparently cultivated by some. In the field note I challenge your response, backscatterer (tracked only, sorry) we see how that went.

If you skim the field notes and articles linked at the end of this piece, you will find that there is, in fact, no end of weirdness in the email business. But one case involving what we must assume is pretty much a bit player had me write up Twenty-plus years on, SMTP callbacks are still pointless and need to die (also here). The TL;DR of that one is that what could have seemed like a bright idea way back when turned out not to be, but in some corners of the internet there are still true believers who can simply not be persuaded to change course even a little.

After a while, I found that though odd episodes did occur, I found it harder to make the writeups interesting and fun to read. A case in point is the year 2019, where at the very end of the year I finally forced myself to write that my only article of the year, The Year 2019 in Review: This Was, Once Again, Weirder Than the Last One (also here). That year had had its share of oddities, including a totally bizarre amount of backscatter from what must have been one or more phishing campaigns aimed at Chinese users. I found that episode hilarious myself, and while it prompted me to automate the spamtrap harvesting a bit, I tried and failed over and over to write what I thought would be a readable and enjoyable article about it.

Actually Running the Thing, and Finding Imaginary Friends

The day to day operations of the greytrapping is quite unremarkable, really. The script that dumps the trapped IP addresses at ten past every hour also presents me with a list of candidate spamtraps -- addresses in our domains currently in the the greylist that do not match any existing valid address or spamtrap, and I add those when I have the time at quasi-random points during the day.

The dump of trapped IP addresses is totally automated, and expiry is 24 hours. In 2013 I wrote a piece called Maintaining A Publicly Available Blacklist - Mechanisms And Principles (also here) that lays out the process in hopefully understandable terms. There is of course also the short version available on the website.

Over time we went from simply collecting from the greylist to also fishing out local parts from the logs of failed logon attempts to services such as ssh and (the obsolete, horrible) pop3.

A little while later it occured to me that it would perhaps be useful to make a record of when each spamtrap entry was added. History starts 2017-05-20, whatever spamtraps can not be found in this data set is assumed to have been added before that date, and reconstructing earlier history of the data would take more time and effort than I have any motivation to expend on the task.

The first partial year's data are, summarized:

New traps per month, 2017
Month	Total	SMTP	SSH	POP3	Other
May	159	49	110	0	0
June	275	48	213	14	0
July	811	144	667	0	0
August	486	447	38	1	0
September	-	-	-	-	-
October	886	513	367	6	0
November	825	57	768	0	0
December	299	91	208	0	0

From that year, the first aricle A New Year, a New Round of pop3 Gropers from China (also here) (January 9, 2017) was written before the date added data started, while the episode described in Twenty-plus years on, SMTP callbacks are still pointless and need to die (also here) (August 27, 2017) more likely than not produced more spamtraps around the time the article was written.

For 2018, we have the first in the series of a full year's data on traps added:

New traps per month, 2018
Month	Total	SMTP	SSH	POP3	Other
January	304	172	132	0	0
February	228	72	148	2	0
March	160	73	87	0	0
April	102	84	18	0	0
May	12206	811	11370¹⁾	22	3²⁾
June	146	26	59	61	0
July	358	248	26	84	0
August	359	125	69	165	0
September	-	-	-	-	-
October	671	241	413	17	0
November	311	297	12	0	2³⁾
December	1038	116	922	0	0

¹⁾ From the Hail Mary Cloud data set
²⁾ IMAP
³⁾ JOKE (see the data)

From June 2018 onwards, we have hourly data on the number of hosts trapped in our spamd-greytrap, in a form that is relatively easy to graph:

The data that went into producing the graph is available as 2018-traplistcounts.txt.

The articles from 2018 include A Life Lesson in Mishandling SMTP Sender Verification (also here) (February 17, 2018) with that life lesson, while the next two show that I felt a need to explain exactly what that blocklist producing thing was about, first with Badness, Enumerated by Robots (also here) (August 13, 2018) and the followup Goodness, Enumerated by Robots. Or, Handling Those Who Do Not Play Well With Greylisting (also here) (November 4, 2018) which really only goes to show that I was starting to contemplate converting my setup to use OpenBSD's own OpenSMTPD -- part of the base system -- rather than trusty old exim.

The 2019 spamtraps added data shows shows again, just how weird that year was -- see The Year 2019 in Review: This Was, Once Again, Weirder Than the Last One (also here) (December 28, 2019):

New traps per month, 2019
Month	Total	SMTP	SSH	POP3	Other
January	1829	192	1636	0	1⁴⁾
February	19644	18782	860	0	2⁵⁾
March	58005	57186	819	0	0
April	53856	52563	1290	3	0
May	2315	350	1964	1	0
June	3164	312	2852	0	0
July	1058	434	618	6	0
August	1229	331	898	0	0
September	-	-	-	-	-
October	11016	630	10385	1	0
November	11119	222	10897	0	0
December	19304	208	19096	0	0

⁴⁾ ARTICLE (see the data)
⁵⁾ JOKE (see the data)

The year 2019 is the oldest preserved data set of number of hosts in our spamd-greytrap that covers an entire year, which in turn gives us this diagram of the year:

The data that went into producing the graph is available as 2019-traplistcounts.txt.

The lockdown year 2020 again did not see much article activity, but after seeing the N!th wankstortion campaign aimed at a large subset of our imaginary friends, I wrote a rant-ish article about it: The 'sextortion' Scams: The Numbers Show That What We Have Is A Failure Of Education (also here) (February 28, 2020)

New traps per month, 2020
Month	Total	SMTP	SSH	POP3	Other
January	5085	171	4914	0	0
February	8941	150	8786	5	0
March	1363	258	1103	2	0
April	596	139	456	1	0
May	1406	108	1298	0	0
June	649	133	516	0	0
July	2405	98	2306	1	0
August	134	123	11	0	0
September	-	-	-	-	-
October	591	185	403	3	0
November	2843	1318	1525	0	0
December	1571	169	1402	0	0

Again for 2020 we have complete data on the of number of hosts in our spamd-greytrap, which in turn gives us this diagram of the year:

The data that went into producing the graph is available as 2020-traplistcounts.txt.

In 2021, still mostly a lockdown year, RFC7505 Means Yes, Your Domain Can Refuse to Handle Mail. Please Leave Us a TXT If You Do. (also here) (February 22, 2021) indicates a small but potentially significant change in mail server configuration. It has been a while since I last saw anything heading for that .se domain.

New traps per month, 2021
Month	Total	SMTP	SSH	POP3	Other
January	179	129	49	1	0
February	172	97	75	0	0
March	112	95	17	0	0
April	150	88	62	0	0
May	1360	90	1270	0	0
June	307	41	266	0	0
July	68	58	8	2	0
August	144	61	82	1	0
September	-	-	-	-	-
October	1035	160	875	0	0
November	166	94	72	0	0
December	304	192	112	0	0

The 2021 data of hosts in our spamd-greytrap produces this graph for the year:

The data that went into producing the graph is available as 2021-traplistcounts.txt.

By 2022, we were back out of lockdowns and I produced several relevant articles -- Spammers in the Public Cloud, Protected by SPF; Intensified Password Groping Still Ongoing; Spamware Hawked to Spamtraps (also here) (April 3, 2022) showed that our imaginary friends or at least a significant subset are indeed in common spamto: lists out there.

The Things Spammers Believe - A Tale of 300,000 Imaginary Friends (also here) (September 7, 2022) -- in which I had somehow not gotten around to celebrating the day when the number of spamtraps went past the number of inhabitants of my home town of Bergen, Norway and decided that a nice round number would serve just as well.

Harvesting the Noise While it's Fresh, Revisited (also here) (December 9, 2022) -- I realized that spammers with freshly generated spamto addresses may try more variants after the first one that gets them trapped, so I turned to some further digging into logs for new data. The numbers swelled slightly as a result.

Can Your Spam-eater Manage to Catch Seventy-one Percent Like This Other Service? (also here) (December 23, 2022) -- yet another piece to explain what greylisting and greytrapping is good for and why it is good for you.

The Despicable, No Good, Blackmail Campaign Targeting ... Imaginary Friends? (also here) (December 25, 2022) -- the first "they're sending wankstortion mail to my imaginary friends" article had not gotten much attention so I tried again.

New traps per month, 2022
Month	Total	SMTP	SSH	POP3	Other
January	143	129	14	0	0
February	333	79	253	0	1⁶⁾
March	915	179	736	0	0
April	20451	91	20360	0	0
May	254	139	114	1	0
June	3898	54	3844	0	0
July	700	86	611	3	0
August	979	514	461	4	0
September	-	-	-	-	-
October	2111	597	1514	0	0
November	470	73	396	1	0
December	2030	1714	303	13	0

⁶⁾ fatfinger (see the data)

The 2022 data of hosts in our spamd-greytrap produces this graph for the year:

The data that went into producing the graph is available as 2022-traplistcounts.txt.

In 2023, we kept adding spamtraps as they came in and generating data, but no mail-themed articles at all.

New traps per month, 2023
Month	Total	SMTP	SSH	POP3	Other
January	642	175	465	2	0
February	429	301	128	0	0
March	8838	5296	3542	0	0
April	1557	1243	314	0	0
May	104	39	65	0	0
June	2273	2234	38	1	0
July	182	76	106	0	0
August	2436	2285	151	0	0
September	-	-	-	-	-
October	4008	3752	256	0	0
November	1912	96	1813	0	3⁷⁾
December	1165	52	1113	0	0

⁷⁾ HTTP (see the data)

The 2023 data of hosts in our spamd-greytrap produces this graph for the year:

The data that went into producing the graph is available as 2023-traplistcounts.txt.

The year 2024 saw little innovation and no new episodes I found a reason to write about. However, that year saw the launch of Michael Lucas' much anticpiated Run Your Own Mail Server, and events somewhat related to that had me write A Simpler Life: Trapping Spambots Based on Target Domain Only (also here) (January 24, 2024) and its followup Three Minimalist spamd Configurations for Your Spam Fighting Needs (With Bonus Points at the End) (also here) (January 25, 2024).

If you have been reading carefully up to this point, you may have noticed what I only noticed myself when I started massaging my spamtraps added data into tables: That during the logged years 2017 through 2023, no new spamtraps were added during the month of September.

As time went by I had noticed that there were periods of up to several weeks when no new spamtrap candidates appeared, but it did not occur to me that every year up to that point, that period had actually been the entire month of September. It is possible or even likely that the change to a more aggressive method of searching for candidates in the logs is what filled up September from this year on.

During late November of 2024, I decided that the time had come to ditch the quasi-empirism of passively collecting the actual to: addresses and start making an effort to fill spammers' spamto: lists with as much junk as possible. So I started extracting local parts from the from: and hostname or host ID fields in my verbose spamd logs, splicing together a larger than ever number of fake @bsdly.net addresses for the spamtraps list. I also started digging back into archived spamd logs and extracting data from there. For obvious reasons, this means that from that point on, the overwhelming majority of the items tagged SMTP in the date added logs are of the synthetic kind. More of that later.

New traps per month, 2024
Month	Total	SMTP	SSH	POP3	Other
January	3122	92	3028	2	0
February	6442	202	6238	2	0
March	2150	198	1951	1	0
April	10028	5010	5018	0	0
May	633	413	219	1	0
June	680	72	608	0	0
July	177	151	25	1	0
August	561	433	125	3	0
September	3770	3675	95	0	0
October	10517	8631	1884	2	0
November	22899	18083	4815	1	0
December	167037	166605	428	4	0

The 2024 data of hosts in our spamd-greytrap produces this graph for the year:

The data that went into producing the graph is available as 2024-traplistcounts.txt.

We continued adding synthetic spamtraps from the from and host fields in both new and archived spamd logs into the new year 2025. This and a few related items are described in A Suitably Bizarre Start of the Year 2025 (also here) (January 1, 2025). In June I found I needed to clarify some things about the exported IP address lists, specifically that one should be considered a historical artifact only, and wrote Should I Stop Caring and Let IP Address Reputation Sort Them Out? (also here) (June 8, 2025).

Seeing that the number of spamtraps now had run into the millions, I decided to speed up the process of filling spamto lists with garbage a bit more, by generating a few thousand extra items from short snippets of /dev/random output, base64 encoded and stripped of certain characters that would possibly lead to spamdb not accepting the result as valid. An example one-liner would be (vary to taste)

The contents of the file rawbar would then be subject to the same checks (eliminating actually valid local addresses as well as the more commonly used of the RFC2142 mandated set) as any other before being fed to spamdb to swell the imaginary friends populations. I was sometimes surprised how many of the items output looked like they could conceivably have been part of something at least vaguely resembling human speech. Anyway, on to the data:

New traps per month, 2025
Month	Total	SMTP	SSH	POP3	Other
January	1400109	1399950	139	23	0
February	1261530	1260708	823	0	0
March	1142404	1141980	423	2	0
April	333442	333332	110	0	0
May	220072	218045	2027	0	0
June	180348	180271	75	2	0
July	242346	240771	1573	2	0
August	245893	245891	2	0	0

The 2025 data up to the publication date of hosts in our spamd-greytrap produces this graph:

The data that went into producing the graph is available as 2025-traplistcounts.txt.

Where to Next, What Is Missing or Needed?

What happens next is not necessarily much different from what we have seen during all of those long years. Looking at the graphed data of number of trapped hosts, it is quite clear that the number of trapped hosts or IP addresses is on a declining trend, but with bursts or spikes when one or more campaigns are active and aimed at our domains. That general trend is possibly a consequence of the trend towards centralization of Internet services in general.

While I have not done any thorough analysis of the data, it appears that there is not a similar decline in delivery attempts, and some quasi-random sampling seems to indicate that traffic from a single trapped IP address presents with a number of different hostnames or host IDs. This could be an indication that the senders sit in a cloud somewhere, or possibly are old-style compromised personal systems tucked away behind NAT.

That said, in my experience greylisting and greytrapping are useful techniques that work well within their limitations.

The limitation that irks me the most is that spamd is IPv4-only. While the migration to IPv6 has been slow, it is happening, and the portion of mail that is delivered over the modern protocol is increasing year by year. Around 2015 there was som work in the OpenBSD project on possibly extending spamd and supporting tools to support IPv6, but if I remember correctly the project was abandoned, at least partly because both parts of "rough consensus and working code" was not possible. Reaching consensus on how greylisting should work in the IPv6 world proved hard, to the point of turning out to being impossible.

I would personally hope that we can make progress towards IPv6 support at some point in the future, but until that happens, we can rest assured that a large part of the spammers have stayed on IPv4, and our tools work well to stop them in their tracks on the legacy protocol.

When I started working on this article, I had only a vague idea of how much I had actually written on the subject. I was a bit surprised at the number of pieces that had accumulated. I have included the list of links in the next, final section.

If you found this article useful, irritating, provoking, thought provoking, or simply would like to comment or contact me personally on the subject, please do.

Previous spamd(8) Themed Articles and Field Notes

Hey, spammer! Here's a list for you! (also here) (July 9, 2007)

Spam is a solved problem (also here) (July 13, 2007)

The noise, we ignore it (tracked) (July 22, 2007)

Harvesting the noise while it's still fresh; SPF found potentially useful (also here) (July 25, 2007)

On the business end of a blacklist. Oh the hilarity. (tracked) (August 1, 2007)

We see your every move, spammer (tracked) (August 4, 2007)

A Lady in Distress; or Then Again, Maybe Not (tracked) (August 19, 2007)

Wanna help science? Study your greylists innards! (tracked) (September 8, 2007)

Always a pleasure to be wasting your time, guv (tracked) (September 29, 2007)

Of Course, It Had To Be A Webshield (tracked) (October 28, 2007)

I Must Be Living in a Parallel Universe, Then (also here) (November 25, 2007)

Fake Address Round Trip Time: 13 days (tracked) (May 21, 2008)

I challenge your response, backscatterer (tracked) (May 25, 2008)

Yes, we can! Make a difference, that is (tracked) (June 25, 2008)

Now that we have their addresses, do we name and shame? (tracked) (August 7, 2008)

Is one of your machines secretly a spambot? (tracked) (August 9, 2008)

“Name and Shame”, or socially responsible use of your log data (tracked) (September 22, 2008)

IETF failed to account for greylisting (also here) (October 20, 2008)

Oh yes, you signed up for this. You did. Honest. (also here) (March 21, 2009)

The Problem Isn't Email, It's Microsoft Exchange (also here) (February 27, 2011)

My First IPv6 Spam (also here) (June 8, 2011)

In The Name Of Sane Email: Setting Up OpenBSD's spamd(8) With Secondary MXes In Play - A Full Recipe (also here) (May 28, 2012)

Maintaining A Publicly Available Blacklist - Mechanisms And Principles (also here) (April 14, 2013)

Keep smiling, waste spammers' time (also here) (May 4, 2013)

The Hail Mary Cloud And The Lessons Learned (also here) (October 5, 2013)

Effective Spam and Malware Countermeasures - Network Noise Reduction Using Free Tools (also here) (February 2, 2014)

Password Gropers Take the Spamtrap Bait (also here) (August 12, 2014)

Does Your Email Provider Know What A "Joejob" Is? (also here) (April 23, 2016)

The Voicemail Scammers Never Got Past Our OpenBSD Greylisting (also here) (August 29, 2016)

Is SPF Simply Too Hard For Application Developers? (also here) (October 20, 2016)

So somebody is throwing HTML at your sshd. What to do? (also here) (December 22, 2016)

A New Year, a New Round of pop3 Gropers from China (also here) (January 9, 2017)

Twenty-plus years on, SMTP callbacks are still pointless and need to die (also here) (August 27, 2017)

A Life Lesson in Mishandling SMTP Sender Verification (also here) (February 17, 2018)

Badness, Enumerated by Robots (also here) (August 13, 2018)

Goodness, Enumerated by Robots. Or, Handling Those Who Do Not Play Well With Greylisting (also here) (November 4, 2018)

The Year 2019 in Review: This Was, Once Again, Weirder Than the Last One (also here) (December 28, 2019)

The 'sextortion' Scams: The Numbers Show That What We Have Is A Failure Of Education (also here) (February 28, 2020)

RFC7505 Means Yes, Your Domain Can Refuse to Handle Mail. Please Leave Us a TXT If You Do. (also here) (February 22, 2021)

Spammers in the Public Cloud, Protected by SPF; Intensified Password Groping Still Ongoing; Spamware Hawked to Spamtraps (also here) (April 3, 2022)

The Things Spammers Believe - A Tale of 300,000 Imaginary Friends (also here) (September 7, 2022)

Harvesting the Noise While it's Fresh, Revisited (also here) (December 9, 2022)

Can Your Spam-eater Manage to Catch Seventy-one Percent Like This Other Service? (also here) (December 23, 2022)

The Despicable, No Good, Blackmail Campaign Targeting ... Imaginary Friends? (also here) (December 25, 2022)

A Simpler Life: Trapping Spambots Based on Target Domain Only (also here) (January 24, 2024)

Three Minimalist spamd Configurations for Your Spam Fighting Needs (With Bonus Points at the End) (also here) (January 25, 2024)

A Suitably Bizarre Start of the Year 2025 (also here) (January 1, 2025)

Should I Stop Caring and Let IP Address Reputation Sort Them Out? (also here) (June 8, 2025)

You might also be interested in reading selected pieces via That Grumpy BSD Guy: A Short Reading List (also here).

At EuroBSDcon 2025, there will be a Network Management with the OpenBSD Packet Filter Toolset session, a full day tutorial starting at 2025-09-25 10:30 CET. You can register for the conference and tutorial by following the links from the conference Registration and Prices page.

Separately, pre-orders of The Book of PF, 4th edition are now open. For a little background, see the blog post Yes, The Book of PF, 4th Edition Is Coming Soon (also here). We are hoping to have physical copies of the book available in time for the conference, and hopefully you will be able to find it in good book stores by then.

Friday, July 11, 2025

Yes, The Book of PF, 4th Edition Is Coming Soon

Long rumored and eagerly anticipated by some, the fourth edition of The Book of PF is now available for preorder

This week it was finally time to announce, to the fediverse and to mailing lists, that there is a new edition of The Book of PF in the works, and preordering is now enabled.

Note: This piece is also available without trackers but classic formatting only here.

A few questions immediately pop into readers' minds on hearing this news. The ones I get most often are,

Why now? What took you so long?

which quite frequently combines with

What changed? Are previous editions now useless?

I'll address both after repeating what I said in the email announcements:

The fourth edition was written to bring the text into sync with the realities of the modern Internet, seen from the perspective of someone working with OpenBSD 7.8 or FreeBSD 14-STABLE.

The structure and chapter titles will be recognizable to readers of the previous edition, with the content updated to reflect the realities of the modern Internet.

What happened was, for quite some time after the third edition was finished, there were essentially no user visible changes such as syntax changes in the configuration for OpenBSD PF.

The code was definitely being worked on, developers fixed bugs, introduced optimizations such as network stack wide improvements in multicore support. But user-visible changes other than likely performance improvements did not appear, so I saw no urgent need to make updates to the book.

During the years following the late 2014 publication of third edition, I went on giving talks and tutorials, and at some point I welcomed input and help from my present co-presenters of the Network Management with the OpenBSD Packet Filter Toolset tutorials, Max Stucchi and Tom Smyth.

Over a few revisions, the tutorial sessions became ever more OpenBSD centered, possibly because we were all focusing more on that system than the others. And of course, over time we made tweaks to the material we present at the tutorials in response to our own real world experiences and feedback from attendees and others.

This went on for some years, with the still moderately popular conference sessions each yielding incremental changes to the material. Then, COVID-19 put all physical conference activities on hold for the years 2020 and 2021. EuroBSDCon 2022 in Vienna (originally planned as the 2020 conference), was our first post-lockdown presentation and a well attended one at that.

We went on to do more sessions at BSDCan 2023 and EuroBSDCon 2023. During the post-lockdown period, one question started popping up ever more frequently in email, social media (direct messages, even) and in connection with the sessions themselves,

Are you working on a fourth edition?

more often than not accompanied by

I'd love to get this in a FreeBSD version, can you do that too?

My answers would be roughly, "No tech book writer will ever reveal what their current project is until they have a specific publication date set", and "Sure, I will look into what I can do about that for sometime later"

On my way back from EuroBSDCon 2023 in Coimbra, where these questions had of course come up again, even from the quite small group of attendees at our session, I decided I that after eight years it might be worth at least looking into what, if anything beyond incrementing version numbers, would make sense to do to produce a new edition.

So I set out to assess. The book had seen some light touch updates for its second printing that were not in my pre-production .odt files, so I set out with the freshest .pdf and started making annotations. After a little while the volume of annotations had grown enough that I found it more useful to transfer those annotations to a normal text file. That file was becoming something like an outline of what a fourth edition would look like.

So testing my own 8 year old work against modern OpenBSD and FreeBSD, and poking around for PF material in general, I noticed several things.

Since the third edition was written, NetBSD, prominently featured in that edition, had developed their own NPF packet filter subsystem and deprecated their PF port. While DragonFly BSD still had PF in their tree, it looked like their version was seriously out of date (as far as I could tell equal to roughly OpenBSD 3.6, released in 2004).

So concentrating on the two free systems I was anyway in daily contact with -- OpenBSD and FreeBSD -- made sense.

My notes of things that needed to be done swelled over the next few weeks. The revision notes work became my main activity on evenings and weekends for a while, and by late November I sent off that file as an attachment to a mail message to Bill at No Starch that started with,

Dear Bill,
I think a fourth edition of The Book of PF might have a reason to exist soonish.

I went on to explain that while we had not had major announcements in the packet filtering space during the past few years, quite a few incremental and larger changes had indeed happened. A lot more had happened in user visible PF matters on the FreeBSD side than on the OpenBSD side, but incremental changes had happened there too.

And as you could reasonably expect, the world around us had changed enough that in addition to introducing some new features, existing examples and the way we present the issues to the reader needed a refresh in order to be relevant to anyone working in or starting out with modern TCP/IP networks.

It took some weeks before the yes, we're on board for a fourth edition message came back. It is entirely possible that making an important business pitch just before Thanksgiving weekend is not an optimal thing to do, timing-wise.

But when the go-ahead came, I asked Henning Brauer, who had been very much involved with technical editing for the previous versions, and Kristof Provost, who does the major PF things on the FreeBSD side, to be my tech reviewers. Both accepted immediately.

Over the next few months intense editing and revising followed -- yes, I do make mistakes, and Henning and Kristof proved to be very good at catching them.

Now we are very close to having the final result. The fourth edition of The Book of PF focuses on PF on modern versions of OpenBSD and FreeBSD, with only minor mention of other platforms. The ports to Apple systems and Oracle Solaris are mentioned, but I decided early on to focus on the free systems for all examples. The FreeBSD parts have received significantly more attention than in previous versions, to the extent that we jokingly referred to the fourth edition as the be nice to FreeBSD edition.

The editing process has taken longer than I had anticipated, but we are on track now to have copies in readers' hands some time in the second half of 2025. I hope I will be able to bring physical copies of the fourth edition to EuroBSDcon 2025 in Zagreb in September. There will be a revised and updated version of the Network Management with the OpenBSD Packet Filter Toolset session there, a full day tutorial starting at 2025-09-25 10:30 CET. You can register for the conference and tutorial by following the links from the conference Registration and Prices page.

So summing up,

The fourth edition was written to bring the text into sync with the realities of the modern Internet, seen from the perspective of someone working with OpenBSD 7.8 or FreeBSD 14-STABLE.

The structure and chapter titles will be recognizable to readers of the previous edition, with the content updated to reflect the realities of the modern Internet.

If you have actually read this far, there is a good chance you might be interested in the book, the tutorial, or both. I welcome your comments and input.

If you found this piece to be useful, informative, annoying or would for some other reason like to contact me or comment, please do.

You might also be interested in reading selected pieces via That Grumpy BSD Guy: A Short Reading List (also here).

Thursday, January 25, 2024

Three Minimalist spamd Configurations for Your Spam Fighting Needs (With Bonus Points at the End)

Peter N. M. Hansteen

Making life harder for spammers does not necessarily require a lot of effort, if done correctly. Here are a few suggestions for how to use your spamd(8) on an OpenBSD or FreeBSD system that require minimal input but can yield noticeable gains.

Doing your bit to protect your own users and others agains scams, phising or other undesirable mail activity is good netizenship, but unfortunately there is a tendency to think that contributing in any way takes a lot of effort in addition to deep insight into all matters technical and social.

This piece is intended to give you, an aspiring or experienced OpenBSD or FreeBSD user who do not necessarily run a mail service yourself, a taste of some of the options available to you even if you do not want to expend too much effort.

Note: This piece is also available without trackers only basic formatting here.

If your system runs OpenBSD, you only need to enable spamd (overriding the NO defaults from /etc/rc.conf) by adding the following lines to your /etc/rc.conf.local:

spamd_flags="" spamdlogd_flags=""

And adding the required lines to your pf.conf, cut-and-pasteable from the man page before reloading your ruleset. You may want to look into filling in actual flags later if your setup requires it.

If your system runs FreeBSD, you need to enable PF, install the spamd package, then run through the steps outlined in the package message which is displayed at the end of the package installation.

With those preliminaries out of the way, we can go on to the specifics of each of the low effort scenarios.

Classic imported blacklist-only

When spamd(8) was first introduced, it did only one thing: slow down incoming SMTP traffic from known bad sources. The known bad addresses were the ones fetched from address lists generated locally or elsewhere, as specified in spamd.conf.

The pure blacklisting mode is still available. If you have one or more sources of blocklists that you consider reliable, you can use those. To enable this mode on OpenBSD, add the line

spamd_black=YES

to /etc/rc.conf.local or add the -b flag to any options in the spamd_flags= variable, edit in any lists to fetch in your spamd.conf, restart spamd and add a crontab entry to run spamd-setup at reasonable intervals.

On FreeBSD, the procedure is basically the same, but adding the -b flag to the spamd_flags= variable is the only way to enable the feature.

Once you have the -b mode enabled, any SMTP traffic from the known bad hosts will be stuttered at -- answers arriving at a rate of one byte per second until they give up, and spamd-setup will refresh your lists at the intervals you have specified.

You can then sit back and enjoy the feeling of getting to waste spammers' (or at least spambots') time.

Checking your system logs for spamd log entries occasionally will likely lead to giggles.

Classic greylisting without imported lists

The original version of spamd(8) did not know how to do greylisting, but since the version that shipped with OpenBSD 4.1, greylisting mode is the default mode.

If you simply enable spamd without touching any other options, you will have greylisting enabled.

This means that any SMTP traffic from hosts that have not previously contacted your spamd will be stuttered at (one byte at the time, remember) for ten seconds at first.

If they come back within a reasonable time, they will be added to the allowable list. If you have a real mail server in the back somewhere, the traffic will eventually be let through.

Once set up, this mode is also extremely low maintenance.

After a while, your system logs may offer some occasional entertainment.

Allowed domains only

If you're still reading this article, you more likely than not have at least heard about the greytrapping concept. I have written about the concept and practice at length (see the reading materials at the end), and it is one of the topics that I sense is generally perceived as being complicated and labor intensive.

I am here to tell you that there is in fact an easy, low maintenance way in to greytrapping, by making allowed domains be the only criterion for trapping and blocking. This is the method I described in more detail in the previous article A Simpler Life: Trapping Spambots Based on Target Domain Only (or with nicer formatting and Big G's trackers here).

Simply put, if you are running your spamd in the default greylisting mode, with or without imported blocklists, you can tiptoe into greytrapping by adding the domains you want to receive mail for to your spamd.alloweddomains file. If you want to disallow subdomains of otherwise wanted domains, you add an entry with the otherwise wanted domain with an @ at sign prepended.

Make the configuration changes specified in the article. Do read the man pages and other relevant references, the article has quite a few links.

Once you have input the wanted domains in your spamd.alloweddomains file and reloaded your spamd service, any attempt at delivery to any domain that is not specificed in your configuration will lead to blocklisting and subsquent stuttering until the sender gives up.

With this minimal trapping configuration in place, your logs will soon offer some excellent entertainment. Such as this, which demonstrates that I do not own that domain and do not want to receive or relay mail from elsewhere to it:

Jan 25 16:29:14 skapet spamd[84681]: (GREY) 185.196.10.236: <htg@dataped.no> -> <captainjohnwhite3@gmail.com> Jan 25 16:29:14 skapet spamd[4259]: Trapping 185.196.10.236 for tuple 185.196.10.236 tTzhEgT <htg@dataped.no> <captainjohnwhite3@gmail.com> Jan 25 16:29:14 skapet spamd[4259]: new greytrap entry 185.196.10.236 from <htg@dataped.no> to <captainjohnwhite3@gmail.com>, helo tTzhEgT

Bonus tracks: The MX-less merry prankster, and more

All of the things mentioned here will work equally well each on their own or in combination, and those things will, should you choose to go on to set up a mail service, ease the load considerably on the parts of your setup that does the heavier duty computing involved in mail delivery, the content filtering, either for match against known bad code (aka antivirus or antimalware) patterns or text patterns known to be part of scammy spam.

But one fun fact that one of my correspondents pointed out to me some years back is that you can run a spamd service with no real mail service available.

This correspondent reported that sure, they had an OpenBSD machine in an internet facing position, but did not run a mail service.

They set up a combination of the methods outlined earlier, but their mail was handled elsewhere. Anything that finally cleared the barriers of their spamd config would have nowhere to go.

The fact that they did not run an actual mail service did not stop spam senders for trying, and the setup proved ideal for testing how well spamd(8)'s -S and -s options worked.

Please check out the man page to see what they do.

And yes, the effect of -s seemed to be quite linear according to my correspondent's data.

If you want to go further, here is some reading material for you

I hope you find the previous entries informative and possibly even useful.

As you have seen, you can contribute to spam protection efforts even if you do not run an actual mail service. If any of the things suggested earlier suit your needs, enjoy!

However, if you are entertaining the idea of running your own mail service, I have some further reading that I suggest and recommend you spend some time digesting.

First, if you want to run a mail service, do yourself a favor and not only read the relevant man pages, but also sign up for the mailop mailing list, read the Mailop FAQ and the Best Practices for Servers document.

Please also do yourself the favor or lurking, or listening in a bit to get some idea of what kind of discussions are expected there, before posting yourself. Also, familiarize yourself with the mailing list archives. Your question may very well have been answered extensively and well in the past.

If you want to dig deeper in matters related to spam, greytrapping and the OpenBSD spamd(8) program in general, here are a few resources for you:

In The Name Of Sane Email: Setting Up OpenBSD's spamd(8) With Secondary MXes (also with trackers)

Badness, enumerated by robots (also with trackers)

Goodness, Enumerated by Robots. Or, Handling Those Who Do Not Play Well With Greylisting (also with trackers)

Maintaining A Publicly Available Blacklist (tracked only, sorry)

Effective Spam and Malware Countermeasures - Network Noise Reduction Using Free Tools (also tracked only, sorry)

The Book of PF, 3rd edition (now again available as physical copies).

Thanks to Michael Lucas, who wrote a message on the mailop mailing list that spurred me to write both this article and the previous A Simpler Life: Trapping Spambots Based on Target Domain Only (or with nicer formatting and Big G's trackers here).

Wednesday, January 24, 2024

A Simpler Life: Trapping Spambots Based on Target Domain Only

If you want to hurt spammers, you can get away with maintaining a list of domains you want to receive mail for in your spamd.alloweddomains.

I have at times written at length about spam countermeasures, and I must take responsibility for sometimes going into too much detail about options and nuances that are on offer if you enjoy fighting back at the spammers and watching them fail.

So it was a bit refreshing to be reminded that you can, in fact, make good use of the OpenBSD spam deferral daemon spamd(8) without maintaining lengthy lists of anything or even pulling in externally generated data, unless you want to.

The key to the simplest version of spam fightng life with spamd(8) is to put a list of the domains you do want to receive mail for in a file called spamd.alloweddomains, in /etc/mail/ if your system runs OpenBSD, and in /usr/local/etc/spamd/ if you are setting up on a FreeBSD system. Make sure the file is readable for the user that runs the spamd(8) process, and restart or reload your spamd.

The result will be that any host that tries to deliver mail to addresses that are not listed in spamd.alloweddomains will be greytrapped and added to your spamd-greytrap. The host will be stuttered at until it gives up.

If you have no use for external blocklists or allowlists, you can even empty spamd.conf if you want (or comment out any content with # hash characters). The spamd process will run fine without one.

Here is an example lifted from my nxdomain.no server recently:

Jan 23 15:18:27 skapet spamd[84681]: (GREY) 193.222.96.180: <test@bsdly.net> -> <director_ericmoore@hotmail.com> Jan 23 15:18:27 skapet spamd[4259]: Trapping 193.222.96.180 for tuple 193.222.96.180 win-4tti4dh7sgh.domain <test@bsdly.net> <director_ericmoore@hotmail.com> Jan 23 15:18:27 skapet spamd[4259]: new greytrap entry 193.222.96.180 from &kt;test@bsdly.net>M to <director_ericmoore@hotmail.com>, helo win-4tti4dh7sgh.domain

Needless to say I am not Microsoft, so hotmail.com is not in nxdomain.no's /etc/mail/spamd.alloweddomains.

If you want to pull in external blocklists or pass lists, you can pull in a spamd.conf with content. One useful starting point is the default version, or if you want you can stat with mine, which pulls in some other resources.

Finally, if you want to run a mail service, do yourself a favor and not only read the relevant man pages, but also sign up for the mailop mailing list, read the Mailop FAQ and the Best Services for Servers document.

Thanks to Michael Lucas, who wrote a message on the mailop mailing list that spurred me to write this article.

If you want to dig deeper in matters related to spam, greytrapping and the OpenBSD spamd(8) program in general, here are a few resources for you:

In The Name Of Sane Email: Setting Up OpenBSD's spamd(8) With Secondary MXes (also with trackers)

Badness, enumerated by robots (also with trackers)

Goodness, Enumerated by Robots. Or, Handling Those Who Do Not Play Well With Greylisting (also with trackers)

Three Minimalist spamd Configurations for Your Spam Fighting Needs (With Bonus Points at the End) (also with trackers

Maintaining A Publicly Available Blacklist (tracked only, sorry)

Effective Spam and Malware Countermeasures - Network Noise Reduction Using Free Tools (also tracked only, sorry)

The Book of PF, 3rd edition (now again available as physical copies)

Wednesday, September 14, 2022

Open Source in Enterprise Environments - Where Are We Now and What Is Our Way Forward?

We have been used to hearing that free and open source software and enterprise environments in Big Business are fundamentally opposed and do not mix well. Is that actually the case, or should we rather explore how business and free software can both benefit going forward?

Puffy, the OpenBSD mascot, shiny version

Free and Open Source vs Enterprise and Business: The Bad Old Days

Open source, free software and enterprise IT environments have both been around for quite a while. I'm old enough to remember when the general perception was that the free exchange of source code was merely a game for amateurs, or at best an academic excercise. In contrast, the proper business way of doing things was to perhaps learn general principles and ideas from the academics, but real products for business use would be built to be sold as binary only, with any source code to be kept locked away and secret.

Note: This piece is also available without trackers but more basic formatting here.

If you're a little younger you may remember a time when Windows NT is the future was essentially gospel and all the business pundits were saying we would be seeing the last of Unix and mainframes both within only a handful of years.

Thinking back to the late 1980s and early 1990s it is hard to imagine now how clear the consensus seemed to be on the issue at that point. The PC architecture and a few other proprietary technologies was the way of business and the way forward.

No discussion or dissent seemed possible.

Then, The Internet Happened

Then the Internet happened. What few people outside some inner circles were aware that what actually made the Net work was code that came directly out of the Berkeley Software Distribution. BSD Unix, or simply BSD for short, was a freely licensed operating system that was the result of a rather informal cooperation of researchers in academia and business alike, originally derived from Unix source code.

When the United States Department of Defense wanted work done on resilient, device independent, distributed and autoconfiguring networks, the task of supplying the reference implementation for the TCP/IP stack, based on a stream of specifications dubbed Request for comments or RFCs, fell to the international group of developers coordinated by the Computer Science Research Group at the University of California's Berkeley campus. In short, the Internet came from BSD, which thanks to a decision made by the Regents of the University of California, was freely licensed.

The BSD sourced TCP/IP stack was part of all Internet capable systems until around the turn of the century, when Linux developers and later Microsoft started working on their own independent implementations. By that time it had been forcefully demonstrated to the developer community at least that open source code was indeed capable of scaling to industrial scale and beyond.

Due to a handful of accidents of history, mainly involving imperfect communications between groups of developers and combined with a somewhat misguided lawsuit involving the BSD code, it was Linux that became the general household term for free software in general and the re-emergence of Unix-like systems in the Internet connected server market space. Linux distributions came with a largely GNU userland as well as generous helpings of BSD code.

At roughly the same time Linux emerged, the BSD code became generally available via the FreeBSD and NetBSD projects, and soon after the OpenBSD project, which forked from the NetBSD code base in the mid 1990s. For a more detailed history of these developments, see the three part series on the APNIC blog starting with this piece. If that piqued your interest, you may enjoy this piece about some incremental improvements over time in OpenBSD.

The War on Linux and the Proliferation of Open Source Tools

During the 1990s and early 2000s the Internet and services of all kinds that ran on top of it expanded in all directions. That expansion had the effect of advancing the free unixlike systems such as Linux and the BSDs, which would run quite comfortably on commonly available hardware, along with an ever expanding number of development tools and software of all kinds to new categories of users.

The success of the open source software lead to what would be dubbed The War on Linux, a rather vicious defamation campaign executed in both PR campaigns and lawsuits, and driven mainly by the then-dominant desktop software vendor's ambition to dominate server space as well. One of the more bizarre sequences of Linux-targeting lawsuits was run by proxy, and is extensively documented at groklaw.net (Note: http-only site). It is worth noting that the process eventually lead to bankruptcy for the litigant.

Over the years it became clear to essentially everyone in the industry that open source tools were essential to development, and several practical aspects of developer life lead to ever increasing open source use. During the time of The War on Linux, the likes of Apple, Cisco, Netscaler (later acquired by Citrix) and Sun Microsystems (later acquired by Oracle) either incorporated open source code in their products and workflows, open sourced large parts of their own code or forked freely available code to base proprietary systems on. It may be worth discussing each of these approaches in detail later.

On to the Present: We All Use...

Fast forward to the present day, and I recently had colleagues sum up that in the enterprise environments we move in,

Software is developed on Macs,
deployed on a cloud somewhere,
which more likely than not runs on Linux.

And the software itself is likely built with open source tools and pulls in dependencies from open source projects, possibly hosted on Github or other public sites.

Your software in all probability uses some open source. And even if you are not a developer, you most likely use open source tools that are integrated in your operating system or common application software or web services.

On the client side of things, an ever increasing part of the volume comes from smartphones, tablets and the like, where the market share for open source based systems (Android and IOS) exceeds 90 percent. In a document we will come back to later, the Norwegian National Security Authority (NSM) estimates that approximately 90 to 98 per cent of all software in use to some extent has dependencies on open source software. Other relevant statistics can be found here, here and here. Or, if you're in a bit of a hurry: It is estimated that some 3.1 billion Linux-based Android phones are currently in use. In addtion, there is Apple, which we know has a significant amount of BSD code in their software.

It is of course worth noting that by now even the old open source arch-enemy Microsoft ships their offerings with what amounts to an almost complete Linux distribution as a subsystem. The same company regularly lobs cash over the wall to the likes of The OpenBSD Foundation and regularly contributes to other open source projects. Not to mention that much of what runs in their Azure cloud is one way or the other Linux based.

Security: QA Your Supply Chain, Excercise the Right to Repair

Back in the days of The War on Linux, and to some extent still, we have often been faced with claims that open source software could either never be as secure as proprietary software or that open source software was inherently more secure than the closed source kind, because "given enough eyes, all bugs are shallow".

Both assertions fail because even without access to source code, it is possible to probe running software for vulnerabilities, and on the other hand the shallowness of bugs depends critically on the eyes looking being attached to people with sufficient competence in the field.

The public reaction to a couple of security incidents during recent years that generated a flurry of largely uninformed punditry are worth revisiting for the lessons that can be learned.

The Solarwinds supply chain incident aka SUNBURST (2020) - One of the most widely publicized yet mostly quite poorly understood security incidents in recent years emerged when it was revealed that adversaries unknown had been able to compromise the build computers where the binaries for their widely used network management software was built for distribution.

The SANS institute has produced a fairly thorough writeup of the incident, which breaks down as follows: The first stage of a multi-stage compromise kit was included in binary distribution packages, complete with authentic signatures from the build system, that were largely put directly into production environments by network admins everywhere. The malware then went on to explore the networks they landed in, and through a process that made heavy use of crafted DNS queries and other non-obvious techniques, the miscreants were able to compromise several high security government and enterprise networks.

Several open source component supply chain incidents (2020 onwards) - Soon after the SUNBURST incident several incidents occured where popular open source components that other systems pulled in as dependencies started malfunctioning or were suddenly unavailable, causing complete malfunctions or loss of functionality such as a web service suddenly refusing to interact with specific networks.

The sudden breakage in open source components caused quite a bit of uproar, and predictably the chattering subset of the consulting class set about churning out dire warnings about the risk of using open source of any kind.

Watching from the sidelines it struck many open source oriented professionals, myself included, that the combination of these incidents carry an important lesson. It is obvious in a modern environment we suck in upgrades automatically and frequently, and that no untested code should ever be deployed directly to production.

Blind trust versus the right to read (and educate yourself) and the right to repair - In the case of proprietary, binary-only software, you have no choice but to trust your supplier and that they will address any defects in a timely manner. The upshot is that with proprietary, binary-only you do not have access to two important features of open source software: The right to read and study the code, and the right to repair any defects you find, potentially saving yourself potential service shutdowns or workarounds while the secret parts of your system get fixed elsewhere.

The lesson to be learned is that you need to run quality assurance on your supply chain. You may choose to trust, but you still need to verify. That goes for open source and proprietary software both.

This Norwegian felt slightly elated when reading that the Norwegian National Security Authority (NSM) provides essentially the same assessments in their published recommendations.

Contributing - Cooperating on Maintenance

As with any product it is entirely possible to be a relatively passive consumer, just install and use, and build whatever you need on top, interacting with the community only via downloading as needed from the mirror sites. Communicating via online forums, mailing lists or other channels is entirely optional.

If you are a developer or integrator with an ambition to make one or more opern source products central to your business either by using and contributing to an existing project or starting a new one, several approaches are possible.

Let's take a look at the strategies some big names adopted on open source in their products:

Grab and fork, sell hardware: The Netscaler load balancer and application delivery products were based on a fork of FreeBSD.

They appear to have rewritten large parts of the network stack and devised a multifunctional network product on top, which among other things features a slick web GUI for most if not all admin tasks.

If you look closely, Netscaler (since acquired and rebranded by Citrix) appear to cultivate a menagerie of open source projects to interface with their products.

However they appear not to have in particularly close contact with their main upstream. (It is worth noting that the BSD license does not require publishing changes to the code base.) When dropping to a shell on a Netscaler unit, last time I looked the output of uname -a seemed to indicate that their kernel was still based on FreeBSD 8.4, which the FreeBSD web site lists as End of Life by August 1, 2015.

Grab and fork, sell hardware, keep sync with your upstream: Starting with the initial release of macOS, Apple have maintained the software that drives their various devices, from phones to desktop computers and related services with generous helpings of open source code, along with what appears to be a general willingness to publish code and interact with upstream projects such as the FreeBSD project. Apple maintains the Open Source at Apple site for easy access to the open source components of their offerings.

This mode of open source interaction seems to be rather common, especially among network oriented suppliers of various specialty gear.

Open source everyting, sell support: Despite early scepticism from business circles, several companies have built successful companies on the model of participating or even driving the development of open sources systems or components, making support contracts (which may include early or privileged access to updates) as well as consulting services the main or sole source of company revenue.

Decide what code is both good enough to publish and useful elsewhere: Finally, for those of us in the services or consulting business who will occasionally write code that is not necessarily business specfic, the reasonable middle ground is just that. Identify code that meets the following criteria:

Was developed by yourself and cleared by your organization and other stakeholders such as your customer as such
Is high enough quality that you dare show it to others
Does not reveal core aspects of your clients' business
Is likely to be useful elsewhere too
Would be nice to have exposed to other sets of eyes in order do identify bugs and fix them

If you have code under your care in your organization that meets those criteria, you should in my opinion be seriously considering making that code open source.

Your next adventure will then be to pick an appropriate license.

Now for Policies and Processes - Do You Have Them?

If you have followed on this far, you probably caught on to the notion that it is wise to set up clear policies and procedures for handling code, open source or otherwise.

Keep in mind that

A license is an assertion of authority. A license is a creator's message to the world that states the conditions others must abide by when using, or if they allow it, change and further develop the code.

Without a license the default regime is that only the person or persons who originated the code have the right to make changes or for that matter make further copies for redistribution.

For that reason it is important to ensure that every element of your project has a known copyright and license.

There have been quite a few instances of free software project rewriting functionally equivalent, or hopefully better, versions of whole subsystems because of unacceptable or unclear licenses (see the OpenBSD articles in the Resources section for some examples).

Procedures and policies, you need them. A self employed developer working on their own project is usually free to choose whatever license they please. In a corporate environment, any code developed is likely tied to a contract of some sort, which may or may not set the parameters of who holds the copyright or what licenses my be acceptable. The exact parameters of what can be decided by contract and what follows from copyright law my vary according to what jurisdiction you are in. When considering whether to publish your own code under an open source license, make sure all stakeholders (and certainly any parties to any relevant contract) agree on the policies and procedures.

Keep it simple, for your own sake. There are supposedly several hundred licenses in existence that the Open Source Initiative considers to be open source. In the interest of making life easier for anyone who would be interested in working on your code, please consider adopting one of those well-known licenses.

They range from the simplest, BSD or MIT style ones that run a handful of sentences and can be condensed to you can do whatever you like with this material except to claim that you made it all yourself to elaborate documents (the GNU GPL v3 comes to mind) which set out detailed terms and conditions, may require republication of any changes under the same terms, and could set up a specific regime with respect to patent disputes.

It is also important to consider that components you use in your project may have specific license requirements and that different licenses may contain terms that make the licenses incompatible in practice.

My general advice here is, make it as simple as possible, but no simpler.

Or to rephrase slightly: The general advice for dealing with licenses echoes that of dealing with crypto code: Do not set out writing your own unless you know exactly what you are doing. Avoid that path if at all possible.

When in need, call in Legal (but make sure they understand the issues). Lawyers endure a lengthy education in order to pass the bar and turn to practicing law, but there is no guarantee that a person well versed in other business legalese has any competence at all when it comes to matters of copyright law. When you do turn to Legal for help, be very exacting and stern in insisting that they demonstrate a command of copyright basics and if at all possible have a reasonable real world understanding of how software is built.

As in, you really do not want to spend an entire afternoon or more explaning the difference between static and dynamic linking and why this matters in the face of a certain license, or that specific terms of different licenses deemed open source by the Open Source Initiative may in fact be incompatible in practice.

It is important to keep in mind that doing open source is about making our lives more productive and enjoyable by exchanging ideas between quality professionals, perhaps sharing the load of maintenance and leaving us all more resources to develop our competence and products further.

The Way Forward - The Work Goes On

So this is where we are today. Modern software development and indeed a goodly chunk of business and society in general depends critically on open source software.

If you enjoyed this piece (or became annoyed by any part of it) I would like to hear from you. I especially welcome comments from colleagues who have experience with open source use and/or development in enterprise settings. Of course if you are just curious about open source software in these settings, you are welcome to drop me a line too. I am most easily reachable via email nix at nxdomain dot no.

I want to extend thanks to Malin Bruland and Knut Yrvin for excellent comments and proofreading.

Resources

All things open source (including an almost encyclopedic collection of licenses) at The Open Source Initiative

Wikipedia: Berkeley Software Distribution about where the Internet came from

The GNU Operating System, supported by The Free Software Foundation

The FreeBSD operating system project

Open Source at Apple

Peter Hansteen: What every IT person needs to know about OpenBSD Part 1: How it all started,
What every IT person needs to know about OpenBSD Part 2: Why use OpenBSD?,
What every IT person needs to know about OpenBSD Part 3: That packet filter
(or the whole shebang in the raw at bsdly.blogspot.com)

Bradford Morgan White: The Berkeley Software Distribution

Nasjonal Sikkerhetsmyndighet (NSM): Åpen kildekode i den digitale leverandørkjeden (Norwegian only)

Business of Apps: Android Statistics (2023)

Bank My Cell: How Many Android Users Are There? Global and US Statistics (2023) (Source: https://www.bankmycell.com/blog/how-many-android-users-are-there)

Statista: Market share held by Apple iOS operating system of smartphone shipments from 1st quarter 2011 to 4th quarter 2022

Appendix: License Complexity Measured by Word Count

While presenting on free and open source software in enterprise environments, the topic of license complexity and how to handle licensing matters usually generates questions of the type,

"Does doing open source mean we need to staff an Open Source Program Office?
Does this not add a considerable measure of complexity to the development organization?
Do the open source licenses mean we have to hire even more lawyers?"

So I set out to do a little research. I figured that the number of words in a text is a useful, if not perfect indicator of complexity, so we could use that measure as a useful and easy to obtain proxy for measuring how complex the licenses we are likely to encounter are in practice.

I headed over to the Open Source Initiative website and their excellent collection of open source licenses. I then picked out the more common open source licenses, and for each license I pasted the text into the word counter at wordcounter.net, which in addition to the word count provides an indication of likely target audience "reading level" and estimated reading time as well as a few other measures of the text characteristics.

The results are in the following table:

License complexity by wordcount
	Word count	Reading Level	Reading time
1-clause BSD License	160	College Graduate	35s
2-clause BSD License	191	College Graduate	42s
3-clause BSD License	220	College Graduate	48s
GNU GPL v2.0	2964	College Graduate	10m47s
GNU GPL v3.0	5608	College Graduate	20m30s
Apache License v2.0	1677	College Graduate	5m44s
Microsoft 365 Developer program license	4803	College Graduate	17m28s
Microsoft Windows 11 OS license terms	5766	College Graduate	20m58
Oracle End User License Agreement	2554	College Graduate	9m17s
Adobe End-User License Agreement	450	College Graduate	1m38s
Apple Licensed Application End User License Agreement	1524	College Graduate	5m32s

Once again, strict word count is not a perfect indicator of complexity — other measures such as sentence length and logical structure and interdependencies are likely to matter in real life scenarios.

That grumpy BSD guy