Thursday, September 18, 2025

EU CRA: It's Later Than You Think, Time to Engineer Up!

© 2025 Peter N. M. Hansteen

On December 12 2027, it's already too late. The day before, the European Union Cyber Resilience Act (CRA) will have fully entered into force.

On December 11 2027, the Cyber Resilience Act is fully in force in the European Union member states and associated countries and territories.

From that date onward, suppliers of any "product with digital elements" are required to present those products along with a full overview and insight into all components and dependencies that went into making that product.

Unless, of course, you are a supplier that is fine with being considered at best second rate, or even being ineligible for lucrative contracts. Selling product that has not qualified for the CE mark for its product category will simply not do.

The European timeline for phased implementation of the CRA is outlined here, among other places.

Even if you are on the other side of the pond, you're not out of the woods. But more on that later.


Note: This piece is also available without trackers but classic formatting only here.

Upping Your Engineering Game

For individual developers, the question becomes something more along the lines of "Do you know what your code does?", or even "Do you know everything your code does?".

To put it bluntly, whether you answer to either of these is a clear yes or no determines whether you are just a coder or an engineer who codes.

The purpose of this session is to help you move towards becoming the latter. To start you upping your engineering game.

To set the stage for what real engineers (should) do and to keep focus on the importance of doing things right, the anecdote of the Canadian engineers' steel ring is a useful reference.

This all sounds a bit harsh, I know. So we will go a little softer at first, much like I did in my earlier article No Project Is an Island: Why You Need SBOMs and Dependency Management.

And yes, some of this will sound familiar if you have taken in that piece or participated in the live sessions based on the text.

Dear Developer, do you know what your code does?

So let's ask the question,

Dear developer, do you know what your code does?

Your answer is likely to be along the lines of

Sure, I wrote it all. I know what it does.

Unless you vibe coded the thing, that is. But let's leave that particular set of circumstances for another time.

The answer I wrote it all. I know what it does is, however, unlikely to be totally accurate. Unless you are doing extremely low level stuff and your code speaks directly to the hardware, your code more likely than not also pulls in and utilizes dependencies such as system calls and library functions that provide the foundation of functionality that makes the code you wrote work.

Knowing your dependencies and what role they plain in making your code work is a significant part of delivering proper quality. More on that later. First, we turn to a little history of software.

Just a Bit of Typing

Software is a relatively recent phenomenon. For a long time, you could credibly say most of its existence, software was poorly understood by society and industry at large.

There was a time -- and I am old enough to remember that time -- when software was considered a minor, somewhat irritating but necessary, component in IT deliveries.

On the more extreme end of things, you would occasionally hear that software was not at all important, literally just a bit of typing.

All the while it was ever more clear to developers and practitioners that the software was what made all that expensive hardware useful. But software was all ephemereal to most and in almost all cases the source code was secret, and the customer was expected to just accept whatever came you way as-is.

That perception changed over time, and during recent decades it is no longer in doubt that the software industry is just that, an industry in its own right.

But Then Suddenly Software Turned Important

Then, as some of us still remember, the Internet happened.

Few people realized it at the time, but this was the time in history when two important things happened at roughly the same time.

For one, it became obvious to developers at least that the infrastructure we all have come to rely upon owes its strength and resilience to the fact that it consists mainly of software that was built on standards built on rough consensus and working code, code that was open source.

The other thing was that software faced the full force of the entire world banging away at their keyboards.

Some of those keyboards were operated by people who intended to do bad things.

And eventually, bad things started happening.

Over the years, eventually enough episodes piled up that software security, sometimes discussed under other labels, started becoming an issue.

During the twenty-tens and -teens, we had several incidents where software bugs were tickled enough to lead to costly and embarrasing episodes. Some of these episodes were grave enough that the powers that be (the kind wearing suits) discovered that software was indeed something they needed to care about.

These episodes spurred several things, one being memes like

(XKCD #2347, please also read the explainer), which lead to the common belief that supply chain management and the subtopic dependency management is mainly a problem that concerns open source software.

This assertion is simply not true, in that no project is an island.

Whether you let others see the code you wrote or not, the software does not exist in isolation.

The XKCD comic struck a chord with open source developers, who at the time were a lot more in tune with the world of software dependencies than most other people.

Dependencies Became A Thing

There were several high profile and scary security incidents during the twenty-tens and twenty-teens. Some were due to exploitable and exploited bugs in open source code and dependencies, such as the log4shell incident involving a very popular logging library. This incident served to make it clear to C-level executives that dependencies were indeed a thing, and that their infrastructure was to a large extent made up of open source software.

At roughly the same time, the SUNBURST supply chain incident, which involved a popular piece of proprietary network management software that had been backdoored, demonstrated that even when the source code is kept secret, that is not sufficient protection against skilled adversaries.

These and other grave incidents made supply chain security an important new addition to our software security vocabulary.

No Project Is an Island

As I mentioned earlier, no project is an island.

Whether you let others see the code you wrote or not, the software does not exist in isolation.

Summing up so far,

  • We write software
  • Which depends on other software
  • Which interacts with other software
  • Which again interacts with other components (hardware, humans)
  • To run important stuff
  • Nothing exists in actual isolation – No project is an island

So what we do is important. What do we do about that?

Learn From Those Who Build Important Things

One way to handle the situation is to look at what other people who build important things do.

In other fields, the term Bill of Materials, or BOM for short, is a familiar term. The Bill of Materials is a document or set of documents that lists all component parts of a delivery.

This is the kind of document that becomes crucial in contexts where the procuring organization is geared toward accounting for everything and auditing when the supplier least expects it.

One such context could be when your organization has landed a contract to supply a backhoe, an armored personnel carrier or even a ship, and the contract requires you to specify component materials used, down to the nuts and bolts level.

For an example of the scale of things we are talking about, consider this ultra high level view of an item that was delivered to the UK Royal Navy, one aircraft carrier HMS Queen Elizabeth:

Aircraft carrier HMS Queen Elizabeth, exploded view

Your delivery would not be considered complete without the Bill of Materials or Manifest, even for a thing this size.

In practice, the BOM for the HMS QE and similar-sized projects would be a collection of BOMs with specifications for each of the multitude of component deliveries that make up the whole. Each supplier would be required to come up with a Bill of Materials for their delivery.

For physical deliveries to organizations of some stature, a Bill of Materials has been a standard part of the process across industries as an important part of quality assurance and a fundamental part of maintenance processes.

Software, on the other hand, has traditionally not been subject to that kind of scrutiny.

What Do Engineers Do?

In other fields of engineering, the process runs roughly like this:

You design your product, make detailed plans and descriptions of how to build the thing.

While planning and building, you keep track of all parts and components.

A Bill of Materials (BOM) for a pump that could well be a part of the HMS Queen Elizabeth could look like

Screenshot of a Bill of Materials (BOM) foar a boat pump, possibly part of a larger delivery

Your plans and design documents will likely undergo changes during product development and assembly.

For each delivery, you create a Bill of Materials that is a required and essential part of the delivery.

The Bill of Materials (BOM) lists all component parts, to the detail level required for running maintenance.

The BOM typically also references and serves as reference for maintenance documentation.

As an aside, it is likely worth noting that the US Department of Defense's need for structured text markup in processing inventory information such as bills of materials was one of the more important drivers, albeit not the only one, behind the creation of SGML, the direct precursor to HTML and XML.

Again, for a long time, this kind of engineering practice was not seen as a requirement for software deliveries.

Libre Software Has Package Management Already

Handling dependencies in software is not a new thing. You probably poke around for dependencies yourself when you start looking into a new project.

You will start looking into the source code files in your project, any libraries or tools needed to build the thing would be nice-to-knows. Once you have the thing built, it becomes interesting to know what other things -- libraries, suites of utilities, services that are required to be running or other software frameworks of any kind -- that are required in order to have the thing run.

So basically, any item your code would need comes out as a dependency, and you will find that your code has both build time and run time dependencies.

Those terms will be quite familiar to users and the developers of the package manager systems for the various open source operating systems. The very same items you would recognize from a listing of package dependencies in a package management tool will turn up in our Software Bill of Materials too. Depending on the specific tool and options you use, the SBOM could contain additional information that may not be entirely relevant in a package manager context.

Under any circumstances, with package systems in place, and even vulnerability scanners available to scan for unsecure code at rest or while running, the free and open source software communities were in fact well positioned for the legal requirements when they hit. Even more, the lessons learned from package management came in quite useful in meeting and satisfying the updated requirements.

Every free operating system, and in fact most modern-ish programming languages come with a package system to install software and to track and handle the web of depenencies. You are supposed to use the corresponding package manager for the bulk of maintenance tasks.

So when the security relevant incidents hit, the open source world was fairly well stocked with code that did almost all the things that were needed for producing what became known as Software Bill of Materials, or SBOM for short.

Introducing: A Software Bill of Materials (SBOM)

So what would a Software Bill of Materials even look like?

Obviously nuts and bolts would not be involved, but items such as the source code files in your project, any libraries or tools needed to build the thing would be nice-to-knows. And once you have the thing built, it becomes interesting to know what other things -- libraries, suites of utilities, services that are required to be running or other software frameworks of any kind -- that are required in order to have the thing run.

The information is there in our code, and with development tools and code scanners a developer is well placed to poke around.

The next challenge it to take that information and present it in a way that conforms with the legal specification and is presented in a way that is usable for stakeholders that are not developers.

In addition to module or package names and versions, the expected SBOM product will typically include information on any identified security problems such as CVEs and a specification of the licenses that apply to each of the identified dependencies.

Thanks in large measure to the open source heritage of the specifications and tools, both of the commonly used SBOM specifications (SPDX and CycloneDX) consider information on licenses used in a file or project as tagging and tracking relevant items.

The tools we describe have some measure of support for tracking and reporting on licenses in use. This can be useful for flagging licenses that may be mutually incompatible or even incompatible with your organization's business goals.

Several pieces of legislation emerged from the at times panic flavored fallout from the security incidents. Which ones are more relevant to you will become clear as we move on.

Depending on what parts of the world you care more about, the emphasis will either be on

So that's our backdrop for now.

Mainly (I think) due to coordinated lobbying by major players, both have rougly the same time frames for becoming formal requirements, with the EU Cyber Resilience Act (CRA) entering fully into force, with a CE mark scheme for digital products to be in place with the same deadline.

The name of the SBOM game is compliance with those legal requirements, and to not only generate the information -- that's the relatively easy part -- but also to present the information in a way that is understandable and actionable to stakeholders who are not themselves software developers.

We're Real Engineers Now, Sparky! We Have Tools

As I hinted at earlier, there are tools available for all of this. If you want to go on and explore for yourself, I would recommend going to the awesome-sbom site, which offers a curated collection of SBOM resources and tools hosted as a Github repo.

There are a large number of tools available, with varying feature sets. In addition to the free tools you find via that collection, several tool suites exist that are exclusively commercial or with free trial or reduced features set versions out with full features available only to paying customers.

The tool set I found the most accessible for my poking around was the combination of syft for generating SBOMs and bomber for display and presentation. The home pages for both are linked from the awesome-sbom collection.

As you can see from that page, there are several SBOM formats around, and to some extent standardization and interoperability efforts are under way. But enough of that, let's look at the actual tools in use.

Tools and How To Use Them

The tool set I found the most accessible for my poking around was the combination of syft for generating SBOMs and bomber for display and presentation. The home pages for both are linked from the awesome-sbom collection.

As you can see from that page, there are several SBOM formats around, and to some extent standardization and interoperability efforts are under way. But enough of that, let's look at the actual tools in use.

As a first step, it is instructive to point syft at the base directory of your project and see if it can tell you something you did not know already. syft supports a number of output formats, so if XML is the more readable format to you,

$ syft . -s all-layers -o cyclonedx-xml | xq

will give you pretty-printed XML (assuming you have xq installed) output of what syft found out. Do explore the various command line options for extracting various information about your project.

If you prefer JSON over XML, something like

$ syft . -s all-layers -o cyclonedx-json | jq

will give you readable JSON of the same information. Again, there are a number of options to explore.

Your SBOM, The Build Artifact

When you have explored a bit, you may want to look into how you incorporate these tools in your project and make the SBOM a build artifact.

The bomber documentation has this example suggestion for inclusion in a CI/CD pipeline:

# Make sure you include the - character at the end of the command.
# This triggers bomber to read from STDIN
syft packages . -o cyclonedx-json | bomber scan --provider ossindex --output json -

For your own projects you will tweak to taste, of course.

Your Tools May Already Have (Some of) This

More SBOM-savvy co-stakeholders in your project may even be capable of processing your json or xml formatted SBOMs themselves, using tools of their choice.

Your project and customer may already have chosen a different toolset, or you may find that some other SBOM generating and presentation tool set are better matches for your requirements.

It is in fact conceivable that you have SBOM-capable tools within reach in your environment already. The fairly popular images-and-sundry repository system Harbor supports automatic SBOM generation on image push by hooking in trivy for image scanning duty, should you choose to enable that feature for your Harbor hosted projects.

Track Your Dependencies On The Fly

In a real world scenario, I could imagine that non-developers would appreciate it if you supplement that line with one using the --output=html option. The HTML output provides a report that lists licenses involved before listing know vulnerabilites by severity and assigned CVE.

While I was writing this article, a colleague who had been reviewing it told me of an episode that shows that even extremely basic use of the SBOM tools can be useful. A customer had called, saying they needed a complete list of tools and dependencies involved in a project, and right away. As a first step, my colleague cd'ed in to the main directory of one of the subprojects for that customer, and issued the command

$ cdxgen .

and was rewarded with a bom.json file that listed somewhere in excess of three hundred dependencies for that relatively minor subproject alone. The customer was suitably impressed and granted my colleague a more realistic and less immediate time frame for submitting the full dependency tree.

There Is More

If you want to explore further, please dive into the resource references at the end here.

For the more Bill of Materials savvy developers who want to explore even more, it may be of interest that the OWASP and SPDX teams are working on more specialized BOM variants, including

  • OBOM (Operating system Bill of Materials)
  • SaaSBOM (Software as a Service Bill of Materials)
  • CBOM (Cryptography Bill of Materials)
  • AISBOM (Artificial Intelligence Bill of Materials)
and several more. Again, see the referenced resources at the end here and follow the breadcrumbs.

Now It's Your Turn: Get The Tools

Now it's your turn to go exploring. The first item is to get the tools installed.

If you haven't already, go to the home pages of each:

And follow the instructions on how to install for your environment.

The exact steps to install depends, of course, on your platform.

If you are running a recent Linux distribution, you more likely than not have them within reach via your package system. Failing that, or if you happen to be on macOS or a supported Linux, the command

$ brew install $toolname

where the value of toolname expands to the name of the tool you want will get you there.

There are even instructions for Microsoft systems at some of the tools' home pages.

If none of these methods work, do a git clone of the tool source code (you were going to do that anyway, right?) and follow the build instructions.

If necessary, tweak to get the thing to work. If you find you need to do something non-trivial to make the tool build and run on your system, consider submitting a pull request to the project.

Tools in Hand, Dig Into a New Project

Now that you have to tools installed, it is time to put them and your own skills to work on some actual source code.

If you have the source for a project you are already familiar with available, or a project you are interested in exploring, choose that. Otherwise, find something you're interested in on Github or somewhere else public.

Once you have a local copy of the codebase, go to that directory.

Once there, start with

$ cdxgen .

then watch the output (it may be useful to run commands like these in a script(1) session so you can look up what happened in the script file later), and act upon it.

Be prepared that there may be issues in the code that needs fixing or some dependency that you were not aware of.

Then look up the cdxgen, syft and bomber documentation to find out the following about your chosen code base:

  • What is the number of dependencies for this code base? How many direct dependencies? How many indirect ones (dependencies of dependencies)?
  • Does the code base itself have any known problems, reported as CVEs? How many for the dependencies?

If you are feeling a bit more ambitious, you could try checking out the tools themselves, and run the tools on those codebases:

Fetching cdxgen source code is as easy as

$ git clone git@github.com:CycloneDX/cdxgen.git

There may be some challenges ahead. If the result of your first session looks like this (an actual script session of cdxgen run on its own source), please do not let that discourage you. Those are problems to be fixed, and you are developer enough to do that, right?

for syft, the command is

$ git clone git@github.com:anchore/syft.git

and for bomber,

$ git clone git@github.com:devops-kung-fu/bomber.git

You may find other tools, via awesome-sbom or elsewhere, that fit your tastes or your projects better than those.

This is when the fun part starts.

Resources for Further Reading

Linux Foundation Training:
Automating Supply Chain Security: SBOMs and Signatures (LFEL1007) a short but information- and reference-filled introduction (free, requires registration, gives you a badge at the end)
Understanding the EU Cyber Resilience Act (CRA) (LFEL1001) Focused on the EU CRA, gives an overview with lots of useful references, nominally a 1 hour course worth taking

The Software Bill of Materials home page at NTIA is the mother ship of SBOM documentation

Browse OWASP CycloneDX for all things about the CycloneDX specification and related tools, also their CycloneDX tool center

Browse the System Package Data Exchange specification (SPDX) for all things SPDX (supported by the Linux Foundation), including copious linked reference material

awesome-sbom is a curated list of SBOM tools and resources

EU residents will want to poke around the Cyber Resilience Act site for reference

Brewing Transparency: How OWASP's TEA Is Revolutionizing Software Supply Chains is a summary of recent work on OWASP Transparency Exchange API (TEA)

SBOM buyer’s guide: 8 top software bill of materials tools to consider is a readable overview of (some) SBOM tools

Olle Johansson's FOSDEM presentations are among several good SBOM talks at that conference (search the site for more)

Peter N. M. Hansteen: Open Source in Enterprise Environments - Where Are We Now and What Is Our Way Forward? (2022, also here) has some insights on how open source software plays a crucial role in enterprise environments and elsewhere

Peter N. M. Hansteen: No Project Is an Island: Why You Need SBOMs and Dependency Management (also here)

Peter N. M. Hansteen: EU CRA: It's Later Than You Think, Time to Engineer Up! (this article) (also here)

Peter N. M. Hansteen: EU CRA: It's Later Than You Think, Time to Engineer Up! (slides)

Sunday, August 10, 2025

Eighteen Years of Greytrapping - Is the Weirdness Finally Paying Off?

© 2025 Peter N. M. Hansteen

With the imaginary friends, also known as spamtraps, now more numerous than the inhabitants of their virtual landlord's home country, a greytrapping retrospective is in order.

Friends, it finally happened. On August 7th, 2025, the number of spamtraps intended to woo the unwary spammer rolled past the number of inhabitants in my home country of Norway, as tallied by the official statistics compiled by Statistisk SentralbyrÄ, also known as Statistics Norway.

After the morning run that day, the number of spamtraps (imaginary friends) stood at 5620384, inching past the country's total population of 5601049. And yes, the first number is likely to have increased when you read this. Under normal circumstances, the second will likely move a bit in the near future too. To mark the occasion, I present to you the retrospective that some correspondents have been asking for in response to some recent mail related articles of mine.

The Experiment Started in 2007

Greytrapping at nxdomain.no, also known as bsdly.net and a few other domain names, has been a long running experiment. I had been running a mail service for my own and my colleagues' benefit for some years already when I converted that setup stepwise from a Debian Linux setup to one involving OpenBSD hosts as the outer line of defense and a mix of FreeBSD, OpenBSD and other hosts in an evironment not unlike what is described in some of the rather basic configurations described early on in the PF tutorial and later The Book of PF.

Soon after converting the outer defense at that site to an OpenBSD one running a basic PF ruleset, I introduced the then blocklist-importing and greylisting only spamd, and experienced (as described elsewhere) that the fan noise coming from the mail server, obviously burdened by performing content filtering, just stopped immediately, only to occasionally to rise just a quiet murmor for the rest of that server's service life.


Note: This piece is also available without trackers but classic formatting only here.

I did not retain records of when I did that conversion, but my original PF presentation slides from January 2005 describes a spamd setup with greylisting as well as imported lists from spews and spamhaus, which is a strong indication that I had had that running for a while at that point.

Greytrapping was only introduced a little later, but when the feature became available I was ready and eager to put it into production as soon as at all possible. I went on to initiate the greytrapping experiment some time in 2007 and announced to the world in the article Hey, spammer! Here's a list for you! (also here) on July 9, 2007.

Unfortunately, or some would say fortunately, we have not been able to preserve all logs and records, but enough survives that we can sense the general thread and trends until we can get into the details of what we do have available from the last handful of years.

In Retrospect, What Changed Over the Years?

Looking back to the mid-noughties, the most significant change I see is that back then, people did this sort of thing.

Even for small organizations like the company I was attached to then, it was entirely normal to set up their own, in-house mail service as soon as they had some sort of Internet connectivity available.

In the years since then, the Internet in general, and SMTP email in particular, has been centralized to a degree we would not have considered even imaginable back in the mid-noughties.

We call it The Cloud, but as we all know it's really about running your stuff on other people's computers, and in the email case, the centralization is even more extreme.

In some of the field notes and articles linked at the end of this piece you will find mention of the major players in hosted or cloud email field and the fallout from their policies. Those policies and the companies' actions hint strongly that they really think that unless you are them, you have no business running a mail service.

So if it is not clear already, this is a piece that is written for people who either run their own mail service or are considering setting up one, as well as people in their immediate surroundings.

If your perspective on email is "how can I do $THING in Outlook?" or similar, this is really not for you, but you are of course welcome to read on for entertainment and/or enlightenment value, if such is to be found.

If you are considering setting up your own mail service, my main recommendation to you, after you have skimmed this piece and a selection of the linked resources, is to get Michael W. Lucas' 2024 book Run Your Own Mail Server, read it from cover to cover, and do what the man says. That really is the best book on the subject currently available, and it is recent enough to not yet be outdated.

What I saw as the main attraction of the greylisting and greytrapping combo back in the day and even still do, was and is that a set of actuallly quite simple network-level tricks and a tending-towards-pedantic interpretation of the SMTP protocol specification could have such a dramatic effect on the amount of work involved in running a sane mail service.

With a greytrapping spamd and a mail service that would utilize the content filtering setup du jour, my colleagues in the various organizations where we had these setups in place never saw the need to even consider listening to sales pitches for other offerings.

The early field notes and articles very much reflect that situation. We were quite enthusiastic about what we had running. What we had was cheap and reliable, and when there was a need to debug something, we would either point to the other party's configuration fumble or do such things as slowly come to the realization that not all senders play well with greylisting (also here).

I Hear You Say It's Good, But You're Weird Anyway

Over the years my experience of advocating both OpenBSD or FreeBSD as systems to use in general and implementing a greylisting and trapping spamd specifically, more often than not the attitude I would need to try turning around would typically be along the lines of I hear you say it's all good, but you're weird anyway.

In retrospect some of that may have come from me generally using various versions of the somewhat lengthy Effective Spam and Malware Countermeasures - Network Noise Reduction Using Free Tools (also here), sometimes supplemented with In The Name Of Sane Email: Setting Up OpenBSD's spamd(8) With Secondary MXes In Play - A Full Recipe (also here) more or less as promotional material. Both texts have to my mind stood up well over the years and are potentially useful for the right audience, but may not have been quite appropriate in a sales context.

There would be some update here and there, and questions I got during tutorial sessions and via various online channels indicate that people were setting up similar setups to what I have described there, and the various exported blocklists (see eg Badness, Enumerated by Robots (also here)) are quite popular downloads both at the primary and the mirror site.

Over the years there would be some odd episodes, sometimes involving the big players, with a piece such as Does Your Email Provider Know What A "Joejob" Is? (also here) a prime example of behavior I personally do not appreciate experiencing from anyone. On the other hand, in A Life Lesson in Mishandling SMTP Sender Verification (also here) we see an example of a different big player actually contributing well to resolving a puzzling situation.

In addition to the big players, we have of course also at times ran into less pleaseant encounters with not-exactly-captains-of-industry too. An early example was that in 2008, the notion that a challenge-response setup could be an effective antispam mechanism was apparently cultivated by some. In the field note I challenge your response, backscatterer (tracked only, sorry) we see how that went.

If you skim the field notes and articles linked at the end of this piece, you will find that there is, in fact, no end of weirdness in the email business. But one case involving what we must assume is pretty much a bit player had me write up Twenty-plus years on, SMTP callbacks are still pointless and need to die (also here). The TL;DR of that one is that what could have seemed like a bright idea way back when turned out not to be, but in some corners of the internet there are still true believers who can simply not be persuaded to change course even a little.

After a while, I found that though odd episodes did occur, I found it harder to make the writeups interesting and fun to read. A case in point is the year 2019, where at the very end of the year I finally forced myself to write that my only article of the year, The Year 2019 in Review: This Was, Once Again, Weirder Than the Last One (also here). That year had had its share of oddities, including a totally bizarre amount of backscatter from what must have been one or more phishing campaigns aimed at Chinese users. I found that episode hilarious myself, and while it prompted me to automate the spamtrap harvesting a bit, I tried and failed over and over to write what I thought would be a readable and enjoyable article about it.

Actually Running the Thing, and Finding Imaginary Friends

The day to day operations of the greytrapping is quite unremarkable, really. The script that dumps the trapped IP addresses at ten past every hour also presents me with a list of candidate spamtraps -- addresses in our domains currently in the the greylist that do not match any existing valid address or spamtrap, and I add those when I have the time at quasi-random points during the day.

The dump of trapped IP addresses is totally automated, and expiry is 24 hours. In 2013 I wrote a piece called Maintaining A Publicly Available Blacklist - Mechanisms And Principles (also here) that lays out the process in hopefully understandable terms. There is of course also the short version available on the website.

Over time we went from simply collecting from the greylist to also fishing out local parts from the logs of failed logon attempts to services such as ssh and (the obsolete, horrible) pop3.

A little while later it occured to me that it would perhaps be useful to make a record of when each spamtrap entry was added. History starts 2017-05-20, whatever spamtraps can not be found in this data set is assumed to have been added before that date, and reconstructing earlier history of the data would take more time and effort than I have any motivation to expend on the task.

The first partial year's data are, summarized:

New traps per month, 2017
MonthTotal SMTP SSH POP3Other
May 159 49 110 0 0
June 275 48 213 14 0
July 811 144 667 0 0
August 486 447 38 1 0
September - - - - -
October 886 513 367 6 0
November 825 57 768 0 0
December 299 91 208 0 0

From that year, the first aricle A New Year, a New Round of pop3 Gropers from China (also here) (January 9, 2017) was written before the date added data started, while the episode described in Twenty-plus years on, SMTP callbacks are still pointless and need to die (also here) (August 27, 2017) more likely than not produced more spamtraps around the time the article was written.

For 2018, we have the first in the series of a full year's data on traps added:

New traps per month, 2018
MonthTotal SMTP SSH POP3Other
January 304 172 132 0 0
February 228 72 148 2 0
March 160 73 87 0 0
April 102 84 18 0 0
May 12206 811 113701) 22 32)
June 146 26 59 61 0
July 358 248 26 84 0
August 359 125 69 165 0
September - - - - -
October 671 241 413 17 0
November 311 297 12 0 23)
December 1038 116 922 0 0


1) From the Hail Mary Cloud data set
2) IMAP
3) JOKE (see the data)

From June 2018 onwards, we have hourly data on the number of hosts trapped in our spamd-greytrap, in a form that is relatively easy to graph:

Line graph of hosts in spamd-greytrap 2018 (15 Jun - 31 Dec)

The data that went into producing the graph is available as 2018-traplistcounts.txt.

The articles from 2018 include A Life Lesson in Mishandling SMTP Sender Verification (also here) (February 17, 2018) with that life lesson, while the next two show that I felt a need to explain exactly what that blocklist producing thing was about, first with Badness, Enumerated by Robots (also here) (August 13, 2018) and the followup Goodness, Enumerated by Robots. Or, Handling Those Who Do Not Play Well With Greylisting (also here) (November 4, 2018) which really only goes to show that I was starting to contemplate converting my setup to use OpenBSD's own OpenSMTPD -- part of the base system -- rather than trusty old exim.

The 2019 spamtraps added data shows shows again, just how weird that year was -- see The Year 2019 in Review: This Was, Once Again, Weirder Than the Last One (also here) (December 28, 2019):

New traps per month, 2019
MonthTotal SMTP SSH POP3Other
January 1829 192 1636 0 14)
February 19644 18782 860 0 25)
March 58005 57186 819 0 0
April 53856 52563 1290 3 0
May 2315 350 1964 1 0
June 3164 312 2852 0 0
July 1058 434 618 6 0
August 1229 331 898 0 0
September - - - - -
October 11016 630 10385 1 0
November 11119 222 10897 0 0
December 19304 208 19096 0 0


4) ARTICLE (see the data)
5) JOKE (see the data)

The year 2019 is the oldest preserved data set of number of hosts in our spamd-greytrap that covers an entire year, which in turn gives us this diagram of the year:

Line graph of hosts in spamd-greytrap 2019

The data that went into producing the graph is available as 2019-traplistcounts.txt.

The lockdown year 2020 again did not see much article activity, but after seeing the N!th wankstortion campaign aimed at a large subset of our imaginary friends, I wrote a rant-ish article about it: The 'sextortion' Scams: The Numbers Show That What We Have Is A Failure Of Education (also here) (February 28, 2020)

New traps per month, 2020
MonthTotal SMTP SSH POP3Other
January 5085 171 4914 0 0
February 8941 150 8786 5 0
March 1363 258 1103 2 0
April 596 139 456 1 0
May 1406 108 1298 0 0
June 649 133 516 0 0
July 2405 98 2306 1 0
August 134 123 11 0 0
September - - - - -
October 591 185 403 3 0
November 2843 1318 1525 0 0
December 1571 169 1402 0 0

Again for 2020 we have complete data on the of number of hosts in our spamd-greytrap, which in turn gives us this diagram of the year:

Line graph of hosts in spamd-greytrap 2020

The data that went into producing the graph is available as 2020-traplistcounts.txt.

In 2021, still mostly a lockdown year, RFC7505 Means Yes, Your Domain Can Refuse to Handle Mail. Please Leave Us a TXT If You Do. (also here) (February 22, 2021) indicates a small but potentially significant change in mail server configuration. It has been a while since I last saw anything heading for that .se domain.

New traps per month, 2021
MonthTotal SMTP SSH POP3Other
January 179 129 49 1 0
February 172 97 75 0 0
March 112 95 17 0 0
April 150 88 62 0 0
May 1360 90 1270 0 0
June 307 41 266 0 0
July 68 58 8 2 0
August 144 61 82 1 0
September - - - - -
October 1035 160 875 0 0
November 166 94 72 0 0
December 304 192 112 0 0

The 2021 data of hosts in our spamd-greytrap produces this graph for the year:

Line graph of hosts in spamd-greytrap 2021

The data that went into producing the graph is available as 2021-traplistcounts.txt.

By 2022, we were back out of lockdowns and I produced several relevant articles -- Spammers in the Public Cloud, Protected by SPF; Intensified Password Groping Still Ongoing; Spamware Hawked to Spamtraps (also here) (April 3, 2022) showed that our imaginary friends or at least a significant subset are indeed in common spamto: lists out there.

The Things Spammers Believe - A Tale of 300,000 Imaginary Friends (also here) (September 7, 2022) -- in which I had somehow not gotten around to celebrating the day when the number of spamtraps went past the number of inhabitants of my home town of Bergen, Norway and decided that a nice round number would serve just as well.

Harvesting the Noise While it's Fresh, Revisited (also here) (December 9, 2022) -- I realized that spammers with freshly generated spamto addresses may try more variants after the first one that gets them trapped, so I turned to some further digging into logs for new data. The numbers swelled slightly as a result.

Can Your Spam-eater Manage to Catch Seventy-one Percent Like This Other Service? (also here) (December 23, 2022) -- yet another piece to explain what greylisting and greytrapping is good for and why it is good for you.

The Despicable, No Good, Blackmail Campaign Targeting ... Imaginary Friends? (also here) (December 25, 2022) -- the first "they're sending wankstortion mail to my imaginary friends" article had not gotten much attention so I tried again.

New traps per month, 2022
MonthTotal SMTP SSH POP3Other
January 143 129 14 0 0
February 333 79 253 0 16)
March 915 179 736 0 0
April 20451 91 20360 0 0
May 254 139 114 1 0
June 3898 54 3844 0 0
July 700 86 611 3 0
August 979 514 461 4 0
September - - - - -
October 2111 597 1514 0 0
November 470 73 396 1 0
December 2030 1714 303 13 0


6) fatfinger (see the data)

The 2022 data of hosts in our spamd-greytrap produces this graph for the year:

Line graph of hosts in spamd-greytrap 2022

The data that went into producing the graph is available as 2022-traplistcounts.txt.

In 2023, we kept adding spamtraps as they came in and generating data, but no mail-themed articles at all.

New traps per month, 2023
MonthTotal SMTP SSH POP3Other
January 642 175 465 2 0
February 429 301 128 0 0
March 8838 5296 3542 0 0
April 1557 1243 314 0 0
May 104 39 65 0 0
June 2273 2234 38 1 0
July 182 76 106 0 0
August 2436 2285 151 0 0
September - - - - -
October 4008 3752 256 0 0
November 1912 96 1813 0 37)
December 1165 52 1113 0 0


7) HTTP (see the data)

The 2023 data of hosts in our spamd-greytrap produces this graph for the year:

Line graph of hosts in spamd-greytrap 2023

The data that went into producing the graph is available as 2023-traplistcounts.txt.

The year 2024 saw little innovation and no new episodes I found a reason to write about. However, that year saw the launch of Michael Lucas' much anticpiated Run Your Own Mail Server, and events somewhat related to that had me write A Simpler Life: Trapping Spambots Based on Target Domain Only (also here) (January 24, 2024) and its followup Three Minimalist spamd Configurations for Your Spam Fighting Needs (With Bonus Points at the End) (also here) (January 25, 2024).

If you have been reading carefully up to this point, you may have noticed what I only noticed myself when I started massaging my spamtraps added data into tables: That during the logged years 2017 through 2023, no new spamtraps were added during the month of September.

As time went by I had noticed that there were periods of up to several weeks when no new spamtrap candidates appeared, but it did not occur to me that every year up to that point, that period had actually been the entire month of September. It is possible or even likely that the change to a more aggressive method of searching for candidates in the logs is what filled up September from this year on.

During late November of 2024, I decided that the time had come to ditch the quasi-empirism of passively collecting the actual to: addresses and start making an effort to fill spammers' spamto: lists with as much junk as possible. So I started extracting local parts from the from: and hostname or host ID fields in my verbose spamd logs, splicing together a larger than ever number of fake @bsdly.net addresses for the spamtraps list. I also started digging back into archived spamd logs and extracting data from there. For obvious reasons, this means that from that point on, the overwhelming majority of the items tagged SMTP in the date added logs are of the synthetic kind. More of that later.

New traps per month, 2024
MonthTotal SMTP SSH POP3Other
January 3122 92 3028 2 0
February 6442 202 6238 2 0
March 2150 198 1951 1 0
April 10028 5010 5018 0 0
May 633 413 219 1 0
June 680 72 608 0 0
July 177 151 25 1 0
August 561 433 125 3 0
September 3770 3675 95 0 0
October 10517 8631 1884 2 0
November 22899 18083 4815 1 0
December 167037 166605 428 4 0

The 2024 data of hosts in our spamd-greytrap produces this graph for the year:

Line graph of hosts in spamd-greytrap 2024

The data that went into producing the graph is available as 2024-traplistcounts.txt.

We continued adding synthetic spamtraps from the from and host fields in both new and archived spamd logs into the new year 2025. This and a few related items are described in A Suitably Bizarre Start of the Year 2025 (also here) (January 1, 2025). In June I found I needed to clarify some things about the exported IP address lists, specifically that one should be considered a historical artifact only, and wrote Should I Stop Caring and Let IP Address Reputation Sort Them Out? (also here) (June 8, 2025).

Seeing that the number of spamtraps now had run into the millions, I decided to speed up the process of filling spamto lists with garbage a bit more, by generating a few thousand extra items from short snippets of /dev/random output, base64 encoded and stripped of certain characters that would possibly lead to spamdb not accepting the result as valid. An example one-liner would be (vary to taste)

for ((foo=4096;foo>=0;foo--)); do barone=`dd if=/dev/random bs=4 count=1 | base64 | tr -d '+=/\r'`; bartwo=`dd if=/dev/random bs=6 count=1 | base64 | tr -d '+=/\r'`; echo $barone.$bartwo@bsdly.net ; done | tee -a rawbar

The contents of the file rawbar would then be subject to the same checks (eliminating actually valid local addresses as well as the more commonly used of the RFC2142 mandated set) as any other before being fed to spamdb to swell the imaginary friends populations. I was sometimes surprised how many of the items output looked like they could conceivably have been part of something at least vaguely resembling human speech. Anyway, on to the data:

New traps per month, 2025
MonthTotal SMTP SSH POP3Other
January 1400109 1399950 139 23 0
February 1261530 1260708 823 0 0
March 1142404 1141980 423 2 0
April 333442 333332 110 0 0
May 220072 218045 2027 0 0
June 180348 180271 75 2 0
July 242346 240771 1573 2 0
August 1020877 1020804 72 1 0

The 2025 data up to the publication date of hosts in our spamd-greytrap produces this graph:

Line graph of hosts in spamd-greytrap 2025 up to publication date

The data that went into producing the graph is available as 2025-traplistcounts.txt.

Where to Next, What Is Missing or Needed?

What happens next is not necessarily much different from what we have seen during all of those long years. Looking at the graphed data of number of trapped hosts, it is quite clear that the number of trapped hosts or IP addresses is on a declining trend, but with bursts or spikes when one or more campaigns are active and aimed at our domains. That general trend is possibly a consequence of the trend towards centralization of Internet services in general.

While I have not done any thorough analysis of the data, it appears that there is not a similar decline in delivery attempts, and some quasi-random sampling seems to indicate that it is fairly common that traffic from a single trapped IP address presents with a number of different hostnames or host IDs. This could be an indication that the senders sit in a cloud somewhere, or possibly are old-style compromised personal systems tucked away behind NAT.

That said, in my experience greylisting and greytrapping are useful techniques that work well within their limitations.

The limitation that irks me the most is that spamd is IPv4-only. While the migration to IPv6 has been slow, it is happening, and the portion of mail that is delivered over the modern protocol is increasing year by year. Around 2015 there was som work in the OpenBSD project on possibly extending spamd and supporting tools to support IPv6, but if I remember correctly the project was abandoned, at least partly because both parts of "rough consensus and working code" was not possible. Reaching consensus on how greylisting should work in the IPv6 world proved hard, to the point of turning out to being impossible.

I would personally hope that we can make progress towards IPv6 support at some point in the future, but until that happens, we can rest assured that a large part of the spammers have stayed on IPv4, and our tools work well to stop them in their tracks on the legacy protocol.

When I started working on this article, I had only a vague idea of how much I had actually written on the subject. I was a bit surprised at the number of pieces that had accumulated. I have included the list of links in the next, final section.

If you found this article useful, irritating, provoking, thought provoking, or simply would like to comment or contact me personally on the subject, please do.


The most recent exports of all lists generated here can be found in this directory. Before making any inguiries on removal from any of the lists, check all files in this directory for occurences or not of the IP address in question.

Previous spamd(8) Themed Articles and Field Notes

Hey, spammer! Here's a list for you! (also here) (July 9, 2007)

Spam is a solved problem (also here) (July 13, 2007)

The noise, we ignore it (tracked) (July 22, 2007)

Harvesting the noise while it's still fresh; SPF found potentially useful (also here) (July 25, 2007)

On the business end of a blacklist. Oh the hilarity. (tracked) (August 1, 2007)

We see your every move, spammer (tracked) (August 4, 2007)

A Lady in Distress; or Then Again, Maybe Not (tracked) (August 19, 2007)

Wanna help science? Study your greylists innards! (tracked) (September 8, 2007)

Always a pleasure to be wasting your time, guv (tracked) (September 29, 2007)

Of Course, It Had To Be A Webshield (tracked) (October 28, 2007)

I Must Be Living in a Parallel Universe, Then (also here) (November 25, 2007)

Fake Address Round Trip Time: 13 days (tracked) (May 21, 2008)

I challenge your response, backscatterer (tracked) (May 25, 2008)

Yes, we can! Make a difference, that is (tracked) (June 25, 2008)

Now that we have their addresses, do we name and shame? (tracked) (August 7, 2008)

Is one of your machines secretly a spambot? (tracked) (August 9, 2008)

“Name and Shame”, or socially responsible use of your log data (tracked) (September 22, 2008)

IETF failed to account for greylisting (also here) (October 20, 2008)

Oh yes, you signed up for this. You did. Honest. (also here) (March 21, 2009)

The Problem Isn't Email, It's Microsoft Exchange (also here) (February 27, 2011)

My First IPv6 Spam (also here) (June 8, 2011)

In The Name Of Sane Email: Setting Up OpenBSD's spamd(8) With Secondary MXes In Play - A Full Recipe (also here) (May 28, 2012)

Maintaining A Publicly Available Blacklist - Mechanisms And Principles (also here) (April 14, 2013)

Keep smiling, waste spammers' time (also here) (May 4, 2013)

The Hail Mary Cloud And The Lessons Learned (also here) (October 5, 2013)

Effective Spam and Malware Countermeasures - Network Noise Reduction Using Free Tools (also here) (February 2, 2014)

Password Gropers Take the Spamtrap Bait (also here) (August 12, 2014)

Does Your Email Provider Know What A "Joejob" Is? (also here) (April 23, 2016)

The Voicemail Scammers Never Got Past Our OpenBSD Greylisting (also here) (August 29, 2016)

Is SPF Simply Too Hard For Application Developers? (also here) (October 20, 2016)

So somebody is throwing HTML at your sshd. What to do? (also here) (December 22, 2016)

A New Year, a New Round of pop3 Gropers from China (also here) (January 9, 2017)

Twenty-plus years on, SMTP callbacks are still pointless and need to die (also here) (August 27, 2017)

A Life Lesson in Mishandling SMTP Sender Verification (also here) (February 17, 2018)

Badness, Enumerated by Robots (also here) (August 13, 2018)

Goodness, Enumerated by Robots. Or, Handling Those Who Do Not Play Well With Greylisting (also here) (November 4, 2018)

The Year 2019 in Review: This Was, Once Again, Weirder Than the Last One (also here) (December 28, 2019)

The 'sextortion' Scams: The Numbers Show That What We Have Is A Failure Of Education (also here) (February 28, 2020)

RFC7505 Means Yes, Your Domain Can Refuse to Handle Mail. Please Leave Us a TXT If You Do. (also here) (February 22, 2021)

Spammers in the Public Cloud, Protected by SPF; Intensified Password Groping Still Ongoing; Spamware Hawked to Spamtraps (also here) (April 3, 2022)

The Things Spammers Believe - A Tale of 300,000 Imaginary Friends (also here) (September 7, 2022)

Harvesting the Noise While it's Fresh, Revisited (also here) (December 9, 2022)

Can Your Spam-eater Manage to Catch Seventy-one Percent Like This Other Service? (also here) (December 23, 2022)

The Despicable, No Good, Blackmail Campaign Targeting ... Imaginary Friends? (also here) (December 25, 2022)

A Simpler Life: Trapping Spambots Based on Target Domain Only (also here) (January 24, 2024)

Three Minimalist spamd Configurations for Your Spam Fighting Needs (With Bonus Points at the End) (also here) (January 25, 2024)

A Suitably Bizarre Start of the Year 2025 (also here) (January 1, 2025)

Should I Stop Caring and Let IP Address Reputation Sort Them Out? (also here) (June 8, 2025)


Eighteen Years of Greytrapping - Is the Weirdness Finally Paying Off? is © 2025 Peter N. M. Hansteen (published 2025-08-10)
You might also be interested in reading selected pieces via That Grumpy BSD Guy: A Short Reading List (also here).

At EuroBSDcon 2025, there will be a Network Management with the OpenBSD Packet Filter Toolset session, a full day tutorial starting at 2025-09-25 10:30 CET. You can register for the conference and tutorial by following the links from the conference Registration and Prices page.

Separately, pre-orders of The Book of PF, 4th edition are now open. For a little background, see the blog post Yes, The Book of PF, 4th Edition Is Coming Soon (also here). We are hoping to have physical copies of the book available in time for the conference, and hopefully you will be able to find it in good book stores by then.