Thursday, March 13, 2025

No Project Is an Island: Why You Need SBOMs and Dependency Management

© 2025 Peter N. M. Hansteen

The system you develop and maintain does not exist in isolation. Providing SBOMs for our work is our way to show we care.

Software is a relatively recent phenomenon. For a long time, you could credibly say most of its existence, software was poorly understood by society and industry at large.

There was a time -- and I am old enough to remember that time -- when software was considered a minor but necessary component in IT deliveries. That perception changed over time, and during recent decades it is no longer in doubt that the software industry is just that, an industry in its own right.

Note: This piece is also available without trackers but classic formatting only here.

What we did not have at all until recently was a set of formal requirements to verifiably show what it is we deliver.

In other fields, the term Bill of Materials, or BOM for short, is a familiar term. The Bill of Materials is a document or set of documents that lists all component parts of a delivery.

This is the kind of document that becomes crucial in contexts where the procuring organization is geared toward accounting for everything and auditing when the supplier least expects it.

One such context could be when your organization has landed a contract to supply a backhoe, an armored personnel carrier or even a ship, and the contract requires you to specify component materials used, down to the nuts and bolts level.

Your delivery would not be considered complete without the Bill of Materials or Manifest.

As an aside, it is likely worth noting that the US Department of Defense's need for structured text markup in processing inventory information such as bills of materials was one of the more important drivers, albeit not the only one, behind the creation of SGML, the direct precursor to HTML and XML.

For physical deliveries to organizations of some stature, a Bill of Materials has been a standard part of the process across industries as an important part of quality assurance and a fundamental part of maintenance processes.

Software, on the other hand, has traditionally not been subject to that kind of scrutiny.

Until software that made critical infrastructure work broke, that is.

During the twenty-tens and -teens, we had several incidents where software bugs were tickled enough to lead to costly and embarrasing episodes, and the powers that be (the kind wearing suits) discovered that software was indeed something they needed to care about.

These episodes spurred several things, one being memes like

(XKCD #2347, please also read the explainer), which lead to the common belief that supply chain management and the subtopic dependency management is mainly a problem that concerns open source software.

This assertion is simply not true, in that no project is an island.

Whether you let others see the code you wrote nor not, the software does not exist in isolation.

All software has dependencies, and in the open source world this fact has been treated as a truth out in the open. Every free operating system, and in fact most modern-ish programming languages come with a package system to install software and to track and handle the web of depenencies, and you are supposed to use the corresponding package manager for the bulk of maintenance tasks.

So when the security relevant incidents hit, the open source world was fairly well stocked with code that did almost all the things that were needed for producing what became known as Software Bill of Materials, or SBOM for short.

So what would a Software Bill of Materials even look like?

Obviously nuts and bots would not be involved, but items such as the source code files in your project, any libraries or tools needed to build the thing would be nice-to-knows, and once you have the thing built, what other things -- libraries, suites of utilities, services that are required to be running or other software frameworks of any kind -- that are required in order to have the thing run are obvious items of interest.

So basically, any item your code would need comes out as a dependency, and you will find that your code has both build time and run time dependencies.

Those terms will be quite familiar to users and the developers of the package manager systems for the various open source operating systems. The very same items you would recognize from a listing of package dependencies in a package management tool will turn up in our Software Bill of Materials too. Depending on the specific tool and options you use, the SBOM could contain additional information that may not be entirely relevant in a package manager context.

Under any circumstances, with package systems in place, and even vulnerability scanners available to scan for unsecure code at rest or while running, the free and open source software communities were in fact well positioned for the legal requirements when they hit, and the lessons learned from package management came in quite useful in meeting and satisfying the updated requirements.

Several pieces of legislation emerged from the at times panic flavored fallout from the security incidents. Which ones are more relevant to you will become clear as we move on.

Depending on what parts of the world you care more about, the emphasis will either be on US Executive Order 14028 of May 12, 2021, Improving the Nation's Cybersecurity and its summaries found at the Software Bill of Materials Home page hosted by the National Telecommunications and Information Administration, or for the EU and our neighborhood, the EU Cyber Resilience Act (CRA) with slightly less hardcore legalese available at the Cyber Resilience Act start page.

So that's our backdrop for now. The name of the SBOM game is compliance with those legal requirements, and to not only generate the information -- that's the relatively easy part -- but also to present the information in a way that is understandable and actionable to stakeholders who are not themselves software developers.

The information is there in our code, and with development tools and code scanners a developer is well placed to poke around.

The next challenge it to take that information and present it in a way that conforms with the legal specification and is presented in a way that is usable for stakeholders that are not developers.

In addition to module or package names and versions, the expected SBOM product will typically include information on any identified security problems such as CVEs and a specification of the licenses that apply to each of the identified dependencies.

Thanks in large measure to the open source heritage of the specifications and tools, both of the commonly used SBOM specifications (SPDX and CycloneDX) consider information on licenses used in a file or project as tagging and tracking relevant items, and the tools we describe have some measure of support for tracking and reporting on licenses in use. This can be useful for flagging licenses that may be mutually incompatible or even incompatible with your organization's business goals.

As I hinted at earlier, there are tools available for all of this. If you want to go on and explore for yourself, I would recommend going to the awesome-sbom site, which offers a curated collection of SBOM resources and tools hosted as a Github rego.

There are a large number of tools available, with varying feature sets. In addition to the free tools you find via that collection, several tool suites exist that are exclusively commercial or with free trial or reduced features set versions out with full features available only to paying customers.

The tool set I found the most accessible for my poking around was the combination of syft for generating SBOMs and bomber for display and presentation. The home pages for both are linked from the awesome-sbom collection.

As you can see from that page, there are several SBOM formats around, and to some extent standardization and interoperability efforts are under way. But enough of that, let's look at the actual tools in use.

As a first step, it is instructive to point syft at the base directory of your project and see if it can tell you something you did not know already. syft supports a number of output formats, so if XML is the more readable format to you,

$ syft . -s all-layers -o cyclonedx-xml | xq

will give you pretty-printed XML (assuming you have xq installed) output of what syft found out. Do explore the various command line options for extracting various information about your project.

If you prefer JSON over XML, something like

$ syft . -s all-layers -o cyclonedx-json | jq

will give you readable JSON of the same information. Again, there are a number of options to explore.

When you have explored a bit, you may want to look into how you incorporate these tools in your project and make the SBOM a build artifact.

The bomber documentation has this example suggestion for inclusion in a CI/CD pipeline:

# Make sure you include the - character at the end of the command. This triggers bomber to read from STDIN
syft packages . -o cyclonedx-json | bomber scan --provider ossindex --output json -

In a real world scenario, I could imagine that non-developers would appreciate it if you supplement that line with one using the --output=html option. The HTML output provides a report that lists licenses involved before listing know vulnerabilites by severity and assigned CVE.

While I was writing this article, a colleague who had been reviewing it told me of an episode that shows that even extremely basic use of the SBOM tools can be useful. A customer had called, saying they needed a complete list of tools and dependencies involved in a project, and right away. As a first step, my colleague cd'ed in to the main directory of one of the subprojects for that customer, and issued the command

$ cdxgen .

and was rewarded with a bom.json file that listed somewhere in excess of three hundred dependencies for that relatively minor subproject alone. The customer was suitably impressed and granted my colleague a more realistic and less immediate time frame for submitting the full dependency tree.

More SBOM-savvy co-stakeholders in your project may even be capable of processing your json or xml formatted SBOMs themselves, using tools of their choice.

Your project and customer may already have chosen a different toolset, or you may find that some other SBOM generating and presentation tool set are better matches for your requirements.

It is in fact conceivable that you have SBOM-capable tools within reach in your environment already. The fairly popular images-and-sundry repository system Harbor supports automatic SBOM generation on image push by hooking in trivy for image scanning duty, should you choose to enable that feature for your Harbor hosted projects.

If you want to explore further, please dive into the resource references at the end here.

For the more Bill of Materials savvy developers who want to explore even more, it may be of interest that the OWASP and SPDX teams are working on more specialized BOM variants, including OBOM (Operating system Bill of Materials), SaaSBOM (Software as a Service Bill of Materials), CBOM (Cryptography Bill of Materials), and several more. Again, see the referenced resources at the end here and follow the breadcrumbs.

SBOM Resources

The Software Bill of Materials home page at NTIA is the mother ship of SBOM documentation

Browse OWASP CycloneDX for all things about the CycloneDX specification and related tools, also their CycloneDX tool center

Browse the System Package Data Exchange specification (SPDX) for all things SPDX (supported by the Linux Foundation), including copious linked reference material

awesome-sbom is a curated list of SBOM tools and resources

EU residents will want to poke around the Cyber Resilience Act site for reference

Brewing Transparency: How OWASP's TEA Is Revolutionizing Software Supply Chains is a summary of recent work on OWASP Transparency Exchange API (TEA)

SBOM buyer’s guide: 8 top software bill of materials tools to consider is a readable overview of (some) SBOM tools

Olle Johansson's FOSDEM presentations are among several good SBOM talks at that conference (search the site for more)

Peter N. M. Hansteen: Open Source in Enterprise Environments - Where Are We Now and What Is Our Way Forward? (2022, also here) has some insights on how open source software plays a crucial role in enterprise environments and elsewhere

No Project Is an Island: Why You Need SBOMs and Dependency Management is this article (also here)

Wednesday, January 1, 2025

A Suitably Bizarre Start of the Year 2025

© 2025 Peter N. M. Hansteen

Already somewhat blasé from life in the honeypots, yours truly registers an even more bizarre level of events after a some routine logs spelunking

If you're reading this soon after the piece is published, 2025 is a fresh new year, and I would like to wish you all the best for the year ahead.

Then I want to relate what happened here (or rather at the Internet facing network interface of the server in question) during the initial few hours of the new year 2025.

Note: This piece is also available without trackers but classic formatting only here.

If you are a returning reader, you will be familiar with my ongoing experiment and studies of Internet miscreants and how to thwart their efforts as effectively as possible while expending no more than absolutely necessary in terms of time or energy on our end. Central to those efforts are the greytrapping based blocklist and the ever-growing list of spamtraps, which late in 2024 passed the half a million mark, right now numbering 568212 entries of known bad, not deliverable email addresses in our domains (almost certain to have increased by the time you read this).

I have written about the daily maintenance tasks for the lists, such as they are, in previous entries such as the list homepage pointed to in the previous paragraph and the traplist ethics page as well as the blog post Goodness, Enumerated by Robots. Or, Handling Those Who Do Not Play Well With Greylisting (November 2018, also here) or for that matter the piece I wrote about the arrival of the three hundred thousandth spamtrap, The Things Spammers Believe - A Tale of 300,000 Imaginary Friends (also here).

All of those pieces show that the original emphasis was to keep the working environment sane for the local users, and the fact that I could generate resources that I could make available for others to use was really just a byproduct of that work, while of course a welcome one for its users.

After some years, and certainly around the time the list of spamtraps had reached the hundreds of thousands, the "salt the mine and poison the well" part (the fourth principle listed on the ethics page) part had subtly slid more into central focus, and I was adding incrementally to my arsenal of scripts and one-liners to expand the list of "imaginary friends" as I came to think of new angles.

Most of these would involve fishing out potential local parts to (the parts before the '@') from the din of spamd log entries. Some of these are hinted at in Harvesting the Noise While it's Fresh, Revisited (also here).

The pace of growth for the spamtraps list did pick up as a consequence, and as I reported in a fediverse post, the total made the half millon mark at some point in December of 2024.

Part of the updating procedure is to search logs for addresses not already in the spamtraps list. One of the things I tend to do after extracting the list of addresses somebot tried to deliver to and that we have not been included already in the spamtraps is to extract the log entries involving those supposedly new addresses for further processing. The output from that grep centered one liner from the overnight run taken during the late morning of January 1st, 2025 can be found here.

Take a few moments to look at that one if you want.

You will be looking at the output of a series of grep searches for destination addresses.

The bulk of the data shows that hosts not in our local networks tried to deliver largish numbers of messages to third party domains such as qq.com and gmail.com, using our spamtrap addresses as the purported sender addresses, only of course to be added to the set of greytrapped addresses.

Making up addresses in other people's domains to use as From or Reply-to addresses on your spam messages is not a new thing, of course, as long as you do not care to get any feedback on what actually happened with those attempted deliveries.

What baffled me more than a little was that the addresses were apparently used in the exact sequence they would have been found at this site after a fairly recent update run.

Apart from the sheer number of addresses and their freshness, the only item of interest was that behind each of the IP addresses involved there appears to be a number of hosts -- likely virtual machines -- with distinct identifiers in their HELO/EHLO sequence, likely generated strings of a handful of characters such AXBPvDt.

These quasi-random, generated IDs were of course soon made into local parts for new spamtraps. As would, at times some other items it is possible to extract from logs with common unix commands.

So as a start to the new year, this was surprisingly fitting. The general insanity we have seen in this particular field continues, but appears to have reached a new level at the tail end of the tumultous year just past, possibly heading for new levels still.

Good night and good luck.

Addendum 2025-01-13

For those so inclined, it is perhaps worth noting that after a bit of pondering some time after writing this other piece (also here), I started looking at extracting other items from the spamd logs log entries.

I ended up with extracting the local parts for new spamtraps from the purported sender addreses of entries for trapped delivery attempts some time mid-2024. This made for a significant increase in the number of new imaginary friends, and by the final months of that year I had also started extracting similarly from the string offered by the spam senders as their host name in the EHLO/HELO exchange, which of course swelled the population further.

The effect is clearly to be seen in the file that records the number of spamtraps added per year, updated via trivial scriptery roughly daily.

I hope this article and its addenda helps inspire others in our efforts of green cybercrime prevention by giving the actually intelligent detection methods less work to do.

Addendum 2025-03-20

Only a couple of weeks after the previous addendum was written, it was outdated. Due to some trivial resource restraints lead do a slightly different organization of the log data, now as per year files up to and including 2024, and per month from 2025 onwards, in this directory, while the main traplist page still has the list of spamtraps itself in one piece.


Upcoming Events to watch for:
BSDCan 2025 June 11 through June 14th 2025, in Ottawa, Canada. The Call for papers is active, with February 12 2025 as the deadline for submissions.

EuroBSDCon 2025 September 25-28, 2025 in Zagreb, Croatia.