Wednesday, June 7, 2023

I asked ChatGPT to write a pf.conf to spec, 2023-06-07 version

ChatGPT prompt and answer


Prompted by this exchange

I just had to check what its capabilities were. Here is what happened:


Note: A version without trackers but “classical” formatting is available here.


PNMH::
write an OpenBSD pf.conf that blocks by default, allows traffic from the local host, allows name resolution, network time and ssh out from the DMZ, lets http and https through to the web server, and allows smtp that has been whitelisted by spamd to pass to the mail server.

To which ChatGPT answered:

Sure! Here's an example of an OpenBSD pf.conf configuration file that meets your requirements:

# Define network interfaces
ext_if = "em0"  # External interface (connected to the internet)
dmz_if = "em1"  # DMZ interface
lan_if = "em2"  # Local LAN interface

# Set default block policy
block all

# Allow traffic from the local host
pass quick on lo0 all

# Allow name resolution (DNS)
pass out on $ext_if inet proto udp to any port domain

# Allow network time (NTP)
pass out on $ext_if inet proto udp to any port ntp

# Allow SSH out from the DMZ
pass out on $dmz_if inet proto tcp to any port ssh

# Allow HTTP and HTTPS traffic to the web server
pass in on $ext_if inet proto tcp to $ext_if port { http, https }

# Allow SMTP traffic that has been whitelisted by spamd to the mail server
pass in on $ext_if inet proto tcp to $ext_if port smtp \
    rdr-to $lan_if port spamd

# Log and block any remaining traffic
block log all

    

Make sure to replace the interface names (em0, em1, em2) with the appropriate interface names on your OpenBSD system. Additionally, adjust the names of the web server and mail server as needed.

Remember to save the pf.conf file in the correct location (/etc/pf.conf by default on OpenBSD) and reload the firewall rules using the pfctl command for the changes to take effect.

Note: This configuration assumes you have already set up spamd and whitelisted the desired SMTP traffic.


I will leave it as an excercise to the reader to point out why this would earn an F.

Or you can take in things like our Network mananagement with the OpenBSD Packet Filter Toolset or just get yourself a copy of The Book of PF , as an option after after looking up the PF user guide.

Update: Point it at specific sources! After posting the original result I got a few responses, with lots of smileys and other emojis.

Then, a colleague advised that pointing the system at specific sources such as the man pages would likely produce better results.

So here is the second exchange:

PNMH:
based on the information in https://man.openbsd.org/pf.conf and https://man.openbsd.org/spamd write an OpenBSD pf.conf that blocks by default, allows traffic from the local host, allows name resolution, network time and ssh out from the DMZ, lets http and https through to the web server, and allows smtp that has been whitelisted by spamd to pass to the mail server.

The response was,

Certainly! Here's an updated version of the pf.conf configuration file based on the information from the OpenBSD man pages for pf.conf and spamd:

# Define macros for network interfaces and IP addresses
ext_if = "em0"  # External interface (connected to the internet)
dmz_if = "em1"  # DMZ interface
lan_if = "em2"  # Local LAN interface

# Set default block policy
block all

# Allow traffic from the local host
pass quick on lo0 all

# Allow name resolution (DNS)
pass out on $ext_if inet proto udp to any port domain

# Allow network time (NTP)
pass out on $ext_if inet proto udp to any port ntp

# Allow SSH out from the DMZ
pass out on $dmz_if inet proto tcp to any port ssh

# Allow HTTP and HTTPS traffic to the web server
pass in on $ext_if inet proto tcp to $ext_if port { http, https }

# Allow SMTP traffic that has been whitelisted by spamd to the mail server
table  persist file "/etc/spamd.whitelist"
pass in on $ext_if inet proto tcp to $ext_if port smtp \
    rdr-to $lan_if port spamd if-bound \
    spammed by 

# Log and block any remaining traffic
block log all

In this version, we have added the table, which is populated by the /etc/spamd.whitelist file. This file should contain the IP addresses or networks that have been whitelisted by spamd. Make sure to save the pf.conf file in the correct location (/etc/pf.conf by default on OpenBSD) and reload the firewall rules using the pfctl command for the changes to take effect. Please note that this configuration assumes you have correctly set up spamd and have populated the /etc/spamd.whitelist file with the desired IP addresses or networks.


It did pick up a few clues from the man pages I fed it, but it still has fairly obvious flaws that would stem from it not being able to take in the actual interdependencies of even a very clearly worded, if slightly complex document like man pf.conf.

But then what would happen if we in addition feed it the PF user guide?

Let's try:

PNMH:
based on the information in https://www.openbsd.org/faq/pf/index.html and links therein as well as https://man.openbsd.org/pf.conf and https://man.openbsd.org/spamd write an OpenBSD pf.conf that blocks by default, allows traffic from the local host, allows name resolution, network time and ssh out from the DMZ, lets http and https through to the web server, and allows smtp that has been whitelisted by spamd to pass to the mail server.

The third try response was,

Certainly! Here's an updated version of the pf.conf configuration file based on the OpenBSD FAQ for pf, as well as the pf.conf and spamd man pages:

# Define macros for network interfaces and IP addresses
ext_if = "em0"  # External interface (connected to the internet)
dmz_if = "em1"  # DMZ interface
lan_if = "em2"  # Local LAN interface

# Set default block policy
block all

# Allow traffic from the local host
pass quick on lo0 all

# Allow name resolution (DNS)
pass out on $ext_if inet proto { udp, tcp } to any port domain

# Allow network time (NTP)
pass out on $ext_if inet proto udp to any port ntp

# Allow SSH out from the DMZ
pass out on $dmz_if inet proto tcp to any port ssh

# Allow HTTP and HTTPS traffic to the web server
pass in on $ext_if inet proto { tcp, udp } to $ext_if port { http, https }

# Allow SMTP traffic that has been whitelisted by spamd to the mail server
table  persist file "/etc/spamd.whitelist"
pass in on $ext_if inet proto tcp to $ext_if port smtp \
    rdr-to $lan_if port spamd if-bound \
    spammed by 

# Log and block any remaining traffic
block log all
    

In this version, the rules have been updated to align with the OpenBSD pf FAQ and provide a more comprehensive configuration. Make sure to save the pf.conf file in the correct location (/etc/pf.conf by default on OpenBSD) and reload the firewall rules using the pfctl command for the changes to take effect. Please note that this configuration assumes you have correctly set up spamd and have populated the /etc/spamd.whitelist file with the desired IP addresses or networks.


It is fairly obvious that this would not have passed the pfctl parser, and that feeding it the relevant man pages as well as the PF user guide did not make the system more capable of understanding how rule evaluation actually works.

In fact, all three tries would be functionally equivalent to the rather terser

block

But arriving at that conclusion would require the reader to actually understand the content fed to it.


My favorite take on this so far came via Mastodon, from Mark Shane Hayden, in this toot, which reads in full:

@mwl @pitrh I suppose we can take solace in that if Skynet ever came into existence for real it would be unreachable

Friday, March 3, 2023

ChatGPT opines on IPv6 procastination, waxes lyrical over OpenBSD

Are the robots indeed capable of helping us in our tech writing, or even help our creativity? This writer tries on the robot that's been in the news for size. OpenBSD material to be had at the end.

ChatGPT goes to work on our first task

For a number of years, those involved in networking things have all had one thing in common: A TODO item about either getting started implementing IPv6 in their networks, or at least getting started on planning to go dual-stack or even purely IPv6.

And as anybody who follows networking things is aware, IPv6 adoption in some parts of the world has been advancing hardly at all, even in jurisdictions where actual legislation is in place that mandates IPv6 use and leaves IPv4 support as an allowable option.

I have been involved in some capacity or other in enough of these efforts over the years that I have been pondering writing a piece about the slow rate of adoption, and of course I hoped to be able to write something witty or at least entertaining about something that is in fact, to some of us at least, a somewhat frustrating situation.

Then today I attended an internal presentation by a colleague that included some examples of ChatGPT use writing as well as programming.

The demonstration was enough fun that I thought I would see if the robot could offer some insights and perhaps get me started on writing the piece I had been pondering.

Signing up and getting access to a trial was painless and quick, so some things have improved since I first tried to access the service a few weeks ago when news about the service first hit.

First try: Please write me a story

The first chat went like this:

PNMH: managers and admins put off setting up IPv6 in their networks. write a short story on how and why they procrastinate and make excuses.

ChatGPT: Once upon a time, in a large corporation, there was a group of managers and administrators responsible for the network infrastructure. They had heard about IPv6, the new internet protocol, and knew that it was important to adopt it to future-proof the company's network. However, they kept putting off the task of implementing IPv6, and s why:

Firstly, they felt overwhelmed by the complexity of IPv6. They were used to IPv4, and the thought of transitioning to a new protocol seemed daunting. They convinced themselves that they needed more time to research and plan the transition thoroughly.

Secondly, they thought that their current IPv4 infrastructure was sufficient, and that there was no need to change anything. They believed that IPv6 adoption was only necessary for larger companies and service providers, not for their organization.

Thirdly, they were concerned about the cost of implementing IPv6. They felt that it would require a significant investment in new hardware, software, and training, and they were not sure if the benefits of IPv6 would justify the expense.

Finally, they were preoccupied with other pressing matters, such as meeting deadlines, managing budgets, and dealing with daily operational issues. IPv6 implementation was not a top priority for them, and they kept pushing it back on their to-do lists.

As time went by, the managers and administrators continued to procrastinate, making excuses and justifications for not implementing IPv6. They were oblivious to the fact that their competitors were already using IPv6 and gaining a competitive edge. They failed to realize that IPv6 adoption was not a choice but a necessity for staying relevant and competitive in the digital age.

In the end, the managers and administrators were forced to implement IPv6 when their IPv4 addresses ran out, causing network disruptions and downtime. They learned the hard way that procrastination and excuses only lead to costly delays and missed opportunities.

Also accesssible as pure text as chatgpt_on_ipv6_procrastination.txt

As one of my mastodon commenters said, it looks like this robot has peeked into too many IT Management meetings.

I would myself have been more worried about ensuring that services we run are accessible to IPv6-only networks, and for that matter that users in our own networks should be able to access services in IPv6-only parts of the Internet.

This shows that whatever they trained the system on seems to have included the most frequently cited reasons for not actually moving forward with IPv6 implemenation or other network modernization. But the thing had stylistic issues and I was looking for something fresher. So on to the next try.

Second try: Tease the legal requirements

Some stakeholders, such as Norwegian public sector entities, are under legal obligation to make their services available on IPv6 and to enable clients in their network to access IPv6 resources, the final deadline for that implementation in principle passed at the end of 2022.

So I tried stressing that point in my second try. Here is a record of hat session:

PNMH: why do network designers always postpone their IPv6 implementation even past legally required deadlines?

ChatGPT: There are a few reasons why network designers may postpone their IPv6 implementation, even past legally required deadlines:

Compatibility issues: IPv4 and IPv6 are not directly compatible with each other, which means that devices using IPv4 cannot communicate directly with devices using IPv6. This creates interoperability issues that can make it difficult for organizations to transition to IPv6 without causing disruptions to their existing networks.

Cost: Implementing IPv6 requires an investment in new hardware, software, and training. For some organizations, the cost of upgrading their infrastructure to support IPv6 may be prohibitive, especially if they have already invested heavily in their existing IPv4 infrastructure.

Lack of perceived need: Some organizations may not see the urgency in transitioning to IPv6, especially if they are not experiencing any immediate problems with their IPv4 networks. This can lead to a "wait and see" approach, where organizations postpone their IPv6 implementation until they are forced to do so by external factors.

Complexity: IPv6 is more complex than IPv4, which can make it more difficult for network designers to implement and manage. This complexity can also make it more challenging to troubleshoot network issues that may arise after the transition to IPv6.

Lack of IPv6 support from vendors: Some network hardware and software vendors may not yet fully support IPv6, which can make it difficult for organizations to transition to IPv6 without making significant changes to their existing infrastructure.

Overall, transitioning to IPv6 requires a significant investment of time and resources, and network designers may delay implementation for various reasons. However, as the pool of available IPv4 addresses continues to dwindle, and as more organizations begin to adopt IPv6, it is becoming increasingly important for organizations to make the transition to IPv6 in a timely manner.

Also accesssible as pure text as chatgpt_on_ipv6_procrastination_legal.txt

This was actually an improvement. The reachability point is made clearer here, and the system clearly picked up on the legal requirement part.

Whether IPv6 is in fact more complex than IPv4 is debatable, but the robot is right that this is at least a common perception, factual or otherwise.

A clear improvement, but when I publish something, I make an effort to make it witty or at least a bit snarky.

Time to tweak the query once again.

Third try: Once more, with wit and feeling

My third and final IPv6-themed query of the day ran like this:

PNMH: write a funny story about why IPv6 implementation is so hard and scary to plan for and actually do

ChatGPT: Once upon a time, there was a small company that had been using IPv4 for all their networking needs. One day, their IT manager decided that it was time to upgrade to IPv6.

Excited by the prospect of faster speeds and improved security, the IT manager decided to jump right in and start planning for the implementation of IPv6. However, as they began to research and plan, they soon discovered that implementing IPv6 was not as easy as they had thought.

First, they realized that they would need to upgrade all of their networking equipment to support IPv6. This was going to be expensive, but the IT manager was determined to make it happen. They went ahead and ordered all the necessary equipment and started scheduling the installation.

However, as the installation date approached, the IT manager began to feel a sense of dread. What if the new equipment didn't work with their existing systems? What if they couldn't get everything up and running in time?

Feeling overwhelmed, the IT manager decided to take a break and watch some TV. As luck would have it, they stumbled upon a horror movie about a group of people who get lost in a network of tunnels and are pursued by an army of rogue IPv6 addresses.

Suddenly, the IT manager's fear of implementing IPv6 reached new heights. They couldn't shake the feeling that something was going to go horribly wrong. They started imagining all sorts of terrifying scenarios: their network crashing, their data being compromised, even their computers coming to life and taking over the office!

In the end, the IT manager decided to put the IPv6 implementation on hold indefinitely. They just couldn't bring themselves to take the risk. And so, the small company continued to chug along on IPv4, while their competitors raced ahead with faster, more secure networks.

The moral of the story? Don't let fear hold you back from upgrading to IPv6. Sure, it can be scary and complicated, but with the right planning and support, you can make the transition smoothly and reap the benefits of a faster, more secure network. And who knows, maybe you'll even get a good horror story out of it!

Also accesssible as pure text as chatgpt_on_ipv6_procrastination_funny.txt

Ouch. The Once upon a time lead-in is back. But the rest is something that I might actually manage to editstomptwistcrunch into something presentable given enough coffee and time.

I have had worse material handed to me to edit by humans, and I may have done worse first drafts myself when writing when I should not have.

There are some logic issues in there, but nothing that could not be edited out or around.

By this time of the day, I had to go attend to a few things with deadlines, but I took the time to do a final query on a slightly different topic.

And now for something completely different: Write me OpenBSD poetry

So after looking into a slightly painful topic, what better than to see what our robot friend has learned about our favorite operating system and poetry:

PNMH: write a poem about OpenBSD

ChatGPT: In a world of code and bytes,
Where hackers lurk in endless nights,
There shines a system, strong and bright,
A fortress of security, with OpenBSD in sight.

Like a lighthouse on a stormy shore,
It guards the network from cyber war,
With firewalls, cryptography, and more,
It stands tall, unyielding, forevermore.

In its heart, a philosophy,
Of code that's clean and free,
No backdoors, no secret keys,
Just trust and transparency.

From humble beginnings, it arose,
A gift to the world, from the chosen few,
A beacon of hope, for those who oppose,
The tyranny of those who seek to control.

So let us raise a toast, to OpenBSD,
A shining example of what can be,
When we dare to dream, and we dare to be free,
For a better tomorrow, for you and for me.

Also accesssible as pure text as chatgpt_poem_about_openbsd.txt

I just had to post that to openbsd-misc, and it will be interesting to see how the thread develops. And is a fitting end to what is, I think, the piece I have posted under my own name that contains the least material written by myself.

Good night and good luck.

<

Sunday, December 25, 2022

The Despicable, No Good, Blackmail Campaign Targeting ... Imaginary Friends?

Natalia here speaks to our imaginary friend 185.150.184.92

In which we confront the pundits' assumption that the embarrasment-based extortion attempts would grow more “sophisticated and credible” over time with real data.

It's a problem that should not exist. 

It's a scam that's so obvious it should not work.

Yet we still see a stream of reports about people who have actually gone out and bought their first bitcoins (or more likely fractions of one) in order to pay off blackmailers who claim to have in their possesion videos that record the vicim while performing some autoerotic activity and the material they were supposedly viewing while performing that activity.

And occasionally one of those messages actually find their way to some pundit's inbox (like yours truly), and at times some of those pundits will say things like that those messages represent a real problem and will evolve to be ever more sophisticated.

Note: This piece is also available, with more basic formatting but with no trackers, here.

I am here to tell you that

  1. That incriminating video does not exist, and
  2. The pundits who predicted that those scams would evolve to become more sophisticated were wrong.

If you stumbled on this article because one of those messages reached you, it's safe to not read any further and please do ignore the extortion attempt.

I wrote a piece in 2019 The 'sextortion' Scams: The Numbers Show That What We Have Is A Failure Of Education, also available without trackers, where the summary is,

Every time I see one of those messages reach a mailbox that is actually read by one or more persons, I also see delivery attempts for near identical messages aimed at a subset of my now more than three hundred thousand spamtraps, also known imaginary friends.

Over the years since the piece was originally written, I have added several updates — generally when some of this nonsense reaches a mailbox I read — and while I have seen the messages in several languages, no real development beyond some variations in wording has happened.

Whenever one of those things does reach an inbox, my sequence of actions is generally to save the message and add it to the archive, see if the sending IP address has already entered the blocklist that is later exported and add it by hand if not. Then check if the number of trapped addesses has swelled recently by checking the log file from the export script

$ tail -n 96 /var/log/traplistcounts

See if there is a sharp increase since the last blocklist export

$ doas spamdb | grep -c TRAPPED

Then check for related activity in the log

$ tail -n 500 -f /var/log/spamd

Check for the full subject in the same log file

$ grep "You are in really big troubles therefore, you much better read" /var/log/spamd

Then check older, archived logs to see how long this campaign has been going on for

$ zgrep "You are in really big troubles therefore, you much better read" /var/log/spamd.0.gz

This time, the campaign had not gone on for long enough to show traces in the older archive, so I go on to extracting the sending IP addresses

$ grep "You are in really big troubles therefore, you much better read" /var/log/spamd | awk '{print $6}' | tr -d ':' | sort -u

Check for activity from one of the extracted addresses

$ grep 183.111.115.4 /var/log/spamd | tee wankstortion/20221123_trapped_183.111.115.4.txt

Extract the sender IP addresses to an environment variable to use in the next oneliner,

$ grep trouble /var/log/spamd | awk '{print $6}' | tr -d ':' | sort -u | grep -vc BLACK | tee -a wankstortion/20221123_campaign_ip_addresses.txt

which will record all activity involving those IP addresses since the last log rotation:

$ for foo in $troubles ; do grep $foo /var/log/spamd | tee -a wankstortion/20221123_campaign_log_extract.txt ; done

You will find all those files, along with some earlier samples, and by the time you read this, possibly even newer samples, in the archive.

When something of the sort inboxes, I probably will go on adding to the archive, and if I have time on my hands, also run similar extraction activities as the ones I just described. But unless something unexpected such as actual development in the senders' methods occurs, I will not bother to write about it.

The subject is simply not worth attention past persuading supposed victims to not bother to get bitcoins or spend any they might have to hand. None of my imaginary friends have, and they are just as fine as they were before somebot tried to scam them.

Good night and good luck.


 

Friday, December 23, 2022

Can Your Spam-eater Manage to Catch Seventy-one Percent Like This Other Service?

Measuring the effect of what you do is important. Equally important is knowing what is the measure of your actions.

A question turned up on IRC that had me thinking.

Do you have a percentage of the spam traffic you catch on your MXes? The reason I ask is I lust learned that fastmail.com claim they catch 71% of all incoming spam. Also a rate of false positives would be nice to have, but that's likely harder to measure.

My first impulse was that I would consider a seventy-one percent hit rate on the low side of what we are seeing here at bsdly.net and associated domains.

But getting actually useful data would require some thinking. That said, comparing a major mail operator that sells deliverability and promises a 71 percent catch rate for incoming spam and bsdly.net would be like comparing apples and oranges at best. 

While bsdly.net (which is also known under a few other domain names) is my main mail service for my personal use and for a very select number of other people, to the rest of the world it is primarily a honeypot that generates security relevant data that other sites use, and that contributes to IP reputation rankings.

The site has been in operation in those roles for a little more than 15 years, since shortly before the original announcement in the article Hey, spammer! Here's a list for you!. When we started using the greylisting and greytrapping based setup, we saw a sharp drop in undesirable messages actually reaching inboxes, and I observed a marked decrease in load on the mail servers that did the content filtering.

Not long after I had set up our early greylisting setup, a message turned up on the openbsd-misc mailing list that pretty much matched our experience — a 95% reduction in spam in line to be treated to content filtering — so setting up precise measuring became a thing to do when we could get around to it.

Now enough with the background. It is relatively easy to extract at least some data that would give us a rough picture of the relative effectiveness of the greylisting and greytrapping versus the content filtering on receipt. The setup is very similar to the one described in the practically-oriented parts of the Effective Spam and Malware Countermeasures - Network Noise Reduction Using Free Tools and is part of a syncronizing multi-domain setup rougly as described in the earlier article In The Name Of Sane Email: Setting Up OpenBSD's spamd(8) With Secondary MXes In Play - A Full Recipe.

Using only tools found in the OpenBSD base system, I went on to collect data.

Whenever spamd(8) closes a connection it logs a message to that effect, so

$ zgrep "Nov  1" /var/log/spamd.6.gz | grep -c disconnected

Supplies the total number of connections closed by spamd(8) during November 1st, fetched from the archived log file.

Similarily

$ zgrep "Nov  1" /var/log/spamd.6.gz | grep -c BLACK

provides the number of connections during the same 24 hour period initiated by hosts that were already in one of the blocklists used.

The command to get the number of connections that had cleared the first hurdle and entered greylisted status would be

$ zgrep "Nov  1" /var/log/spamd.6.gz | grep -c GREY

And the number of hosts that had been well behaved enough to enter the whitelist and be allowed to talk to the real SMTP service comes out of

$ zgrep "Nov  1" /var/log/spamd.6.gz | grep -c whitelisting

For hosts that have reached this far and did not fail the content filtering we do during receipt, we get the number with

$ doas zgrep 2022-11-02 /var/spool/exim/logs/main.log.6.gz | grep -c Completed

It is however worth noting that our MTA exim reports Completed for apparently message deliveries in both directions, so the number of received messages, or messages that did inbox is likely about thirty percent lower.

The number of messages rejected for one reason or the other, by being addressed to an undeliverable address or by failing content filtering we find with

$ doas zgrep 2022-11-02 /var/spool/exim/logs/main.log.6.gz | grep -c rejected

And finally, a side effect of a frequently run log reading script that adds hosts with certain kinds of characteristics such as not having a correct reverse DNS entry to a blocklist and kills all their connections will at times produce an unexpected disconnection while reading SMTP command message. We find those with

$ doas zgrep 2022-11-02 /var/spool/exim/logs/main.log.6.gz | grep -c unexpected

Those are hosts that somehow got past spamd(8) by behaving enough like a real SMTP server to clear greylisting. However spamd(8) does not have the ability to check for valid reverse, so that part is left in our case to check for by reading the log files at intervals.

The following table has the data for November 2022 —

Date Incoming SMTP
connections
BLACK
connections
GREY
connections
New whitelist
entries
Deliveries Rejected Unexpected
disconnect
2022-11-01 53303 38951 2580 54 1347 409 384
2022-11-02 55653 40467 2174 121 1297 549 330
2022-11-03 59658 43901 2086 85 1260 865 759
2022-11-04 57462 45674 1683 71 1270 30 0
2022-11-05 44993 43571 2146 105 1182 43 0
2022-11-06 36768 37802 2322 86 1366 184 0
2022-11-07 49464 44213 2398 182 1424 67 0
2022-11-08 52285 45904 2676 113 1513 69 3
2022-11-09 47652 47988 2085 105 1438 154 0
2022-11-10 57850 49875 2614 104 1435 192 2
2022-11-11 60269 56719 2355 99 1420 90 1
2022-11-12 46139 54073 1160 96 1182 29 0
2022-11-13 40497 40221 1777 70 1239 189 0
2022-11-14 59965 59951 2062 63 1382 145 73
2022-11-15 56265 32727 2304 113 1298 351 301
2022-11-16 77252 58029 1925 109 1340 282 33
2022-11-17 43107 30713 786 131 1250 215 17
2022-11-18 49448 48999 1590 96 1327 194 1
2022-11-19 42413 45927 973 92 1182 182 70
2022-11-20 50890 55318 1558 77 1203 358 33
2022-11-21 36601 35070 1707 125 1321 241 146
2022-11-22 37840 35499 2055 99 1359 142 17
2022-11-23 43186 34545 1314 114 1345 103 21
2022-11-24 46802 45765 1856 66 1269 729 52
2022-11-25 70911 52404 1315 89 1326 1488 395
2022-11-26 39780 32226 1500 77 1175 954 379
2022-11-27 67578 41581 1743 85 1231 523 315
2022-11-28 54688 37534 2433 77 1337 321 269
2022-11-29 70893 45917 2502 65 1248 87 39
2022-11-30 50280 35585 2567 67 1324 1293 1113

The table is also available as a comma separated (CSV) file.

As I mentioned earlier, the number of connections to the outer layer spamd(8) is likely higher than what would be expected on sites that are not considered a honeypot and home to in excess of three hundred thousand imaginary friends (see The Things Spammers Believe - A Tale of 300,000 Imaginary Friends or the trackerless version.

That said, I think the data shows that catching the unwanted traffic early, and discarding as much as possible of that traffic before it reaches the resource hungry content filtering is definitely beneficial. 

Even sites that do not actively bait the baddies out there would likely see noticeable energy bill savings by having their mail servers run quiter and cooler, as they definitely will after getting a greylisting, and optionally greytrapping setup in front of them. Those services have a truly low energy consumption profile.

If you found this article interesting, useful or just simply irritating, I would like to hear from you. Please use the comment field, or if you prefer, send email to nix at nxdomain dot no with a subject that at least tries to sound sensible and relevant.

As always, if you are interested in research on items mentioned in this article, I will be able to provide data for study. I will honor reasonable requests.


Friday, December 9, 2022

Harvesting the Noise While it's Fresh, Revisited

A year's worth of logs yields entertaining but unsurprising findings about spammer behavior.
Spam mail, masked but detected, from the archive

Returning readers will be almost painfully aware that here at nxdomain.no (also known as bsdly.net) we host and maintain a blocklist, which in turn is the product of traffic that hits our mail system with attempts at delivery to one or more of the now more than three hundred thousand known bad addresses, also featured at the blocklist home page.

Note: This piece is also available without trackers but only basic formatting here

When I first set up the greytrapping back in 2007, the initial spamtraps were non-deliverable addresses in our domains that I had extracted from mail server logs. I won't bore you with the details (which are anyway documented at length in earlier articles), but it was clear from those logs that the domains we hosted back then were more or less continously subject to Joe jobs, as in somebody sending messages with a forged From: field with a made up address in our domains.

After a while I started extracting the potential new spamtraps from the greylist — actually dumping data from there once per hour as part of the script that also generated the exported blocklist. The basic process is described in the July 25 2007 article Harvesting the noise while it's still fresh; SPF found potentially useful (also available trackerless but with links to tracked articles).

Then today it struck me that while that method is useful, by extracting only from the greylist we will only ever collect the address from the initial connections. Any addresses attempted after the miscreants enter the blocklist will simply not be recorded there.

This of course lead to the question: What did we miss?

Fortunately I keep my logs around for a while, the most easily accessible log archive for my main spamd spans a lttle over a year. So I set about with some very basic grep and awk, which netted me this raw list of targeted addresses from the spamd logs.

The list weighs in at a total of 269903 entries, as counted by wc -l.

Some of those addresses are valid, and a small, but actually significant, number are in domains we do not actually serve here, and some entries do not look like mail addresses at all. The stranger ones could be strings encoded in a character set that spamd is not equipped to handle, or could be other binary data that might have been intended to trigger bugs in some of the variants of fully equipped SMTP servers that are out there. Or simply noise of any other kind, including a byproduct of the not very intelligent extraction one-liner I used.

The target addresses in foreign domains I take as a sign that at least some spamming operators mistake a reasonably configured spamd for an open relay, just like they did all those years ago when I started running the greytrapping.

Some things apparently stay the same no matter how the rest of the world has found a way to move forward.

While I did a few other tasks and finally started writing this article, the bulk of the processes that would answer the question posed earlier (What did we miss?) could fortunately run unattended in the background, and after some manual massaging we are left with a results file, with 1530 entries that were none of

  • actually useful deliverable addresses in our domains
  • existing spamtraps

This means of course that the collection of imaginary friends expanded by the same number, and now stands at 304154 entries.

Which I suppose means that harvesting the noise even after a period of aging for refinement can be a good thing.

The entries added represent a wide variety of phenomena. Quite a few seem to be truncated versions of earlier spamtrap entries, and a fair number of the new entries look like they may have descended from artifacts of stupidity such as products of SMTP callbacks. Proving mainly that in mail and spam handling, there appears to be a space still for the less intellectually astute.

With all of this said, the natural followup question is, given the modest net result, was this worth the effort?

Well, the raw output that yielded 269903 entries needed some manual operations in order to weed out the obvious noise (exact time used not recorded), followed by another background task that took, according to time(1)

    real        105m24.220s
    user        73m3.280s
    sys	        29m14.930s
    

which yielded 1577 entries that were pared down to 1530 entries that met the criteria for inclusion in the circle of imaginary friends (also known as spamtraps).

Before this experiment, the spamtraps list numbered 302625, after including the result here, the count stands at 304154, for a gain of less than one percent of the previous total. Again, if you check back at the traplist home page now, the total number is likely to have increased again.

So was it worth the effort? I feel that as an experiment, it was worth doing.

Whether or not it is an experiment that is worth repeating is a question for another day.

If you have opinions on this, I would love to hear from you, in comments, via email or messages on whichever social media brought you the link to this article.

As always, parties interested in studying the data referenced in this article and other pieces I have written are welcome to contact me for arrangements. I can easily dig out more and rawer data than directly referenced here on request.

Stay safe out there.


As a side note, a slightly improved way of extracting useful data about other domains' mail service via SPF records can be found in the November 2018 artice Goodness, Enumerated by Robots. Or, Handling Those Who Do Not Play Well With Greylisting.

That article (naturally) works from the premise that you are running a recent OpenBSD system.


Sunday, September 25, 2022

A Few of My Favorite Things About The OpenBSD Packet Filter Tools

The OpenBSD packet filter PF was introduced a little more than 20 years ago as part of OpenBSD 3.0. We'll take a short tour of PF features and tools that I have enjoyed using.



NOTE: If you are more of a slides person, the condensate for a SEMIBUG user group meeting is available here. A version without trackers but “classical” formatting is available here.

At the time the OpenBSD project introduced its new packet filter subsystem in 2001, I was nowhere near the essentially full time OpenBSD user I would soon become. I did however quickly recognize that even what was later dubbed “the working prototype” was reported to perform better in most contexts than the code it replaced.

The reason PF's predecessor needed to be replaced has been covered extensively by myself and others elsewhere, so I'll limit myself to noting that the reason was that several somebodies finally read and understood the code's license and decided that it was not in fact open source in any acceptable meaning of the term.

Anyway the initial PF release was very close in features and syntax to the code it replaced. And even at that time, the config syntax was a lot more human readable than the alternative I had been handling up to then, which was Linux' IPtables. The less is said about IPtables, the better.

But soon visible improvements in user friendliness, or at least admin friendliness, started turning up. With OpenBSD 3.2, the separate /etc/nat.conf network adress translation configuration file moved to the attic and the NAT and redirection options moved into the main PF config file /etc/pf.conf.

The next version, OpenBSD 3.3, saw the ALTQ queueing configuration move into pf.conf as well, and the previously separate altq.conf file became obsolete. What did not change, however, was the syntax, which was to remain just bothersome enough that many of us put off playing with traffic shaping until some years later. Other PF news in that release included anchors, or named sub-rulesets, as well as tables, described as "a very efficient way for large address lists in rules" and the initial release of spamd(8), the spam deferral daemon.

More on all of these things later, I will not bore you with a detailed history of PF features introduced or changed in OpenBSD over the last twenty-some years.

PF Rulesets: The Basics

So how do we go about writing that perfect firewall config?

I could go on about that at length, and I have been known to on occasion, but let us start with the simplest possible, yet absolutely secure PF ruleset:

block

With that in place, you are totally secure. No traffic will pass.

Or as they say in the trade, you have virtually unplugged yourself from the rest of the world.

By way of getting ahead of ourselves, that particular ruleset will expand to the following:

block drop all

But we are getting ahead of ourselves.

To provide you with a few tools and some context, these are the basic building blocks of a PF rule:

verb criteria action ... options

Here are a few sample rules to put it into context, all lifted from configurations I have put into production:

pass in on egress proto tcp to egress port ssh

This first sample says that if a packet arrives on the egress — an interface belonging to the group of interfaces that has a default route — and that packet is a TCP packet with a destination service ssh, let the packet pass to the interfaces belonging to the egress interface group.

Yes, when you write PF rulesets, you do not necessarily need to write port numbers for services and memorize what services hide behind port 80, 53 or 443. The common or standard services are known to the rules parsing part of pfctl(8), generally with the service names you can look up in the /etc/services file.

The interface groups concept is as far as I know an OpenBSD innovation. You can put interfaces into logical groups and reference the group name in PF configurations. A few default interface groups exist without you doing anything, egress is one, another common one is wlan where all configured WiFi interfaces are members by default. Keep in mind that you can create your own interface groups — set them up using ifconfig(8) — and refer to them in your rules.

match out on egress nat-to egress

This one matches outbound traffic, again on egress (which in the simpler cases consists of one interface) and applies the nat-to action on the packets, transforming them so that the next hops all the way to the destination will see packets where the source address is equal to the egress interface's address. If your network runs IPv4 and you have only one routeable address assigned, you will more than likely have something like this configured on your Internet-facing gateway.

It is worth noting that early PF versions did not have the match verb. After a few years of PF practice, developers and practitioners alike saw the need for a way to apply actions such as nat-to or other transformations without making a decision on whether to pass or block the traffic. The match keyword arrived in OpenBSD 4.6 and in retrospect seems like a prelude to more extensive changes that followed over the next few releases.

Next up is a variation on the initial absolutely secure ruleset.

block all

I will tell you now so you will not be surprised later: If you had made a configuration with those three rules in that order, your configuration would be functionally the same as the one word one we started with. This is because in PF configurations, the rules are evaluated from top to bottom, and the last matching rule wins.

The only escape from this progression is to insert a quick modifier after the verb, as in

pass quick from (self)

which will stop evaluation when a packet matches the criteria in the quick rule. Please use sparingly if at all.

There is a specific reason why PF behaves like this. The system that PF replaced in OpenBSD had the top to bottom, last match wins logic, and the developers did not want to break existing configurations too badly during the transition away from the old system.

So in practice you would put them in this order for a more functional setup,

  block all
  match out on egress nat-to egress
  pass in on egress proto tcp to egress port ssh
    

but likely supplemented by a few other items.

For those supplementing items, we can take a look at some of the PF features that can help you write readable and maintainable rulesets. And while a readable ruleset is not automatically a more secure one, readability certainly helps spot errors in your logic that could put the systems and users in your care in reach of potential threats.

To help that readability, it is important to be aware of these features:

Options: General configuration options that set the parameters for the ruleset, such as

  set limit states 100000
  set debug debug
  set loginterface dc0
  set timeout tcp.first 120 
  set timeout tcp.established 86400 
  set timeout { adaptive.start 6000, adaptive.end 12000 }
  

If the meaning of some of those do not seem terribly obvious to you at this point, that's fine. They are all extensively documented in the pf.conf man page.

Macros: Content that will expand in place, such as lists of services, interface names or other items you feel useful. Some examples along with rules that use them:

  ext_if = "kue0" 
  all_ifs = "{" $ext_if lo0 "}" 
  pass out on $ext_if from any to any 
  pass in  on $ext_if proto tcp from any to any port 25
  

Keep in mind that if your macros expand to lists of either ports or IP addresses, the macro expansion will create several rules to cover your definitions in the ruleset that is eventually loaded.

Tables: Data structures that are specifically designed to store IP addresses and networks. Originally devised to be a more efficient way to store IP addresses than macros that contained IP addresses and expanded to several rules that needed to be evaluated separately. Rules can refer to tables so the rule will match any member of the table.

  table <badhosts> persist counters file "/home/peter/badhosts"
  # ...
  block from <badhosts>
      

Here the table is loaded from a file. You can also initialize a table in pf.conf itself, and you can even manipulate table contents from the command line without reloading the rules:

$ doas pfctl -t badhosts -T add 192.0.2.11 2001:db8::dead:beef:baad:f00d

In addition, several of the daemons in the OpenBSD base system such as spamd, bgpd and dhcpd can be set up to interact with your PF rules.

Rules: The rules with the verbs, criteria and actions that determine how your system handles network traffic.

A very simple and reasonable baseline is one that blocks all incoming traffic but allows all traffic initiated on the local system:

  block
  pass from (self)
      

The pass rule lets our traffic pass to elsewhere, and since PF is a stateful firewall by default, return traffic for the connections the local system sends out will be allowed back.

You probably noticed the configuration here references something called (self).

The string self is a default macro which expands to all configured local interfaces on the host. Here, self is set inside parentheses () which indicates that one or more of the interfaces in self may have dynamically allocated addresses and that PF will detect any changes in the configured interface IP addresses.

This exact ruleset expanded to this on my laptop in my home network at one point:

 $ doas pfctl -vnf /etc/pf.conf
   block drop all
   pass inet6 from ::1 to any flags S/SA
   pass on lo0 inet6 from fe80::1 to any flags S/SA
   pass on iwm0 inet6 from fe80::a2a8:cdff:fe63:abb9 to any flags S/SA
   pass inet6 from 2001:470:28:658:a2a8:cdff:fe63:abb9 to any flags S/SA
   pass inet6 from 2001:470:28:658:8c43:4c81:e110:9d83 to any flags S/SA
   pass inet from 127.0.0.1 to any flags S/SA
   pass inet from 192.168.103.126 to any flags S/SA

The pfctl command here says to verbosely parse but do not load rules from the file /etc/pf.conf.

This shows what the loaded ruleset will be, after any macro expansions or optimizations.

For that exact reason, it is strongly recommended to review the output of pfctl -vnf on any configuration you write before loading it as your running configuration.

If you look closely at that command output, you will see both the inet and inet6 keywords. These designate IPv4 and IPv6 addresses respectively. PF since the earliest days has supported both, and if you do not specify which address family your rule applies to, it will apply to both.

But this has all been on a boring single host configuration. In my experience, the more interesting settings for PF use is when the configuration is for a host that handles traffic for other hosts, as a gateway or other intermediate host.

To forward traffic to and from other hosts, you need to enable forwarding. You can do that from the command line:

 # sysctl net.inet.ip.forwarding=1 
 # sysctl net.inet6.ip6.forwarding=1
	

But you will want to make the change permanent by putting the following lines in your /etc/sysctl.conf so the change survives reboots.

  net.inet.ip.forwarding=1 
  net.inet6.ip6.forwarding=1
	

With these settings in place, a configuration (/etc/pf.conf) like this might make sense if your system has two network interfaces that are both of the bge kind:

  ext_if=bge0
  int_if=bge1
  client_out = "{ ftp-data ftp ssh domain pop3, imaps nntp https }"
  udp_services = "{ domain ntp }"
  icmp_types = "echoreq unreach"
  match out on egress inet nat-to ($ext_if)
  block
  pass inet proto icmp all icmp-type $icmp_types keep state
  pass quick proto { tcp, udp } to port $udp_services keep state
  pass proto tcp from $int_if:network to port $client_out
  pass proto tcp to self port ssh
	

Your network likely differs in one or more ways from this example. See the references at the end for a more thorough treatment of all these options.

And once again, please do use the readability features of the PF syntax to keep you sane and safe.

A Configuration That Learns From Network Traffic Seen and Adapts To Conditions

With PF, you can create a network that learns. Fairly early in PF's history it occured to the developers that the network stack collects and keeps track of information about the traffic it sees, which could then be acted upon if the software became able to actively monitor the data and act on specified changes. So the state tracking options entered the pf.conf repertoire in their initial form with the OpenBSD 3.7 release.

A common use case is when you run an SSH service or really any kind of listening service with the option to log in, you will see some number of failed authentication attempts that generate noise in the logs. The password guessing, or as some of us say, password groping, can turn to be pretty annoying even if the miscreants do not actually manage to compromise any of your systems. So to eliminate noise in our logs we turn to the data that is anyway available in the state table, to track the state of active connections, and to act on limits you define such as number of connections from a single host over a set number of seconds.

The action could be to add the source IP that tripped the limit to a table. Additional rules could then subject the members of that table to special treatment. Since that time, my internet-facing rule sets have tended to include variations on

  table <bruteforce> persist
  block quick from <bruteforce>
  pass inet proto tcp from any to $localnet port $tcp_services \
        flags S/SA keep state \
	(max-src-conn 100, max-src-conn-rate 15/5, \
         overload <bruteforce> flush global)
	

which means that any host that tries more than 100 simultaneous connections or more than 15 new connections over 5 seconds are added to the table and blocked, with any existing connections terminated.

It is a good practice to let table entries in such setups expire eventually. How long entries stay is entirely up to you.

At first I set expiry at 24 hours, but with password gropers like those caught by this rule being what they are, I switched a few years ago to at four weeks at first, then upped again a few months later to six weeks. Groperbots tend to stay broken for that long. And since they target any service you may be running, state tracking options with overload tables can be useful in a lot of non-SSH contexts as well.

A point that observers often miss is that with this configuration, you have a firewall that learns from the traffic it sees and adapts to network conditions.

It is also worth noting that state tracking actions can be applied to all TCP traffic and that they can be useful for essentially all services.

The buzzwordability potential in the learning configurations is enormous, and I for one fail to see how the big names have failed to copy or imitate this feature and greytrapping which we will look at later, and capitalize on products with those features.

The article Forcing the password gropers through a smaller hole with OpenBSD's PF queues has a few suggestions on how to handle noise sources with various other services. More on queues in a few moments.

The Adaptive Firewall and the Greytrapping Game

At the risk of showing my age, I must admit that I have more or less always run a mail service. Once TCP/IP networking became available in some form for even small businesses and individuals during the early 1990s, once you were connected, it was simply one of those things you would do. Setting up an SMTP service (initially wrestling with sendmail and it legendary sendmail.cf configuration file) with accompanying pop3 and/or imap service was the done thing.

Over time the choice of mail server software changed, we introduced content filtering to beat the rise of the trashy, scanny spam mail and, since the majority of clients ran that operating system mail-borne malware. But even with state of the art content filtering some unwanted messages would make it into users' inboxes often enough to be annoying.

So when OpenBSD 3.3 shipped with the initial version of spamd it was quite a relief for people of my job category, even if that only would load lists of known bad senders' IP addresses and stutter at them one byte per second until the other side gave up.

Later versions introduced greylisting — answering SMTP connections from previously unknown senders with a temporary local error code and only accepting delivery if the same host tried again — which reduced the load on the content filtering machines significantly, and the real fun started with the introduction of greytrapping in the version of spamd(8) that shipped with OpenBSD 3.7.

Greytrapping is yet another adaptive or learning feature. The system identifies bad actors by comparing the destination email address in incoming SMTP traffic from unknown or already greylisted hosts with a list of known invalid addresses in the domains the site serves. The spamdb(8) command was extended to add features to add addresses to and delete from the spamtrap list.

Greytrapping was an extremely welcome new feature, and I adopted it eagerly. Soon after the feature became available, I set up for greytrapping. The spamtrap addresses were the ones initially addresses I fished out of my mail server logs — from entries produced by bounce messages that themselves turned out to be undeliverable at our end since the recipient did not exist — and after a few weeks I started publishing both the list of spamtraps and an hourly dump of currently trapped IP addresses.

The setup is amazingly easy. On a typical gateway in front of a mail server you instrument your /etc/pf.conf with a few lines, usually at the top,

  table <spamd-white> persist
  table <nospamd> persist file "/etc/mail/nospamd"
  pass in on egress proto tcp to any port smtp \
        divert-to 127.0.0.1 port spamd
  pass in on egress proto tcp from <nospamd> to any port smtp
  pass in log on egress proto tcp from <spamd-white> to any port smtp
  pass out log on egress proto tcp to any port smtp
    

Here we even suck in a file that contains the IP addresses of hosts that should not be subjected to the spamd treatment.

In addition you will need to set up with the correct options for spamd(8) and spamdlogd(8) in your /etc/rc.conf.local:

  spamd_flags="-v -G 2:8:864 -n "mailwalla 17.25" -c 1200 -C /etc/mail/fullchain.pem -K /etc/mail/privkey.pem -w 1 -y em1 -Y em1 -Y 158.36.191.225"
  spamdlogd_flags="-i em1 -Y 158.36.191.225"
      

The IP address here designates a sync partner, check out the spamd(8) man page for the other options. If you're interested, you can get the gory details of running a setup with several mail exchangers in the In The Name Of Sane Email: Setting Up OpenBSD's spamd(8) With Secondary MXes In Play - A Full Recipe article.

You probably do not need to edit the configuration file /etc/mail/spamd.conf much, but do look up the man page and possibly references to the bsdly.net blocklist. Finally, reload your PF configuration, start the daemons spamd(8) and spamdlogd(8) using rcctl, set up a crontab(5) line to run spamd-setup(8) at reasonable intervals to fetch updated blocklists.

The number of trapped addresses in the hourly dump has been anything from a few hundred in the earliest days, later in the thousands and even at times in the hundreds of thousands. For the last couple of years the number has generally been in the mid to low four digits, with each host typically hanging around longer to try delivery to an ever expanding number of invalid addresses in their database.

Just a few weeks ago, the list of “imaginary friends” rolled past 300,000 entries. The article The Things Spammers Believe - A Tale of 300,000 Imaginary Friends tells the story with copious links to earlier articles and other resources, while Maintaining A Publicly Available Blacklist - Mechanisms And Principles details the work involved in maintaining a blocklist that is offered to the public.

It's been good fun, with a liberal helping of bizarre as the number of spamtraps grew, sometimes with truly weird contents.

Traffic Shaping You Can Actually Understand

You've heard it before: Traffic shaping is hard. Hard to do and hard to understand.

Traditionally traffic shaping was available on all BSDs in the form of ALTQ, a codebase that its developers labeled experimental and contained implementations of several different traffic shaping algorithms. One central problem was that the configuration syntax was inelegant at best, even after the system was merged into the PF configuration.

In OpenBSD, which runs development on a strict six month release cycle, the code that would eventually replace ALTQ was introduced gradually over several releases.

The first feature to be introduced was always-on, settable priorities with the keyword prio.

A random example shows that this configuration prioritises ssh traffic above most others (the default is 3):

pass proto tcp to port ssh set prio 6

While this configuration makes an attempt at speeding up TCP traffic by assigning a higher priority to lowdelay packets, typically ACKs:

  match out on $ext_if proto tcp from $ext_if set prio (3, 7)
  match in  on $ext_if proto tcp to $ext_if set prio (3, 7)
	

Next up, the newqueue code did away with the multiple algorithms approach and settled on the Hierarchical fair-service curve (HFSC) as the most flexible option that would even make it possible to emulate or imitate the alternative shaping algorithms from the ALTQ experiment.

HFSC queues are defined on an interface with a hierarchy of child queues, where only the “leaf” queues can be assigned traffic. We take a look at a static allocation first:

  queue main on $ext_if bandwidth 20M
    queue defq parent main bandwidth 3600K default
    queue ftp parent main bandwidth 2000K
    queue udp parent main bandwidth 6000K
    queue web parent main bandwidth 4000K
    queue ssh parent main bandwidth 4000K
      queue ssh_interactive parent ssh bandwidth 800K
      queue ssh_bulk parent ssh bandwidth 3200K
    queue icmp parent main bandwidth 400K
  

You then tie in the queue assignment, here with match rules

  match log quick on $ext_if proto tcp to port ssh \
        queue (ssh_bulk, ssh_interactive)
  match in quick on $ext_if proto tcp to port ftp queue ftp
  match in quick on $ext_if proto tcp to port www queue http
  match out on $ext_if proto udp queue udp
  match out on $ext_if proto icmp queue icmp
  

which is definitely the way to add queueing to an existing configuration, and in my view also a good practice for configuration structure reasons. But you can also tack on queue this_or_that_queue at the end of pass rules.

There are two often forgotten facts about HFSC traffic shaping I would like to mention:

Traffic shaping is more often than not a matter of prioritizing which traffic you drop packets for, and no shaping at all takes place before the traffic volume approaches one or more of the limits set by the queue definitions.

One of the beautiful things about modern HFSC queueing is that you can build in flexibility, like this:

  queue rootq on $ext_if bandwidth 20M
    queue main parent rootq bandwidth 20479K min 1M max 20479K qlimit 100
    queue qdef parent main bandwidth 9600K min 6000K max 18M default
    queue qweb parent main bandwidth 9600K min 6000K max 18M
    queue qpri parent main bandwidth 700K min 100K max 1200K
    queue qdns parent main bandwidth 200K min 12K burst 600K for 3000ms
    queue spamd parent rootq bandwidth 1K min 0K max 1K qlimit 300
  
The min and max values are core to that flexibility. Subordinate queues can 'borrow' bandwidth up to their own max values within the allocation of the parent queue. The combined max queue bandwidth can exceed the root queue's bandwith and still be valid. However the allocation will always top out at the allocated or the actual physical limits of the interface the queue is configured on.

For bursty services such as DNS in our example you can allow burst for a specified time where the allocation can exceed the queue's max value, still within the limits set on the parent queue.

Finally, the qlimit sets the size of the queue's holding buffer. A larger buffer may lead to delays since it packets may be kept longer in the buffer before sending on their way out to the world.

And if you noticed the name of that final, tiny queue, you probably have guessed correctly what it was for. The traffic from hosts that were caught in the spamd net was really horrible, as this systat queues display shows:

 1 users Load 2.56 2.27 2.28                                      skapet.bsdly.net 20:55:50
 QUEUE                BW SCH  PRI    PKTS   BYTES   DROP_P   DROP_B QLEN BOR SUS  P/S   B/S
 rootq on bge0       20M                0       0        0        0    0            0     0
  main               20M                0       0        0        0    0            0     0
   qdef               9M          6416363   2338M      136    15371    0          462 30733
   qweb               9M           431590 144565K        0        0    0          0.6   480
   qpri               2M          2854556 181684K        5      390    0           79  5243
   qdns             100K           802874  68379K        0        0    0          0.6    52
  spamd               1K           596022  36021K  1177533 72871514  299            2   136
	    

It was good, clean fun. And that display did give me a feeling of Mission accomplished.

There are several other tools in the PF toolset such as carp(4) based redundancy for highly available service, relayd(8) for load balancing, application delivery and general network trickery, PF logs and the fact that tcpdump(8) is your friend, and several others that I have enjoyed using but I decided to skip since this was supposed to be a user group talk and a somewhat dense article.

I would encourage you to explore those topics further via the literature listed under the Resources heading for more on these.

Who Else Uses PF Today?

PF originated in OpenBSD, but word of the new subsystem reached other projects quickly and there was considerable interest from the very start.  Over the years, PF has been ported from the original OpenBSD to the other BSDs and a few other systems, including

Other than Oracle with their port to Solaris, most ports of the PF subsystem happened before the OpenBSD 4.7 NAT rewrite, and for that reason they have kept the previous syntax intact.

There may very well be others. There is no duty to actually advertise the fact that you have incorporated BSD licensed code in your product.

If you find other products using PF or other OpenBSD code in the wild, I am interested in hearing from you about it. Please comment or send email to nix at nxdomain dot no.

Resources for Further Exploration

The PF User's Guide

The Book of PF by Peter N. M. Hansteen

Absolute OpenBSD by Michael Lucas

Network Management with the OpenBSD Packet Filter toolset, by Peter N. M. Hansteen, Massimiliano Stucchi and Tom Smyth (A PF tutorial, this is the EuroBSDCon 2022 edition). An earlier, even more extensive set of slides can be found in the 2016-vintage PF tutorial.

That Grumpy BSD Guy Blog posts by Peter N. M. Hansteen

OpenBSD Journal News items about OpenBSD, generally short with references to material elsewhere.