Sunday, September 25, 2022

A Few of My Favorite Things About The OpenBSD Packet Filter Tools

The OpenBSD packet filter PF was introduced a little more than 20 years ago as part of OpenBSD 3.0. We'll take a short tour of PF features and tools that I have enjoyed using.



NOTE: If you are more of a slides person, the condensate for a SEMIBUG user group meeting is available here. A version without trackers but “classical” formatting is available here.

At the time the OpenBSD project introduced its new packet filter subsystem in 2001, I was nowhere near the essentially full time OpenBSD user I would soon become. I did however quickly recognize that even what was later dubbed “the working prototype” was reported to perform better in most contexts than the code it replaced.

The reason PF's predecessor needed to be replaced has been covered extensively by myself and others elsewhere, so I'll limit myself to noting that the reason was that several somebodies finally read and understood the code's license and decided that it was not in fact open source in any acceptable meaning of the term.

Anyway the initial PF release was very close in features and syntax to the code it replaced. And even at that time, the config syntax was a lot more human readable than the alternative I had been handling up to then, which was Linux' IPtables. The less is said about IPtables, the better.

But soon visible improvements in user friendliness, or at least admin friendliness, started turning up. With OpenBSD 3.2, the separate /etc/nat.conf network adress translation configuration file moved to the attic and the NAT and redirection options moved into the main PF config file /etc/pf.conf.

The next version, OpenBSD 3.3, saw the ALTQ queueing configuration move into pf.conf as well, and the previously separate altq.conf file became obsolete. What did not change, however, was the syntax, which was to remain just bothersome enough that many of us put off playing with traffic shaping until some years later. Other PF news in that release included anchors, or named sub-rulesets, as well as tables, described as "a very efficient way for large address lists in rules" and the initial release of spamd(8), the spam deferral daemon.

More on all of these things later, I will not bore you with a detailed history of PF features introduced or changed in OpenBSD over the last twenty-some years.

PF Rulesets: The Basics

So how do we go about writing that perfect firewall config?

I could go on about that at length, and I have been known to on occasion, but let us start with the simplest possible, yet absolutely secure PF ruleset:

block

With that in place, you are totally secure. No traffic will pass.

Or as they say in the trade, you have virtually unplugged yourself from the rest of the world.

By way of getting ahead of ourselves, that particular ruleset will expand to the following:

block drop all

But we are getting ahead of ourselves.

To provide you with a few tools and some context, these are the basic building blocks of a PF rule:

verb criteria action ... options

Here are a few sample rules to put it into context, all lifted from configurations I have put into production:

pass in on egress proto tcp to egress port ssh

This first sample says that if a packet arrives on the egress — an interface belonging to the group of interfaces that has a default route — and that packet is a TCP packet with a destination service ssh, let the packet pass to the interfaces belonging to the egress interface group.

Yes, when you write PF rulesets, you do not necessarily need to write port numbers for services and memorize what services hide behind port 80, 53 or 443. The common or standard services are known to the rules parsing part of pfctl(8), generally with the service names you can look up in the /etc/services file.

The interface groups concept is as far as I know an OpenBSD innovation. You can put interfaces into logical groups and reference the group name in PF configurations. A few default interface groups exist without you doing anything, egress is one, another common one is wlan where all configured WiFi interfaces are members by default. Keep in mind that you can create your own interface groups — set them up using ifconfig(8) — and refer to them in your rules.

match out on egress nat-to egress

This one matches outbound traffic, again on egress (which in the simpler cases consists of one interface) and applies the nat-to action on the packets, transforming them so that the next hops all the way to the destination will see packets where the source address is equal to the egress interface's address. If your network runs IPv4 and you have only one routeable address assigned, you will more than likely have something like this configured on your Internet-facing gateway.

It is worth noting that early PF versions did not have the match verb. After a few years of PF practice, developers and practitioners alike saw the need for a way to apply actions such as nat-to or other transformations without making a decision on whether to pass or block the traffic. The match keyword arrived in OpenBSD 4.6 and in retrospect seems like a prelude to more extensive changes that followed over the next few releases.

Next up is a variation on the initial absolutely secure ruleset.

block all

I will tell you now so you will not be surprised later: If you had made a configuration with those three rules in that order, your configuration would be functionally the same as the one word one we started with. This is because in PF configurations, the rules are evaluated from top to bottom, and the last matching rule wins.

The only escape from this progression is to insert a quick modifier after the verb, as in

pass quick from (self)

which will stop evaluation when a packet matches the criteria in the quick rule. Please use sparingly if at all.

There is a specific reason why PF behaves like this. The system that PF replaced in OpenBSD had the top to bottom, last match wins logic, and the developers did not want to break existing configurations too badly during the transition away from the old system.

So in practice you would put them in this order for a more functional setup,

  block all
  match out on egress nat-to egress
  pass in on egress proto tcp to egress port ssh
    

but likely supplemented by a few other items.

For those supplementing items, we can take a look at some of the PF features that can help you write readable and maintainable rulesets. And while a readable ruleset is not automatically a more secure one, readability certainly helps spot errors in your logic that could put the systems and users in your care in reach of potential threats.

To help that readability, it is important to be aware of these features:

Options: General configuration options that set the parameters for the ruleset, such as

  set limit states 100000
  set debug debug
  set loginterface dc0
  set timeout tcp.first 120 
  set timeout tcp.established 86400 
  set timeout { adaptive.start 6000, adaptive.end 12000 }
  

If the meaning of some of those do not seem terribly obvious to you at this point, that's fine. They are all extensively documented in the pf.conf man page.

Macros: Content that will expand in place, such as lists of services, interface names or other items you feel useful. Some examples along with rules that use them:

  ext_if = "kue0" 
  all_ifs = "{" $ext_if lo0 "}" 
  pass out on $ext_if from any to any 
  pass in  on $ext_if proto tcp from any to any port 25
  

Keep in mind that if your macros expand to lists of either ports or IP addresses, the macro expansion will create several rules to cover your definitions in the ruleset that is eventually loaded.

Tables: Data structures that are specifically designed to store IP addresses and networks. Originally devised to be a more efficient way to store IP addresses than macros that contained IP addresses and expanded to several rules that needed to be evaluated separately. Rules can refer to tables so the rule will match any member of the table.

  table <badhosts> persist counters file "/home/peter/badhosts"
  # ...
  block from <badhosts>
      

Here the table is loaded from a file. You can also initialize a table in pf.conf itself, and you can even manipulate table contents from the command line without reloading the rules:

$ doas pfctl -t badhosts -T add 192.0.2.11 2001:db8::dead:beef:baad:f00d

In addition, several of the daemons in the OpenBSD base system such as spamd, bgpd and dhcpd can be set up to interact with your PF rules.

Rules: The rules with the verbs, criteria and actions that determine how your system handles network traffic.

A very simple and reasonable baseline is one that blocks all incoming traffic but allows all traffic initiated on the local system:

  block
  pass from (self)
      

The pass rule lets our traffic pass to elsewhere, and since PF is a stateful firewall by default, return traffic for the connections the local system sends out will be allowed back.

You probably noticed the configuration here references something called (self).

The string self is a default macro which expands to all configured local interfaces on the host. Here, self is set inside parentheses () which indicates that one or more of the interfaces in self may have dynamically allocated addresses and that PF will detect any changes in the configured interface IP addresses.

This exact ruleset expanded to this on my laptop in my home network at one point:

 $ doas pfctl -vnf /etc/pf.conf
   block drop all
   pass inet6 from ::1 to any flags S/SA
   pass on lo0 inet6 from fe80::1 to any flags S/SA
   pass on iwm0 inet6 from fe80::a2a8:cdff:fe63:abb9 to any flags S/SA
   pass inet6 from 2001:470:28:658:a2a8:cdff:fe63:abb9 to any flags S/SA
   pass inet6 from 2001:470:28:658:8c43:4c81:e110:9d83 to any flags S/SA
   pass inet from 127.0.0.1 to any flags S/SA
   pass inet from 192.168.103.126 to any flags S/SA

The pfctl command here says to verbosely parse but do not load rules from the file /etc/pf.conf.

This shows what the loaded ruleset will be, after any macro expansions or optimizations.

For that exact reason, it is strongly recommended to review the output of pfctl -vnf on any configuration you write before loading it as your running configuration.

If you look closely at that command output, you will see both the inet and inet6 keywords. These designate IPv4 and IPv6 addresses respectively. PF since the earliest days has supported both, and if you do not specify which address family your rule applies to, it will apply to both.

But this has all been on a boring single host configuration. In my experience, the more interesting settings for PF use is when the configuration is for a host that handles traffic for other hosts, as a gateway or other intermediate host.

To forward traffic to and from other hosts, you need to enable forwarding. You can do that from the command line:

 # sysctl net.inet.ip.forwarding=1 
 # sysctl net.inet6.ip6.forwarding=1
	

But you will want to make the change permanent by putting the following lines in your /etc/sysctl.conf so the change survives reboots.

  net.inet.ip.forwarding=1 
  net.inet6.ip6.forwarding=1
	

With these settings in place, a configuration (/etc/pf.conf) like this might make sense if your system has two network interfaces that are both of the bge kind:

  ext_if=bge0
  int_if=bge1
  client_out = "{ ftp-data ftp ssh domain pop3, imaps nntp https }"
  udp_services = "{ domain ntp }"
  icmp_types = "echoreq unreach"
  match out on egress inet nat-to ($ext_if)
  block
  pass inet proto icmp all icmp-type $icmp_types keep state
  pass quick proto { tcp, udp } to port $udp_services keep state
  pass proto tcp from $int_if:network to port $client_out
  pass proto tcp to self port ssh
	

Your network likely differs in one or more ways from this example. See the references at the end for a more thorough treatment of all these options.

And once again, please do use the readability features of the PF syntax to keep you sane and safe.

A Configuration That Learns From Network Traffic Seen and Adapts To Conditions

With PF, you can create a network that learns. Fairly early in PF's history it occured to the developers that the network stack collects and keeps track of information about the traffic it sees, which could then be acted upon if the software became able to actively monitor the data and act on specified changes. So the state tracking options entered the pf.conf repertoire in their initial form with the OpenBSD 3.7 release.

A common use case is when you run an SSH service or really any kind of listening service with the option to log in, you will see some number of failed authentication attempts that generate noise in the logs. The password guessing, or as some of us say, password groping, can turn to be pretty annoying even if the miscreants do not actually manage to compromise any of your systems. So to eliminate noise in our logs we turn to the data that is anyway available in the state table, to track the state of active connections, and to act on limits you define such as number of connections from a single host over a set number of seconds.

The action could be to add the source IP that tripped the limit to a table. Additional rules could then subject the members of that table to special treatment. Since that time, my internet-facing rule sets have tended to include variations on

  table <bruteforce> persist
  block quick from <bruteforce>
  pass inet proto tcp from any to $localnet port $tcp_services \
        flags S/SA keep state \
	(max-src-conn 100, max-src-conn-rate 15/5, \
         overload <bruteforce> flush global)
	

which means that any host that tries more than 100 simultaneous connections or more than 15 new connections over 5 seconds are added to the table and blocked, with any existing connections terminated.

It is a good practice to let table entries in such setups expire eventually. How long entries stay is entirely up to you.

At first I set expiry at 24 hours, but with password gropers like those caught by this rule being what they are, I switched a few years ago to at four weeks at first, then upped again a few months later to six weeks. Groperbots tend to stay broken for that long. And since they target any service you may be running, state tracking options with overload tables can be useful in a lot of non-SSH contexts as well.

A point that observers often miss is that with this configuration, you have a firewall that learns from the traffic it sees and adapts to network conditions.

It is also worth noting that state tracking actions can be applied to all TCP traffic and that they can be useful for essentially all services.

The buzzwordability potential in the learning configurations is enormous, and I for one fail to see how the big names have failed to copy or imitate this feature and greytrapping which we will look at later, and capitalize on products with those features.

The article Forcing the password gropers through a smaller hole with OpenBSD's PF queues has a few suggestions on how to handle noise sources with various other services. More on queues in a few moments.

The Adaptive Firewall and the Greytrapping Game

At the risk of showing my age, I must admit that I have more or less always run a mail service. Once TCP/IP networking became available in some form for even small businesses and individuals during the early 1990s, once you were connected, it was simply one of those things you would do. Setting up an SMTP service (initially wrestling with sendmail and it legendary sendmail.cf configuration file) with accompanying pop3 and/or imap service was the done thing.

Over time the choice of mail server software changed, we introduced content filtering to beat the rise of the trashy, scanny spam mail and, since the majority of clients ran that operating system mail-borne malware. But even with state of the art content filtering some unwanted messages would make it into users' inboxes often enough to be annoying.

So when OpenBSD 3.3 shipped with the initial version of spamd it was quite a relief for people of my job category, even if that only would load lists of known bad senders' IP addresses and stutter at them one byte per second until the other side gave up.

Later versions introduced greylisting — answering SMTP connections from previously unknown senders with a temporary local error code and only accepting delivery if the same host tried again — which reduced the load on the content filtering machines significantly, and the real fun started with the introduction of greytrapping in the version of spamd(8) that shipped with OpenBSD 3.7.

Greytrapping is yet another adaptive or learning feature. The system identifies bad actors by comparing the destination email address in incoming SMTP traffic from unknown or already greylisted hosts with a list of known invalid addresses in the domains the site serves. The spamdb(8) command was extended to add features to add addresses to and delete from the spamtrap list.

Greytrapping was an extremely welcome new feature, and I adopted it eagerly. Soon after the feature became available, I set up for greytrapping. The spamtrap addresses were the ones initially addresses I fished out of my mail server logs — from entries produced by bounce messages that themselves turned out to be undeliverable at our end since the recipient did not exist — and after a few weeks I started publishing both the list of spamtraps and an hourly dump of currently trapped IP addresses.

The setup is amazingly easy. On a typical gateway in front of a mail server you instrument your /etc/pf.conf with a few lines, usually at the top,

  table <spamd-white> persist
  table <nospamd> persist file "/etc/mail/nospamd"
  pass in on egress proto tcp to any port smtp \
        divert-to 127.0.0.1 port spamd
  pass in on egress proto tcp from <nospamd> to any port smtp
  pass in log on egress proto tcp from <spamd-white> to any port smtp
  pass out log on egress proto tcp to any port smtp
    

Here we even suck in a file that contains the IP addresses of hosts that should not be subjected to the spamd treatment.

In addition you will need to set up with the correct options for spamd(8) and spamdlogd(8) in your /etc/rc.conf.local:

  spamd_flags="-v -G 2:8:864 -n "mailwalla 17.25" -c 1200 -C /etc/mail/fullchain.pem -K /etc/mail/privkey.pem -w 1 -y em1 -Y em1 -Y 158.36.191.225"
  spamdlogd_flags="-i em1 -Y 158.36.191.225"
      

The IP address here designates a sync partner, check out the spamd(8) man page for the other options. If you're interested, you can get the gory details of running a setup with several mail exchangers in the In The Name Of Sane Email: Setting Up OpenBSD's spamd(8) With Secondary MXes In Play - A Full Recipe article.

You probably do not need to edit the configuration file /etc/mail/spamd.conf much, but do look up the man page and possibly references to the bsdly.net blocklist. Finally, reload your PF configuration, start the daemons spamd(8) and spamdlogd(8) using rcctl, set up a crontab(5) line to run spamd-setup(8) at reasonable intervals to fetch updated blocklists.

The number of trapped addresses in the hourly dump has been anything from a few hundred in the earliest days, later in the thousands and even at times in the hundreds of thousands. For the last couple of years the number has generally been in the mid to low four digits, with each host typically hanging around longer to try delivery to an ever expanding number of invalid addresses in their database.

Just a few weeks ago, the list of “imaginary friends” rolled past 300,000 entries. The article The Things Spammers Believe - A Tale of 300,000 Imaginary Friends tells the story with copious links to earlier articles and other resources, while Maintaining A Publicly Available Blacklist - Mechanisms And Principles details the work involved in maintaining a blocklist that is offered to the public.

It's been good fun, with a liberal helping of bizarre as the number of spamtraps grew, sometimes with truly weird contents.

Traffic Shaping You Can Actually Understand

You've heard it before: Traffic shaping is hard. Hard to do and hard to understand.

Traditionally traffic shaping was available on all BSDs in the form of ALTQ, a codebase that its developers labeled experimental and contained implementations of several different traffic shaping algorithms. One central problem was that the configuration syntax was inelegant at best, even after the system was merged into the PF configuration.

In OpenBSD, which runs development on a strict six month release cycle, the code that would eventually replace ALTQ was introduced gradually over several releases.

The first feature to be introduced was always-on, settable priorities with the keyword prio.

A random example shows that this configuration prioritises ssh traffic above most others (the default is 3):

pass proto tcp to port ssh set prio 6

While this configuration makes an attempt at speeding up TCP traffic by assigning a higher priority to lowdelay packets, typically ACKs:

  match out on $ext_if proto tcp from $ext_if set prio (3, 7)
  match in  on $ext_if proto tcp to $ext_if set prio (3, 7)
	

Next up, the newqueue code did away with the multiple algorithms approach and settled on the Hierarchical fair-service curve (HFSC) as the most flexible option that would even make it possible to emulate or imitate the alternative shaping algorithms from the ALTQ experiment.

HFSC queues are defined on an interface with a hierarchy of child queues, where only the “leaf” queues can be assigned traffic. We take a look at a static allocation first:

  queue main on $ext_if bandwidth 20M
    queue defq parent main bandwidth 3600K default
    queue ftp parent main bandwidth 2000K
    queue udp parent main bandwidth 6000K
    queue web parent main bandwidth 4000K
    queue ssh parent main bandwidth 4000K
      queue ssh_interactive parent ssh bandwidth 800K
      queue ssh_bulk parent ssh bandwidth 3200K
    queue icmp parent main bandwidth 400K
  

You then tie in the queue assignment, here with match rules

  match log quick on $ext_if proto tcp to port ssh \
        queue (ssh_bulk, ssh_interactive)
  match in quick on $ext_if proto tcp to port ftp queue ftp
  match in quick on $ext_if proto tcp to port www queue http
  match out on $ext_if proto udp queue udp
  match out on $ext_if proto icmp queue icmp
  

which is definitely the way to add queueing to an existing configuration, and in my view also a good practice for configuration structure reasons. But you can also tack on queue this_or_that_queue at the end of pass rules.

There are two often forgotten facts about HFSC traffic shaping I would like to mention:

Traffic shaping is more often than not a matter of prioritizing which traffic you drop packets for, and no shaping at all takes place before the traffic volume approaches one or more of the limits set by the queue definitions.

One of the beautiful things about modern HFSC queueing is that you can build in flexibility, like this:

  queue rootq on $ext_if bandwidth 20M
    queue main parent rootq bandwidth 20479K min 1M max 20479K qlimit 100
    queue qdef parent main bandwidth 9600K min 6000K max 18M default
    queue qweb parent main bandwidth 9600K min 6000K max 18M
    queue qpri parent main bandwidth 700K min 100K max 1200K
    queue qdns parent main bandwidth 200K min 12K burst 600K for 3000ms
    queue spamd parent rootq bandwidth 1K min 0K max 1K qlimit 300
  
The min and max values are core to that flexibility. Subordinate queues can 'borrow' bandwidth up to their own max values within the allocation of the parent queue. The combined max queue bandwidth can exceed the root queue's bandwith and still be valid. However the allocation will always top out at the allocated or the actual physical limits of the interface the queue is configured on.

For bursty services such as DNS in our example you can allow burst for a specified time where the allocation can exceed the queue's max value, still within the limits set on the parent queue.

Finally, the qlimit sets the size of the queue's holding buffer. A larger buffer may lead to delays since it packets may be kept longer in the buffer before sending on their way out to the world.

And if you noticed the name of that final, tiny queue, you probably have guessed correctly what it was for. The traffic from hosts that were caught in the spamd net was really horrible, as this systat queues display shows:

 1 users Load 2.56 2.27 2.28                                      skapet.bsdly.net 20:55:50
 QUEUE                BW SCH  PRI    PKTS   BYTES   DROP_P   DROP_B QLEN BOR SUS  P/S   B/S
 rootq on bge0       20M                0       0        0        0    0            0     0
  main               20M                0       0        0        0    0            0     0
   qdef               9M          6416363   2338M      136    15371    0          462 30733
   qweb               9M           431590 144565K        0        0    0          0.6   480
   qpri               2M          2854556 181684K        5      390    0           79  5243
   qdns             100K           802874  68379K        0        0    0          0.6    52
  spamd               1K           596022  36021K  1177533 72871514  299            2   136
	    

It was good, clean fun. And that display did give me a feeling of Mission accomplished.

There are several other tools in the PF toolset such as carp(4) based redundancy for highly available service, relayd(8) for load balancing, application delivery and general network trickery, PF logs and the fact that tcpdump(8) is your friend, and several others that I have enjoyed using but I decided to skip since this was supposed to be a user group talk and a somewhat dense article.

I would encourage you to explore those topics further via the literature listed under the Resources heading for more on these.

Who Else Uses PF Today?

PF originated in OpenBSD, but word of the new subsystem reached other projects quickly and there was considerable interest from the very start.  Over the years, PF has been ported from the original OpenBSD to the other BSDs and a few other systems, including

Other than Oracle with their port to Solaris, most ports of the PF subsystem happened before the OpenBSD 4.7 NAT rewrite, and for that reason they have kept the previous syntax intact.

There may very well be others. There is no duty to actually advertise the fact that you have incorporated BSD licensed code in your product.

If you find other products using PF or other OpenBSD code in the wild, I am interested in hearing from you about it. Please comment or send email to nix at nxdomain dot no.

Resources for Further Exploration

The PF User's Guide

The Book of PF by Peter N. M. Hansteen

Absolute OpenBSD by Michael Lucas

Network Management with the OpenBSD Packet Filter toolset, by Peter N. M. Hansteen, Massimiliano Stucchi and Tom Smyth (A PF tutorial, this is the EuroBSDCon 2022 edition). An earlier, even more extensive set of slides can be found in the 2016-vintage PF tutorial.

That Grumpy BSD Guy Blog posts by Peter N. M. Hansteen

OpenBSD Journal News items about OpenBSD, generally short with references to material elsewhere.

No comments:

Post a Comment

Note: Comments are moderated. On-topic messages will be liberated from the holding queue at semi-random (hopefully short) intervals.

I invite comment on all aspects of the material I publish and I read all submitted comments. I occasionally respond in comments, but please do not assume that your comment will compel me to produce a public or immediate response.

Please note that comments consisting of only a single word or only a URL with no indication why that link is useful in the context will be immediately recycled so those poor electrons get another shot at a meaningful existence.

If your suggestions are useful enough to make me write on a specific topic, I will do my best to give credit where credit is due.