Almost everybody knows what a firewall is. And nobody disputes the need for them in a network design or to protect a host.
Linux has a very powerful firewall: iptables. It is actually more than just a firewall and can perform:
- Filtering
- NAT/PAT
- Packet mangling
- Packet marking
In this post, we will just look at the filtering aspect.
Historically, firewalls have started simply as filters: a firewall would drop or reject a packet based on the source and destination addresses, the protocol used (IP, ICMP, TCP, UDP) and the source and destination ports. However, this proved to have some serious limitations: for instance, an administrator would have had to put all rules both ways, such as
permit TCP 10.0.0.0 255.255.255.0 -> 192.168.0.0 255.255.255.0 port 80
and
permit TCP 192.168.0.0 255.255.255.0 port 80 -> 10.0.0.0 255.255.255.0
This unfortunately has a very serious side effect: any machine on the 192.168.0.0/24 network can access any machine on the 10.0.0.0/24 network on any TCP port, provided that the source is set to 80!
To address that, certain vendors developed workarounds such as the "established" keyword, which check that the packet doesn't have only the SYN flag. However, while this prevents establishing a session, it still permits someone to scan the inside network by using the same trick as previously and forcing the TCP flags to Ack, Push or anything that is non SYN.
Came the stateful firewalls! The idea is simple: the firewall, having access to the packets, can track the state of each session and allow back only the frames leading to an adjacent valid state. For example, if host 192.168.0.1 has sent a SYN packet to 10.0.0.1 from port 3340 to port 80, the only valid replies are either a "RST" from 10.0.0.1 port 80 to 192.168.0.1 port 3340 (the port 80 on 10.0.0.1 is closed or the connection has been refused), or a SYN-ACK from 10.0.0.1 port 80 to 192.168.0.1 port 3340. In the latter case, the connection is "half open", and the only valid replies are a RST (the connection is being dropped) or an ACK from 192.168.0.1 port 3340 to 10.0.0.1 port 80, at which point the connection is fully open and the data transfer can start.
This brought a slight disadvantage: in order to be able to track the session, the firewall needs to see all the packets. Asymmetric routing is no longer possible and if load balancing between multiple firewalls is needed either the state tables need to be shared between all the members or the load balancing needs to take the sessions into account.
There is also another issue: if the notion of state is defined for TCP, UDP and ICMP are connectionless, meaning that there is no notion of session. So how to proceed in this case? This usual answer is to consider that the first datagram "opens" the "session", and that anything in return on the same set of ports is to be accepted. For ICMP, it is a bit different: there is no notion of port, so the
ICMP Type is used instead. For instance, an ICMP Type 8 indicates an "Echo Request", for which there are only a few valid replies: Type 0 (Echo Reply), Type 3 (Unreachable), Type 4 (Source Quench) or Type 11 (TTL Expired).
TCP has a mechanism to indicate to a sender that a segment has been received out of state, or that the port is closed, by replying with the RST (reset) flag set. UDP doesn't have the same mechanism, and instead rely on ICMP Type 3 Code 3 ("Port Unreachable") to convey the information.
Okay, now that we set the scene, let's start with iptables.
As I said, iptables is also a firewall. It has a modular design that allows for the quick development and implementation of new protocols and services. More on this a bit later. It operates by using
chains: a collection of rules whose possible actions are, for the packet filter, to accept, deny, reject a packet or even call another table. This feature allows an almost "programmatic" view of the rules, with calls and returns, code sharing and so forth.
Let's start with an example.
I have a small web, name and mail server exposed to the Internet. It has directly a public IP so I won't need NAT, and it doesn't provide access to a network: this is purely a server.
It needs to allow SMTP, POP/IMAP, DNS, HTTP and HTTPS access to the Internet, and SSH access only to a small number of IPs. The wish is also that it has a blacklist to put the undesirable IPs and a noise list to suppress the various unsolicited packets generated by neighboring machines (Netbios, DHCP, ...). Lastly, we want to keep track of the various regions as defined by the
IANA.
The rule set will look like this:
No panic! It looks way more complex than it really is.
Note: iptables does more than filtering by the use of various
tables. The "filter" is one table among many, the others include "nat"for address translation, "mangle" for various operations on packets and "raw" for very, very specialized treatment. In the following, we will use the "filter" table, which is also the default table.
The "filter" table comes with 3 chains:
- INPUT - for packets destined to local sockets
- OUTPUT - for packets locally generated
- FORWARD - for packets that are routed between two subnets
Our server is not used as a router, so the only chains of concerns are
INPUT and
OUTPUT. These built-in chains also take a
policy: this is the default action to take when none of the rules matches the packet. There are 4 built-in actions:
- ACCEPT - the packet is let through
- DROP - the packet is discarded
- QUEUE - the packet is put in the queue to the userspace. This won't be discussed here.
- RETURN - the process returns to the calling chain
For the INPUT chain, the only two targets that make sense as a policy are ACCEPT and DROP. This is the equivalent of an implicit last rule.
To create a new chain, the -N <chain> switch is used. In our case, we need to create the chains NOISE, BLACKHOLE, REGIONS, APNIC, LACNIC, ARIN, RIPE, AFRINIC and SSH.
iptables -N NOISE
iptables -N BLACKHOLE
iptables -N REGIONS
iptables -N APNIC
iptables -N LACNIC
iptables -N ARIN
iptables -N RIPE
iptables -N AFRINIC
iptables -N SSH
To verify the chains and their content, the
-L [<
chain>] switch can be used. There are also the
-n (numerical) and
-v (verbose) switches.
iptables -L -nv
iptables -L INPUT -n
iptables -L REGIONS
Let's populate the INPUT chain. The order in which I want the process to occur is:
- Drop the NOISE
- Drop the BLACKHOLE
- Accept new HTTP sessions (go to REGIONS)
- Accept new HTTPS sessions (go to REGIONS)
- Accept new DNS sessions (go to REGIONS)
- Accept new SSH sessions (go to SSH)
- Accept already established sessions
- Drop and log everything else
This translates to:
iptables -A INPUT -i eth0 -j NOISE
iptables -A INPUT -i eth0 -j BLACKHOLE
iptables -A INPUT -i eth0 -p tcp -m tcp --dport 80 \
-m state --state NEW -j REGIONS
iptables -A INPUT -i eth0 -p tcp -m tcp --dport 443 \
-m state --state NEW -j REGIONS
iptables -A INPUT -i eth0 -p tcp -m tcp --dport 53 \
-m state --state NEW -j REGIONS
iptables -A INPUT -i eth0 -p udp -m udp --dport 53 \
-m state --state NEW -j REGIONS
iptables -A INPUT -i eth0 -p tcp -m tcp --dport 22 \
-m state --state NEW -j SSH
iptables -A INPUT -i eth0 -m state --state ESTABLISHED \
-j ACCEPT
iptables -A INPUT -i eth0 -j LOG
iptables -A INPUT -i eth0 -j DROP
Note: the LOG target requires that the ipt_LOG module be loaded.
There are a few switches that I will describe later.
If we let the firewall that way, nothing would go through. Why? Because the REGIONS and SSH chains are empty, and would just return back to INPUT. The process would go down to the DROP target, at which point the packet would be discarded.
Let's then populate REGIONS:
iptables -A REGIONS -j APNIC
iptables -A REGIONS -j LACNIC
iptables -A REGIONS -j ARIN
iptables -A REGIONS -j RIPE
iptables -A REGIONS -j AFRINIC
iptables -A REGIONS -j ACCEPT
The closing ACCEPT target is to make sure that IPs in subnets not allocated to a region are processed. At this point, our firewall let anyone on the Internet access the server on HTTP, HTTPS and DNS. The regions can be populated with the information from the IANA.
That's also where an explicit RETURN target is needed: APNIC has been assigned the 202.0.0.0/8 subnets, but the 202.123.0.0/19 was transferred to AFRINIC. This would give the following rules:
[...]
iptables -A APNIC -s 202.123.0.0/19 -j RETURN
iptables -A APNIC -s 202.0.0.0/8 -j ACCEPT
[...]
Dropping the noise:
If the server is on a network where there are MS Windows machines, there is a good chance the firewall is going to log an incredible quantity of noise, mostly broadcasts due to Netbios.
A way to avoid that is to drop that noise early in the process, and that's the reason for the NOISE chain.
iptables -A NOISE -i eth0 -m udp --sport 138 -j DROP
iptables -A NOISE -i eth0 -m udp --dport 138 -j DROP
This is not to be confused with the role of the BLACKHOLE: that latter chain is to drop all traffic coming from certain sources. Technically, both could be in the NOISE chain, but I usually like to separate them.
An entry in the BLACKHOLE would be:
iptables -A BLACKHOLE -i eth0 -s 46.20.33.192/27 -j DROP
The reason for being in the BLACKHOLE, in my case, is too many logged attempts to access my machine on closed ports, or the presence in several blacklists.
Managing the server through SSH
In the current state, it is possible to access the server on HTTP, HTTPS and DNS, but not SSH. Again, to populate the table and assuming that only the RFC1918 192.168.0.0/16 network should be able to manage that server:
iptables -A SSH -i eth0 -s 192.168.0.0/16 -j ACCEPT
There is no need to duplicate the tcp/22 configuration: that chain is accessed only through the statement in INPUT that has the -j SSH, which already matches only new SSH connections.
These switches I haven't talked about yet
In the various examples, I mentioned a few switches I haven't described. That's the case for the -i <interface>, the -m tcp, -m udp or -m state.
The -i <interface> adds a condition to the rule, namely that the packet entered through the mentioned interface. This allows to separate the roles, and for example have an interface dedicated to the management, another to the services provided. If this is not mentioned, the rule is applied to all interfaces, including the loopback.
The -p switch specifies the protocol to match, in this case, tcp or udp. Other values are possible.
-m specifies a match: it gives access to command-line switches depending on the modules. For example, if the rule has to match a udp connection, there is no need to have the module responsible for inspecting tcp segments be used. I used three different modules: tcp, udp and state. The first two are used to inspect specific options of these protocols, such as source or destination ports, the last one is used to access the conntrack module information, which keeps track of all connections. This allows to differentiate between a new session (trying to establish), an established session (already initiated), an invalid session (for example out-of-state TCP segments) and RELATED sessions (sessions related to another session, for example FTP data ports).
With -m tcp or -m udp, two of the additional switches are --sport <port> and --dport <port>, which match respectively the source and the destination port. If they are not specified, any port is valid. With -m state, the --state <state> switch becomes available, to match specific conntrack information.
This was a short introduction to iptables, and going over all of its possibilities would require several books. A good starting point is the man page.