Issues with iptables stateful filtering

I hit a weird issue today, I have Apache configured as a reverse proxy using mod_proxy_balancer which is a welcome addition in Apache 2.2.x. This is forwarding selected requests to more Apache instances running mod_perl, although this could be any sort of application layer, pretty standard stuff.

With ProxyStatus enabled and pushing a reasonable amount of traffic through the proxy, I started to notice that the application layer Apache instances would consistently get marked in error state every so often, removing them from the pool of available servers until their cooldown period expired and the proxy enabled them again.

Investigating the logs showed up this error:

[Tue Sep 28 00:24:39 2010] [error] (113)No route to host: proxy: HTTP: attempt to connect to 192.0.2.1:80 (192.0.2.1) failed
[Tue Sep 28 00:24:39 2010] [error] ap_proxy_connect_backend disabling worker for (192.0.2.1)

A spot of the ol’ google-fu turned up descriptions of similar problems, some not even related to Apache. The problem looked related to iptables.

All the servers are running CentOS 5 and anyone who runs this is probably aware of the stock Red Hat iptables ruleset. With HTTP access enabled, it looks something similar to this:

1
2
3
4
5
6
7
8
9
10
11
12
13
-A INPUT -j RH-Firewall-1-INPUT
-A FORWARD -j RH-Firewall-1-INPUT
-A RH-Firewall-1-INPUT -i lo -j ACCEPT
-A RH-Firewall-1-INPUT -p icmp --icmp-type any -j ACCEPT
-A RH-Firewall-1-INPUT -p 50 -j ACCEPT
-A RH-Firewall-1-INPUT -p 51 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp --dport 5353 -d 224.0.0.251 -j ACCEPT
-A RH-Firewall-1-INPUT -p udp -m udp --dport 631 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m tcp --dport 631 -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 80 -j ACCEPT
-A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited

Almost all of it is boilerplate apart from line 12, which I added and is identical to the line above it granting access to SSH. Analysing the live hit counts against each of these rules showed a large number hitting that last catch-all rule on line 13 and indeed, this is what is causing the Apache errors.

Analysing the traffic with tcpdump/wireshark showed that the frontend Apache server is only getting as far as sending the initial SYN packet and it’s failing to match either the dedicated rule on line 12 for HTTP traffic, or even the rule on line 10 to match any related or previously established traffic, although I wouldn’t really expect it to match that.

Adding a rule before the last one to match and log any HTTP packets that are considered to be in the state INVALID showed that indeed for some strange reason, iptables is deciding that an initial SYN is somehow invalid.

More information about why it might be invalid can be coaxed from the kernel by issuing the following:

# echo 255 > /proc/sys/net/ipv4/netfilter/ip_conntrack_log_invalid

Although all this gave me was some extra text saying “invalid state” and the same output you get from the standard logging target:

ip_ct_tcp: invalid state IN= OUT= SRC=192.0.2.2 DST=192.0.2.1 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=21158 DF PROTO=TCP SPT=57351 DPT=80 SEQ=1402246598 ACK=0 WINDOW=5840 RES=0x00 SYN URGP=0 OPT (020405B40402080A291DC7600000000001030307)

From searching the netfilter bugzilla for any matching bug reports I found a knob that relaxes the connection tracking, enabled with the following:

# echo 1 > /proc/sys/net/ipv4/netfilter/ip_conntrack_tcp_be_liberal

This didn’t fix things, plus it’s recommended to only use this in extreme cases with a broken router or firewall. As all of these machines are on the same network segment with just a switch connecting them there shouldn’t be anything mangling the packets.

With those avenues exhausted the suggested workaround of adding rules similar to the following:

1
-A RH-Firewall-1-INPUT -m tcp -p tcp --dport 80 --syn -j REJECT --reject-with tcp-reset

before your last rule doesn’t really work as the proxy still drops the servers from the pool as before, but just logs a different reason instead:

[Tue Sep 28 13:59:23 2010] [error] (111)Connection refused: proxy: HTTP: attempt to connect to 192.0.2.1:80 (192.0.2.1) failed
[Tue Sep 28 13:59:23 2010] [error] ap_proxy_connect_backend disabling worker for (192.0.2.1)

This isn’t really acceptable to me anyway as it requires your proxy to retry in response to a server politely telling it to go away and it’s going to be a performance hit. The whole point of using mod_proxy_balancer for me was so it could legitimately remove servers that are dead or administratively stopped, how is it supposed to tell the difference between that and a dodgy firewall?

The only solution that worked and was deemed acceptable was to simply remove the state requirement on matching the HTTP traffic, like so:

-A RH-Firewall-1-INPUT -m tcp -p tcp --dport 80 -j ACCEPT

This will match both new and invalid packets, however it’s left me with a pretty low opinion of iptables now. Give me pf any day.

Tags: , , , ,

3 Responses to “Issues with iptables stateful filtering”

  1. Saturn says:

    I have a similar issue on Cent OS in which the ssh connection times out, and
    i get icmp Type 3 code 10 host prohibited.

    • matt says:

      That sounds entirely normal if your SSH connection remains idle long enough for the natural state expiry to occur. I have seen that happen with SSH connections through a few different firewalls.

      A simple way to prevent this happening is to make use of the ServerAliveInterval SSH option. Put something like the following in your ~/.ssh/config file:

      Host *
              ServerAliveInterval 60
      
  2. Andrey says:

    I am having exact same issue on CentOS 6, but in my case even with outgoing HTTP. I am trying to only allow inbound traffic when it was initiated from the machine. As soon as I DROP INVALIDs on INPUT, outbound HTTP stops working. Did you find any solution?

Leave a Reply