ip_conntrack: table full, dropping packet

Linux specific questions/information are gathered here. The main thrust of topics are applied to Centos/RedHat(RH)/Debian/Ubuntu/Gentoo distributives

ip_conntrack: table full, dropping packet

Postby lik » Fri Mar 11, 2011 3:34 pm

You can check server current tracked connections with the help of the following command (as root):
Code: Select all
cat /proc/sys/net/ipv4/netfilter/ip_conntrack_count

or
Code: Select all
wc -l /proc/net/ip_conntrack

Code: Select all
cat /proc/sys/net/ipv4/netfilter/ip_conntrack_max

If you want to adjust it, just run the following as root:
Code: Select all
echo 131072 > /proc/sys/net/ipv4/ip_conntrack_max

To make this persistent you have to add a line like 'net.ipv4.ip_conntrack_max=131072' to /etc/sysctl.conf.

http://rackerhacker.com/2008/01/24/ip_conntrack-table-full-dropping-packet/#comment-15408
Some readers may be interested to know what ip_conntrack is in the first place, and why it fills up. If you run an iptables firewall, and have rules that act upon the state of a packet, then the kernel uses ip_conntrack to keep track of what state what connections are in so that the firewall rule logic can be applied against them. If you have a system that's getting a lot of network activity (high rates of connections, lots of concurrent connections, etc) then the table will accumulate entries.

The entries remain until an RST packet is sent from the original IP address. If you have a flaky network somewhere between you, and the clients accessing your server, it can cause the RST packets to be dropped due to the packet loss, and leave orphaned entries in your ip_conntrack table. This can also happen if you have a malfunctioning switch or NIC card... not necessarily a routing problem out on the internet somewhere.

Typically when I've seen this trouble crop up is when a server is the target of a DDoS attack. Filling up the ip_conntrack table is a relatively easy way to knock a server off line, and attackers know this.

As Major suggested, you can get short term relief by increasing the size of the table. However, these entries are held in memory by the kernel. The bigger you make the table, the more memory it will consume. That memory could be used by your server to serve requests if you really don't need the stateful firewall capability. Don't waste resources on this feature if you really don't need it.

Another option to consider is turning OFF iptables rules that use ip_conntrack so the state able is not used at all. Anything with "-m state" or "-t nat" can be turned off. If you want to just flush all your iptables rules you can do an "iptables -P" to set a default allow policy and "iptables -F" to flush all the rules. On an RHEL or CentOS system you can just do "service iptables stop".

Once iptables is no longer using ip_conntrack, you can reclaim the memory the table was using by unloading the related kernel modules.

rmmod ipt_MASQUERADE
rmmod iptable_nat
rmmod ipt_state
rmmod ip_conntrack

Then you will have an empty ip_conntrack that will stay empty. I mention this because a lot of sysadmins have hordes of iptables rules installed as a matter of course, and don't recognize the downside of having them present. You can still use iptables, but to avoid the use of ip_conntrack simply don't use rules based on stateful logic.


http://rackerhacker.com/2008/01/24/ip_conntrack-table-full-dropping-packet/#comment-15409

One other aspect to consider when raising your max conntrack setting is the depth of the memory objects used to track these connections, henceforth referred to as "buckets".

On RedHat the default hashsize for the conntrack module is 8192. The rule of thumb is to allow for no more than 8 connections per bucket so you would set your conntrack size to be equal to 8 * hashsize. This is why RedHat defaults the ip_conntrack_max to 65536.

You can tweak these settings by adjusting not just the ip_conntrack_max setting but the hashsize option to the ip_conntrack module.

So, for example, if you were to set your ip_conntrack_max to 131072 without modifying the default hashsize of 8k, you are allowing for a bucket depth of 16 entries. Thus the kernel has to dig deeper, potentially, to find that one connection object in it's bucket.

There are a number of schools of thought on how best to address this but in practice I have found that, given the resources, a shallower bucket is better.

For a server that does extremely heavy network traffic, and of course has the memory to spare, you would want to keep the average bucket depth to 2 or 4.

Hashsize, to my knowledge, isn't a dynamic setting so you will need to load the ip_conntrack module with the option:

hashsize =

So in Major's example above, if you want to double your server's capacity for tracked connections while not doubling the lookups you would reload the module with:

options ip_conntrack hashsize=16384

This keeps the items per bucket to 8. I have seen machines with a depth of beyond 8 get completely cowed under heavy network load and since memory is relatively plentiful nowadays you can increase the efficiency of the lookups by making this 4 connections per bucket or even 2 by just doing simple math and reloading the module with the right options.

Hope that helps.

Here is some relatively dated yet still applicable information on the subject:
http://www.wallfire.org/misc/netfilter_ ... k_perf.txt
lik
Founder
Founder
 
Posts: 497
Joined: Wed Dec 15, 2010 3:21 am

Return to Linux specific

 


  • Related topics
    Replies
    Views
    Last post
cron