neigh: Keep neighbour cache entries if number of them is small enough.
authorYOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org>
Tue, 22 Jan 2013 05:20:05 +0000 (05:20 +0000)
committerDavid S. Miller <davem@davemloft.net>
Tue, 22 Jan 2013 19:25:28 +0000 (14:25 -0500)
Since we have removed NCE (Neighbour Cache Entry) reference from
routing entries, the only refcnt holders of an NCE are its timer
(if running) and its owner table, in usual cases.  As a result,
neigh_periodic_work() purges NCEs over and over again even for
gateways.

It does not make sense to purge entries, if number of them is
very small, so keep them.  The minimum number of entries to keep
is specified by gc_thresh1.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Documentation/networking/ip-sysctl.txt
net/core/neighbour.c

index 4976564..19ac180 100644 (file)
@@ -26,6 +26,11 @@ route/max_size - INTEGER
        Maximum number of routes allowed in the kernel.  Increase
        this when using large numbers of interfaces and/or routes.
 
+neigh/default/gc_thresh1 - INTEGER
+       Minimum number of entries to keep.  Garbage collector will not
+       purge entries if there are fewer than this number.
+       Default: 256
+
 neigh/default/gc_thresh3 - INTEGER
        Maximum number of neighbor entries allowed.  Increase this
        when using large numbers of interfaces and when communicating
index c815f28..7bd0eed 100644 (file)
@@ -778,6 +778,9 @@ static void neigh_periodic_work(struct work_struct *work)
        nht = rcu_dereference_protected(tbl->nht,
                                        lockdep_is_held(&tbl->lock));
 
+       if (atomic_read(&tbl->entries) < tbl->gc_thresh1)
+               goto out;
+
        /*
         *      periodically recompute ReachableTime from random function
         */
@@ -832,6 +835,7 @@ next_elt:
                nht = rcu_dereference_protected(tbl->nht,
                                                lockdep_is_held(&tbl->lock));
        }
+out:
        /* Cycle through all hash buckets every base_reachable_time/2 ticks.
         * ARP entry timeouts range from 1/2 base_reachable_time to 3/2
         * base_reachable_time.