Rehash but don't grow when full of tombstones.
authorHoward Hinnant <hhinnant@apple.com>
Wed, 30 Oct 2013 15:10:54 +0000 (15:10 +0000)
committerHoward Hinnant <hhinnant@apple.com>
Wed, 30 Oct 2013 15:10:54 +0000 (15:10 +0000)
This problem was found and fixed by José Fonseca in March 2011 for
SmallPtrSet, committed r128566.  But as far as I can tell, all other
llvm hash tables retain the same problem:  the bucket count can grow
without bound while size() remains near constant by repeated
insert/erase cycles that tend to fill the container with tombstones.
Here is a demo that has been reduced to a trivial case:

int
main()
{
   llvm::DenseSet<unsigned> d;
   for (unsigned i = 0; i < 0xFFFFFFF; ++i)
   {
       d.insert(i);
       d.erase(i);
   }
}

While the container size() never grows above 1, the bucket count grows
like this:

nb = 64
nb = 128
nb = 256
nb = 512
nb = 1024
nb = 2048
nb = 4096
nb = 8192
nb = 16384
nb = 32768
nb = 65536
nb = 131072
nb = 262144
nb = 524288
nb = 1048576
nb = 2097152
nb = 4194304
nb = 8388608
nb = 16777216
nb = 33554432
nb = 67108864
nb = 134217728
nb = 268435456

The above program currently consumes a few GB ram.  This patch brings
the memory consumption down by several orders of magnitude, and keeps
the bucket count at 64 for the above test.

llvm-svn: 193689

llvm/include/llvm/ADT/DenseMap.h

index fac0b95..ce322cc 100644 (file)
@@ -438,9 +438,8 @@ private:
       this->grow(NumBuckets * 2);
       LookupBucketFor(Key, TheBucket);
       NumBuckets = getNumBuckets();
-    }
-    if (NumBuckets-(NewNumEntries+getNumTombstones()) <= NumBuckets/8) {
-      this->grow(NumBuckets * 2);
+    } else if (NumBuckets-(NewNumEntries+getNumTombstones()) <= NumBuckets/8) {
+      this->grow(NumBuckets);
       LookupBucketFor(Key, TheBucket);
     }
     assert(TheBucket);