libkmod-hash: always align n_buckets to power of 2
By aligning n_buckets to power of 2 we can turn the "bucket = hashval %
n_buckets" into a less expensive bucket = hashval & (n_buckets - 1).
This removes the DIV instruction as shown below.
Before:
xor %edx,%edx
divl 0x8(%rbx)
mov %edx,%eax
add $0x1,%rax
shl $0x4,%rax
add %rbx,%rax
After:
lea -0x1(%rdi),%edx
and %edx,%eax
add $0x1,%rax
shl $0x4,%rax
add %rbx,%rax
With a microbenchmark, measuring the time to locate the bucket (i.e.
time_to_calculate_hashval + time_to_calculate_bucket_position) we have
the results below (time in clock cycles):
keylen before after
2-10 79.0 61.9 (-21.65%)
11-17 81.0 64.4 (-20.48%)
18-25 90.0 73.2 (-18.69%)
26-32 104.7 87.0 (-16.82%)
33-40 108.4 89.6 (-17.37%)
41-48 111.2 91.9 (-17.38%)
49-55 120.1 102.1 (-15.04%)
56-63 134.4 115.7 (-13.91%)
As expected the gain is constant, regardless of the key length.
The time to clculate the hashval varies with the key length, which
explains the bigger gains for short keys.