Improve macro table performance
In the existing implementation, when a new macro is added, the whole
table has to be sorted again. Hence the cost of adding n macros is
worse than O(n^2), due to arithmetic progression.
This change drops all qsort(3) stuff altogether, by carefully preserving
table in sorted order. In findEntry routine, bsearch(3) is replaced
with customized binary search which tracks position for insertion.
In the addMacro routine, if a matching entry is not found, this
position is used for direct insertion, after the rest of the elements
are "shifted to the right" with memmove(3). Likewise, in delMacro
routine, the elements are shifted back to the left when the last macro
definition is popped. Technically, shifting half of the array with
memmove(3) is still O(n^2); however, modern CPUs process contiguous
memory in a very efficient manner, and glibc provides a fine-tuned
memmove(3) implementation.
Also, macro table entries are now allocated in a single chunk.
This change reduces rpm startup costs by factor of 6. Also, this change
improves specfile parser performance by a factor of 2 (e.g. the parse
time of texlive.spec is reduced from 67s to 35s).
Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
(cherry picked from commit
301d5450a1c7849f9eb4ead11d00c8a20ea6a6bd)