gdb/23712: Introduce multidictionary's
gdb/23712 is a new manifestation of the now-infamous (at least to me)
symtab/23010 assertion failure (DICT_LANGUAGE == SYMBOL_LANGAUGE).
An example of the problem (using test case from symtab/23010):
Reading symbols from /home/rdiez/rdiez/arduino/JtagDue/BuildOutput/JtagDue-obj-release/firmware.elf...done.
(gdb) p SysTick_Handler
dwarf2read.c:9715: internal-error: void dw2_add_symbol_to_list(symbol*, pending**): Assertion `(*listhead) == NULL || (SYMBOL_LANGUAGE ((*listhead)->symbol[0]) == SYMBOL_LANGUAGE (symbol))' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n)
This assertion was added specifically to catch this condition (of adding
symbols of different languages to a single pending list).
The problems we're now seeing on systems utilizing DWARF debugging seem to
be caused by the use of LTO, which adds a CU with an artificial DIE of
language C99 which references DIEs in other CUs of language C++.
Thus, we create a dictionary containing symbols of C99 but end up
stuffing C++ symbols into it, and the dw2_add_symbol_to_list triggers.
The approach taken here to fix this is to introduce multi-language
dictionaries to "replace" the standard, single-language dictionaries
used today.
Note to reviewers: This patch introduces some temporary functions to
aide with review. This and other artifacts (such as "See dictionary.h"
which appear incorrect) will all be valid at the end of the series.
This first patch introduces the new multidictionary and its API (which
is, by design, identical to the old dictionary interface). It also
mutates dict_create_hashed and dict_create_linear so that they take
a std::vector instead of the usual struct pending linked list. This will
be needed later on.
This patch does /not/ actually enable multidictionary's. That is left
for a subsequent patch in the series.
I've done exhaustive performance testing with this approach, and I've
attempted to minimize the overhead for the (overwhelmingly) most common
one-language scenario.
On average, a -g3 -O0 GDB (the one we developers use) will see
approximately a 4% slowdown when initially reading symbols. [I've
tested only GDB and firefox with -readnow.] When using -O2, this
difference shrinks to ~0.5%. Since a number of runs with these
patches actually run /faster/ than unpatched GDB, I conclude that
these tests have at least a 0.5% error margin.
On our own gdb.perf test suite, again, results appear to be pretty
negligible. Differences to unpatched GDB range from -7.8% (yes,
patched version is again faster than unpatched) to 27%. All tests
lying outside "negligible," such as the 27% slowdown, involve a total
run time of 0.0007 (or less) with smaller numbers of CUs/DSOs (usually 10
or 100). In all cases, the follow-up tests with more CUs/DSOs is never
more than 3% difference to the baseline, unpatched GDB.
In my opinion, these results are satisfactory.
gdb/ChangeLog:
PR gdb/23712
PR symtab/23010
* dictionary.c: Include unordered_map.
(pending_to_vector): New function.
(dict_create_hashed_1, dict_create_linear_1, dict_add_pending_1):
Rewrite the non-"_1" functions to take vector instead
of linked list.
(dict_create_hashed, dict_create_linear, dict_add_pending): Use the
"new" _1 versions of the same name.
(multidictionary): Define.
(std::hash<enum language): New definition.
(collate_pending_symbols_by_language, mdict_create_hashed)
(mdict_create_hashed_expandable, mdict_create_linear)
(mdict_create_linear_expandable, mdict_free)
(find_language_dictionary, create_new_language_dictionary)
(mdict_add_symbol, mdict_add_pending, mdict_iterator_first)
(mdict_iterator_next, mdict_iter_match_first, mdict_iter_match_next)
(mdict_size, mdict_empty): New functions.
* dictionary.h (mdict_iterator): Define.