review.tizen.org Git - platform/upstream/libabigail.git/log

Fix a typo in a comment of abg-dwar-reader.cc

* src/abg-dwarf-reader.cc (compare_dies_string_attribute_value):
Fix a typo in the comment of this function.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

PR25042 - Support string form DW_FORM_strx{1,4} from DWARF 5

* configure.ac: Detect the presence of the DW_FORM_strx{1,4}
enumerators.
* src/abg-dwarf-reader.cc (form_is_DW_FORM_strx): Define new
function.
(compare_dies_string_attribute_value): Use the new
form_is_DW_FORM_strx here.
* tests/data/Makefile.am: Add the new test input files below to
source distribution.
* tests/data/test-read-dwarf/PR25042-libgdbm-clang-dwarf5.so.6.0.0:
New binary test input file.
* tests/data/test-read-dwarf/PR25042-libgdbm-clang-dwarf5.so.6.0.0.abi:
Reference output of the new binary test input file.
* tests/test-read-dwarf.cc (in_out_specs): Add the input test
files above to the test harness, for platforms that support the
DW_FORM_strx form.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Support the "name_not_regexp" property in the [suppress_type] section

When writting a suppression specification in which the user wants to
keep a family of types (whose names set is specified by a regular
expression) and suppress/drop all other types, one needs to write
something like:

[suppress_type]
  name_regexp = (?!the-regexp-of-the-types-to-keep)

It would be nicer (like what is done for other properties that take
regular expressions as value in suppression specifications) to be able
to write:

[suppress_type]
  name_not_regexp = the-regexp-of-types-to-keep

This patch does just that.

It augments the abigail::suppr::type_suppression type to make it carry
the new 'name_not_regex' property.  It updates the suppression engine
to take the 'name_not_regex' property into account when interpreting
instances of abigail::suppr::type_suppression.  The parser for type
suppression directives is updated to recognize the new name_not_regexp
property.  The manual has been updated accordingly to describe the new
property.  New regression tests have been added.

* doc/manuals/libabigail-concepts.rst: Update this to document the
new name_not_regexp property of the suppress_type directive.
* include/abg-suppression.h
(type_suppression::{g,s}et_type_name_not_regex_str): Declare new accessors.
* src/abg-suppression-priv.h
(type_suppression::priv::{type_name_not_regex_str_,
type_name_not_regex_}): Define new data members.
(type_suppression::priv::{get_type_name_not_regex,
set_type_name_not_regex, get_type_name_not_regex_str,
set_type_name_not_regex_str}): Define new member functions.
* src/abg-suppression.cc
(type_suppression::get_type_name_regex_str): Fix comments.
(type_suppression::{set_type_name_not_regex_str,
get_type_name_not_regex_str}): Define new data members.
(suppression_matches_type_name): Adapt to support the new
type_name_not_regex property.
(read_type_suppression): Support parsing the type_name_not_regexp
property.
* tests/data/test-diff-suppr/test42-negative-suppr-type-report-0.txt:
New test reference output.
* tests/data/test-diff-suppr/test42-negative-suppr-type-report-1.txt: Likewise.
* tests/data/test-diff-suppr/test42-negative-suppr-type-suppr-1.txt:
New test input.
* tests/data/test-diff-suppr/test42-negative-suppr-type-suppr-2.txt: Likewise.
* tests/data/test-diff-suppr/test42-negative-suppr-type-v0.{cc, o}: Likewise.
* tests/data/test-diff-suppr/test42-negative-suppr-type-v1.{cc,
o}: Likewise.
* tests/data/Makefile.am: Add the test files above to source
distribution.
* tests/test-diff-suppr.cc (int_out_specs): Add the new tests to
the harness.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Better propagation of suppressed-ness to function types

In the comparison engine, when a sub-type of a function type (say, a
parameter type size change) has been suppressed, this suppression is
not necessarily well propagated to the function carrying the function
type, because the parameter type size, for instance, is considered as
a type local change to that function; and we generally don't propagate
suppression to a non-suppressed parent diff node that already carries
a local change.

This leads to an empty change report for the function we are looking
at because the only sub-type change has been suppressed.

This patch properly propagates the suppressed-ness in that case, so
that the parent function diff node is suppressed as well.

* src/abg-comparison.cc
(suppression_categorization_visitor::visit_end): Propagate
suppression-ness from suppressed function type diff node to its
parent function node if the latter doesn't have any local non-type
change.
* tests/data/test-diff-suppr/test43-suppr-direct-fn-subtype-report-1.txt:
New test reference output.
* tests/data/test-diff-suppr/test43-suppr-direct-fn-subtype-suppr-1.txt:
New test input suppression file.
* tests/data/test-diff-suppr/test43-suppr-direct-fn-subtype-v{0,1}.cc:
Source code of input binary file.
* tests/data/test-diff-suppr/test43-suppr-direct-fn-subtype-v{0,1}.o:
Input binary files.
* tests/data/Makefile.am: Add the new test input files above to
source distribution.
* tests/test-diff-suppr.cc (in_out_specs): Add the test input to
test harness.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

[has_type_change] Better detect type size changes

While looking at something else, I noticed that we could be missing
type size changes on function parameters when using has_type_change.

Fixed thus.

* src/abg-comp-filter.cc (has_type_change): Support function
parameters.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Fix reading of relocation sections when endianness mismatches

When the endianness of the ELF binary differs from the endianness of
the host, some byte swapping needs to happen when we read the reloc
section to either determine the format of the kernel symbol table or
to get the set of symbols referenced by the kernel symbol table.

So we need to use elf_getdata rather than elf_rawdata to read the data
from the reloc section, because the former handles the proper byte
swapping for us.

This patch does just that and thus fixes the build breakage that is
occuring when running the testreaddwarf test on s390x (big endian),
especially when trying to read the AARCH64 little endian binary
data/test-read-dwarf/PR25007-sdhci.ko.

* src/abg-dwarf-reader.cc
(read_context::{get_ksymtab_format_module,
populate_symbol_map_from_ksymtab_reloc}): Use elf_getdata rather
than elf_rawdata.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Guard testing v4.19+ AARCH64 kernel module loading for EL6 support

When analyzing an AARCH64 linux kernel module built with support for
either R_AARCH64_ABS64 or R_AARCH64_PREL32 relocations, we need these
macros to be defined in elf.h (i.e a recent enough version of libelf),
otherwise we cannot properly support those kernel modules using the
scheme that uses the relocation table of the __ksymtab and
__ksymtab_gpl sections to read those sections.

In the future, I think we should automatically fallback to another way
of trying to read those sections if those macros are not defined, and
emit a message hinting at what is happening, when in verbose mode.  I
am keeping it as is for the moment, so that we can get a better case
of the when these macros are not defined and whatnot.

In the mean time, this patch conditionalizes the test that reads a
kernel module build with support for these relocations to avoid
running it on platform that support these relocations.

* tests/test-read-dwarf.cc: Do not run the test on
          PR25007-sdhci.ko if the macros R_AARCH64_PREL32 and
          R_AARCH64_ABS64 are not defined.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Remove the elf_symbol::get_value property

Now that there are proper facilities to lookup ELF symbols inside the
ELF/DWARF reader and get a native GElf_Sym type instance (from
libelf), we don't need to carry the value of the symbol (that is
relevant only that low level anyway) in the abigail::ir::elf_symbol
type.

This patch removes that property.

* include/abg-ir.h (elf_symbol::{elf_symbol, create}): Remove the
'val' parameter.
* src/abg-dwarf-reader.cc (elf_symbol::get_value): Remove this
member function declaration.
(lookup_symbol_from_sysv_hash_tab)
(lookup_symbol_from_gnu_hash_tab, lookup_symbol_from_symtab)
(create_default_var_sym, create_default_fn_sym)
(read_context::lookup_elf_symbol_from_index): Adjust calls to
creating elf_symbol instances.
* src/abg-ir.cc (elf_symbol::priv::value_): Remove this data
member.
(elf_symbol::{priv::priv, elf_symbol, create): Adjust.
* src/abg-reader.cc (build_elf_symbol): Likewise.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Bug 25007 - Don't use section-relative symbol values on ET_REL binaries

In relocatable files, two symbols listed in the .symtab section can
have the same value and yet be different.  That is because those
symbols can be *defined* in different sections.  And the value of
those symbols represent addresses (offsets) within their own
respective sections (a.k.a section-relative addresses).

In the same time, symbol address as referred-to in the DWARF
information are *not* section-relative, rather, they are relative to
the beginning of the whole binary.

Until now, the DWARF-referred-to symbol addresses were translated into
section-relative addresses, so that they could be compared to the
other section-relative addresses we were getting from listing the
symbols and their values from the .symtab section.  The problem with
that approach is that, during the translation from binary-relative to
section-relative addresses we were wrongly assuming that all symbols
referenced from the DWARF were defined in the .text section.  This is
wrong especially for ET_REL files because they could be defined in
sections named .foo.text or .bar.text, for instance.

This leads to issues where we wrongly consider that two symbols having the
same value are the same.  Because we wrongly assume that they are all
defined in the same .text section.

This patch fixes this problem by translating the section-relative
addresses we see in .symtab into binary-relative addresses by adding
the address of the section to the section-relative address.  Those
binary-addresses can thus safely be compared to the binary-relative
addresses we see in the DWARF.  And also, when two symbols have the
same binary-relative address, we can now safely assume that they are
the same -- they are aliases, basically.

* src/abg-dwarf-reader.cc
(read_context::{lookup_native_elf_symbol_from_index,
maybe_adjust_et_rel_sym_addr_to_abs_addr}): Define new member
functions.
(read_context::lookup_elf_symbol_from_index): Add a new overload.
Write the old overloads in terms of the new one.
(read_context::{load_symbol_maps_from_symtab_section,
populate_symbol_map_from_ksymtab_reloc}): Use the new
maybe_adjust_et_rel_sym_addr_to_abs_addr function to translate the
symbol value/address into a binary-relative address before adding
it to the addr->sym maps.
(read_context::maybe_adjust_{fn, var}_sym_address): Do not adjust
DWARF-referred-to addresses of ET_REL symbols anymore.
* tests/data/test-read-dwarf/PR25007-sdhci.ko: New binary test input.
* tests/data/test-read-dwarf/PR25007-sdhci.ko.abi: ABI
representation of the above.
* tests/test-read-dwarf.cc: Add the new test input to the harness.
* tests/data/test-diff-dwarf/test28-vtable-changes-report-0.txt: Adjust.
* tests/data/test-diff-filter/test20-inline-report-0.txt: Likewise.
* tests/data/test-diff-filter/test20-inline-report-1.txt: Likewise.
* tests/data/test-diff-filter/test41-report-0.txt: Likewise.
* tests/data/test-diff-filter/test9-report.txt: Likewise.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Detect the presence of R_AARCH64_{ABS64, PREL32} macros

The patch:

"e687032 Support pre and post v4.19 ksymtabs for Linux kernel modules"

introduces the use of the R_AARCH64_{ABS64, PREL32} macros. However,
some older "elf.h" don't define these. When compiling on these older
platforms, we thus need to avoid using these new macros.

With this patch, the configure system detects the presence of these
macros and defines the HAVE_R_AARCH64_{ABS64, PREL32}_MACRO macros
accordingly.

Note that just to comply with what's in there in the code already, we
don't directly do "#ifdef R_AARCH64_ABS64", but rather "#ifdef
HAVE_R_AARCH64_ABS64_MACRO", to allow cases where we want to
artificially disable the "feature" at configure time, in the future.

* configure.ac: Define macros HAVE_R_AARCH64_{ABS64, PREL32}_MACRO
if the macros R_AARCH64_{ABS64, PREL32} are present.
* src/abg-dwarf-reader.cc
(read_context::get_ksymtab_format_module): Conditionalize the use
of R_AARCH64_{ABS64, PREL32} using HAVE_R_AARCH64_{ABS64, PREL32}_MACRO.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Support pre and post v4.19 ksymtabs for Linux kernel modules

As described in commit ad8c2531fb9, the format of the Linux kernel
ksymtab changed in v4.19 to use relative references instead of absolute
references. This changes the type of relocations emitted for ksymtab
sections to be place-relative 32-bit relocations instead of absolute
relocations. One side-effect of this is that libdwfl will not relocate
the ksymtab sections due to the PC-relative relocations. This breaks
load_kernel_symbol_table() for kernel modules because it only reads in
zeros from the unrelocated ksymtab section and is subsequently unable to
determine what exported symbols it refers to. Since a vmlinux binary is
already fully linked and relocated (and therefore we can read its
ksymtab section just fine), this problem is only relevant to Linux
kernel modules.

To work around this, we utilize the ksymtab relocation sections to
determine which symbols the ksymtab entries refer to. We do this by
inspecting each relocation's r_info field for the symbol table index and
from there we are able to read each symbol's value and subsequently add
that to the set of exported symbols.

In addition, for Linux kernel modules, we can utilize relocation types
to implement a new heuristic to determine the ksymtab format we have.
The presence of PC-relative relocations suggest the new v4.19 format,
and absolute relocation types suggest the old pre v4.19 format.

        * include/abg-ir.h (elf_symbol::{elf_symbol, create}): Take new
symbol value and shndx parameters.
        (elf_symbol::{get_value, get_shndx}): Declare new accessors.
        * src/abg-ir.cc (elf_symbol::priv::{value_, shndx_}): New data
members.
        (elf_symbol::priv::priv): Adjust.
        (elf_symbol::elf_symbol): Take new value and is_linux_string_cst
parameters.
        (elf_symbol::create): Likewise.
        (elf_symbol::{get_value, get_is_linux_string_cst}): Define new
accessors.
        * src/abg-reader.cc (build_elf_symbol): Adjust.
        * src/abg-dwarf-reader.cc (binary_is_linux_kernel)
(binary_is_linux_kernel): New static functions.
(lookup_symbol_from_sysv_hash_tab)
        (lookup_symbol_from_gnu_hash_tab)
        (lookup_symbol_from_symtab): Adjust.
        (read_context::{ksymtab_reloc_section_,
ksymtab_gpl_reloc_section_, ksymtab_strings_section_}): New data
members.
        (read_context::read_context): Initialize ksymtab_reloc_section_,
ksymtab_gpl_reloc_section_, ksymtab_strings_section_.
        (read_context::{find_ksymtab_reloc_section,
find_ksymtab_gpl_reloc_section, find_ksymtab_strings_section,
find_any_ksymtab_reloc_section, get_ksymtab_format_module,
populate_symbol_map_from_ksymtab,
populate_symbol_map_from_ksymtab_reloc, is_linux_kernel_module}):
New member functions.
        (read_context::load_kernel_symbol_table): Adjust to call either
populate_symbol_map_from_ksymtab{_reloc,} depending on ksymtab
format.
        (read_context::get_ksymtab_format): Adjust to call
get_ksymtab_format_module for linux kernel modules.
        (read_context::lookup_elf_symbol_from_index): Adjust.
        (create_default_var_sym, create_default_fn_sym): Adjust.

Signed-off-by: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Serialize canonical types to avoid testing if types have been emitted

When emitting abixml, profiling shows that we spend a great deal of
time testing if a given type has been emitted already, to avoid
emitting a given type more than once.  This makes the serialization
phase take more time than the binary analysis phase!

This patch leverages the fact that we already have the set of
canonical types in the system.  While emitting that set entirely, we
don't need to test if a type has been emitted already because we know
by definition that every type is present just once in that set, more
or less.  OK, because there are also types that don't have canonical
types (for instance, declaration-only class/structs), we'll still have
to check of those types have already been emitted, but this is a very
small set to handle.

The patch thus organizes the canonical types per scope, so that when
emitting a scope and the canonical types within it, the type is
emitted in its correct namespace.

Then, when emitting a translation unit and each namespaces in it, the
patch emits the canonical types of those namespaces.

The patch arranges for some ancillary things that are needed to make
the whole picture be coherent enough for things to keep working.

Testing shows that we gained ~ 30% of performance by doing this, while
analysing the whole linux kernel 5.1 version.  We went from ~ 3m30s
minutes to less than 2m30s.

With this patch, the serialization phase now takes less time than the
analysis time.

* include/abg-fwd.h (is_decl_slow)
(peel_pointer_or_reference_type): Declare new functions.
* include/abg-ir.h (struct canonical_type_hash): Define new type.
(type_base_ptr_set_type, type_base_ptrs_type)
(type_base_sptrs_type, canonical_type_sptr_set_type): Define new
typedefs.
(environment::get_canonical_types_map): Declare new member
function.
(scope_decl::{get_canonical_types, get_sorted_canonical_types}):
Declare new member functions.
* src/abg-ir.cc (is_ptr_ref_or_qual_type)
(peel_pointer_or_reference_type, is_decl_slow): Define new
functions.
(environment::{get_canonical_types_map}): Define new member
functions.
(canonical_type_hash::operator()): Likewise.
(scope_decl::{get_canonical_types, get_sorted_canonical_types}):
Likewise.
(struct type_topo_comp): Define new comparison functor type.
(environment::{sorted_canonical_types_}): Define new data member.
(scope_decl::priv::{canonical_types_, sorted_canonical_types_}):
Likewise.
(scope_decl::is_empty): Take the presence of canonical types into
account when determining if a scope is empty or not.
(is_decl): Make this work for cases where the artifact at hand is
a type which has a declaration, as opposed to being a pure
declaration like a variable or a function.
(canonicalize): Add the canonical type the list of canonical types
of its scope.
* src/abg-dwarf-reader.cc (read_context::die_is_in_cplus_plus):
Define new member function.
* src/abg-writer.cc (write_type, write_canonical_types_of_scope):
Define new static functions.
(fn_type_ptr_set_type): Define new typedef.
(write_context::{m_referenced_fn_types_set,
m_referenced_non_canonical_types_set}): Add new data members.
(write_context::m_referenced_types_set): Renamed
m_referenced_types_map into this.
(write_context::get_referenced_types): Adjust.
(write_context::get_referenced_{function_types,
non_canonical_types}):
(write_context::record_type_as_referenced): Adjust to add the
referenced type in the proper set which would be one of the three
following: write_context::{get_referenced_types,
get_referenced_function_types,
get_referenced_non_canonical_types}.
(write_context::{type_is_referenced, clear_referenced}): Adjust.
(write_translation_unit): Use the new
write_canonical_types_of_scope.  Also emit declaration-only
classes that have member types.  Do not test if a given type of a
given scope has been emitted, in general, as this was super slow
given the number of types.  Emit referenced function types (as
these don't belong to any scope).  Rather than using the expensive
"is_function_type" on *all* the referenced types, just walk the
set write_context::get_referenced_function_types.  Likewise,
rather than using type_base::get_naked_canonical_type on
*all* the referenced types, just walk the set
write_context::get_referenced_non_canonical_types
(write_class): Use write_canonical_types_of_scope here.
* tools/abilint.cc (main): Support linting corpus group abixml
files.
* tests/data/test-annotate/libtest23.so.abi: Adjust.
* tests/data/test-annotate/libtest24-drop-fns-2.so.abi: Likewise.
* tests/data/test-annotate/libtest24-drop-fns.so.abi: Likewise.
* tests/data/test-annotate/test-anonymous-members-0.o.abi: Likewise.
* tests/data/test-annotate/test0.abi: Likewise.
* tests/data/test-annotate/test1.abi: Likewise.
* tests/data/test-annotate/test13-pr18894.so.abi: Likewise.
* tests/data/test-annotate/test14-pr18893.so.abi: Likewise.
* tests/data/test-annotate/test15-pr18892.so.abi: Likewise.
* tests/data/test-annotate/test17-pr19027.so.abi: Likewise.
* tests/data/test-annotate/test18-pr19037-libvtkRenderingLIC-6.1.so.abi: Likewise.
* tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise.
* tests/data/test-annotate/test2.so.abi: Likewise.
* tests/data/test-annotate/test20-pr19025-libvtkParallelCore-6.1.so.abi: Likewise.
* tests/data/test-annotate/test21-pr19092.so.abi: Likewise.
* tests/data/test-annotate/test4.so.abi: Likewise.
* tests/data/test-annotate/test6.so.abi: Likewise.
* tests/data/test-annotate/test7.so.abi: Likewise.
* tests/data/test-annotate/test8-qualified-this-pointer.so.abi: Likewise.
* tests/data/test-read-dwarf/PR22015-libboost_iostreams.so.abi: Likewise.
* tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Likewise.
* tests/data/test-read-dwarf/PR24378-fn-is-not-scope.abi: Likewise.
* tests/data/test-read-dwarf/libtest23.so.abi: Likewise.
* tests/data/test-read-dwarf/libtest24-drop-fns-2.so.abi: Likewise.
* tests/data/test-read-dwarf/libtest24-drop-fns.so.abi: Likewise.
* tests/data/test-read-dwarf/test0.abi: Likewise.
* tests/data/test-read-dwarf/test1.abi: Likewise.
* tests/data/test-read-dwarf/test10-pr18818-gcc.so.abi: Likewise.
* tests/data/test-read-dwarf/test11-pr18828.so.abi: Likewise.
* tests/data/test-read-dwarf/test12-pr18844.so.abi: Likewise.
* tests/data/test-read-dwarf/test13-pr18894.so.abi: Likewise.
* tests/data/test-read-dwarf/test14-pr18893.so.abi: Likewise.
* tests/data/test-read-dwarf/test15-pr18892.so.abi: Likewise.
* tests/data/test-read-dwarf/test16-pr18904.so.abi: Likewise.
* tests/data/test-read-dwarf/test17-pr19027.so.abi: Likewise.
* tests/data/test-read-dwarf/test18-pr19037-libvtkRenderingLIC-6.1.so.abi: Likewise.
* tests/data/test-read-dwarf/test19-pr19023-libtcmalloc_and_profiler.so.abi: Likewise.
* tests/data/test-read-dwarf/test2.so.abi: Likewise.
* tests/data/test-read-dwarf/test20-pr19025-libvtkParallelCore-6.1.so.abi: Likewise.
* tests/data/test-read-dwarf/test21-pr19092.so.abi: Likewise.
* tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi: Likewise.
* tests/data/test-read-dwarf/test4.so.abi: Likewise.
* tests/data/test-read-dwarf/test6.so.abi: Likewise.
* tests/data/test-read-dwarf/test7.so.abi: Likewise.
* tests/data/test-read-dwarf/test8-qualified-this-pointer.so.abi: Likewise.
* tests/data/test-read-dwarf/test9-pr18818-clang.so.abi: Likewise.
* tests/data/test-read-write/test10.xml: Likewise.
* tests/data/test-read-write/test14.xml: Likewise.
* tests/data/test-read-write/test15.xml: Likewise.
* tests/data/test-read-write/test17.xml: Likewise.
* tests/data/test-read-write/test18.xml: Likewise.
* tests/data/test-read-write/test19.xml: Likewise.
* tests/data/test-read-write/test2.xml: Likewise.
* tests/data/test-read-write/test20.xml: Likewise.
* tests/data/test-read-write/test21.xml: Likewise.
* tests/data/test-read-write/test22.xml: Likewise.
* tests/data/test-read-write/test23.xml: Likewise.
* tests/data/test-read-write/test24.xml: Likewise.
* tests/data/test-read-write/test25.xml: Likewise.
* tests/data/test-read-write/test26.xml: Likewise.
* tests/data/test-read-write/test27.xml: Likewise.
* tests/data/test-read-write/test28-without-std-fns-ref.xml: Likewise.
* tests/data/test-read-write/test28-without-std-vars-ref.xml: Likewise.
* tests/data/test-read-write/test3.xml: Likewise.
* tests/data/test-read-write/test6.xml: Likewise.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

abg-dwarf-reader: detect kernel modules without exports as such

Kernel modules without exported symbols (no use of EXPORT_SYMBOL*()),
will not have a __ksymtab_strings section. Libabigail will therefore
assume they are usual ELF binaries. That leads to wrong results as
now all ELF symbols are considered part of the ABI. That is obviously
wrong. Instead consider binaries having a .modinfo section to be kernel
binaries. We keep the __ksymtab_strings condition as vmlinux has no
.modinfo section but a __ksymtab_strings if symbols are exported.

One case is still open (and requires maybe some documentation): if a
kernel does not export symbols (no module support), none of the
conditions apply. But, who would be interested in the ABI of a kernel
that does not expose any?

* src/abg-dwarf-reader.cc(is_linux_kernel_binary): consider
binaries only having a .modinfo section to be kernel binaries

Co-developed-by: Alessio Balsini <balsini@android.com>
Signed-off-by: Alessio Balsini <balsini@android.com>
Signed-off-by: Matthias Maennich <maennich@google.com>

Ensure a consistent C++ standard use

On older compilers (such as g++ 4.8), the default C++ standard is set to
gnu++98. When compiling libabigail with --enable-cxx11=yes, src/ and
tests/ where compiled with the correct flag, while tools/ was compiled
without specifying a standard. With a compiler falling back to gnu++98
that leads to unresolved references when linking the tools against the
libabigail library. Fix that by consistently using the std= flag across
the code base.

* configure.ac: add -std=c++11 flag to CXXFLAGS when compiling
for C++11
* src/Makefile.am: drop now obsolete setting of the -std flag
* tests/Makefile.am: likewise

Reported-by: Chun-Hung Wu <Chun-hung.Wu@mediatek.com>
Signed-off-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Bug 24787 - Filter out enum changes into compatible integer types

Libabigail's filtering engine fails to recognize an enum changing into
a compatible integer (or vice versa) as a harmless change.

This patch fixes that.

* include/abg-comparison.h (peel_typedef_or_qualified_type_diff):
Declare new function.
(peel_pointer_or_qualified_type_diff): Rename
peel_pointer_or_qualified_type into this.
* include/abg-fwd.h (is_enum_type): Declare a new overload for
type_or_decl_base*.
* src/abg-comp-filter.cc (has_harmless_enum_to_int_change): Define
new static function.
* src/abg-comparison.cc (categorize_harmless_diff_node): Use the
new has_harmless_enum_to_int_change here.
(peel_pointer_or_qualified_type_diff): Renamed
peel_pointer_or_qualified_type into this.
(is_diff_of_basic_type): Adjust.
(peel_typedef_or_qualified_type_diff): Define new function.
* test-diff-filter/PR24787-lib{one, two}.so: New test input
binaries.
* test-diff-filter/PR24787-{one, two}.c: Source files of the test
input binaries above.
* test-diff-filter/PR24787-report-0.txt: Test output reference.
* tests/data/Makefile.am: Add the new testing material to source
distribution.
* tests/test-diff-filter.cc (in_out_specs): Add the new test to
the test harness.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Add timing to the verbose logs of abidw

While doing my recent optimization work, it became useful to have an
idea of the time different parts of the processing pipeline are
taking.

This patch introduces an abigail::tools_utils::timer type that is easy
to use to time a given part of the code and emit the elapsed time to
an output stream.

This abigail::tools_utils::timer type is thus used to time various
parts of the processing pipeline involved in abidw. Just using the
existing --verbose option now yields timing information.

* include/abg-tools-utils.h (class timer): Declare new type.
(operator<<(ostream&, const timer&)): Declare new streaming
operator for the new timer type.
* src/abg-tools-utils.cc (struct timer::priv): Define new type.
(timer::{timer, start, stop, value_in_seconds, value,
value_as_string, ~timer}): Define member functions.
(operator<<(ostream& o, const timer& t)): Define streaming
operator.
(build_corpus_group_from_kernel_dist_under): Add timing logs to
the linux kernel reading process.
* src/abg-dwarf-reader.cc
(read_context::canonicalize_types_scheduled): Add timing logs to
type canonicalization.
(read_debug_info_into_corpus): Add timing logs for the whole debug
info loading and internal representation building process.
* tools/abidw.cc (load_corpus_and_write_abixml): Add timing logs
for the binary loading and serizalization process.
(load_kernel_corpus_group_and_write_abixml): Add timing logs the
Linux Kernel binary loading and writing process.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

[ir] Fix indentation and add comments

GCC 9+ rightfully complains about some indentation issues in the
types_defined_same_linux_kernel_corpus_public function.

This patch fixes it and adds more comments.

* src/abg-ir.cc (types_defined_same_linux_kernel_corpus_public):
Fix indentation and add comments.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Implement fast comparison of Linux Kernel types when applicable

During type canonicalization there are observations that can speed-up
type comparison significantly without impacting correctness too much.

Typically, when two types are of the same name and kind, are found in
the same corpus and are defined in the same translation unit, they
ought to be the same type, even in C.  So there is no need in this
case to actually perform the structural comparison of the two types
which does have a quadratic performance at best.

Using this optimization made the loading of the
drivers/gpu/drm/i915/i915.ko module go from a quasi inifite time (many
hours on my system) to less than two minutes.  I am confining this
optimization to the Linux kernel case only for now, but I believe it
could benefit all C programs.  I am waiting for more testing before
applying it more broadly.

Also, while looking at this, I noticed that when loading several
corpora into a given corpus group (i.e, loading several linux kernel
binaries to represent a single conceptual kernel), we sometimes fail
to recognize that a type defined in a header file that is included in
several corpora is actually the same type, and should be re-used,
rather than being re-defined in each corpus.  This later adds stress
(time and space) on the system as we need to canonicalize and
de-duplicate these type later on.

This is because the "per-corpus" type maps that we use to lookup a
type by name and location when we see it (so that we know it's defined
in a different corpus of our current group) should really be
per-corpus-group type maps!  That is a type can be defined in the
corpus representing a .ko binary, and that type would be seen again in
another .ko binary later.  Until now, we were wrongly considering that
types were to be first defined in the corpus of the vmlinux binary,
and then could be re-used later.

I have thus fixed the code so that whenever we add a type to its
scope, the relevant per-corpus type maps are updated, as well as the
per-corpus-group ones, so that we can later lookup types in those
per-corpus-group type maps to know if a type is already defined in any
corpus of the group.

* include/abg-corpus.h (corpus::origin): Add a new
LINUX_KERNEL_BINARY_ORIGIN enumerator.
(corpus::{s,g}et_group): Declare new member
functions.
(class corpus): Make the corpus_group class friend of this one.
(corpus_group::get_main_corpus): Declare new member function.
* src/abg-corpus-priv.h (corpus::priv::group): Define new data
member.
(corpus::priv::priv): Initialize the new corpus::priv::group data
member.
* src/abg-corpus.cc (corpus::{g,s}et_group): Define new member
functions.
(corpus_group::get_main_corpus): Likewise.
(corpus_group::add_corpus): Use the new corpus::set_group() here
to to make the corpus be aware of the group it belongs to.
* src/abg-dwarf-reader.cc (read_debug_info_into_corpus): Set the
current corpus origin to the corpus::LINUX_KERNEL_BINARY_ORIGIN if
we are looking at a Linux Kernel binary.
(read_context::main_corpus_from_current_group): Use the
corpus_group::get_main_corpus method.
(should_reuse_type_from_corpus_group): Return the corpus group,
rather than the main corpus.
(read_debug_info_into_corpus): Add the current corpus to the
current corpus group before the debug info reading is done.  That
way, the corpus group will be accessible from the current corpus
during the construction of the internal representation.
(read_and_add_corpus_to_group_from_elf): Add the corpus to the
group only if it wasn't added to it before.
* include/abg-ir.h (operator{==,!=}): Declare new deep equality
and inequality operators for class_or_union_sptr and
union_decl_sptr.
* src/abg-ir.cc (types_defined_same_linux_kernel_corpus_public):
Define a new static function.
(type_base::get_canonical_type_for): Use the new
types_defined_same_linux_kernel_corpus_public here to speed up
type comparison.
(equals): In the overload of class_or_union, use the new
types_defined_same_linux_kernel_corpus_public as well, to speed up
type comparison.
(operator{==,!=}): Define new deep equality and inequality
operators for class_or_union_sptr and union_decl_sptr.
(maybe_update_types_lookup_map): In the overload function for
type_decl_sptr, class_decl_sptr, union_decl_sptr,
enum_type_decl_sptr, typedef_decl_sptr, qualified_type_def_sptr,
reference_type_def_sptr, array_type_def_sptr,
array_type_def::subrange_sptr, and function_type_sptr, update the
type lookup maps of the containing corpus group as well, not just
the ones of the current corpus.
* src/abg-reader.cc (build_enum_type_decl): Forgot to set the
"is-anonymous" flag.  Oops, fix this.
* tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Adjust.
* tests/data/test-read-dwarf/test10-pr18818-gcc.so.abi: Adjust.
* tests/data/test-read-dwarf/test12-pr18844.so.abi: Adjust.
* tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi: Adjust.
* tests/data/test-read-dwarf/test9-pr18818-clang.so.abi: Adjust.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

abg-tools-utils: add missing header include guards

* include/abg-tools-utils.h: add header include guards

Signed-off-by: Matthias Maennich <maennich@google.com>

Add compatibility layer for C++11 mode

Introduce a compatibility layer for C++11 code by adding
include/abg-cxx-compat.h. abg-cxx-compat defines a new namespace
abg_compat and defines
abg_compat::hash
abg_compat::shared_ptr
abg_compat::weak_ptr
abg_compat::dynamic_pointer_cast
abg_compat::static_pointer_cast
abg_compat::unordered_map
abg_compat::unordered_set
based on definitions from std::tr1 (std=gnu++98) or std:: (std=gnu++11).

I decided for introducing abg_compat:: rather than polluting abigail::
to allow an easier transition to C++11 at a later time and to not subtly
break existing code.

As the shared_ptr in C++11 defines shared_ptr::operator bool() explicit,
some locations where a shared_ptr is assigned to boolean, needed to be
adjusted to explicitly cast to bool.

* include/abg-cxx-compat.h: new file introducing the abg_compat
  namespace to provide C++11 functionality from either std::tr1
  or std::
* include/Makefile.am: Add the new abg-cxx-compat.h to source
  distribution.
* include/abg-comparison.h: replace std::tr1 usage by abg_compat
  and adjust includes accordingly: likewise
* include/abg-diff-utils.h: likewise
* include/abg-fwd.h: likewise
* include/abg-ini.h: likewise
* include/abg-interned-str.h: likewise
* include/abg-ir.h: likewise
* include/abg-libxml-utils.h: likewise
* include/abg-libzip-utils.h: likewise
* include/abg-reporter.h: likewise
* include/abg-sptr-utils.h: likewise
* include/abg-suppression.h: likewise
* include/abg-tools-utils.h: likewise
* include/abg-workers.h: likewise
* src/abg-comp-filter.cc: likewise
* src/abg-comparison-priv.h: likewise
* src/abg-corpus.cc: likewise
* src/abg-dwarf-reader.cc: likewise
* src/abg-hash.cc: likewise
* src/abg-ir.cc: likewise
* src/abg-reader.cc: likewise
* src/abg-suppression.cc: likewise
* src/abg-tools-utils.cc: likewise
* src/abg-writer.cc: likewise
* tests/test-diff-filter.cc: likewise
* tests/test-diff-pkg.cc: likewise
* tests/test-read-dwarf.cc: likewise
* tests/test-read-write.cc: likewise
* tests/test-types-stability.cc: likewise
* tests/test-write-read-archive.cc: likewise
* tools/abicompat.cc: likewise
* tools/abidiff.cc: likewise
* tools/abidw.cc: likewise
* tools/abilint.cc: likewise
* tools/abipkgdiff.cc: likewise

Signed-off-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Update tests/.gitignore to ignore runtesttoolsutils

* tests/.gitignore: ignore runtesttoolsutils

Signed-off-by: Matthias Maennich <maennich@google.com>

Drop requirement to compile with GNU extensions

__gnu_cxx::stdio_filebuf is a GNU extension only available in certain
std libraries. It is not e.g. in libc++. In order to be able to compile
with using libc++, replace the usage of __gnu_cxx::stdio_filebuf with
standard C++ methods. In this case, reopen the temporary file with a
std::fstream and expose that stream rather than the previously exposed
std::iostream.

* include/abg-tools-utils.h (get_stream): Change return type to
  std::fstream
* src/abg-corpus.cc: remove unused #include of ext/stdio_filebuf.h
* src/abg-tools-utils (temp_file::priv): remove filebuf_ member,
  and replace iostream_ by fstream_ with changing the shared_ptr
  type accordingly
  (temp_file::priv::priv): initialize fstream_ based on
  temporary file name
  (temp_file::priv::~priv): adjust destruction accordingly
  (temp_file::is_good): test the fstream rather than the fd
  (temp_file::get_stream): adjust return type to std::fstream
  and adjust implementation based on the changes in temp_file::priv
* src/Makefile.am: remove gnu extension from c++ standard flag
* tests/Makefile.am: likewise

Signed-off-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Misc indent cleanup

* src/abg-dwarf-reader.cc (addr_elf_symbol_sptr_map_sptr): Fix a
typo in the comment of this typedef.
* src/abg-ir.cc (hash_type_or_decl): Fix typo in a comment.
* src/abg-writer.cc (write_translation_unit): Remove useless
vertical space.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

[xml-writer] Remove a useless kludge

* src/abg-writer.cc (write_context::type_is_emitted): Remove
useless kludge from here.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

[xml-writter] Speedup function_type::get_cached_name

It looks like due to a typo, we are never caching the name of the
function_type, so we are computing it all the time, *OOOPS*. So this
is having an impact when comparing instance of function_type during
de-duplication at abixml writting time.

Things are faster now, thanks to this patch.

* src/abg-ir.cc (function_type::get_cached_name): Really cache the
computed name of function_type instances.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

[xml-writter] Avoid using RTTI when dynamically hashing types

When we dynamically hash types in the abixml writter, we use
hash_type_or_decl.  This function uses runtime type identification to
determine if the (type) artifact is a decl or a type, and based on
that, choose how to compute its hash value.  Profiling shows that
using the RTTI in hash_type_or_decl at this point is a hotspot.

Because we know that the type ABI is a *type*, we obviously can avoid
using RTTI there.

The patch thus implements a hash_type function, and uses that in the
xml writter.  Emitting the abixml output is faster with this patch.

* include/abg-fwd.h (hash_type): Declare new function.
* src/abg-ir.cc (hash_type): Define new function.
* src/abg-writer.cc (type_hasher::operator()): Use the new
  hash_type rather than the old hash_type_or_decl.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Implement a poor-man's RTTI for performance

Profiling showed that a number of use of dynamic_cast are a speed
bottleneck.

This patch implements a poor-man's RTTI that allows us to implement a
form of dynamic_cast that is specific to the types of the internal
reprenstation that are in the namespace abigail::ir. It speeds up
things greatly.

Basically, the base type of all ABI artifacts
(abigail::ir::type_or_decl_base) now contains three new data members.
The first one contains a bitmap that identifies the type of artifact.
The second one contains a pointer to the dynamic type sub-object of
the current instance of the artifact. The last one contains either a
pointer to the type_base sub-object of the current instance of ABI
artifact if it's a type, or a pointer to the type_decl sub-object of
the current instance.

Together these three data members allow the patch to implement the
abigail::ir::{is_type(), is_decl(), is_<type_kind>_type} functions
that we need to make the code base noticeably faster when using abidw
on a big vmlinux binary.

* include/abg-fwd.h (is_type_decl): Replace the overloads
that takes a type_base* and/or a decl_base* by one that takes a
type_or_decl_base*.
* include/abg-ir.h (type_or_decl_base::type_or_decl_kind): Define
new enum.
(type_or_decl_base::{kind, runtime_type_instance,
type_or_decl_base_pointer}): Declare new accessors.
(operator{|,|=,&,&=): Declare new operators for the new
type_or_decl_base::type_or_decl_kind enum.
(global_scope::global_scope): Move the definition of this
constructor to ...
* src/abg-ir.cc (global_scope::global_scope): ... here.
(type_or_decl_base::priv::{kind_, rtti_, type_or_decl_ptr_}):
Add new data members.
(type_or_decl_base::priv::priv): Take a
type_or_decl_base::type_or_decl_kind enum.
(type_or_decl_base::priv::kind): Define new accessors.
(operator{|,|=,&,&=): Define new operators for the new
type_or_decl_base::type_or_decl_kind enum.
(type_or_decl_base::type_or_decl_base): Take a
type_or_decl_base::type_or_decl_kind enum.
(type_or_decl_base::{kind, runtime_type_instance,
type_or_decl_base_pointer}): Define new accessors.
(decl_base::decl_base, scope_decl::scope_decl)
(type_base::type_base, scope_type_decl::scope_type_decl)
(class_or_union::class_or_union) : Adjust to set the runtime type
identifier of the instances of these types.
(global_scope::global_scope, type_decl::type_decl)
(qualified_type_def::qualified_type_def)
(pointer_type_def::pointer_type_def)
(reference_type_def::reference_type_def
array_type_def::subrange_type::subrange_type)
(array_type_def::array_type_def, enum_type_decl::enum_type_decl)
(typedef_decl::typedef_decl, var_decl::var_decl)
(function_type::function_type, method_type::method_type)
(function_decl::function_decl)
(function_decl::parameter::parameter, method_decl::method_decl)
(class_decl::class_decl, class_decl::base_spec::base_spec)
(union_decl::union_decl, template_decl::template_decl)
(type_tparameter::type_tparameter)
(non_type_tparameter::non_type_tparameter)
(template_tparameter::template_tparameter)
(type_composition::type_composition)
(function_tdecl::function_tdecl, function_tdecl::function_tdecl)
(class_tdecl::class_tdecl):
Likewise and call runtime_type_instance() here to set the runtime
type instance pointers of the current instance.
(is_decl, is_type, is_class_type, is_pointer_type): Adjust to use
the new poor-man's rtti machinery.
(is_type_decl): Replace the overloads that takes a type_base*
and/or a decl_base* by one that takes a type_or_decl_base*.
(pointer_type_def::operator==, class_decl::operator==): Use the
poor-man's rtti machinery to replace dynamic_cast.
hash_type_or_decl: Replace dynamic_cast<const type_base> by
is_type() and dynamic_cast<const decl_base*> by is_decl().

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

[dwarf-reader] Make sure to canonicalize anonymous types

For a reason, anonymous types are not canonicalized.  I think this is
due to the fact that because they have no name,
read_context::lookup_type_from_die(die) used by maybe_canonicalize_type()
falls short in trying to canonicalize the *DIE*.

So later, at comparison time, things can be really slow because we
can't do canonical comparison; we ressort to structural comparison.

This patch ensures that even anonymous types are canonicalized.

* src/abg-dwarf-reader.cc (maybe_canonicalize_type): Add two new
overloads.  One that takes type_base_sptr, one that takes a
Dwarf_Die* and type_base_sptr.  These force canonicalization for
anonymous types.
(build_function_type): Schedule function types for
canonicalization.
(build_ir_node_from_die): For struct/classes and unions, use the
new overload of maybe_canonicalize_type to schedule
canonicalization.
* tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Adjust.
* tests/data/test-read-dwarf/test10-pr18818-gcc.so.abi: Adjust.
* tests/data/test-read-dwarf/test12-pr18844.so.abi: Adjust.
* tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi: Adjust.
* tests/data/test-read-dwarf/test9-pr18818-clang.so.abi: Adjust.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

[dwarf-reader] Constify the first parameter of maybe_canonicalize_type

In preparation for some coming patches, I figured it'd be more type
safe to make the Dwarf_Die parameter of maybe_canonicalize_type be a
const pointer. The patch subsequently adjusts code that needs adjusting.

* src/abg-dwarf-reader.cc (maybe_canonicalize_type): Make the
first parameter const.
(read_context::{get_canonical_die, lookup_artifact_from_die,
lookup_type_from_die, schedule_type_for_late_canonicalization}):
Adjust.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Make abidiff --harmless show harmless changes in unions

Since the previous commit filters out harmless changes inside unions,
this one allows those changes to be reported whenever the user runs
abidiff with the --harmless option.

The patch just displays the "before" and "after" of the union because
some of the harmless changes are not tracked anymore.

Because we want to display the before and after of the union we use
the function get_class_or_union_flat_representation. The patch adds a
"qualified_name" boolean parameter to that function so that we can
choose to display union members names in a non-qualified fashion,
which is the natural way of displaying those names in the context of a
union (or class) representation. Because
get_class_or_union_flat_representation uses
type_or_decl_base::get_pretty_representation, the patch has also added a
"qualified_name" boolean parameter to that function so that we can
choose to display names in a non-qualified manner.

* include/abg-fwd.h (get_class_or_union_flat_representation): Add
a "qualified_name" boolean parameter.
* include/abg-ir.h ({type_or_decl_base, decl_base, type_decl,
namespace_decl, array_type_def::subrange_type, array_type_def,
enum_type_decl, typedef_decl, var_decl, function_decl,
function_decl::parameter, function_type, method_type, class_decl,
union_decl}::get_pretty_representation): Likewise.
* src/abg-ir.cc ({type_or_decl_base, decl_base, type_decl,
namespace_decl, array_type_def::subrange_type, array_type_def, enum_type_decl,
typedef_decl, var_decl, function_decl, function_decl::parameter,
function_type, method_type, class_decl, union_decl,
}::get_pretty_representation): Adjust the code to emit qualified
or non-qualified names depending on the new "qualified_name"
boolean parameter.
(get_class_or_union_flat_representation): Likewise.
* src/abg-default-reporter.cc (default_reporter::report): Use
get_class_or_union_flat_representation with the new
"qualified_name" boolean set to false.
* tests/data/test-diff-dwarf/test38-union-report-0.txt: Adjust.
* tests/test-diff-filter.cc (in_out_specs): Run the test harness
on test-PR24731-v{0,1}.o make abidiff use the --harmless option.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Bug 24731 - Wrongly reporting union members order change

When union data members are re-ordered, abidiff reports the
re-ordering as if it was a meaningful ABI change.

This patch teaches Libabigail to categorize that benign type layout
change as a HARMLESS_UNION_CHANGE_CATEGORY kind of change and ignore it.

* include/abg-comp-filter.h (union_diff_has_harmless_changes):
Declare new function and ...
* src/abg-comp-filter.cc (union_diff_has_harmless_changes):
... define it here.
(categorize_harmless_diff_node): Use the new
union_diff_has_harmless_changes here.
* include/abg-comparison.h (HARMLESS_UNION_CHANGE_CATEGORY): Add a
new enumerator to diff_category enum. Adjust the value of the
other enumerators.
* src/abg-comparison.cc (get_default_harmless_categories_bitmap):
Add the new HARMLESS_UNION_CHANGE_CATEGORY in here.
(operator<<(ostream& o, diff_category c)): Support the new
HARMLESS_UNION_CHANGE_CATEGORY.
* tests/data/test-diff-filter/test-PR24731-report-0.txt: Likewise.
* tests/data/test-diff-filter/test-PR24731-report-1.txt: Likewise.
* tests/data/test-diff-filter/test-PR24731-v0.c: Likewise.
* tests/data/test-diff-filter/test-PR24731-v0.o: Likewise.
* tests/data/test-diff-filter/test-PR24731-v1.c: Likewise.
* tests/data/test-diff-filter/test-PR24731-v1.o: Likewise.
* tests/data/Makefile.am: Add the new test material above to
source distribution.
* tests/test-diff-filter.cc (in_out_spec): Add the new test input
to this test harness.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Fully account for anonymous-ness of scopes when comparing decl names

When comparing internal decl names (as part of decl comparison), we
need to take into account the fact that a given decl might be
anonymous and that it might have anonymous scopes in its tree of
containing scopes.

For instance, "__anonymous_struct__1::foo" and
"__anonymous_struct__2::foo" are considered equivalent.

So are "__anonymous_struct__1::foo::__anonymous_struct__2::bar" and
"__anonymous_struct__10::foo::__anonymous_struct__11::bar".

But "__anonymous_struct__1::bar::__anonymous_struct__2::baz" and
"__anonymous_struct__10::foo::__anonymous_struct__11::bar" are not.

This patch introduces the function tools_utils::decl_names_equal that
compares fully qualified names by taking into account anonymous
component names.

That function is thus used in the equals() function overload for
decl_base types.  Because tools_utils::decl_names_equal compares strings the
usual way (character by character) it's slower than comparing
instances of interned_string in a O(1) time.  So the patch carefully
tries to use tools_utils::decl_names_equal sparringly; that is, it
uses it only when we are looking at decls that have some anonymous
scope.  That way, we use the fast interned_string comparison most of
the time.  By doing this, we barely see any performance degradation
while running abidw --noout on a full blown vmlinux binary.

* include/abg-ir.h (decl_base::{get_has_anonymous_parent,
set_has_anonymous_parent,
get_is_anonymous_or_has_anonymous_parent}): Declare new member
functions.
* src/abg-ir.cc (decl_base::priv::has_anonymous_parent_): Define
new data member.
(decl_base::priv): Initialize the new data member.
(decl_base::{get_has_anonymous_parent, set_has_anonymous_parent,
get_is_anonymous_or_has_anonymous_parent}): Define new member
functions.
(equals): In the overload for decl_base, use the new
decl_names_equal for decls that have anonymous scopes.
(scope_decl::add_member_decl): Propagate the
decl_base::has_anonymous_parent_ property.
* include/abg-tools-utils.h
(get_anonymous_struct_internal_name_prefix)
(get_anonymous_union_internal_name_prefix)
(get_anonymous_enum_internal_name_prefix, decl_names_equal):
Declare new functions.
* src/abg-comp-filter.cc (has_harmless_name_change): Handle the
case where the name change is actually from an anonymous name to
another one, using the new decl_names_equal function.
* src/abg-dwarf-reader.cc
(get_internal_anonymous_die_prefix_name): Renamed
get_internal_anonynous_die_base_name into this.  Use the new
get_anonymous_{struct, union, enum}_internal_name_prefix functions
here.
(get_internal_anonymous_die_name, die_qualified_type_name)
(build_enum_type, add_or_update_class_type)
(add_or_update_union_type): Adjust.
* src/abg-tools-utils.cc (get_anonymous_struct_internal_name_prefix)
(get_anonymous_union_internal_name_prefix)
(get_anonymous_enum_internal_name_prefix, decl_names_equal):
Define new functions.
* tests/test-tools-utils.cc: New test file.
* tests/Makefile.am: Add new runtesttoolsutils test, built from
test-tools-utils.cc.
* tests/data/test-diff-dwarf/test46-rust-report-0.txt: Adjust.
* tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el7.x86_64-report-3.txt:
Likewise.
* tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Likewise.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

abg-reporter.h: add missing includes / using declarations

In order to build this (external!) header file stand-alone, it required
some minor fixes. I.e. adding some includes and using declarations.

* include/abg-reporter.h: fix includes and using declarations

Signed-off-by: Matthias Maennich <maennich@google.com>

[dwarf-reader] Fix indentation in compare_dies_string_attribute_value

While looking at something else, I realized that
compare_dies_string_attribute_value had some indentation that was off.

Fixed thus.

* src/abg-dwarf-reader.cc (compare_dies_string_attribute_value):
Fix indentation.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

[dwarf-reader] Optimize speed of compare_as_decl_dies

Profiling shows that the *number of calls* to
compare_dies_string_attribute_value by compare_as_decl_dies represents
a hotspot. That is, compare_dies_string_attribute_value itself
doesn't necessarily takes long, but it's being called "too much",
especially when compare_as_decl_dies is called for DIEs representing
classes/structs.

This patch thus reduces the calls to
compare_dies_string_attribute_value from 3 to 1 for classes/structs.

This optimization makes abidw's reading be 10% faster (from ~5min:15s
to ~ 4min:45s) on a fullblow vmlinux. Note that abidw writting time
hasn't been yet optimized.

* src/abg-dwarf-reader.cc (die_is_class_type): Take a const
pointer to Dwarf_Die.
(compare_as_decl_dies): For classes/structs, call
compare_dies_string_attribute_value just once to compare the
DW_AT_name attribute values.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

[dwarf-reader] Better use of linkage name for fn decl de-duplication

When looking at a C program, during function decl DIE de-duplication
at we can rely on linkage names of function declarations to quickly
determine if two function decls are equal, in a given binary.

This patch uses that observation to speed up function decl DIE
de-duplication. abidw --noout vmlinux goes from 8 to 5 minutes with
this.

* src/abg-dwarf-reader.cc (read_context::{die_is_in_c,
die_is_in_c_or_cplusplus}): Define new member functions.
(fn_die_equal_by_linkage_name): Define new static function.
(compare_dies): In the case for for DW_TAG_subprogram, use the new
fn_die_equal_by_linkage_name.
* tests/data/test-annotate/test15-pr18892.so.abi: Adjust.
* tests/data/test-annotate/test21-pr19092.so.abi: Adjust.
* tests/data/test-read-dwarf/test15-pr18892.so.abi: Adjust.
* tests/data/test-read-dwarf/test21-pr19092.so.abi: Adjust.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

[dwarf-reader] Re-use function types inside a given TU

Whenever we see a function type inside a translation unit, if it
matches one that has already been seen -- i.e, one that has the same
textual representation -- we should be able to re-use that same
function type without having to compare their types to be sure they
are the same, as part of the type canonicalization process.

This slittly increases analysis speed (by a few tens of seconds on a
total of 8 minutes) by decreasing the load on type canonicalization
when anlyzing vmlinux. It also slightly reduces memory consumption,
so I am getting it in for now.

* src/abg-dwarf-reader.cc (istring_fn_type_map_type): Declare new
typedef.
(die_is_function_type): Define new static function.
(read_context::per_tu_repr_to_fn_type_maps_): Define new data
member ...
(read_context::per_tu_repr_to_fn_type_maps): ... and its accessor.
(read_context::{associate_die_repr_to_fn_type_per_tu,
lookup_fn_type_from_die_repr_per_tu}): Define new member
functions.
(build_function_type): Use the new
read_context::lookup_fn_type_from_die_repr_per_tu and
read_context::associate_die_repr_to_fn_type_per_tu functions,
instead of read_context::lookup_type_from_die.
* tests/data/test-annotate/test13-pr18894.so.abi: Adjust.
* tests/data/test-annotate/test14-pr18893.so.abi: Adjust.
* tests/data/test-annotate/test21-pr19092.so.abi: Adjust.
* tests/data/test-read-dwarf/test13-pr18894.so.abi: Adjust.
* tests/data/test-read-dwarf/test14-pr18893.so.abi: Adjust.
* tests/data/test-read-dwarf/test16-pr18904.so.abi: Adjust.
* tests/data/test-read-dwarf/test21-pr19092.so.abi: Adjust.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

[dwarf-reader] const-ify Dwarf_Die* use in many places

This is useful to prepare a subsequent patch that uses
get_die_pretty_type_representation with a const Dwarf_Die*. Trying to
const-ify the use of Dwarf_Die* in that function leads to a cascade of
const-ification. Much needed anyway.

* src/abg-dwarf-reader.cc (get_parent_die, get_scope_die)
(die_is_anonymous, die_is_type, die_is_decl, die_is_namespace)
(die_is_pointer_type, pointer_or_qual_die_of_anonymous_class_type)
(die_is_reference_type, die_is_pointer_or_reference_type)
(die_is_qualified_type, die_has_object_pointer)
(die_is_at_class_scope, die_unsigned_constant_attribute)
(die_signed_constant_attribute, die_attribute_is_signed)
(die_attribute_is_unsigned, die_attribute_has_no_signedness)
(die_name, die_location, die_qualified_type_name)
(die_qualified_decl_name, die_qualified_name)
(die_qualified_type_name_empty)
(die_return_and_parm_names_from_fn_type_die)
(die_function_signature, die_function_type_is_method_type)
(die_pretty_print_type, die_pretty_print_decl, die_pretty_print)
(maybe_canonicalize_type, build_subrange_type)
(build_subranges_from_array_type_die, compare_dies)
(read_context::get_container)
(read_context::compute_canonical_die_offset)
(read_context::get_or_compute_canonical_die)
(read_context::get_die_source)
(read_context::get_die_qualified_type_name)
(read_context::get_die_pretty_representation)
(read_context::get_die_language, read_context::odr_is_relevant)
(read_context::set_canonical_die_offset)
(read_context::associate_die_to_type, die_is_anonymous)
(die_string_attribute, die_constant_attribute)
(die_attribute_has_form, die_linkage_name)
(die_decl_file_attribute, die_die_attribute, die_size_in_bits)
(die_is_decl, die_is_namespace)
(pointer_or_qual_die_of_anonymous_class_type, die_is_array_type)
(die_is_pointer_reference_or_typedef_type)
(die_peel_pointer_and_typedef, die_function_type_is_method_type)
(die_virtuality, die_is_virtual)
(compare_dies_string_attribute_value, compare_dies_cu_decl_file)
(die_location_expr, die_member_offset)
(get_internal_anonynous_die_base_name, compare_as_decl_dies)
(compare_as_type_dies): Const-ify the Dwarf_Die* parameter(s) of
these functions.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Take anonymous scopes into account when comparing decls

This is another attempt at handling anonymous decls comparison.  It's
not the full blown method that I'd like, but this one seems to be fast
enough.  In this method, we take the immediate scope (and whether it's
anonymous or not) of the anonymous decl into account.

* include/abg-interned-str.h (interned_string::clear): Add new
member function.
* src/abg-ir.cc (equals): In the overload for decl_base, consider
the scope of the current (anonymous) decl.  If that scope is
anonymous then take that into account as well.
* tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el7.x86_64-report-3.txt:
Adjust.
* tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Likewise.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

.clang-format: Add more options for match existing coding style

Add options for constructor intializers, using declarations and
consecutive declarations.

Even though sorting using declarations could be useful, it changes too
much existing code as of now.

* .clang-format: Add options for ConstructorInitializers
Set SortUsingDeclarations=false
Set AlignConsecutiveDeclarations=true

Signed-off-by: Matthias Maennich <maennich@google.com>

.gitignore: Add libabigail-?.* *.orig files

- Artifacts produced by `make dist` should be ignored.
- Artifacts produced by git merge resolution should be ignored.

* .gitignore: add entries for distribution artifacts
* .gitignore: add *.orig files

Signed-off-by: Matthias Maennich <maennich@google.com>

abg-writer: drop deprecated API

Drop the deprecated overloads for write_translation_unit, write_corpus,
write_corpus_group. Also remove the deprecation facilities as they are
not used anymore.

* include/abg-fwd.h (ABG_DEPRECATED): Remove this macro.
* include/abg-writer.h (write_translation_unit, write_corpus)
(write_corpus_group): Drop the deprecated overloads of these
declarations.
* src/abg-writer.cc (write_translation_unit, write_corpus)
(write_corpus_group): Drop the deprecated overloads of these
definitions.

Signed-off-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Dodji Seketeli <dodji@redhat.com>

abidw: add option to only emit file names (--short-locs)

Various emitted directories contain machine specific information and
therefore break reproducibility of abidw's output across different
build paths.

Hence introduce --short-locs to only emit file names.

Thanks to earlier changes, adding an option boils down to adding it to
set_opts and to the write_context along with some auxiliary functions
for setting and getting.

* include/abg-writer.h (set_short_locs): Declare new function.
(set_common_options): Use it.
set_opts
* src/abg-writer.cc (write_context::m_short_locs): New data
member.
(write_context::write_context): Initialize it.
(write_context::{g,s}et_short_locs): Define new accessors.
(write_location, write_translation_unit, write_corpus): Honour the
new write_context::get_short_locs property.
(set_short_locs): Define new function.
* tools/abidw.cc (options::short_locs): New data member.
(display_usage): Help string for the new --no-show-locs option.
(parse_command_line): Parse the new --no-show-locs option.

Signed-off-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Dodji Seketeli <dodji@redhat.com>

abidw: add option to omit the compilation directory

The compilation directory contains machine specific information
therefore breaks reproducibility of abidw's output across different
build paths. Hence introduce --no-comp-dir-path (as in the xml
attribute). Internally I decided to not carry on the duplication of
'dir' and 'path' and used 'comp_dir'.

Thanks to earlier changes, adding an option boils down to adding it to
set_common_options and to the write_context along with some auxiliary
functions for setting and getting.

write_translation_unit uses the flag in the write_context and omits the
comp-dir-path if asked for.

* include/abg-writer.h (set_write_comp_dir): Declare new function.
(set_common_options): Use it.
* src/abg-writer.cc (write_context::m_write_comp_dir): Define new
data member.
(write_context::write_context): Initialize it.
(write_context::{g,s}et_write_comp_dir): Define new member
accessors.
(set_write_comp_dir): Define new free-form getter.
(write_translation_unit): Teach to respect write_comp_dir flag of
write_context.
* tools/abidw.cc (options::write_corpus_path): Define new data
member.
(options::options): Initialize it.
(display_usage): Add doc string for a new command line option: --no-comp-dir-path.
(parse_command_line): Parse the new command line option --no-comp-dir-path.

Signed-off-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Make write_architecture and write_corpus_path flags in the write_context

Having write_context carry corresponding flags for output sensitive
command line options, is useful to ensure these options are not lost in
chains for write_* calls. In particular, these options can have various
meanings depending on the context (corpus, corpus_group, etc.)

Hence add them to the write_context along with getters and setters and
make the writers aware of their existence. We do not need to modify the
corpus or corpus group's path or architecture any longer as they get
ignored for a different reason now.

Finally, drop the flag handling in abidw as it is already done via
set_opts, which learned about these new flags.

* include/abg-writer.h (set_write_architecture)
(set_write_corpus_path): Declare new getter functions.
(write_corpus): Take a new "member_of_group" argument.
(set_common_options): Use set_write_{architecture, corpus_path}
here.
* src/abg-writer.cc (write_context::m_write_{architecture,
corpus_path}}): Add new data members.
(write_context::write_context): Initialize the new data members.
(write_context::{s,g}et_write_{architecture, corpus}): Define new
accessors.
(set_write_{architecture, corpus}): Define new free-form getter
functions.
(write_corpus): Add flag to make aware if written as part of a
group.
* tools/abidw.cc (load_corpus_and_write_abixml)
(load_kernel_corpus_group_and_write_abixml): Drop obsolete option
handling as xml_writer::set_common_options now takes care of it.

Signed-off-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
ldiff --git a/include/abg-writer.h b/include/abg-writer.h
index 200b5f7..729b455 100644
--- a/include/abg-writer.h
+++ b/include/abg-writer.h
@@ -53,6 +53,11 @@ set_show_locs(write_context& ctxt, bool flag);
void
set_annotate(write_context& ctxt, bool flag);

+void
+set_write_architecture(write_context& ctxt, bool flag);
+
+void
+set_write_corpus_path(write_context& ctxt, bool flag);

/// A convenience generic function to set common options (usually used
/// by Libabigail tools) from a generic options carrying-object, into
@@ -69,6 +74,8 @@ set_common_options(write_context& ctxt, const OPTS& opts)
{
   set_annotate(ctxt, opts.annotate);
   set_show_locs(ctxt, opts.show_locs);
+  set_write_architecture(ctxt, opts.write_architecture);
+  set_write_corpus_path(ctxt, opts.write_corpus_path);
}

void
@@ -105,7 +112,10 @@ write_corpus_to_archive(const corpus_sptr corp,
const bool annotate = false);

bool
-write_corpus(write_context& ctxt, const corpus_sptr& corpus, unsigned indent);
+write_corpus(write_context& ctxt,
+      const corpus_sptr& corpus,
+      unsigned indent,
+      bool member_of_group = false);

bool ABG_DEPRECATED
write_corpus(const corpus_sptr& corpus, unsigned indent, write_context& ctxt);

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

abidw: Consolidate setting options

When setting options meant to be used for the write_context, it is easy
to forget to change all relavant locations. In order to consolidate
that, introduce a set_opts function that sets various known options.

We benefit from earlier refactoring as now the write_context is passed
around to carry options for writers. Hence we can be sure, that if we
set up the context correctly (and do not use deprecated functionality),
the respective write_* function will see the options set in the context.

* include/abg-writer.h (set_common_option): Declare new function.
* tools/abidw.cc (load_corpus_and_write_abixml)
(load_kernel_corpus_group_and_write_abixml): Use the newly
introduced set_common_option.

Signed-off-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Dodji Seketeli <dodji@redhat.com>

write_context: allow mutating the ostream used

Allowing the mutation of the ostream, allows heavy reuse of the
write_context as this is the distinction in most places where
write_context is used.

Hence, fixup various users of write_context and use common objects where
applicable.

* include/abg-writer.h (set_ostream): Declare new function.
* src/abg-writer.cc (write_context::m_ostream): Make this data
member be a pointer rather than a reference.
(write_context::{write_context, get_ostream): Adjust. member.
(write_context::set_ostream): Define new member function.
(set_ostream): Define new free-form function.
* tools/abidw.cc (load_corpus_and_write_abixml)
(load_kernel_corpus_group_and_write_abixml): Use the feature of
mutating the ostream and reuse the write_context in most cases.

Signed-off-by: Matthias Maennich <maennich@google.com>

abg-writer: Refactor write_corpus_group API

Introduce a new call overload for write_corpus_group that follows the
parameter order context, object (i.e. corpus_group), indent.

Deprecate all other overloads that were part of the API and mostly
forward them to the new overload. That effort is made to ensure
write_context is always provided. write_context allows access to all
options that influence the output format.

* include/abg-writer.h (write_corpus_group): Introduce new
overload write_corpus_group(ctxt, corpus_group, indent) and
deprecate all others.
* src/abg-writer.cc (write_corpus_group): Likewise for the
definitions and adjust.
* tools/abidw.cc (load_kernel_corpus_group_and_write_abixml):
Migrate to new API of write_corpus_group()

Signed-off-by: Matthias Maennich <maennich@google.com>

abg-writer: Refactor write_corpus API

Introduce a new overload for write_corpus that follows the parameter
order context, object (i.e. corpus), indent.

Deprecate all other overloads that were part of the API and mostly
forward them to the new overload. That effort is made to ensure
write_context is always provided. write_context allows access to all
options that influence the output format.

* include/abg-writer.h (write_corpus): Introduce new overload
write_corpus(ctxt, corpus, indent) and deprecate all others.
* src/abg-writer.cc (write_corpus): Likewise for the definitions
and adjust.
* tests/test-read-dwarf.cc (test_task::perform): Use the new
write_corpus which requires a write_context.
* tools/abidw.cc (load_corpus_and_write_abixml, ): Likewise.
* tools/abilint.cc (main): Likewise. Also simplify logic around the
locations as they now can be expressed with less code.

Signed-off-by: Matthias Maennich <maennich@google.com>

abg-writer: Refactor write_translation_unit API

Introduce a new overload for write_translation_unit that follows the
parameter order context, object (i.e. translation unit), indent.

Deprecate all other overloads that were part of the API and mostly
forward them to the new one. That effort is made to ensure write_context
is always provided. write_context allows access to all options that
influence the output format.

* include/abg-writer.h (write_translation_unit): Declare a new
overload write_translation_unit(ctxt, tu, indent) and deprecate
all others.
* src/abg-writer.cc (write_translation_unit): Likewise in the
definitions.
(write_corpus, dump, write_translation_unit): Adjust.
* tools/abilint.cc (main): use new write_translation_unit() API

Signed-off-by: Matthias Maennich <maennich@google.com>

Add deprecation facilities

Add the macro 'ABG_DEPRECATED' to mark APIs as to be removed in a next
major release. APIs marked with that flag are supposed to work as
before, but might come with downsides. E.g. they could perform worse or
provide only limited functionality. All deprecated functions shall come
with a hint to equivalent functionality within the non-deprecated part
of the API.

* include/abg-fwd.h: Introduce deprecation macro ABG_DEPRECATED

Signed-off-by: Matthias Maennich <maennich@google.com>

abg-writer: Simplify 'annotate' propagation

'annotate' is one of many flags that could potentially influence the
way, output is written. Remove the default parameter from
write_context's constructor and let users explicitely set that flag on
the context.

* src/abg-writer.cc (write_context::write_context): remove
'annotate' parameter.
(write_translation_unit, write_corpus, write_corpus_group, dump): Adjust.

Signed-off-by: Matthias Maennich <maennich@google.com>

Add .clang-format approximation

Add .clang-format definitions that are an approximation of the current
coding style. As I understand it, the current style is based on what GNU
Emacs implements for C++. Hence these rules might not be entirely
accurate, but a good-enough approximation to allow contributers to
follow the coding style more easily.

I expect modifications for specific cases and when clang-format itself
evolves over time.

As of now, this definition is most useful in partial code formatting,
such as executed by `git clang-format` on staged files or
clang-format.py as a means of integration into various editors.

* .clang-format: New File.

Signed-off-by: Matthias Maennich <maennich@google.com>

Bug 24552 - abidiff fails comparing a corpus against a corpus group

In this problem report, the issue is that when comparing two corpus
groups, especially when looking up function/variable symbols, the
get_fun_symbol_map() and get_var_symbol_map() member functions used
are corpus::get_{fun,var}_symbol_map, rather than
corpus_group::get_{fun, var}_symbol_map.  Note that the type
corpus_group inherits from the type corpus.  That leads to unexpected
comparison results, especially for symbols.

This patch fixes this by making the corpus::get_{fun, var}_symbol_map
member function be virtual and by using it during the lookup of
function/variable symbols.  That way, the right symbol map gets used.

* include/abg-corpus.h (corpus{_group}::get_{fun,
var}_symbol_map): Make these member functions virtual.
* src/abg-corpus.cc (corpus::lookup_{function, variable}_symbol):
Use the virtual corpus::get_{fun, var}_symbol_map() member
function to get the symbols of the current corpus or corpus_group.
* tests/data/Makefile.am: Add the new test input material below to
source distribution.
* tests/data/test-abidiff/test-PR24552-report0.txt: New test input.
* tests/data/test-abidiff/test-PR24552-v0.abi: Likewise.
* tests/data/test-abidiff/test-PR24552-v1.abi: Likewise.
* tests/test-abidiff.cc (main): Support comparing corpus groups.
(specs): Add the new test inputs to the harness.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Bug 24560 - Assertion failure on an abixml with an anonymous type

When reading an abixml file, we should not try to re-use an anonymous
class, union or enum because by construction two anonymous unions of
the same (internal) name don't necessarily designate the same type.
We already do that in the ELF/DWARF reader so we need to update the
abixml reader too.

Fixed thus.

* src/abg-reader.cc (read_context::maybe_canonicalize_type): Delay
canonicalization of union types too.
(build_class_decl, build_union_decl): Do not try to re-use
anonymous types.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Handle Linux kernel binaries with no __ksymtab section

Some Linux kernel binaries can have no __ksymtab section. It's
possible that only have a __ksymtab_gpl section or no __ksymtab_gpl
either. And apparently they can also have a __ksymtab* section full
of zeroes.

This patch gets the ELF/DWARF reader prepared to handle these cases.

* src/abg-dwarf-reader.cc (find_section): Use elf_getshdrstrndx
rather than poking at the elf header on our own.
(read_context::find_any_ksymtab_section): Define new member
function.
(read_context::{get_symtab_format,
try_reading_first_ksymtab_entry_using_pre_v4_19_format}): Use the
new find_any_ksymtab_section rather than find_ksymtab_section.
(read_context::get_nb_ksymtab_entries): Handle the absence of
__ksymtab.
(read_context::get_nb_ksymtab_gpl_entries): Handle the absence of
__ksymtab_gpl.
(read_context::load_kernel_symbol_table): Handle the case of zero
ksymtab entries.
(read_context::{maybe_adjust_address_for_exec_or_dyn,
maybe_adjust_fn_sym_address, load_kernel_symbol_table}): Handle an
address that is zero.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Fix logic of get_binary_load_address

The value of the of the pointer to program header returned by
gelf_getphdr is always the same, assuming the value of the last
parameter to gelf_getphdr stays the same.  What changes is what is
pointed to by that pointer.  So rather than storing the the program
header (to determine the lowest load address among several program
headers returned by gelf_getphdr) we need to store the load address
pointed to by the pointer to the program header.

Thanks to Matthias Männich for spotting this and discussing it at
https://sourceware.org/ml/libabigail/2019-q2/msg00064.html.

Testing this is a bit annoying as we'd need to consider a prelinked binary which has
split debuginfo.  I should probably add such a binary to the regression
test suite at some point.

* src/abg-dwarf-reader.cc (get_binary_load_address): Consider the
load address pointed to by the program header pointer returned by
gelf_getphdr rather than the program header itself.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Bug 24431 Treat __ksymtab as int32_t for v4.19+ kernels

Calculating the relocation for values in __ksymtab with GElf_Addr (i.e.
uint64_t), makes the calculation rely on overflows for negative offsets.
Address that by treating these as 32bit signed values for the v4.19+
__ksymtabs and calculate the offset with them. This also allows, similar
an earlier commit, to drop the distinction between 64bit and 32bit
kernels.

* src/abg-dwarf-reader.cc (maybe_adjust_sym_address_from_v4_19_ksymtab):
treat passed addr as 32bit signed offset in case of v4.19+ __ksymtabs

Signed-off-by: Matthias Maennich <maennich@google.com>

Bug 24431 Read 32bit values when testing for the v4.19 symbol table format

Reading into uint64_t when reading the symbol table values drops the
sign and subsequently offset calculations will only be correct if either
the offset is positive or if the calculation overflows.

Read the relative value as signed int32_t (indepently of the target's
bitness) to allow negative offsets. That also allows to drop the code
that formerly handled the overflow.

That change fixes an assertion raised when dealing with aarch64 kernel
binaries that have a __ksymtab with 32bit relocations. i.e. Bug #24431

* src/abg-dwarf-reader.cc
(try_reading_first_ksymtab_entry_using_v4_19_format): attempt to
read first __ksymtab entry into int32_t to preserve sign

Signed-off-by: Matthias Maennich <maennich@google.com>

dwarf-reader: templatize read_int_from_array_of_bytes

Making the return value type a template type allows for signed types to
be passed and successfully interpreted.

* src/abg-dwarf-reader.cc (read_int_from_array_of_bytes):
templatize return type to allow passing of signed integer references

Signed-off-by: Matthias Maennich <maennich@google.com>

dwarf-reader: Fix comments for try_reading_first_ksymtab_entry_using_{pre_,}v4_19_format

Swap the descriptive comments for the two functions.

* src/abg-dwarf-reader.cc: swap the comments of
try_reading_first_ksymtab_entry_using_{pre_,}v4_19_format

Signed-off-by: Matthias Maennich <maennich@google.com>

Better handle several anonymous types of the same kind

This is a follow-up patch for the commit:

   43d56de Handle several member anonymous types of the same kind

It allows support for severan anonymous types even when these are not
members of a class/unions.

The patch introduces the concept of a scoped name.  It's a qualified
name for a decl made of the name of the decl appended to the
*unqualified* name of its scope.  Unlike for qualified names, the
scoped name won't have a "__anonymous_*__" string in its name if its
directly containing scope is not anonymous; a qualified name might
still have that string in its name because the decl has a parent scope
(not necessarily its directly containing scope though) that is
anonymous.

The patch goes on to update the logic for comparison of decls that are
anonymous.  For a decl which direct scope is *NOT* anonymous, the
scoped name is what's used in the comparison.  Otherwise, only the
name of the decl is used.

The patch also updates how we detect changes in data members and
member types, in the comparison engine.  It now uses the names of the
data members, rather than their qualified name.  This is in the scope
of the current class/union anyway.  The improvement is that the fact
that the class/union itself is anonymous (even if its anonymous name
changes to another anonymous name) won't have any spurious impact on
the detection of name change of the members.

The patch considers the change of an anonymous decl name which
anonymous name changes to another anonymous name as being harmless.

The patch updates the logic of category propagation in the comparison
engine.  Although a public typedef to private underlying type needs to
stay public and thus not propagate the PRIVATE_TYPE_CATEGORY from its
child diff node to himself, it still needs to suppress the changes to
the private underlying diff node that were suppressed (because of the
private-ness), unless that typedef has local changes.

* include/abg-ir.h (decl_base::get_scoped_name): Declare new
member function.
(scope_decl::get_num_anonymous_member_{classes, unions, enums}):
Declare new virtual member functions.
(class_decl::get_num_anonymous_member_{classes, unions, enums}):
Adjust to make these virtual.  It's not necessary but I feel
redundancy is a kind of self-documentation here.
* src/abg-comp-filter.cc (has_harmless_name_change): Consider
anonymous name changes as harmless.
* src/abg-comparison.cc
(class_or_union_diff::ensure_lookup_tables_populated): Consider
the names of the members rather than their qualified names.
(suppression_categorization_visitor::visit_end): Suppress the
changes to the private underlying diff node that were suppressed
because of the private-ness, unless that typedef has local
changes.
* src/abg-dwarf-reader.cc (build_enum_type)
(add_or_update_class_type, add_or_update_union_type): Handle
anonymous types in namespaces as well, not just in class/unions.
* src/abg-ir.cc (decl_base::priv::scoped_name_): Define new data
member.
(decl_base::get_scoped_name): Define new member function.
(equals): For the decl_base overload, use scoped name in the
comparison, unless the decl belongs to an anonymous type.  For the
class_or_union_diff, only consider scoped_name during comparison.
Avoid name comparison between anonymous types.
(scope_decl::get_num_anonymous_member_{classes, unions, enums}):
Define new member functions.
(types_have_similar_structure): Do not compare names between
anonymous types.
(qualified_name_setter::do_update): Update scoped names too.
* tests/data/test-abidiff/test-PR18791-report0.txt: Adjust.
* tests/data/test-annotate/libtest23.so.abi: Likewise.
* tests/data/test-annotate/test13-pr18894.so.abi: Likewise.
* tests/data/test-annotate/test14-pr18893.so.abi: Likewise.
* tests/data/test-annotate/test15-pr18892.so.abi: Likewise.
* tests/data/test-annotate/test21-pr19092.so.abi: Likewise.
* tests/data/test-diff-dwarf/test43-PR22913-report-0.txt:
Likewise.
* tests/data/test-diff-dwarf/test46-rust-report-0.txt: Likewise.
* tests/data/test-diff-filter/test30-pr18904-rvalueref-report0.txt:
Likewise.
* tests/data/test-diff-filter/test30-pr18904-rvalueref-report1.txt:
Likewise.
* tests/data/test-diff-filter/test30-pr18904-rvalueref-report2.txt:
Likewise.
* tests/data/test-diff-filter/test31-pr18535-libstdc++-report-0.txt:
Likewise.
* tests/data/test-diff-filter/test31-pr18535-libstdc++-report-1.txt:
Likewise.
* tests/data/test-diff-filter/test33-report-0.txt: Likewise.
* tests/data/test-diff-filter/test35-pr18754-no-added-syms-report-0.txt:
Likewise.
* tests/data/test-diff-filter/test44-anonymous-data-member-report-0.txt:
Likewise.
* tests/data/test-diff-pkg/libsigc++-2.0-0c2a_2.4.0-1_amd64--libsigc++-2.0-0v5_2.4.1-1ubuntu2_amd64-report-0.txt:
Likewise.
* tests/data/test-diff-pkg/nss-3.23.0-1.0.fc23.x86_64-report-0.txt:
Likewise.
* tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el7.x86_64-report-0.txt:
Likewise.
* tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el7.x86_64-report-1.txt:
Likewise.
* tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el7.x86_64-report-2.txt:
Likewise.
* tests/data/test-diff-pkg/spice-server-0.12.4-19.el7.x86_64-0.12.8-1.el7.x86_64-report-3.txt:
Likewise.
* tests/data/test-read-dwarf/PR22015-libboost_iostreams.so.abi:
Likewise.
* tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Likewise.
* tests/data/test-read-dwarf/libtest23.so.abi: Likewise.
* tests/data/test-read-dwarf/test10-pr18818-gcc.so.abi: Likewise.
* tests/data/test-read-dwarf/test11-pr18828.so.abi: Likewise.
* tests/data/test-read-dwarf/test12-pr18844.so.abi: Likewise.
* tests/data/test-read-dwarf/test13-pr18894.so.abi: Likewise.
* tests/data/test-read-dwarf/test14-pr18893.so.abi: Likewise.
* tests/data/test-read-dwarf/test15-pr18892.so.abi: Likewise.
* tests/data/test-read-dwarf/test16-pr18904.so.abi: Likewise.
* tests/data/test-read-dwarf/test21-pr19092.so.abi: Likewise.
* tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi:
Likewise.
* tests/data/test-read-dwarf/test9-pr18818-clang.so.abi: Likewise.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Handle several member anonymous types of the same kind

When there are several anonymous types (e.g, anonymous classes, unions
or enums) in a given class or union, libabigail's internals do
struggle.

An anonymous class, for instance, is named __anonymous_struct__.  When
there are more than one of these inside a given class, then we can't
name and look them up, because they all have the same name.

Incidentally, when add_or_update_class_type completes a class type
that was initially constructed before, it fails to determine that an
anonymous member type of that class was already present in that
context.  It thus wrongly duplicates anonymous structs/unions/enums in
there and that leads to spurious textual (abixml) representation
differences later, where duplicated anonymous member types would
appear intermittently, depending on the order in which the class was
built.

This patch addresses this general issue by naming anonymous member
types in a way that allows several of them to exist. That is, if there
are two anonymous structs in a class, they are going to be named
__anonymous_struct__ and __anonymous_struct__1.  We do follow a
similar scheme for anonymous unions and enums. This is handled by the
DWARF reader that builds the internal representation.

While looking at this issue, I also fixed a tangent bug; some DWARF
emitters wrongly *define* types in the scope of a
DW_TAG_subroutine_type or DW_TAG_array_type.  We handle that by
actually defining those types in the scope of that subroutine or
array.  But then it appears that if that scope itself a class and if
the type defined is an anonymous type, then putting that anonymous
type in the class scope might interfere with the *naming* of the
existing legit anonymous types of that scope.  I decided to put those
anonymous types in the containing namespace instead.  We'll see how
that goes in real time use.

The patch also updates lots of existing tests and adds a new one.

* include/abg-ir.h
(class_or_union::get_num_anonymous_member_{classes, unions,
enums}): Declare new member functions.
* src/abg-dwarf-reader.cc (get_internal_anonynous_die_base_name)
(build_internal_anonymous_die_name)
(get_internal_anonymous_die_name, is_anonymous_type_die): Define
new static functions.
(die_qualified_type_name): Use the new
get_internal_anonymous_die_name.
(get_scope_for_die): Fix this to put anonymous types that were
wrongly emitted into the scope of DW_TAG_subroutine_type or
DW_TAG_array_type by buggy DWARF emitters into the enclosing
namespace, rather than into the enclosing class/union.
(build_enum_type): Take the scope of the enum to have a chance to
properly name potential anonymous enums.
(lookup_class_typedef_or_enum_type_from_corpus): Take an anonymous
member type index for when the DIE we are lookup up represents an
anonymous type.  Support proper building of the internal anonymous
name of the anonymous type we are lookup up.
(add_or_update_class_type): Use the new
get_internal_anonynous_die_base_name and
build_internal_anonymous_die_name functions.  Support making sure
that the anonymous member type we are adding to the class wasn't
already there, especially for cases where we are updating a class
type.
(add_or_update_union_type): Use the new
get_internal_anonynous_die_base_name and
build_internal_anonymous_die_name functions.
(build_ir_node_from_die): Adjust the use of build_enum_type to
pass it the scope of the enum type we are building.
* src/abg-ir.cc (lookup_union_type): Add a new overload.
(lookup_class_or_typedef_type): Use the new overload of
lookup_union_type above to support looking up union types too.
(class_or_union::get_num_anonymous_member_{classes, unions,
enums}): Define new member functions.
* src/abg-reporter-priv.cc (represent): Detect when anonymous
types of anonymous data members have their internal names change,
probably because anonymous member types were inserted in the scope.
* tests/data/Makefile.am: Add the new test-anonymous-members-0.*
test input files to the source distribution.
* tests/data/test-annotate/test-anonymous-members-0.cc: New test
input file.
* tests/data/test-annotate/test-anonymous-members-0.o: Likewise.
* tests/data/test-annotate/test-anonymous-members-0.o.abi: Likewise.
* tests/data/test-annotate/test17-pr19027.so.abi: Adjust.
* tests/data/test-annotate/test18-pr19037-libvtkRenderingLIC-6.1.so.abi:
Likewise.
* tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi:
Likewise.
* tests/data/test-annotate/test21-pr19092.so.abi: Likewise.
* tests/data/test-diff-filter/test30-pr18904-rvalueref-report0.txt:
Likewise.
* tests/data/test-diff-filter/test30-pr18904-rvalueref-report1.txt
* tests/data/test-diff-filter/test30-pr18904-rvalueref-report2.txt:
Likewise.
* tests/data/test-diff-filter/test35-pr18754-no-added-syms-report-0.txt:
Likewise.
* tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Likewise.
* tests/data/test-read-dwarf/test12-pr18844.so.abi: Likewise.
* tests/data/test-read-dwarf/test16-pr18904.so.abi: Likewise.
* tests/data/test-read-dwarf/test17-pr19027.so.abi: Likewise.
* tests/data/test-read-dwarf/test18-pr19037-libvtkRenderingLIC-6.1.so.abi:
Likewise.
* tests/data/test-read-dwarf/test19-pr19023-libtcmalloc_and_profiler.so.abi:
Likewise.
* tests/data/test-read-dwarf/test21-pr19092.so.abi: Likewise.
* tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi:
Likewise.
* tests/test-annotate.cc (int_out_specs): Add the new test inputs
to this test harness.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Use canonical types hash maps for type IDs in abixml writer

This patch stores canonical types in the hash maps that are used in
the abixml writer to compute the type IDs. This limits the
possibility that two types that are equivalent (especially when one is
a declaration-only class) end-up in the same abixml.

* src/abg-writer.cc (write_context::{type_has_existing_id,
get_id_for_type}): Save the canonical type of the type in the map,
not the type itself.
(write_context::{type_is_emitted}): Use the canonical type rather
than the type itself.
* tests/data/test-read-dwarf/test9-pr18818-clang.so.abi: Adjust.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Don't try to de-duplicate all anonymous struct DIEs

Trying to de-duplicate anonymous struct DIEs can lead to subtle
issues, because there can be two different naming typedefs designating
two anonymous structs that are equivalent, in the same translation
unit. In that case, de-duplicating the two leaf anonymous structs
DIEs leads to non-resolvable conflict.

This patch avoids de-duplicating anonymous structs DIEs and rather
de-duplicates (naming) typedefs.

* include/abg-fwd.h (is_typedef): Remove the overloads for
type_base_sptr and decl_base_sptr. Replace those with an overload
for type_or_decl_base_sptr.
* src/abg-ir.cc (is_typedef): Do the same for the definitions.
* src/abg-dwarf-reader.cc (add_or_update_class_type)
(add_or_update_union_type): Do not de-duplicate anonymous
struct/union DIEs.
(build_typedef_type): Try to de-duplicate typedefs DIEs.
* tests/data/test-annotate/test17-pr19027.so.abi: Adjust.
* tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi:
Likewise
* tests/data/test-annotate/test21-pr19092.so.abi: Likewise.
* tests/data/test-read-dwarf/PR22015-libboost_iostreams.so.abi: Likewise.
* tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Likewise.
* tests/data/test-read-dwarf/test16-pr18904.so.abi: Likewise.
* tests/data/test-read-dwarf/test17-pr19027.so.abi: Likewise.
* tests/data/test-read-dwarf/test19-pr19023-libtcmalloc_and_profiler.so.abi:
Likewise.
* tests/data/test-read-dwarf/test21-pr19092.so.abi: Likewise.
* tests/data/test-read-dwarf/test9-pr18818-clang.so.abi: Likewise.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Canonicalize types non tied to any DWARF DIE

During DWARF reading, it can happen that some types are created that
are not tied to a given DIE.  This happens for instance when editing
the internal representation of types to better reflect the intent of
the code rather than the letter of the code; more precisely this
happens when a qualified array is edited to represent an array of
qualified elements, for instance.  In that case, the qualified element
type is created by the DWARF analyser.  And in the present incarnation
of the code, we forget to canonicalize that type.

The fact that that type is not canonicalized leads to cases where two
equivalent types can still be present in a translation unit in the
resulting abixml representation, without the abixml writer noticing
it.  That duplication can lead to unnecessary cycles in the type graph
where passes can either take one side or the other of the cycle; and
that leads to subtle heisen-changes in the emitted abixml representation.

This patch thus canonicalizes those newly created types.

* src/abg-dwarf-reader.cc
(read_context::extra_types_to_canonicalize_): Add new data member.
(read_context::{initialize, clear_types_to_canonicalize}): Adjust.
(read_context::extra_types_to_canonicalize): Create new accessor.
(read_context::schedule_type_for_late_canonicalization): Add new
overload for type_base_sptr.
(read_context::perform_late_type_canonicalizing): Perform the
canonicalization of the types created by the DWARF analyzer, but
that are not tied to any DIE.
(maybe_strip_qualification): Take a read_context&.  Schedule newly
created types (during type edition) for late canonicalization.
(build_ir_node_from_die): Adjust the call to
maybe_strip_qualification to pass a read_context.
* tests/data/test-annotate/test15-pr18892.so.abi: Adjust.
* tests/data/test-annotate/test17-pr19027.so.abi: Likewise.
* tests/data/test-annotate/test21-pr19092.so.abi: Likewise.
* tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Likewise.
* tests/data/test-read-dwarf/test12-pr18844.so.abi: Likewise.
* tests/data/test-read-dwarf/test15-pr18892.so.abi: Likewise.
* tests/data/test-read-dwarf/test17-pr19027.so.abi: Likewise.
* tests/data/test-read-dwarf/test21-pr19092.so.abi: Likewise.
* tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi:
Likewise.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Don't try to read a build_id as string in find_alt_debug_info_link.

The GCC8 address sanitizer found an issue in find_alt_debug_info_link.
It tried to convert a build-id byte sequence into a string. But the
build-id byte sequence is not a zero terminated sequence of chars.
So it could run off way past the section data.

The code never actually uses the build-id. It could use it to verify
the referenced alt-file is the correct one. But since it uses elfutils
to actually load the alt file it doesn't have to, since elfutils will
already check the build-id matches.

So just remove the build_id argument from find_alt_debug_info_link
and don't try to convert and return it as a string.

* src/abg-dwarf-reader.cc (find_alt_debug_info_link): Remove
build_id argument. Don't try to read the buildid chars as a
string.
(find_alt_debug_info): Don't call find_alt_debug_info_link
with a build_id string argument.

Signed-off-by: Mark Wielaard <mark@klomp.org>
Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Fix an undefined behaviour in has_var_type_cv_qual_change

* src/abg-comp-filter.cc: (has_var_type_cv_qual_change):
Initialize the ch_kind variable before using it.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Add --enable-{asan,ubsan} configure options

Add options to enable building with -fsanitize=address or
-fsanitize=undefined.

* configure.ac: Add configure options for -fsanitize=address and
-fsanitize=undefined.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

abg-tools-utils.cc: Plug a leak in find_file_under_dir

We were forgetting to call fts_close on a file hierarchy returned by
fts_open. Plugged thus.

* src/abg-tools-utils.cc (find_file_under_dir): Call fts_close
before return.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

dwarf-reader: fix undefined behaviour in get_binary_load_address

Within the loop, the call `gelf_getphdr(elf_handle, i, &ph_mem)` is
returning a pointer to `ph_mem` that is only valid in this loop
iteration. The later assignment to *lowest_program_header and its
eventual use to assign load_address leads to undefined behaviour.

* src/abg-dwarf-reader.cc (get_binary_load_address): Move the
ph_mem and program_header variables out of the inner for-loop.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Delay canonicalization for array and qualified types

Array and qualified types can be edited after they are built, e.g, to
fold the qualifiers of the array as those should always apply to the
element types of the array, at least in C and C++.  So we need to
delay the canonicalization of these types until *after* all that type
editing is done.  We were not doing that properly and I suspect this
is the cause of some "heisenregressions" we are seeing on some
platforms intermittently.

This patch does delay the type canonicalization and insures that we
don't edit types that are canonicalized already.

* src/abg-dwarf-reader.cc (maybe_canonicalize_type): Delay the
canonicalization of array and qualified types, just like what we
do for classes and function types already.
(maybe_strip_qualification):  Do not
re-canonicalize array and qualified types here because it should
not be necessary anymore.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Fix a memory leak in real_path

Running the testsuite with AddressSanitizer turned on flagged a memory
leak in real_path, in abg-tools-utils.cc.

Fixed thus.

* src/abg-tools-utils.cc (real_path): Fee the returned pointer of
realpath.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Enable building with AddressSanitizer activated

With this patch, configuring the project with the environment variables
ABIGAIL_DEVEL=on and ABIGAIL_DEVEL_ASAN=on will turn on
AddressSanitizer.

* configure.ac: If ABIGAIL_DEVEL_ASAN=on (in addition to
ABIGAIL_DEVEL=on), then turn on AddressSanitizer in the build.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Bug 24431 - ELF reader fails to determine __ksymtab format

This is a fallout from this commit:

ad8c253 Bug 24431 - ELF reader can't interpret ksymtab with Kernel 4.19+

The problem is that the method used to determine if the format of the
__ksymtab section is pre or post v4.19.  In that commit, we assumed
that a kernel without relocations for the __ksymtab section implies a
v4.19 format, because in that format, we don't need relocations for
that section.  But then it can happen that the entire kernel be built
without relocations, in the case a non relocatable kernels.  So just
looking a the presence of relocations is not a good enough
discriminant to determine the format of the __ksymtab section.

This patch rather tries to read the first entry of the __ksymtab
section assuming a pre v4.19 format, comes up with an address, and
then looks the .symtab section up with that address.  If the lookup
succeeds and we find a symbol, then we can reasonably assume that the
__ksymtab section is in the v4.19 format.  Otherwise, we do the same
assuming a v4.19 format, etc.

Tested on v4.9, v4.16 and v4.16 kernels in 32 and 64 bits.

* src/abg-dwarf-reader.cc
(read_context::{try_reading_first_ksymtab_entry_using_pre_v4_19_format,
try_reading_first_ksymtab_entry_using_v4_19_format}): Define new
member functions.
(read_context::maybe_adjust_sym_address_from_v4_19_ksymtab): Make
member function this const.
(read_context::get_ksymtab_format): Implement the new heuristic
here, using try_reading_first_ksymtab_entry_using_pre_v4_19_format
and try_reading_first_ksymtab_entry_using_v4_19_format, rather
than assuming that if we have no relocations, then we are in the
v4.19 format.
(maybe_adjust_sym_address_from_v4_19_ksymtab): When on a 64 bits
architecture, ensure the 32 bits address read from the v4.19
format __ksymtab section is converted into a 64 bits address.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Update .gitignore files to ignore typical dev side products

Add / update .gitignore files for tests/ and tools to ignore
binaries, logs, traces typically produced during development.

* tests/.gitignore: exclude tests binaries and test results
* tools/.gitignore: update to ignore produced binaries

Signed-off-by: Matthias Maennich <maennich@google.com>

dwarf-reader: fix recursion in expr_result::operator&

operator& was implemented in terms of itself, leading to infinite
recursion. Fix that by implementing it in terms of &='ing the const
value. That is consistent with all other ?= operators.

* src/abg-dwarf-reader.cc: fix expr_result::operator&

Signed-off-by: Matthias Maennich <maennich@google.com>

distinct_diff: avoid expression with side effects within typeid

When compiling with clang, the following warning is emitted:

abg-comparison.cc:2674:17: warning: expression with side effects will be
evaluated despite being used as an operand to 'typeid' [-Wpotentially-evaluated-expression]
return typeid(*first.get()) != typeid(*second.get());
^

Mitigate that warning by moving potential side effects out of typeid.

* src/abg-comparison.cc: fix clang warning "potentially-evaluated-expression"

Signed-off-by: Matthias Maennich <maennich@google.com>

ir: drop unused data members from {environment,qualified_name}_setter

The data members environment_setter::artifact_ and
qualified_name_setter::node_ are not used and can therefore be dropped
along with all their references.

* src/abg-ir.cc: drop unused data members

Signed-off-by: Matthias Maennich <maennich@google.com>

suppressions: drop unused parameter from type_is_suppressed

Drop 'require_drop_property' as it is not used at all.
That fixes a clang warning.

* include/abg-suppression-priv.h: drop unused argument from type_is_suppressed

Signed-off-by: Matthias Maennich <maennich@google.com>

viz-dot: remove unused members from dot

_M_canvas and _M_typo are unused within "dot". Remove them and all
references.

* include/abg-viz-dot.h: remove unused data members from 'dot'

Signed-off-by: Matthias Maennich <maennich@google.com>

add missing virtual destructors

Several virtual desctructors were missing. Even though there might not
have been actual leaks or similar bugs, it is worth fixing these
locations as they might lead to bugs in the future.

Clang also warns at these locations:

warning: delete called on non-final 'abigail::ir::corpus' that has virtual
functions but non-virtual destructor [-Wdelete-non-virtual-dtor]

* include/abg-comparison.h: add virtual destructor for corpus_diff and diff_node_visitor
* include/abg-corpus.h: add virtual destructor for corpus
* include/abg-reporter.h: add virtual destructor for reporter_base
* include/abg-traverse.h: add virtual destructor for traversable_base

Signed-off-by: Matthias Maennich <maennich@google.com>

diff-utils: point: fix postfix decrement/increment operator

The postfix increment / decrement operators were implemented by calling
themselves recursively. Fix that by implementing these in terms of their
prefix counter parts.

* include/abg-diff-utils.h: fix postfix dec/inc operator

Signed-off-by: Matthias Maennich <maennich@google.com>

abg-reader: clarify boolean use of assignment

When compiling with clang, the following warning is emitted:

abg-reader.cc:1981:15: warning: using the result of an assignment as a condition without parentheses [-Wparentheses]
while (corp = read_corpus_from_input(ctxt))
~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

That is certainly a common pitfall and can be mitigated by placing
parentheses around the assignment.

* src/abg-reader.cc: clarify boolean use of assignment

Signed-off-by: Matthias Maennich <maennich@google.com>

abilint: fix return types bool -> int

Returning bool literals from main can be misleading. Returning booleans
maps to (by convention):

   return false -> converted to 0 -> rc=0 considered SUCCESS
   return true  -> converted to 1 -> rc=1 considered FAILURE

Compiling with clang also emits:

  abilint.cc:258:7: warning: bool literal returned from 'main' [-Wmain]
      return true;
      ^      ~~~~

The issues can be addressed by consistently returning integers as also
done in all other mains across the project.

Same issue applies to print-diff-tree.cc.

* tools/abilint.cc: return int in main rather than bool.
* tests/print-diff-tree.cc: Likewise.

Signed-off-by: Matthias Maennich <maennich@google.com>

abg-fwd.h: fix mismatched tags for ir_node_visitor

ir_node_visitor is defined as `class` in include/abg-ir.h:4429 and
should therefore also be forward-declared as such.

* include/abg-fwd.h: forward-declare ir_node_visitor as class

Signed-off-by: Matthias Maennich <maennich@google.com>

Bug 24431 - ELF reader can't interpret ksymtab with Kernel 4.19+

Since the commit
https://github.com/torvalds/linux/commit/7290d58095712a89f845e1bca05334796dd49ed2
of the kernel, the format of the ksymtab section changed, so the ELF
reader of Libabigail cannot get the list of kernel symbols that are
meaningful to analyze.

In the new format, each ksymtab entry is now 32 bits length,
independently of if the kernel was built in 32 or 64 bits mode.  This
way, 64 bit kernels save 32 bits per entry, from the ksymtab section
alone.

Also, note that the symbol address in the ksymtab entry is stored in a
"place-relative address" format.  That means for a given symbol value
'Sym_Addr', the value which is stored at a given offset 'Off' of the
ksymtab secthion which address is Ksymtab_Addr is:

  Sym_Addr - Ksymtab_Addr - Off.

This is what is meant by storing Sym_Addr in a "place-relative" manner.

One advantage, of course, is to make Sym_Addr take 32 bits, even on a
64 bits arch.  Another advantage is that the resulting value doesn't
need to be relocated!  So we don't need any relocation section either!
So lots of space saving.

This patch teaches the ELF reader into this new format.

* src/abg-dwarf-reader.cc (enum kernel_symbol_table_kind): Move this
enum at the top.
(enum ksymtab_format): Define new enum.
(read_context::{ksymtab_format_, ksymtab_entry_size_,
nb_ksymtab_entries_, nb_ksymtab_gpl_entries_}): Define new data
members.
(read_context::initiliaze): Initialize the new data members above.
(read_context::{get_ksymtab_format, get_ksymtab_symbol_value_size,
get_ksymtab_entry_size, get_nb_ksymtab_entries,
get_nb_ksymtab_gpl_entries,
maybe_adjust_sym_address_from_v4_19_ksymtab}): Define new member
functions.
(read_context::load_kernel_symbol_table): Support loading from
both pre and post v4.19 linux kernels with their different ksymtab
formats.  Add more comments.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Bug 24430 - Fold away const for array types

In C and C++ the qualifiers of a qualified array type apply to the
element type of said array.

In this particular problem report, GCC and Clang emit DWARF that
represent the qualifiers either on the array type, or on the element
type.  Quoting the test of the bug:

    Given the following example:

struct test {
  const char asdf[4];
};

void func(struct test arg) {}

    abidiff says:

     1 function with some indirect sub-type change:

       [C]'function void func(test)' at test.c:5:1 has some indirect sub-type
     changes:
parameter 1 of type 'struct test' has sub-type changes:
   type size hasn't changed
   1 data member change:
    type of 'const char test::asdf[4]' changed:
      entity changed from 'const char[4]' to 'const char[4] const'

This is because GCC represents the array as const array of const
signed char, whereas clang represents it as an array of const signed
char.

In this patch, libabigail's DWARF reader detects qualified array types
and appropriately qualifies the array element type, instead of
qualifying the array type.  The patch accordingly adjusts the various
regression tests and adds a new test which comes from the problem
report.

* include/abg-fwd.h (is_array_of_qualified_element): Declare 2
overloads of this function.
(re_canonicalize): Declare a new function.
* include/abg-ir.h (class {decl_base, type_base}): Declare
re_canonicalize as a friend of these classes.
* src/abg-dwarf-reader.cc (maybe_strip_qualification): Detect
qualified array types and appropriately qualifies the array
element type, instead of qualifying the array type itself.
Re-canonicalize the resulting type if necessary.
* src/abg-ir.cc (is_array_of_qualified_element): Define 2
overloads of this function.
(re_canonicalize): Define new function.
* tests/data/Makefile.am: The two new test binary input files
PR24430-fold-qualified-array-clang and
PR24430-fold-qualified-array-gcc to source distribution, as well
as the expected reference output.
* tests/data/test-annotate/test15-pr18892.so.abi: Adjust.
* tests/data/test-annotate/test17-pr19027.so.abi: Likewise.
* tests/data/test-annotate/test19-pr19023-libtcmalloc_and_profiler.so.abi:
Likewise.
* tests/data/test-annotate/test21-pr19092.so.abi: Likewise.
* tests/data/test-diff-filter/PR24430-fold-qualified-array-clang:
New binary test input coming from the bug report.
* tests/data/test-diff-filter/PR24430-fold-qualified-array-gcc:
Likewise.
* tests/data/test-diff-filter/PR24430-fold-qualified-array-report-0.txt:
Expected reference abi difference.
* tests/data/test-diff-filter/test33-report-0.txt: Adjust.
* tests/data/test-diff-pkg/nss-3.23.0-1.0.fc23.x86_64-report-0.txt:
Likewise.
* tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Likewise.
* tests/data/test-read-dwarf/test12-pr18844.so.abi: Likewise.
* tests/data/test-read-dwarf/test15-pr18892.so.abi: Likewise.
* tests/data/test-read-dwarf/test17-pr19027.so.abi: Likewise.
* tests/data/test-read-dwarf/test19-pr19023-libtcmalloc_and_profiler.so.abi:
Likewise.
* tests/data/test-read-dwarf/test21-pr19092.so.abi: Likewise.
* tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi:
Likewise.
* tests/test-diff-filter.cc: Add the new binary test input to this
test harness.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Fix "Add test for the fix for PR24410"

Oops, seems like I forgot to add some of the binaries.

There you go.

* tests/data/test-diff-pkg/PR24410-new/poppler-debuginfo-0.73.0-8.fc30.x86_64.rpm:
Really add this.
* tests/data/test-diff-pkg/PR24410-new/poppler-qt5-0.73.0-8.fc30.x86_64.rpm:
Likewise.
* tests/data/test-diff-pkg/PR24410-new/poppler-qt5-debuginfo-0.73.0-8.fc30.x86_64.rpm:
Likewise.
* tests/data/test-diff-pkg/PR24410-new/poppler-qt5-devel-0.73.0-8.fc30.x86_64.rpm:
Likewise.
* tests/data/test-diff-pkg/PR24410-old/poppler-debuginfo-0.73.0-4.fc30.x86_64.rpm:
Likewise.
* tests/data/test-diff-pkg/PR24410-old/poppler-qt5-0.73.0-4.fc30.x86_64.rpm:
Likewise.
* tests/data/test-diff-pkg/PR24410-old/poppler-qt5-debuginfo-0.73.0-4.fc30.x86_64.rpm:
Likewise.
* tests/data/test-diff-pkg/PR24410-old/poppler-qt5-devel-0.73.0-4.fc30.x86_64.rpm:
Likewise.
* tests/data/test-diff-pkg/PR24410-report-0.txt: Likewise.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Add test for the fix for PR24410

PR24410 was fixed by these recent commits:

    1b83138 Propagate private type diff category through refs/qualified type diffs
    dc84fee Fix anonymous union constructed under the wrong context
    522ac25 Internal pretty repr of union cannot be flat representation

But then I forgot to add a regression test for that issue.

This patch does that.

* tests/data/test-diff-pkg/PR24410-new/poppler-debuginfo-0.73.0-8.fc30.x86_64.rpm:
Add new test input.
* tests/data/test-diff-pkg/PR24410-new/poppler-qt5-0.73.0-8.fc30.x86_64.rpm:
Add new test input.
* tests/data/test-diff-pkg/PR24410-new/poppler-qt5-debuginfo-0.73.0-8.fc30.x86_64.rpm:
Add new test input.
* tests/data/test-diff-pkg/PR24410-new/poppler-qt5-devel-0.73.0-8.fc30.x86_64.rpm:
Add new test input.
* tests/data/test-diff-pkg/PR24410-old/poppler-debuginfo-0.73.0-4.fc30.x86_64.rpm:
Add new test input.
* tests/data/test-diff-pkg/PR24410-old/poppler-qt5-0.73.0-4.fc30.x86_64.rpm:
Add new test input.
* tests/data/test-diff-pkg/PR24410-old/poppler-qt5-debuginfo-0.73.0-4.fc30.x86_64.rpm:
Add new test input.
* tests/data/test-diff-pkg/PR24410-old/poppler-qt5-devel-0.73.0-4.fc30.x86_64.rpm:
Add new test input.
* tests/data/test-diff-pkg/PR24410-report-0.txt: Add new test
input.
* tests/data/Makefile.am: Add the test input above to source
distribution.
* tests/test-diff-pkg.cc: Make this test harness use the new input
rpms above.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Propagate private type diff category through refs/qualified type diffs

This patch is the third of the series:

    Internal pretty repr of union cannot be flat representation
    Fix anonymous union constructed under the wrong context
    Propagate private type diff category through refs/qualified type diffs

The intent of this series is to fix the bug:

    https://sourceware.org/bugzilla/show_bug.cgi?id=24410
    "Empty change report emitted for libpoppler-qt5.so.1.18.0"

We (mistakenly) don't propagate private type diff categories through
reference and qualified type diffs.  This leads to some diff nodes not
being suppressed just because they are private type diffs which
category weren't properly propagated.

This patch fixes this.

Note that the tests updated in this patch reflect the regression tests
changes needed for the entire set of 3 patches.

* src/abg-comparison.cc
(suppression_categorization_visitor::visit_end): Propagate
suppressed and private type diff categories for reference and
qualified types.  For qualified types, make sure they don't have
local changes.  Even when there are no local changes, do not
propagate private diff categories to typedefs.
* tests/data/test-annotate/test17-pr19027.so.abi: Adjust.
* tests/data/test-annotate/test21-pr19092.so.abi: Likewise.
* tests/data/test-read-dwarf/PR22122-libftdc.so.abi: Likewise.
* tests/data/test-read-dwarf/test10-pr18818-gcc.so.abi: Likewise.
* tests/data/test-read-dwarf/test11-pr18828.so.abi: Likewise.
* tests/data/test-read-dwarf/test12-pr18844.so.abi: Likewise.
* tests/data/test-read-dwarf/test16-pr18904.so.abi: Likewise.
* tests/data/test-read-dwarf/test17-pr19027.so.abi: Likewise.
* tests/data/test-read-dwarf/test21-pr19092.so.abi: Likewise.
* tests/data/test-read-dwarf/test22-pr19097-libstdc++.so.6.0.17.so.abi: Likewise.
* tests/data/test-read-dwarf/test9-pr18818-clang.so.abi: Likewise.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Fix anonymous union constructed under the wrong context

This patch is second of the series:

    Internal pretty repr of union cannot be flat representation
    Fix anonymous union constructed under the wrong context
    Propagate private type diff category through refs/qualified type diffs

The intent of this series is to fix the bug:

    https://sourceware.org/bugzilla/show_bug.cgi?id=24410
    "Empty change report emitted for libpoppler-qt5.so.1.18.0"

What happens here is that when the DWARF reader sees an anonymous
union/struct, it can mistakenly think that it has seen it before
(because the comparison doesn't take the scope of the union/struct
into account), and thus mistakenly represent the union/struct.

The solution implemented by this patch is to take the scope of the
anonymous union/struct into account.

Note that regression tests are all updated in the last patch of the
series.

* src/abg-dwarf-reader.cc (add_or_update_class_type)
(add_or_update_union_type): Only reuse anonymous class/union types
which have the same scope as the current one.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Internal pretty repr of union cannot be flat representation

This is the first patch of this series:

    Internal pretty repr of union cannot be flat representation
    Fix anonymous union constructed under the wrong context
    Propagate private type diff category through refs/qualified type diffs

The intent of this series is to fix the bug:

    https://sourceware.org/bugzilla/show_bug.cgi?id=24410
    "Empty change report emitted for libpoppler-qt5.so.1.18.0"

The internal pretty representation of a union must be its fully
qualified name, even when it's a anonymous union.  It cannot be its
flat representation as for anonymous unions, that would lead to
confusion between anonymous unions that have the same flat
representation but are in different scopes.

Fixed thus.

Note that regression tests are all updated in the last patch of the
series

      * src/abg-ir.cc (union_decl::get_pretty_representation):
        Anonymous internal pretty representation of unin is its fully
        qualified name.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Misc comment fixes

* src/abg-comp-filter.cc (has_harmless_name_change): Fix comment.
* src/abg-ir.cc (var_decl::get_qualified_name): Likewise.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Bump version number to 1.7

* configure.ac: Bump version number to 1.7

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Update website mainpage for 1.6 release

* doc/website/mainpage.txt: Update for 1.6 release.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Update ChangeLog

* ChangeLog: Update automatically by using "make
update-changelog".

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Update NEWS file for 1.6

* NEWS: Update for 1.6

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Add missing assignment operators

This was detected by compiling with GCC 9.0.1

* include/abg-interned-str.h (interned_string::operator=): Define
assignment operator.
* include/abg-ir.h
({location, enum_type_decl::enumerator}::operator=): Declare
assignment operator.
* src/abg-ir.cc (enum_type_decl::enumerator::operator=): Define
assignment operator.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Bug 24378 - DW_TAG_subroutine_type as a DIE scope causes infinite loop

GCC 4.3.2 wrongly emits some type definition DIEs in the scope of a
DW_TAG_subroutine_type. Whenever the DWARF reader tries to get the
scope of a DIE during the computation of the pretty name of a type DIE
which scope is (wrongly) emitted as being a DW_TAG_subroutine_type
things end-up in an infinite loop.

This patch makes get_scope_die to look through the
DW_TAG_subroutine_type to return the proper scope instead, just like
what we already do for DW_TAG_subprogram and DW_TAG_array_type.

* src/abg-dwarf-reader.cc (get_scope_die): Look through
DW_TAG_subroutine_type to get the scope of a given DIE.
* tests/data/Makefile.am: Add the two new files below to source
distribution.
* tests/data/test-read-dwarf/PR24378-fn-is-not-scope.abi: New
reference test output.
* tests/data/test-read-dwarf/PR24378-fn-is-not-scope.o: New binary
test input.
* tests/test-read-dwarf.cc (in_out_specs): Add the new test input
to the test harness.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>