From: Tamar Christina Date: Mon, 25 Mar 2019 12:16:17 +0000 (+0000) Subject: Arm: Fix Arm disassembler mapping symbol search. X-Git-Tag: binutils-2_33~1748 X-Git-Url: http://review.tizen.org/git/?a=commitdiff_plain;h=796d6298bb11deab06814cc38cfe74a1bfc57551;p=platform%2Fupstream%2Fbinutils.git Arm: Fix Arm disassembler mapping symbol search. Similar to the AArch64 patches the Arm disassembler has the same issues with out of order sections but also a few short comings. For one thing there are multiple code blocks to determine mapping symbols, and they all work slightly different, and neither fully correct. The first thing this patch does is centralise the mapping symbols search into one function mapping_symbol_for_insn. This function is then updated to perform a search in a similar way as AArch64. Their used to be a value has_mapping_symbols which was used to determine the default disassembly for objects that have no mapping symbols. The problem with the approach was that it was determining this value in the same loop that needed it, which is why this field could take on the states -1, 0, 1 where -1 means "don't know". However this means that until you actually find a mapping symbol or reach the end of the disassembly glob, you don't know if you did the right action or not, and if you didn't you can't correct it anymore. This is why the two jump-reloc-veneers-* testcases end up disassembling some insn as data when they shouldn't. Out of order here refers to an object file where sections are not listed in a monotonic increasing VMA order. The ELF ABI for Arm [1] specifies the following for mapping symbols: 1) A text section must always have a corresponding mapping symbol at it's start. 2) Data sections do not require any mapping symbols. 3) The range of a mapping symbol extends from the address it starts on up to the next mapping symbol (exclusive) or section end (inclusive). However there is no defined order between a symbol and it's corresponding mapping symbol in the symbol table. This means that while in general we look up for a corresponding mapping symbol, we have to make at least one check of the symbol below the address being disassembled. When disassembling different PCs within the same section, the search for mapping symbol can be cached somewhat. We know that the mapping symbol corresponding to the current PC is either the previous one used, or one at the same address as the current PC. However this optimization and mapping symbol search must stop as soon as we reach the end or start of the section. Furthermore if we're only disassembling a part of a section, the search is a allowed to search further than the current chunk, but is not allowed to search past it (The mapping symbol if there, must be at the same address, so in practice we usually stop at PC+4). lastly, since only data sections don't require a mapping symbol the default mapping type should be DATA and not INSN as previously defined, however if the binary has had all its symbols stripped than this isn't very useful. To fix this we determine the default based on the section flags. This will allow the disassembler to be more useful on stripped binaries. If there is no section than we assume you to be disassembling INSN. [1] https://developer.arm.com/docs/ihi0044/latest/elf-for-the-arm-architecture-abi-2018q4-documentation#aaelf32-table4-7 binutils/ChangeLog: * testsuite/binutils-all/arm/in-order-all.d: New test. * testsuite/binutils-all/arm/in-order.d: New test. * testsuite/binutils-all/arm/objdump.exp: Support .d tests. * testsuite/binutils-all/arm/out-of-order-all.d: New test. * testsuite/binutils-all/arm/out-of-order.T: New test. * testsuite/binutils-all/arm/out-of-order.d: New test. * testsuite/binutils-all/arm/out-of-order.s: New test. ld/ChangeLog: * testsuite/ld-arm/jump-reloc-veneers-cond-long.d: Update disassembly. * testsuite/ld-arm/jump-reloc-veneers-long.d: Update disassembly. opcodes/ChangeLog: * arm-dis.c (struct arm_private_data): Remove has_mapping_symbols. (mapping_symbol_for_insn): Implement new algorithm. (print_insn): Remove duplicate code. --- diff --git a/binutils/ChangeLog b/binutils/ChangeLog index aad8bdd..83aa37c 100644 --- a/binutils/ChangeLog +++ b/binutils/ChangeLog @@ -1,5 +1,16 @@ 2019-03-25 Tamar Christina + * testsuite/binutils-all/arm/in-order-all.d: New test. + * testsuite/binutils-all/arm/in-order.d: New test. + * testsuite/binutils-all/arm/objdump.exp: Support .d tests. + * testsuite/binutils-all/arm/out-of-order-all.d: New test. + * testsuite/binutils-all/arm/out-of-order.T: New test. + * testsuite/binutils-all/arm/out-of-order.d: New test. + * testsuite/binutils-all/arm/out-of-order.s: New test. + + +2019-03-25 Tamar Christina + * testsuite/binutils-all/aarch64/in-order-all.d: New test. * testsuite/binutils-all/aarch64/out-of-order-all.d: New test. * testsuite/binutils-all/aarch64/out-of-order.d: diff --git a/binutils/testsuite/binutils-all/arm/in-order-all.d b/binutils/testsuite/binutils-all/arm/in-order-all.d new file mode 100644 index 0000000..3a098dd --- /dev/null +++ b/binutils/testsuite/binutils-all/arm/in-order-all.d @@ -0,0 +1,50 @@ +#PROG: objcopy +#source: out-of-order.s +#ld: -e v1 -Ttext-segment=0x400000 +#objdump: -D +#name: Check if disassembler can handle all sections in default order + +.*: +file format .*arm.* + +Disassembly of section \.func1: + +00400000 : + 400000: e0800001 add r0, r0, r1 + 400004: 00000000 andeq r0, r0, r0 + +Disassembly of section \.func2: + +00400008 <\.func2>: + 400008: e0800001 add r0, r0, r1 + +Disassembly of section \.func3: + +0040000c <\.func3>: + 40000c: e0800001 add r0, r0, r1 + 400010: e0800001 add r0, r0, r1 + 400014: e0800001 add r0, r0, r1 + 400018: e0800001 add r0, r0, r1 + 40001c: e0800001 add r0, r0, r1 + 400020: 00000000 andeq r0, r0, r0 + +Disassembly of section \.rodata: + +00400024 <\.rodata>: + 400024: 00000004 andeq r0, r0, r4 + +Disassembly of section \.global: + +00410028 <__data_start>: + 410028: 00000001 andeq r0, r0, r1 + 41002c: 00000001 andeq r0, r0, r1 + 410030: 00000001 andeq r0, r0, r1 + +Disassembly of section \.ARM\.attributes: + +00000000 <\.ARM\.attributes>: + 0: 00001141 andeq r1, r0, r1, asr #2 + 4: 61656100 cmnvs r5, r0, lsl #2 + 8: 01006962 tsteq r0, r2, ror #18 + c: 00000007 andeq r0, r0, r7 + 10: Address 0x0000000000000010 is out of bounds. + diff --git a/binutils/testsuite/binutils-all/arm/in-order.d b/binutils/testsuite/binutils-all/arm/in-order.d new file mode 100644 index 0000000..a0b63c2 --- /dev/null +++ b/binutils/testsuite/binutils-all/arm/in-order.d @@ -0,0 +1,28 @@ +#PROG: objcopy +#source: out-of-order.s +#ld: -e v1 -Ttext-segment=0x400000 +#objdump: -d +#name: Check if disassembler can handle sections in default order + +.*: +file format .*arm.* + +Disassembly of section \.func1: + +00400000 : + 400000: e0800001 add r0, r0, r1 + 400004: 00000000 \.word 0x00000000 + +Disassembly of section \.func2: + +00400008 <\.func2>: + 400008: e0800001 add r0, r0, r1 + +Disassembly of section \.func3: + +0040000c <\.func3>: + 40000c: e0800001 add r0, r0, r1 + 400010: e0800001 add r0, r0, r1 + 400014: e0800001 add r0, r0, r1 + 400018: e0800001 add r0, r0, r1 + 40001c: e0800001 add r0, r0, r1 + 400020: 00000000 \.word 0x00000000 diff --git a/binutils/testsuite/binutils-all/arm/objdump.exp b/binutils/testsuite/binutils-all/arm/objdump.exp index 5013b18..33e3fd1 100644 --- a/binutils/testsuite/binutils-all/arm/objdump.exp +++ b/binutils/testsuite/binutils-all/arm/objdump.exp @@ -111,3 +111,17 @@ if {![binutils_assemble $srcdir/$subdir/rvct_symbol.s tmpdir/rvct_symbol.o]} the fail "skip rvct symbol" } } + +########################### +# Set up generic test framework +########################### + +set tempfile tmpdir/armtemp.o +set copyfile tmpdir/armcopy + +set test_list [lsort [glob -nocomplain $srcdir/$subdir/*.d]] +foreach t $test_list { + # We need to strip the ".d", but can leave the dirname. + verbose [file rootname $t] + run_dump_test [file rootname $t] +} diff --git a/binutils/testsuite/binutils-all/arm/out-of-order-all.d b/binutils/testsuite/binutils-all/arm/out-of-order-all.d new file mode 100644 index 0000000..58c4057 --- /dev/null +++ b/binutils/testsuite/binutils-all/arm/out-of-order-all.d @@ -0,0 +1,50 @@ +#PROG: objcopy +#source: out-of-order.s +#ld: -T out-of-order.T +#objdump: -D +#name: Check if disassembler can handle all sections in different order than header + +.*: +file format .*arm.* + +Disassembly of section \.global: + +ffe00000 <\.global>: +ffe00000: 00000001 andeq r0, r0, r1 +ffe00004: 00000001 andeq r0, r0, r1 +ffe00008: 00000001 andeq r0, r0, r1 + +Disassembly of section \.func2: + +04018280 <\.func2>: + 4018280: e0800001 add r0, r0, r1 + +Disassembly of section \.func1: + +04005000 : + 4005000: e0800001 add r0, r0, r1 + 4005004: 00000000 andeq r0, r0, r0 + +Disassembly of section \.func3: + +04015000 <\.func3>: + 4015000: e0800001 add r0, r0, r1 + 4015004: e0800001 add r0, r0, r1 + 4015008: e0800001 add r0, r0, r1 + 401500c: e0800001 add r0, r0, r1 + 4015010: e0800001 add r0, r0, r1 + 4015014: 00000000 andeq r0, r0, r0 + +Disassembly of section \.rodata: + +04015018 <\.rodata>: + 4015018: 00000004 andeq r0, r0, r4 + +Disassembly of section \.ARM\.attributes: + +00000000 <\.ARM\.attributes>: + 0: 00001141 andeq r1, r0, r1, asr #2 + 4: 61656100 cmnvs r5, r0, lsl #2 + 8: 01006962 tsteq r0, r2, ror #18 + c: 00000007 andeq r0, r0, r7 + 10: Address 0x0000000000000010 is out of bounds. + diff --git a/binutils/testsuite/binutils-all/arm/out-of-order.T b/binutils/testsuite/binutils-all/arm/out-of-order.T new file mode 100644 index 0000000..489ae80 --- /dev/null +++ b/binutils/testsuite/binutils-all/arm/out-of-order.T @@ -0,0 +1,14 @@ +ENTRY(v1) +SECTIONS +{ + . = 0xffe00000; + .global : { *(.global) } + . = 0x4018280; + .func2 : { *(.func2) } + . = 0x4005000; + .func1 : { *(.func1) } + . = 0x4015000; + .func3 : { *(.func3) } + .data : { *(.data) } + .rodata : { *(.rodata) } +} \ No newline at end of file diff --git a/binutils/testsuite/binutils-all/arm/out-of-order.d b/binutils/testsuite/binutils-all/arm/out-of-order.d new file mode 100644 index 0000000..9351af7 --- /dev/null +++ b/binutils/testsuite/binutils-all/arm/out-of-order.d @@ -0,0 +1,27 @@ +#PROG: objcopy +#ld: -T out-of-order.T +#objdump: -d +#name: Check if disassembler can handle sections in different order than header + +.*: +file format .*arm.* + +Disassembly of section \.func2: + +04018280 <\.func2>: + 4018280: e0800001 add r0, r0, r1 + +Disassembly of section \.func1: + +04005000 : + 4005000: e0800001 add r0, r0, r1 + 4005004: 00000000 \.word 0x00000000 + +Disassembly of section \.func3: + +04015000 <\.func3>: + 4015000: e0800001 add r0, r0, r1 + 4015004: e0800001 add r0, r0, r1 + 4015008: e0800001 add r0, r0, r1 + 401500c: e0800001 add r0, r0, r1 + 4015010: e0800001 add r0, r0, r1 + 4015014: 00000000 \.word 0x00000000 diff --git a/binutils/testsuite/binutils-all/arm/out-of-order.s b/binutils/testsuite/binutils-all/arm/out-of-order.s new file mode 100644 index 0000000..4e43ddf --- /dev/null +++ b/binutils/testsuite/binutils-all/arm/out-of-order.s @@ -0,0 +1,29 @@ + .text + .arm + .global v1 + .section .func1,"ax",%progbits + .type v1 %function + .size v1, 4 +v1: + add r0, r0, r1 + .word 0 + + .section .func2,"ax",%progbits + add r0, r0, r1 + + .section .func3,"ax",%progbits + add r0, r0, r1 + add r0, r0, r1 + add r0, r0, r1 + add r0, r0, r1 + add r0, r0, r1 + .word 0 + + .data + .section .global,"aw",%progbits + .word 1 + .word 1 + .word 1 + + .section .rodata + .word 4 diff --git a/ld/ChangeLog b/ld/ChangeLog index 2853c09..0029150 100644 --- a/ld/ChangeLog +++ b/ld/ChangeLog @@ -1,3 +1,8 @@ +2019-03-25 Tamar Christina + + * testsuite/ld-arm/jump-reloc-veneers-cond-long.d: Update disassembly. + * testsuite/ld-arm/jump-reloc-veneers-long.d: Update disassembly. + 2019-03-21 Sudakshina Das * testsuite/ld-aarch64/aarch64-elf.exp: Add new test. diff --git a/ld/testsuite/ld-arm/jump-reloc-veneers-cond-long.d b/ld/testsuite/ld-arm/jump-reloc-veneers-cond-long.d index d818cf5..88481f0 100644 --- a/ld/testsuite/ld-arm/jump-reloc-veneers-cond-long.d +++ b/ld/testsuite/ld-arm/jump-reloc-veneers-cond-long.d @@ -10,7 +10,7 @@ Disassembly of section destsect: Disassembly of section .text: 000080.. <[^>]*>: - 80..: (8002f040|f0408002) .word 0x(8002f040|f0408002) + 80..: f040 8002 bne.w 8008 <__dest_veneer> 80..: 0000 movs r0, r0 ... diff --git a/ld/testsuite/ld-arm/jump-reloc-veneers-long.d b/ld/testsuite/ld-arm/jump-reloc-veneers-long.d index 6bd5652..ae176be 100644 --- a/ld/testsuite/ld-arm/jump-reloc-veneers-long.d +++ b/ld/testsuite/ld-arm/jump-reloc-veneers-long.d @@ -10,8 +10,9 @@ Disassembly of section destsect: Disassembly of section .text: 000080.. <[^>]*>: - 80..: (b802f000|f000b802) .word 0x(b802f000|f000b802) - 80..: 00000000 andeq r0, r0, r0 + 80..: f000 b802 b.w 8008 <__dest_veneer> + 80..: 0000 movs r0, r0 + ... 000080.. <[^>]*>: 80..: 4778 bx pc diff --git a/opcodes/ChangeLog b/opcodes/ChangeLog index 433e43f..7b7237f 100644 --- a/opcodes/ChangeLog +++ b/opcodes/ChangeLog @@ -1,5 +1,11 @@ 2019-03-25 Tamar Christina + * arm-dis.c (struct arm_private_data): Remove has_mapping_symbols. + (mapping_symbol_for_insn): Implement new algorithm. + (print_insn): Remove duplicate code. + +2019-03-25 Tamar Christina + * aarch64-dis.c (print_insn_aarch64): Implement override. diff --git a/opcodes/arm-dis.c b/opcodes/arm-dis.c index 71d7c52..d47ef32 100644 --- a/opcodes/arm-dis.c +++ b/opcodes/arm-dis.c @@ -56,15 +56,14 @@ struct arm_private_data /* The features to use when disassembling optional instructions. */ arm_feature_set features; - /* Whether any mapping symbols are present in the provided symbol - table. -1 if we do not know yet, otherwise 0 or 1. */ - int has_mapping_symbols; - /* Track the last type (although this doesn't seem to be useful) */ enum map_type last_type; /* Tracking symbol table information */ int last_mapping_sym; + + /* The end range of the current range being disassembled. */ + bfd_vma last_stop_offset; bfd_vma last_mapping_addr; }; @@ -6351,52 +6350,114 @@ static bfd_boolean mapping_symbol_for_insn (bfd_vma pc, struct disassemble_info *info, enum map_type *map_symbol) { - bfd_vma addr; - int n, start = 0; + bfd_vma addr, section_vma = 0; + int n, last_sym = -1; bfd_boolean found = FALSE; - enum map_type type = MAP_ARM; + bfd_boolean can_use_search_opt_p = FALSE; + + /* Default to DATA. A text section is required by the ABI to contain an + INSN mapping symbol at the start. A data section has no such + requirement, hence if no mapping symbol is found the section must + contain only data. This however isn't very useful if the user has + fully stripped the binaries. If this is the case use the section + attributes to determine the default. If we have no section default to + INSN as well, as we may be disassembling some raw bytes on a baremetal + HEX file or similar. */ + enum map_type type = MAP_DATA; + if ((info->section && info->section->flags & SEC_CODE) || !info->section) + type = MAP_ARM; struct arm_private_data *private_data; - if (info->private_data == NULL || info->symtab_size == 0 + if (info->private_data == NULL || bfd_asymbol_flavour (*info->symtab) != bfd_target_elf_flavour) return FALSE; private_data = info->private_data; - if (pc == 0) - start = 0; - else - start = private_data->last_mapping_sym; - start = (start == -1)? 0 : start; - addr = bfd_asymbol_value (info->symtab[start]); + /* First, look for mapping symbols. */ + if (info->symtab_size != 0) + { + if (pc <= private_data->last_mapping_addr) + private_data->last_mapping_sym = -1; + + /* Start scanning at the start of the function, or wherever + we finished last time. */ + n = info->symtab_pos + 1; + + /* If the last stop offset is different from the current one it means we + are disassembling a different glob of bytes. As such the optimization + would not be safe and we should start over. */ + can_use_search_opt_p + = private_data->last_mapping_sym >= 0 + && info->stop_offset == private_data->last_stop_offset; + + if (n >= private_data->last_mapping_sym && can_use_search_opt_p) + n = private_data->last_mapping_sym; + + /* Look down while we haven't passed the location being disassembled. + The reason for this is that there's no defined order between a symbol + and an mapping symbol that may be at the same address. We may have to + look at least one position ahead. */ + for (; n < info->symtab_size; n++) + { + addr = bfd_asymbol_value (info->symtab[n]); + if (addr > pc) + break; + if (get_map_sym_type (info, n, &type)) + { + last_sym = n; + found = TRUE; + } + } - if (pc >= addr) - { - if (get_map_sym_type (info, start, &type)) - found = TRUE; - } - else + if (!found) + { + n = info->symtab_pos; + if (n >= private_data->last_mapping_sym && can_use_search_opt_p) + n = private_data->last_mapping_sym; + + /* No mapping symbol found at this address. Look backwards + for a preceeding one, but don't go pass the section start + otherwise a data section with no mapping symbol can pick up + a text mapping symbol of a preceeding section. The documentation + says section can be NULL, in which case we will seek up all the + way to the top. */ + if (info->section) + section_vma = info->section->vma; + + for (; n >= 0; n--) + { + addr = bfd_asymbol_value (info->symtab[n]); + if (addr < section_vma) + break; + + if (get_map_sym_type (info, n, &type)) + { + last_sym = n; + found = TRUE; + break; + } + } + } + } + + /* If no mapping symbol was found, try looking up without a mapping + symbol. This is done by walking up from the current PC to the nearest + symbol. We don't actually have to loop here since symtab_pos will + contain the nearest symbol already. */ + if (!found) { - for (n = start - 1; n >= 0; n--) + n = info->symtab_pos; + if (n >= 0 && get_sym_code_type (info, n, &type)) { - if (get_map_sym_type (info, n, &type)) - { - found = TRUE; - break; - } + last_sym = n; + found = TRUE; } } - /* No mapping symbols were found. A leading $d may be - omitted for sections which start with data; but for - compatibility with legacy and stripped binaries, only - assume the leading $d if there is at least one mapping - symbol in the file. */ - if (!found && private_data->has_mapping_symbols == 1) - { - type = MAP_DATA; - found = TRUE; - } + private_data->last_mapping_sym = last_sym; + private_data->last_type = type; + private_data->last_stop_offset = info->stop_offset; *map_symbol = type; return found; @@ -6535,9 +6596,9 @@ print_insn (bfd_vma pc, struct disassemble_info *info, bfd_boolean little) during disassembly.... */ select_arm_features (info->mach, & private.features); - private.has_mapping_symbols = -1; private.last_mapping_sym = -1; private.last_mapping_addr = 0; + private.last_stop_offset = 0; info->private_data = & private; } @@ -6554,121 +6615,13 @@ print_insn (bfd_vma pc, struct disassemble_info *info, bfd_boolean little) && bfd_asymbol_flavour (*info->symtab) == bfd_target_elf_flavour) { bfd_vma addr; - int n, start; + int n; int last_sym = -1; enum map_type type = MAP_ARM; - /* Start scanning at the start of the function, or wherever - we finished last time. */ - /* PR 14006. When the address is 0 we are either at the start of the - very first function, or else the first function in a new, unlinked - executable section (eg because of -ffunction-sections). Either way - start scanning from the beginning of the symbol table, not where we - left off last time. */ - if (pc == 0) - start = 0; - else - { - start = info->symtab_pos + 1; - if (start < private_data->last_mapping_sym) - start = private_data->last_mapping_sym; - } - found = FALSE; - - /* First, look for mapping symbols. */ - if (private_data->has_mapping_symbols != 0) - { - /* Scan up to the location being disassembled. */ - for (n = start; n < info->symtab_size; n++) - { - addr = bfd_asymbol_value (info->symtab[n]); - if (addr > pc) - break; - if (get_map_sym_type (info, n, &type)) - { - last_sym = n; - found = TRUE; - } - } - - if (!found) - { - /* No mapping symbol found at this address. Look backwards - for a preceding one. */ - for (n = start - 1; n >= 0; n--) - { - if (get_map_sym_type (info, n, &type)) - { - last_sym = n; - found = TRUE; - break; - } - } - } - - if (found) - private_data->has_mapping_symbols = 1; - - /* No mapping symbols were found. A leading $d may be - omitted for sections which start with data; but for - compatibility with legacy and stripped binaries, only - assume the leading $d if there is at least one mapping - symbol in the file. */ - if (!found && private_data->has_mapping_symbols == -1) - { - /* Look for mapping symbols, in any section. */ - for (n = 0; n < info->symtab_size; n++) - if (is_mapping_symbol (info, n, &type)) - { - private_data->has_mapping_symbols = 1; - break; - } - if (private_data->has_mapping_symbols == -1) - private_data->has_mapping_symbols = 0; - } - - if (!found && private_data->has_mapping_symbols == 1) - { - type = MAP_DATA; - found = TRUE; - } - } - - /* Next search for function symbols to separate ARM from Thumb - in binaries without mapping symbols. */ - if (!found) - { - /* Scan up to the location being disassembled. */ - for (n = start; n < info->symtab_size; n++) - { - addr = bfd_asymbol_value (info->symtab[n]); - if (addr > pc) - break; - if (get_sym_code_type (info, n, &type)) - { - last_sym = n; - found = TRUE; - } - } - - if (!found) - { - /* No mapping symbol found at this address. Look backwards - for a preceding one. */ - for (n = start - 1; n >= 0; n--) - { - if (get_sym_code_type (info, n, &type)) - { - last_sym = n; - found = TRUE; - break; - } - } - } - } + found = mapping_symbol_for_insn (pc, info, &type); + last_sym = private_data->last_mapping_sym; - private_data->last_mapping_sym = last_sym; - private_data->last_type = type; is_thumb = (private_data->last_type == MAP_THUMB); is_data = (private_data->last_type == MAP_DATA);