[lto] Do not try to internalize symbols with escaped name
authorserge-sans-paille <sguelton@redhat.com>
Thu, 13 Oct 2022 08:26:41 +0000 (10:26 +0200)
committerserge-sans-paille <serge.guelton@telecom-bretagne.eu>
Fri, 14 Oct 2022 20:34:17 +0000 (22:34 +0200)
commit232e0a011e8c07053bdc0156f312046eb09f52b3
tree181809e710161ddd53314af008a3b20360959a31
parent0c8dde551c801f319271fe662e82fef462dd07e0
[lto] Do not try to internalize symbols with escaped name

Because of LLVM mangling escape sequence (through '\01' prefix), it is possible
for a single symbols two have two different IR representations.

For instance, consider @symbol and @"\01_symbol". On OSX, because of the system
mangling rules, these two IR names point are converted in the same final symbol
upon linkage.

LTO doesn't model this behavior, which may result in symbols being incorrectly
internalized (if all reference use the escaping sequence while the definition
doesn't).

The proper approach is probably to use the mangled name to compute GUID to
avoid the dual representation, but we can also avoid discarding symbols that are
bound to two different IR names. This is an approximation, but it's less
intrusive on the codebase.

Fix #57864

Differential Revision: https://reviews.llvm.org/D135710
llvm/lib/LTO/LTO.cpp
llvm/test/LTO/X86/hidden-escaped-symbols-alt.ll [new file with mode: 0644]
llvm/test/LTO/X86/hidden-escaped-symbols.ll [new file with mode: 0644]
llvm/test/ThinLTO/X86/hidden-escaped-symbols-alt.ll [new file with mode: 0644]
llvm/test/ThinLTO/X86/hidden-escaped-symbols.ll [new file with mode: 0644]