[COFF] Cope with GCC produced weak aliases referring to comdat functions
authorMartin Storsjo <martin@martin.st>
Fri, 5 Oct 2018 19:43:16 +0000 (19:43 +0000)
committerMartin Storsjo <martin@martin.st>
Fri, 5 Oct 2018 19:43:16 +0000 (19:43 +0000)
For certain cases of inline functions written to comdat sections,
GCC 5.x produces a weak symbol in addition, which would end up
undefined in some cases.

This no longer seems to happen with GCC 6.x or newer though.

Differential Revision: https://reviews.llvm.org/D52602

llvm-svn: 343877

lld/COFF/InputFiles.cpp
lld/test/COFF/Inputs/inline-weak.o [new file with mode: 0644]
lld/test/COFF/Inputs/inline-weak2.o [new file with mode: 0644]
lld/test/COFF/comdat-weak.test [new file with mode: 0644]

index 7c83214..cdbde0c 100644 (file)
@@ -280,6 +280,13 @@ Symbol *ObjFile::createRegular(COFFSymbolRef Sym) {
     COFFObj->getSymbolName(Sym, Name);
     if (SC)
       return Symtab->addRegular(this, Name, Sym.getGeneric(), SC);
+    // For MinGW symbols named .weak.* that point to a discarded section,
+    // don't create an Undefined symbol. If nothing ever refers to the symbol,
+    // everything should be fine. If something actually refers to the symbol
+    // (e.g. the undefined weak alias), linking will fail due to undefined
+    // references at the end.
+    if (Config->MinGW && Name.startswith(".weak."))
+      return nullptr;
     return Symtab->addUndefined(Name, this, false);
   }
   if (SC)
diff --git a/lld/test/COFF/Inputs/inline-weak.o b/lld/test/COFF/Inputs/inline-weak.o
new file mode 100644 (file)
index 0000000..5987e60
Binary files /dev/null and b/lld/test/COFF/Inputs/inline-weak.o differ
diff --git a/lld/test/COFF/Inputs/inline-weak2.o b/lld/test/COFF/Inputs/inline-weak2.o
new file mode 100644 (file)
index 0000000..b413f5b
Binary files /dev/null and b/lld/test/COFF/Inputs/inline-weak2.o differ
diff --git a/lld/test/COFF/comdat-weak.test b/lld/test/COFF/comdat-weak.test
new file mode 100644 (file)
index 0000000..a2b4688
--- /dev/null
@@ -0,0 +1,82 @@
+RUN: lld-link -lldmingw %S/Inputs/inline-weak.o %S/Inputs/inline-weak2.o -out:%t.exe
+
+When compiling certain forms of templated inline functions, some
+versions of GCC (tested with 5.4) produces a weak symbol for the function.
+Newer versions of GCC don't do this though.
+
+The bundled object files are an example of that, they can be produced
+with test code like this:
+
+$ cat inline-weak.h
+class MyClass {
+public:
+    template<typename... _Args> int get(_Args&&... args) {
+        return a;
+    }
+private:
+    int a;
+};
+
+$ cat inline-weak.cpp
+#include "inline-weak.h"
+
+int get(MyClass& a);
+
+int main(int argc, char* argv[]) {
+    MyClass a;
+    int ret = a.get();
+    ret += get(a);
+    return ret;
+}
+extern "C" void mainCRTStartup(void) {
+    main(0, (char**)0);
+}
+extern "C" void __main(void) {
+}
+
+$ cat inline-weak2.cpp
+#include "inline-weak.h"
+
+int get(MyClass& a) {
+    return a.get();
+}
+
+$ x86_64-w64-mingw32-g++ -std=c++11 -c inline-weak.cpp
+$ x86_64-w64-mingw32-g++ -std=c++11 -c inline-weak2.cpp
+
+$ x86_64-w64-mingw32-nm inline-weak.o | grep MyClass3get
+0000000000000000 p .pdata$_ZN7MyClass3getIJEEEiDpOT_
+0000000000000000 t .text$_ZN7MyClass3getIJEEEiDpOT_
+0000000000000000 T .weak._ZN7MyClass3getIIEEEiDpOT_.main
+0000000000000000 r .xdata$_ZN7MyClass3getIJEEEiDpOT_
+                 w _ZN7MyClass3getIIEEEiDpOT_
+0000000000000000 T _ZN7MyClass3getIJEEEiDpOT_
+
+$ x86_64-w64-mingw32-nm inline-weak2.o | grep MyClass3get
+0000000000000000 p .pdata$_ZN7MyClass3getIJEEEiDpOT_
+0000000000000000 t .text$_ZN7MyClass3getIJEEEiDpOT_
+0000000000000000 T .weak._ZN7MyClass3getIIEEEiDpOT_._Z3getR7MyClass
+0000000000000000 r .xdata$_ZN7MyClass3getIJEEEiDpOT_
+                 w _ZN7MyClass3getIIEEEiDpOT_
+0000000000000000 T _ZN7MyClass3getIJEEEiDpOT_
+
+This can't be reproduced by assembling .s files with llvm-mc, since that
+always produces a symbol named .weak.<weaksymbol>.default, therefore
+the test uses prebuilt object files instead.
+
+In these cases, the undefined weak symbol points to the regular symbol
+.weak._ZN7MyClass3getIIEEEiDpOT_.<othersymbol>, where <othersymbol>
+varies among the object files that emit the same function. This regular
+symbol points to the same location as the comdat function
+_ZN7MyClass3getIJEEEiDpOT_.
+
+When linking, the comdat section from the second object file gets
+discarded, as it matches the one that already exists. This means that
+the uniquely named symbol .weak.<weakname>.<othername> points to a
+discarded section chunk.
+
+Previously, this would have triggered adding an Undefined symbol for
+this case, which would later break linking. However, also previously,
+if the second object file is linked in via a static library, this
+leftover symbol is retained as a Lazy symbol, which would make the link
+succeed.