[lld][WebAssembly] Allow data symbols to extend past end of segment
authorSam Clegg <sbc@chromium.org>
Tue, 11 May 2021 22:16:00 +0000 (15:16 -0700)
committerSam Clegg <sbc@chromium.org>
Wed, 12 May 2021 20:43:37 +0000 (13:43 -0700)
This fixes a bug with string merging with string symbols that contain
NULLs, as is the case in the `merge-string.s` test.

The bug only showed when we run with `--relocatable` and then try read
the resulting object back in.  In this case we would end up with string
symbols that extend past the end of the segment in which they live.

The problem comes from the fact that sections which are flagged as
string mergable assume that all strings are NULL terminated.  The
merging algorithm will drop trailing chars that follow a NULL since they
are essentially unreachable.  However, the "size" attribute (in the
symbol table) of such a truncated symbol is not updated resulting a
symbol size that can overlap the end of the segment.

I verified that this can happen in ELF too given the right conditions
and the its harmless enough.  In practice Strings that contain embedded
null should not be part of a mergable section.

Differential Revision: https://reviews.llvm.org/D102281

lld/test/wasm/merge-string.s
llvm/lib/Object/WasmObjectFile.cpp
llvm/test/Object/wasm-bad-data-symbol.yaml [new file with mode: 0644]

index c422bc7..a4b89ab 100644 (file)
 // RUN: wasm-ld -O0 %t.o -o %t2.wasm --no-gc-sections --no-entry
 // RUN: obj2yaml %t2.wasm | FileCheck --check-prefixes=COMMON,NOMERGE %s
 
+// Check relocatable
+// RUN: wasm-ld -r %t.o -o %t2.o
+// RUN: obj2yaml %t2.o | FileCheck --check-prefixes=RELOC %s
+
         .section .rodata1,"S",@
         .asciz "abc"
 foo:
@@ -18,7 +22,7 @@ foo:
 bar:
         .asciz "bc"
         .asciz "bc"
-        .size bar, 4
+        .size bar, 6
 
         .section .rodata_relocs,"",@
 negative_addend:
@@ -74,3 +78,24 @@ negative_addend:
 //  COMMON-NEXT:          Value:           1024
 //   MERGE-NEXT:          Content:         '61626300'
 // NOMERGE-NEXT:          Content:         '6162630061626300626300'
+
+
+//      RELOC:  - Type:            DATA
+// RELOC-NEXT:    Relocations:
+// RELOC-NEXT:      - Type:            R_WASM_MEMORY_ADDR_I32
+// RELOC-NEXT:        Index:           0
+// RELOC-NEXT:        Offset:          0xF
+// RELOC-NEXT:        Addend:          -10
+// RELOC-NEXT:    Segments:
+// RELOC-NEXT:      - SectionOffset:   6
+// RELOC-NEXT:        InitFlags:       0
+// RELOC-NEXT:        Offset:
+// RELOC-NEXT:          Opcode:          I32_CONST
+// RELOC-NEXT:          Value:           0
+// RELOC-NEXT:        Content:         '61626300'
+// RELOC-NEXT:      - SectionOffset:   15
+// RELOC-NEXT:        InitFlags:       0
+// RELOC-NEXT:        Offset:
+// RELOC-NEXT:          Opcode:          I32_CONST
+// RELOC-NEXT:          Value:           4
+// RELOC-NEXT:        Content:         F6FFFFFF
index df64a46..8004c5e 100644 (file)
@@ -637,9 +637,12 @@ Error WasmObjectFile::parseLinkingSectionSymtab(ReadContext &Ctx) {
                                                 object_error::parse_failed);
         auto Offset = readVaruint64(Ctx);
         auto Size = readVaruint64(Ctx);
-        if (Offset + Size > DataSegments[Index].Data.Content.size())
-          return make_error<GenericBinaryError>("invalid data symbol offset",
-                                                object_error::parse_failed);
+        size_t SegmentSize = DataSegments[Index].Data.Content.size();
+        if (Offset > SegmentSize)
+          return make_error<GenericBinaryError>(
+              "invalid data symbol offset: `" + Info.Name + "` (offset: " +
+                  Twine(Offset) + " segment size: " + Twine(SegmentSize) + ")",
+              object_error::parse_failed);
         Info.DataRef = wasm::WasmDataReference{Index, Offset, Size};
       }
       break;
diff --git a/llvm/test/Object/wasm-bad-data-symbol.yaml b/llvm/test/Object/wasm-bad-data-symbol.yaml
new file mode 100644 (file)
index 0000000..a30b2fe
--- /dev/null
@@ -0,0 +1,31 @@
+# RUN: yaml2obj %s | not llvm-objdump -s - 2>&1 | FileCheck %s
+
+# Check that data symbols must have and offset that is within the
+# bounds of the containing segment
+
+# CHECK: invalid data symbol offset: `foo` (offset: 42 segment size: 5)
+
+--- !WASM
+FileHeader:
+  Version:         0x00000001
+Sections:
+  - Type:            DATA
+    Segments:
+      - SectionOffset:   0
+        InitFlags:       0
+        Offset:
+          Opcode:          I32_CONST
+          Value:           0
+        Content:         '6401020304'
+  - Type:            CUSTOM
+    Name:            linking
+    Version:         2
+    SymbolTable:
+      - Index:           0
+        Kind:            DATA
+        Name:            foo
+        Flags:           [ ]
+        Segment:         0
+        Offset:          42
+        Size:            1
+...