platform/upstream/llvm.git
2 years ago[lldb/test] Allow indentation in inline tests
Pavel Labath [Thu, 28 Oct 2021 09:23:24 +0000 (11:23 +0200)]
[lldb/test] Allow indentation in inline tests

This makes it possible to use for loops (and other language constructs)
in inline tests.

Differential Revision: https://reviews.llvm.org/D112706

2 years ago[InstCombine] allow Negator to fold multi-use select with constant arms
Sanjay Patel [Thu, 28 Oct 2021 12:11:59 +0000 (08:11 -0400)]
[InstCombine] allow Negator to fold multi-use select with constant arms

The motivating test is reduced from:
https://llvm.org/PR52261

Note that the more general problem of folding any binop into a multi-use
select of constants is still there. We need to ease the restriction in
InstCombinerImpl::FoldOpIntoSelect() to catch those. But these examples
never reach that code because Negator exclusively handles negation
patterns within visitSub().

Differential Revision: https://reviews.llvm.org/D112657

2 years ago[InstCombine][ConstantFolding] Make ConstantFoldLoadThroughBitcast TypeSize-aware
Peter Waller [Thu, 28 Oct 2021 12:14:52 +0000 (12:14 +0000)]
[InstCombine][ConstantFolding] Make ConstantFoldLoadThroughBitcast TypeSize-aware

The newly added test previously caused the compiler to fail an
assertion. It looks like a strightforward TypeSize upgrade.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D112142

2 years ago[InstSimplify] Add tests for the range of a half float. NFC
David Green [Thu, 28 Oct 2021 11:58:13 +0000 (12:58 +0100)]
[InstSimplify] Add tests for the range of a half float. NFC

2 years ago[GlobalISel][Tablegen] Fix SameOperandMatcher's isIdentical check
Konstantin Schwarz [Sun, 10 Oct 2021 09:17:07 +0000 (11:17 +0200)]
[GlobalISel][Tablegen] Fix SameOperandMatcher's isIdentical check

During rule optimization, identical SameOperandMatchers are hoisted into a common group,
however previously only one operand index was considered.
Commutable patterns can introduce SameOperandMatcher checks where the second index is commuted,
resulting in a different check that cannot be hoisted.

Reviewed By: qcolombet

Differential Revision: https://reviews.llvm.org/D111506

2 years ago[libomptarget] Build DeviceRTL for amdgpu
Jon Chesterfield [Thu, 28 Oct 2021 11:33:25 +0000 (12:33 +0100)]
[libomptarget] Build DeviceRTL for amdgpu

Passes same tests as the current deviceRTL. Includes cmake change from D111987.
CI is showing a different set of pass/fails to local, committing this
without the tests enabled by default while debugging that difference.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D112227

2 years agotsan: move memory access functions to a separate file
Dmitry Vyukov [Wed, 27 Oct 2021 14:00:23 +0000 (16:00 +0200)]
tsan: move memory access functions to a separate file

tsan_rtl.cpp is huge and does lots of things.
Move everything related to memory access and tracing
to a separate tsan_rtl_access.cpp file.
No functional changes, only code movement.

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D112625

2 years ago[AMDGPU] Add 24-bit mulhi intrinsics in INTRINSIC_WO_CHAIN combine.
Abinav Puthan Purayil [Thu, 28 Oct 2021 10:38:59 +0000 (16:08 +0530)]
[AMDGPU] Add 24-bit mulhi intrinsics in INTRINSIC_WO_CHAIN combine.

mul24 intrinsic's operands are simplified by
AMDGPUTargetLowering::performIntrinsicWOChainCombine(). This change adds
the mul24hi intrinsics in the combine since its operands can be
simplified like that of the mul24 intrinsics.

Differential Revision: https://reviews.llvm.org/D112702

2 years ago[AMDGPU] Fix rhs of the tests in amdgpu-codegenprepare-mul24.ll.
Abinav Puthan Purayil [Thu, 28 Oct 2021 01:33:48 +0000 (07:03 +0530)]
[AMDGPU] Fix rhs of the tests in amdgpu-codegenprepare-mul24.ll.

Differential Revision: https://reviews.llvm.org/D112685

2 years ago[libc] automemcpy
Guillaume Chatelet [Mon, 11 Oct 2021 15:26:43 +0000 (15:26 +0000)]
[libc] automemcpy

2 years ago[flang] Checks for pointers to intrinsic functions
Emil Kieri [Mon, 25 Oct 2021 19:43:17 +0000 (21:43 +0200)]
[flang] Checks for pointers to intrinsic functions

Check that when a procedure pointer is initialised or assigned with an intrinsic
function, or when its interface is being defined by one, that intrinsic function
is unrestricted specific (listed in Table 16.2 of F'2018).

Mark intrinsics LGE, LGT, LLE, and LLT as restricted specific. Getting their
classifications right helps in designing the tests.

Differential Revision: https://reviews.llvm.org/D112381

2 years ago[clangd] NFC: Use more idiomatic way of checking for definition
Kirill Bobyrev [Thu, 28 Oct 2021 10:25:12 +0000 (12:25 +0200)]
[clangd] NFC: Use more idiomatic way of checking for definition

2 years ago[clangd] NFC: Match function signature in the header and source file
Kirill Bobyrev [Thu, 28 Oct 2021 10:11:31 +0000 (12:11 +0200)]
[clangd] NFC: Match function signature in the header and source file

2 years ago[dexter] XFAIL feature_test source-root-dir.cpp
OCHyams [Thu, 28 Oct 2021 09:17:26 +0000 (10:17 +0100)]
[dexter] XFAIL feature_test source-root-dir.cpp

Test is failing for unknown reasons and needs investigating.

2 years ago[AMDGPU] Add gfx10 uaddsat test coverage. NFC.
Jay Foad [Thu, 28 Oct 2021 08:39:19 +0000 (09:39 +0100)]
[AMDGPU] Add gfx10 uaddsat test coverage. NFC.

2 years ago[Test] Regenerate some of llc test checks using auto updater
Max Kazantsev [Thu, 28 Oct 2021 09:18:30 +0000 (16:18 +0700)]
[Test] Regenerate some of llc test checks using auto updater

2 years ago[analyzer] sprintf is a taint propagator not a source
Balazs Benics [Thu, 28 Oct 2021 09:03:02 +0000 (11:03 +0200)]
[analyzer] sprintf is a taint propagator not a source

Due to a typo, `sprintf()` was recognized as a taint source instead of a
taint propagator. It was because an empty taint source list - which is
the first parameter of the `TaintPropagationRule` - encoded the
unconditional taint sources.
This typo effectively turned the `sprintf()` into an unconditional taint
source.

This patch fixes that typo and demonstrated the correct behavior with
tests.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D112558

2 years ago[MLIR][OpenMP] Fixed the missing inclusive clause in omp.wsloop and fix order clause
Shraiysh Vaishay [Thu, 28 Oct 2021 05:34:40 +0000 (11:04 +0530)]
[MLIR][OpenMP] Fixed the missing inclusive clause in omp.wsloop and fix order clause

This patch adds the inclusive clause (which was missed in previous
reorganization - https://reviews.llvm.org/D110903) in omp.wsloop operation.
Added a test for validating it.

Also fixes the order clause, which was not accepting any values. It now accepts
"concurrent" as a value, as specified in the standard.

Reviewed By: kiranchandramohan, peixin, clementval

Differential Revision: https://reviews.llvm.org/D112198

2 years ago[AMDGPU][GlobalISel] Fix waterfall loops
Sebastian Neubauer [Thu, 28 Oct 2021 08:29:06 +0000 (10:29 +0200)]
[AMDGPU][GlobalISel] Fix waterfall loops

- Move the `s_and exec` to its correct position before the content of
  the waterfall loop
- Use the SI_WATERFALL pseudo instruction, like for sdag, to benefit
  from optimizations
- Add support for indirect function calls

To support indirect calls, add a G_SI_CALL instruction without register
class restrictions and insert a waterfall loop when applying register
banks.

Differential Revision: https://reviews.llvm.org/D109052

2 years ago[GlobalISel] Simplify RegBankSelect
Neubauer, Sebastian [Mon, 25 Oct 2021 14:11:42 +0000 (16:11 +0200)]
[GlobalISel] Simplify RegBankSelect

Save the instruction list of a block before selecting banks.
This allows to cope with moved instructions, even if they are reordered
or splitted into multiple basic blocks.

Differential Revision: https://reviews.llvm.org/D111223

2 years ago[lldb] Remove ConstString from Process, ScriptInterpreter and StructuredData plugin...
Pavel Labath [Fri, 22 Oct 2021 17:53:43 +0000 (19:53 +0200)]
[lldb] Remove ConstString from Process, ScriptInterpreter and StructuredData plugin names

2 years ago[Test] Regenerate checks using auto-update script
Max Kazantsev [Thu, 28 Oct 2021 08:13:09 +0000 (15:13 +0700)]
[Test] Regenerate checks using auto-update script

2 years ago[Driver][AArch64]Add driver support for neoverse-512tvb target
Caroline Concatto [Fri, 22 Oct 2021 08:22:14 +0000 (09:22 +0100)]
[Driver][AArch64]Add driver support for neoverse-512tvb target

The support for  neoverse-512tvb mirrors the same option available in GCC[1].
There is no functional effect for this option yet.
This patch ensures the driver accepts "-mcpu=neoverse-512tvb", and enough
plumbing is in place to allow the new option to be used in the future.

[1]https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html

Differential Revision: https://reviews.llvm.org/D112406

2 years ago[lldb] [Host/Socket] Make DecodeHostAndPort() return a dedicated struct
Michał Górny [Wed, 27 Oct 2021 15:52:45 +0000 (17:52 +0200)]
[lldb] [Host/Socket] Make DecodeHostAndPort() return a dedicated struct

Differential Revision: https://reviews.llvm.org/D112629

2 years ago[flang] runtime: Read environment variables directly
Diana Picus [Tue, 12 Oct 2021 12:40:48 +0000 (12:40 +0000)]
[flang] runtime: Read environment variables directly

Add support for reading environment variables directly, via std::getenv.
This needs to allocate a C-style string to pass into std::getenv. If the
memory allocation for that fails, we terminate.

This also changes the interface for EnvVariableLength to receive the
source file and line so we can crash gracefully.

Note that we are now completely ignoring the envp pointer passed into
ProgramStart, since that could go stale if the environment is modified
during execution.

Differential Revision: https://reviews.llvm.org/D111785

2 years ago[Support] [Windows] Manually clean up temp files if not setting delete disposition
Martin Storsjö [Mon, 4 Oct 2021 12:10:52 +0000 (15:10 +0300)]
[Support] [Windows] Manually clean up temp files if not setting delete disposition

Since D81803 / 79657e2339b58bc01fe1b85a448bb073d57d90bb, temp files
created on network shares don't set "Disposition.DeleteFile = true".
This flag normally takes care of removing the temp file both if the
process exits abnormally (either crashing or killed externally), and
when the file is closed cleanly.

For network shares, we voluntarily choose to not set the flag, and
if the operation to inspect the file handle (as a prerequisite to
setting the flag since 79657e2339b58bc01fe1b85a448bb073d57d90bb)
fails we also error out. In both of these cases, we can at least make
sure to remove the temp files when they are closed cleanly.

Adjust the semantics of "OF_Delete" to not set the delete
disposition, but only set the access mode for allowing deletion.
Move the call to setDeleteDisposition into TempFile::create,
where we can check if it failed, and if it did, set a flag noting
that the file should be removed manually at the end.

This does leak files on crash, but at least doesn't leak files
in regular successful runs. (Technically, the alternative codepath
could use the RemoveFileOnSignal function, but that might complicate
the TempFile implementation further.)

This fixes https://github.com/mstorsjo/llvm-mingw/issues/233 and
https://bugs.llvm.org/show_bug.cgi?id=52080.

Differential Revision: https://reviews.llvm.org/D111875

2 years ago[clang] [MinGW] Rename the 'Arch' member to 'SubdirName'. NFC.
Martin Storsjö [Sat, 16 Oct 2021 14:20:16 +0000 (17:20 +0300)]
[clang] [MinGW] Rename the 'Arch' member to 'SubdirName'. NFC.

This string isn't a plain architecture name, but contains the whole
subdir name used for the sysroot, which often is equal to the target
triple.

Differential Revision: https://reviews.llvm.org/D112387

2 years ago[clang][MIPS] Fix search path for Debian multilib O32
YunQiang Su [Wed, 27 Oct 2021 13:16:42 +0000 (16:16 +0300)]
[clang][MIPS] Fix search path for Debian multilib O32

In the situation of multilib, the gcc objects are in a /32 directory. On
Debian, the libraries is under /libo32 to avoid confliction. This patch
enables clang find gcc in /32, and C lib in /libo32.

Differential Revision: https://reviews.llvm.org/D112158

2 years ago[clangd] Avoid expensive checks of buffer names in IncludeCleaner
Sam McCall [Wed, 27 Oct 2021 19:13:32 +0000 (21:13 +0200)]
[clangd] Avoid expensive checks of buffer names in IncludeCleaner

This changes the handling of special buffers (<command-line> etc) that
SourceManager treats as files but FileManager does not.

We now include them in findReferencedFiles() and drop them as part of
translateToHeaderIDs(). This pairs more naturally with the data representations
we're using, and so avoids a bunch of converting between representations for
filtering.

Differential Revision: https://reviews.llvm.org/D112652

2 years ago[CSSPGO] Trim cold base profiles for the CS preinliner.
Hongtao Yu [Wed, 27 Oct 2021 23:56:06 +0000 (16:56 -0700)]
[CSSPGO] Trim cold base profiles for the CS preinliner.

Adding support to the CS preinliner to trim cold base profiles. This makes trimming consistent with the inline decision made by the preinliner. Also disable the existing profile merger when preinliner is on unless explicitly specified.

Reviewed By: wenlei, wlei

Differential Revision: https://reviews.llvm.org/D112489

2 years ago[RISCV] Sync Zvlsseg register order as the same as vector registers.
Hsiangkai Wang [Wed, 22 Sep 2021 23:48:46 +0000 (07:48 +0800)]
[RISCV] Sync Zvlsseg register order as the same as vector registers.

Sync the order of Zvlsseg registers with vector registers to avoid
unnecessary register copies between vector instructions and zvlsseg
instructions.

Differential Revision: https://reviews.llvm.org/D110250

2 years agoAdd unix signal hit counts to the target statistics.
Greg Clayton [Thu, 28 Oct 2021 01:33:17 +0000 (18:33 -0700)]
Add unix signal hit counts to the target statistics.

Android and other platforms make wide use of signals when running applications and this can slow down debug sessions. Tracking this statistic can help us to determine why a debug session is slow.

The new data appears inside each target object and reports the signal hit counts:

      "signals": [
        {
          "SIGSTOP": 1
        },
        {
          "SIGUSR1": 1
        }
      ],

Differential Revision: https://reviews.llvm.org/D112683

2 years ago[mlir][GPUtoNVVM] Relax restriction on wmma op lowering
thomasraoux [Mon, 25 Oct 2021 19:42:36 +0000 (12:42 -0700)]
[mlir][GPUtoNVVM] Relax restriction on wmma op lowering

Allow lowering of wmma ops with 64bits indexes. Change the default
version of the test to use default layout.

Differential Revision: https://reviews.llvm.org/D112479

2 years ago[AMDGPU] Remove unused declaration findNumUsedRegistersSI (NFC)
Kazu Hirata [Thu, 28 Oct 2021 04:24:02 +0000 (21:24 -0700)]
[AMDGPU] Remove unused declaration findNumUsedRegistersSI (NFC)

2 years ago[Test] Add test showing missing simplifycfg opportunity for Phi with undef inputs
Max Kazantsev [Thu, 28 Oct 2021 04:22:34 +0000 (11:22 +0700)]
[Test] Add test showing missing simplifycfg opportunity for Phi with undef inputs

2 years ago[X86] Add a dependency breaking xor before any gathers with an undef passthru value.
Phoebe Wang [Thu, 28 Oct 2021 02:56:04 +0000 (10:56 +0800)]
[X86] Add a dependency breaking xor before any gathers with an undef passthru value.

In the instruction encoding, the passthru register is always
tied to the destination register. The CPU scheduler has to wait
for the last writer of this register to finish executing before
the gather can start. This is true even if the initial mask is
all ones so that the passthru will never be used.

By explicitly zeroing the register we can break the false
dependency. The zero idiom is executed completing by the
register renamer and so is immedately considered ready.

Authored by Craig.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D112505

2 years ago[RISCV] Use vmv.v.[v|i] if we know COPY is under the same vl and vtype.
Hsiangkai Wang [Fri, 2 Jul 2021 01:25:50 +0000 (09:25 +0800)]
[RISCV] Use vmv.v.[v|i] if we know COPY is under the same vl and vtype.

If we know the source operand of COPY is defined by a vector instruction
with tail agnostic and the same LMUL and there is no vsetvli between
COPY and the define instruction to change the vl and vtype, we could use
vmv.v.v or vmv.v.i to copy vector registers to get better performance than
the whole vector register move instructions.

If the source of COPY is from vmv.v.i, we could use vmv.v.i for the
COPY.

This patch only considers all these instructions within one basic block.

Case 1:
```
bb.0:
  ...
  VSETVLI          # The first VSETVLI before COPY and VOP.
  ...              # Use this VSETVLI to check LMUL and tail agnostic.
  ...
  vy = VOP va, vb  # Define vy.
  ...              # There is no vsetvli between VOP and COPY.
  vx = COPY vy
```

Case 2:
```
bb.0:
  ...
  VSETVLI          # The first VSETVLI before VOP.
  ...              # Use this VSETVLI to check LMUL and tail agnostic.
  ...
  vy = VOP va, vb  # Define vy.
  ...              # There is no vsetvli to change vl between VOP and COPY.
  ...
  VSETVLI          # The first VSETVLI before COPY.
  ...              # This VSETVLI does not change vl and vtype.
  ...
  vx = COPY vy
```

Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Co-Authored-by: Kito Cheng <kito.cheng@sifive.com>
Differential Revision: https://reviews.llvm.org/D103510

2 years ago[clang] Fortify warning for scanf calls with field width too big.
Michael Benfield [Thu, 14 Oct 2021 20:02:28 +0000 (20:02 +0000)]
[clang] Fortify warning for scanf calls with field width too big.

Differential Revision: https://reviews.llvm.org/D111833

2 years ago[AMDGPU] Add more llc tests for 48-bit mul generation.
Abinav Puthan Purayil [Tue, 26 Oct 2021 16:08:21 +0000 (21:38 +0530)]
[AMDGPU] Add more llc tests for 48-bit mul generation.

Differential Revision: https://reviews.llvm.org/D112554

2 years ago[SCEV] Invalidate user SCEVs along with operand SCEVs to avoid cache corruption
Max Kazantsev [Thu, 28 Oct 2021 02:08:48 +0000 (09:08 +0700)]
[SCEV] Invalidate user SCEVs along with operand SCEVs to avoid cache corruption

Following discussion in D110390, it seems that we are suffering from unability
to traverse users of a SCEV being invalidated. The result of that is that ScalarEvolution's
inner caches may store obsolete data about SCEVs even if their operands are
forgotten. It creates problems when we try to verify the contents of those caches.

It's also a frequent situation when messing with cache causes very sneaky and
hard-to-analyze bugs related to corruption of memory when dealing with cached
data. They are lurking there because ScalarEvolution's veirfication is not powerful
enough and misses many problematic cases. I plan to make SCEV's verification
much stricter in follow-ups, and this requires dangling-pointers-free caches.

This patch makes sure that, whenever we forget cached information for a SCEV,
we also forget it for all SCEVs that (transitively) use it.

This may have negative compile time impact. It's a sacrifice we are more
than willing to make to enforce correctness. We can also save some time by
reworking invokers of forgetMemoizedResults (maybe we can forget multiple
SCEVs with single query).

Differential Revision: https://reviews.llvm.org/D111533
Reviewed By: reames

2 years ago[RISCV] Replace most uses of RISCVSubtarget::hasStdExtV. NFCI
Craig Topper [Thu, 28 Oct 2021 02:19:03 +0000 (19:19 -0700)]
[RISCV] Replace most uses of RISCVSubtarget::hasStdExtV. NFCI

Add new hasVInstructions() which is currently equivalent.

Replace vector uses of hasStdExtZfh/F/D with new vector specific
versions. The vector spec no longer requires that the vectors implement the
same types as scalar. It only requires that the scalar type is
the maximum size the vectors can support. This is currently
implemented using the scalar rule we were using before.

Add new hasVInstructionsI64() begin using to qualify code that
requires i64 vector elements.

This is all NFC for now, but we can start using this to better
implement D112408 which introduces the Zve extensions.

Reviewed By: frasercrmck, eopXD

Differential Revision: https://reviews.llvm.org/D112496

2 years ago[hwasan] print exact mismatch offset for short granules.
Florian Mayer [Fri, 22 Oct 2021 00:23:45 +0000 (01:23 +0100)]
[hwasan] print exact mismatch offset for short granules.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D104463

2 years ago[clang][compiler-rt][atomics] Add `__c11_atomic_fetch_nand` builtin and support ...
Kai Luo [Thu, 28 Oct 2021 02:18:16 +0000 (02:18 +0000)]
[clang][compiler-rt][atomics] Add `__c11_atomic_fetch_nand` builtin and support `__atomic_fetch_nand` libcall

Add `__c11_atomic_fetch_nand` builtin to language extensions and support `__atomic_fetch_nand` libcall in compiler-rt.

Reviewed By: theraven

Differential Revision: https://reviews.llvm.org/D112400

2 years ago[OpenMP] Declare variants for templates need to match # template args
Johannes Doerfert [Thu, 28 Oct 2021 00:39:28 +0000 (19:39 -0500)]
[OpenMP] Declare variants for templates need to match # template args

A declare variant template is only compatible with a base when the
number of template arguments is equal, otherwise our instantiations will
produce nonsensical results.

Exposes as part of D109344.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D109770

2 years ago[Attributor][FIX] Do not ignore memory writes in AAMemoryBehavior
Johannes Doerfert [Mon, 13 Sep 2021 12:40:41 +0000 (07:40 -0500)]
[Attributor][FIX] Do not ignore memory writes in AAMemoryBehavior

Even if we look for `nocapture` we need to bail on escaping pointers.
The crucial thing is that we might not look at a big enough scope when
we derive the memory behavior. Thus, it might be `nocapture` in a larger
context while it is "captured" in a smaller context.

2 years ago[Attributor][NFX] Pre-commit test case exposing a problem
Johannes Doerfert [Mon, 13 Sep 2021 12:34:51 +0000 (07:34 -0500)]
[Attributor][NFX] Pre-commit test case exposing a problem

The test case is the IR of:

```
  void func(float * restrict a, float *b, int N) {
    N = 199;
    #pragma omp parallel for
    for (int i = 1; i < N; i++)
      a[i] = b[i] + 1.0;
  }
```

2 years ago[Attributor][NFC] Improve debug messages
Johannes Doerfert [Wed, 8 Sep 2021 20:57:18 +0000 (15:57 -0500)]
[Attributor][NFC] Improve debug messages

2 years ago[CMake] Cache the compiler-rt library search results
Petr Hosek [Tue, 29 Sep 2020 00:37:20 +0000 (17:37 -0700)]
[CMake] Cache the compiler-rt library search results

There's a lot of duplicated calls to find various compiler-rt libraries
from build of runtime libraries like libunwind, libc++, libc++abi and
compiler-rt. The compiler-rt helper module already implemented caching
for results avoid repeated Clang invocations.

This change moves the compiler-rt implementation into a shared location
and reuses it from other runtimes to reduce duplication and speed up
the build.

Differential Revision: https://reviews.llvm.org/D88458

2 years ago[openmp] Fix a git misfire in cf37a94c1e42ce
Jon Chesterfield [Thu, 28 Oct 2021 00:35:16 +0000 (01:35 +0100)]
[openmp] Fix a git misfire in cf37a94c1e42ce

2 years ago[lld-macho] Implement -S
Vincent Lee [Wed, 27 Oct 2021 04:42:25 +0000 (21:42 -0700)]
[lld-macho] Implement -S

There are a couple internal builds that require the use of this flag.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D112594

2 years agoRevert "[libomptarget] Build DeviceRTL for amdgpu"
Jon Chesterfield [Thu, 28 Oct 2021 00:01:53 +0000 (01:01 +0100)]
Revert "[libomptarget] Build DeviceRTL for amdgpu"
 - more tests failing on CI than failed locally when writing this patch

This reverts commit 33427fdb7b52b79ce5e25b7e14e0f1a44d876bd2.

2 years ago[openmp] Add amdgpu impl missed from D112153
Jon Chesterfield [Wed, 27 Oct 2021 23:54:29 +0000 (00:54 +0100)]
[openmp] Add amdgpu impl missed from D112153

2 years agoAdd breakpoint resolving stats to each target.
Greg Clayton [Wed, 27 Oct 2021 00:48:42 +0000 (17:48 -0700)]
Add breakpoint resolving stats to each target.

This patch adds breakpoints to each target's statistics so we can track how long it takes to resolve each breakpoint. It also includes the structured data for each breakpoint so the exact breakpoint details are logged to allow for reproduction of slow resolving breakpoints. Each target gets a new "breakpoints" array that contains breakpoint details. Each breakpoint has "details" which is the JSON representation of a serialized breakpoint resolver and filter, "id" which is the breakpoint ID, and "resolveTime" which is the time in seconds it took to resolve the breakpoint. A snippet of the new data is shown here:

  "targets": [
    {
      "breakpoints": [
        {
          "details": {...},
          "id": 1,
          "resolveTime": 0.00039291599999999999
        },
        {
          "details": {...},
          "id": 2,
          "resolveTime": 0.00022679199999999999
        }
      ],
      "totalBreakpointResolveTime": 0.00061970799999999996
    }
  ]

This provides full details on exactly how breakpoints were set and how long it took to resolve them.

Differential Revision: https://reviews.llvm.org/D112587

2 years ago[ARM] Use hardware TLS register in Thumb2 mode when -mtp=cp15 is passed
Ard Biesheuvel [Wed, 27 Oct 2021 23:27:00 +0000 (16:27 -0700)]
[ARM] Use hardware TLS register in Thumb2 mode when -mtp=cp15 is passed

In ARM mode, passing -mtp=cp15 forces the use of an inline MRC system register read to move the thread pointer value into a register.

Currently, in Thumb2 mode, -mtp=cp15 is ignored, and a call to the __aeabi_read_tp helper is emitted instead.

This is inconsistent, and breaks the Linux/ARM build for Thumb2 targets, as the Linux kernel does not provide an implementation of __aeabi_read_tp,.

Reviewed By: nickdesaulniers, peter.smith

Differential Revision: https://reviews.llvm.org/D112600

2 years ago[libomptarget] Build DeviceRTL for amdgpu
Jon Chesterfield [Wed, 27 Oct 2021 23:39:37 +0000 (00:39 +0100)]
[libomptarget] Build DeviceRTL for amdgpu

Passes same tests as the current deviceRTL. Includes cmake change from D111987.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D112227

2 years ago[lldb] The os and version are not separate components in the triple
Jonas Devlieghere [Wed, 27 Oct 2021 23:15:49 +0000 (16:15 -0700)]
[lldb] The os and version are not separate components in the triple

Create a valid triple in the Darwin builder. Currently it was
incorrectly treating the os and version as two separate components in
the triple.

Differential revision: https://reviews.llvm.org/D112676

2 years agoRevert "[ORC] Change SPSExecutorAddr serialization, SupportFunctionCall struct."
Lang Hames [Wed, 27 Oct 2021 23:39:24 +0000 (16:39 -0700)]
Revert "[ORC] Change SPSExecutorAddr serialization, SupportFunctionCall struct."

This reverts commit e32b1eee6aab52e2b7b75ee15e506b3e7dd30e68.

Reverting while I fix some broken unit tests.

2 years ago[Attributor][FIX] Use right address space to avoid assertion
Johannes Doerfert [Tue, 26 Oct 2021 15:06:56 +0000 (10:06 -0500)]
[Attributor][FIX] Use right address space to avoid assertion

When we strip and accumulate constant offsets we need to pick the right
address space such that the offset APInt has the right bit width.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D112544

2 years ago[OpenMP] Remove obsolete external interface for device RT
Johannes Doerfert [Wed, 20 Oct 2021 15:45:13 +0000 (10:45 -0500)]
[OpenMP] Remove obsolete external interface for device RT

We do not generate _serialized_parallel calls in device mode, no
need for an external API.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D112145

2 years ago[OpenMP][FIX] Do not adjust the level after the environment was popped
Johannes Doerfert [Wed, 20 Oct 2021 15:37:50 +0000 (10:37 -0500)]
[OpenMP][FIX] Do not adjust the level after the environment was popped

Exiting a data environment will reset all values, it is wrong to adjust
them afterwards.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D112144

2 years ago[OpenMP] Introduce aligned synchronization into the new device RT
Johannes Doerfert [Wed, 20 Oct 2021 16:28:18 +0000 (11:28 -0500)]
[OpenMP] Introduce aligned synchronization into the new device RT

We will later use the fact that a barrier is aligned to reason about
thread divergence. For now we introduce the assumption and some more
documentation.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D112153

2 years ago[ORC] Change SPSExecutorAddr serialization, SupportFunctionCall struct.
Lang Hames [Wed, 27 Oct 2021 20:40:52 +0000 (13:40 -0700)]
[ORC] Change SPSExecutorAddr serialization, SupportFunctionCall struct.

SPSExecutorAddr will now be serializable to/from ExecutorAddr, rather than
uint64_t. This improves type safety when working with serialized addresses.

Also updates the SupportFunctionCall to use an ExecutorAddrRange (rather than
a separate ExecutorAddr addr and uint64_t size field), and updates the
tpctypes::*Write data structures to use ExecutorAddr rather than
JITTargetAddress.

2 years ago[OpenMP][FIX] Query proper thread ID information to support nesting
Johannes Doerfert [Sat, 16 Oct 2021 19:39:55 +0000 (14:39 -0500)]
[OpenMP][FIX] Query proper thread ID information to support nesting

The OpenMP thread ID is not the hardware thread ID if we have nesting.
We need to ask the runtime properly to ensure correct results.

Note that the loop interface is going to change soon so we do not adjust
it now but simply ignore the extra argument.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D111950

2 years ago[OpenMP][FIX] Do check the level before return team size
Johannes Doerfert [Sat, 16 Oct 2021 19:28:30 +0000 (14:28 -0500)]
[OpenMP][FIX] Do check the level before return team size

The team size could/should be an ICV but since we know it is either 1 or
a value we can leave it in the team state for now. However, we still
need to determine if the current level is nested before we use it.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D111949

2 years ago[OpenMP][FIX] Do not dereference a potential nullptr
Johannes Doerfert [Sat, 16 Oct 2021 19:20:36 +0000 (14:20 -0500)]
[OpenMP][FIX] Do not dereference a potential nullptr

The first thread state in the new GPU runtime doesn't have a previous
one and we should not dereference the nullptr placeholder.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D111946

2 years agoRevert rest of `IRBuilderBase`'s short-circuiting folds
Roman Lebedev [Wed, 27 Oct 2021 23:00:05 +0000 (02:00 +0300)]
Revert rest of `IRBuilderBase`'s short-circuiting folds

Upon further investigation and discussion,
this is actually the opposite direction from what we should be taking,
and this direction wouldn't solve the motivational problem anyway.

Additionally, some more (polly) tests have escaped being updated.
So, let's just take a step back here.

This reverts commit f3190dedeef9da2109ea57e4cb372f295ff53b88.
This reverts commit 749581d21f2b3f53e4fca4eb8728c942d646893b.
This reverts commit f3df87d57e096143670e0fd396e81d43393a2dd2.
This reverts commit ab1dbcecd6f0969976fafd62af34730436ad5944.

2 years ago[lldb] Skip TestCCallingConventions.test_ms_abi on arm64
Jonas Devlieghere [Wed, 27 Oct 2021 23:07:56 +0000 (16:07 -0700)]
[lldb] Skip TestCCallingConventions.test_ms_abi on arm64

rdar://84528755

2 years ago[ORC-RT] Fix objc selector corruption
Ben Langmuir [Wed, 27 Oct 2021 22:39:51 +0000 (15:39 -0700)]
[ORC-RT] Fix objc selector corruption

We were writing a pointer to a selector string into the contents of a
string instead of overwriting the pointer to the string, leading to
corruption. This was causing non-deterministic failures of the
'trivial-objc-methods' test case.

Differential Revision: https://reviews.llvm.org/D112671

2 years ago[Sema] Recognize format argument indicated by format attribute inside blocks
Félix Cloutier [Wed, 27 Oct 2021 22:38:31 +0000 (15:38 -0700)]
[Sema] Recognize format argument indicated by format attribute inside blocks

- `[[format(archetype, fmt-idx, ellipsis)]]` specifies that a function accepts a
  format string and arguments according to `archetype`. This is how Clang
  type-checks `printf` arguments based on the format string.
- Clang has a `-Wformat-nonliteral` warning that is triggered when a function
  with the `format` attribute is called with a format string that is not
  inspectable because it isn't constant. This warning is suppressed if the
  caller has the `format` attribute itself and the format argument to the callee
  is the caller's own format parameter.
- When using the `format` attribute on a block, Clang wouldn't recognize its
  format parameter when calling another function with the format attribute. This
  would cause unsuppressed -Wformat-nonliteral warnings for no supported reason.

Reviewed By: ahatanak

Differential Revision: https://reviews.llvm.org/D112569

Radar-Id: rdar://84603673

2 years ago[amdgpu] Handle the case where there is no scavenged register.
Michael Liao [Fri, 16 Jul 2021 16:14:49 +0000 (12:14 -0400)]
[amdgpu] Handle the case where there is no scavenged register.

- When an unconditional branch is expanded into an indirect branch, if
  there is no scavenged register, an SGPR pair needs spilling to enable
  the destination PC calculation. In addition, before jumping into the
  destination, that clobbered SGPR pair need restoring.
- As SGPR cannot be spilled to or restored from memory directly, the
  spilling/restoring of that SGPR pair reuses the regular SGPR spilling
  support but without spilling it into memory. As that spilling and
  restoring points are fully controlled, we only need to spill that SGPR
  into the temporary VGPR, which needs spilling into its emergency slot.
- The target-specific hook is revised to take additional restore block,
  where the restoring code is filled. After that, the relaxation will
  place that restore block directly before the destination block and
  insert an unconditional branch in any fall-through block into the
  destination block.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D106449

2 years ago[clang] NFC: remove carriage return from AST tests
Matheus Izvekov [Sat, 23 Oct 2021 22:44:17 +0000 (00:44 +0200)]
[clang] NFC: remove carriage return from AST tests

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D112372

2 years agoRevert "[lldb] [Host/ConnectionFileDescriptor] Refactor to improve code reuse"
Med Ismail Bennani [Wed, 27 Oct 2021 21:49:49 +0000 (23:49 +0200)]
Revert "[lldb] [Host/ConnectionFileDescriptor] Refactor to improve code reuse"

This reverts commit e1acadb61dfc0810656219c6314019d5132f2c61.

2 years ago[InstCombine] add tests for select-of-constants; NFC
Sanjay Patel [Wed, 27 Oct 2021 19:50:57 +0000 (15:50 -0400)]
[InstCombine] add tests for select-of-constants; NFC

2 years ago[InstCombine] add tests for icmp with trunc operand; NFC
Sanjay Patel [Wed, 27 Oct 2021 14:22:10 +0000 (10:22 -0400)]
[InstCombine] add tests for icmp with trunc operand; NFC

2 years ago[lldb] Fixup code addresses in the Objective-C language runtime
Jonas Devlieghere [Wed, 27 Oct 2021 21:23:32 +0000 (14:23 -0700)]
[lldb] Fixup code addresses in the Objective-C language runtime

Upstream the calls to ABI::FixCodeAddress in the Objective-C language
runtime.

Differential revision: https://reviews.llvm.org/D112662

2 years ago[libc++] Make __decay_copy constexpr
Louis Dionne [Wed, 27 Oct 2021 19:12:07 +0000 (15:12 -0400)]
[libc++] Make __decay_copy constexpr

This is going to be necessary to implement some range adaptors.
As a fly-by fix, rename _LIBCPP_INLINE_VISIBILITY to _LIBCPP_HIDE_FROM_ABI
and remove a redundant inline keyword.

Differential Revision: https://reviews.llvm.org/D112650

2 years ago[libunwind] Simplify the executor used in the tests
Louis Dionne [Wed, 27 Oct 2021 19:04:18 +0000 (15:04 -0400)]
[libunwind] Simplify the executor used in the tests

Instead of going through libc++'s run.py, we can simply run the executable
directly since we don't need to setup a working directory or control the
environment.

Differential Revision: https://reviews.llvm.org/D112649

2 years ago[libc++][test] Fix invalid test for views::view_interface
Joe Loser [Wed, 27 Oct 2021 21:12:33 +0000 (17:12 -0400)]
[libc++][test] Fix invalid test for views::view_interface

The type `MoveOnlyForwardRange` violates the precondition stated in
`view.interface.general`. Specifically, the type passed to
`view_interface` shall model the `view` concept. In turn, this requires the
type to satisfy `movable` concept (and others), but this type
`MoveOnlyForwardRange` does not satisfy the `movable` concept.

Add a move assignment operator so that `MoveOnlyForwardRange` satisfies the
`movable` concept. While we're here, ensure the neighboring types that inherit
from `view_interface` also satisfy the `view` concept to avoid similar issues.

Fixes https://bugs.llvm.org/show_bug.cgi?id=50720

Reviewed By: Quuxplusone, Mordante, #libc

Differential Revision: https://reviews.llvm.org/D112631

2 years ago[clang] NFC: include non friendly types and missing sugar in test expectations
Matheus Izvekov [Wed, 22 Sep 2021 00:33:45 +0000 (02:33 +0200)]
[clang] NFC: include non friendly types and missing sugar in test expectations

The dump of all diagnostics of all tests under `clang/test/{CXX,SemaCXX,SemaTemplate}` was analyzed , and all the cases where there were obviously bad canonical types being printed, like `type-parameter-*-*` and `<overloaded function type>` were identified. Also a small amount of cases of missing sugar were analyzed.

This patch then spells those explicitly in the test expectations, as preparatory work for future fixes for these problems.

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D110210

2 years ago[clang] deprecate frelaxed-template-template-args, make it on by default
Matheus Izvekov [Thu, 9 Sep 2021 10:11:03 +0000 (12:11 +0200)]
[clang] deprecate frelaxed-template-template-args, make it on by default

A resolution to the ambiguity issues created by P0522, which is a DR solving
CWG 150, did not come as expected, so we are just going to accept the change,
and watch how users digest it.

For now we deprecate the flag with a warning, and make it on by default.
We don't remove the flag completely in order to give users a chance to
work around any problems by disabling it.

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D109496

2 years ago[AST] fail rather than crash when const evaluating invalid c++ foreach
Sam McCall [Wed, 27 Oct 2021 16:50:24 +0000 (18:50 +0200)]
[AST] fail rather than crash when const evaluating invalid c++ foreach

Differential Revision: https://reviews.llvm.org/D112633

2 years ago[ORC][ORC-RT] Enable the MachO platform for arm64
Ben Langmuir [Wed, 27 Oct 2021 18:53:37 +0000 (11:53 -0700)]
[ORC][ORC-RT] Enable the MachO platform for arm64

Enables the arm64 MachO platform, adds basic tests, and implements the
missing TLV relocations and runtime wrapper function. The TLV
relocations are just handled as GOT accesses.

rdar://84671534

Differential Revision: https://reviews.llvm.org/D112656

2 years ago[lld/mac] Don't crash on undefined symbols with --icf=all
Nico Weber [Wed, 27 Oct 2021 18:40:00 +0000 (14:40 -0400)]
[lld/mac] Don't crash on undefined symbols with --icf=all

ICF runs before relocation processing, but undefined symbol errors
are only emitted during relocation processing.

So just ignore Undefineds during ICF (instead of crashing) -- lld
will emit an error once ICF is done.

Fixes PR52330.

Differential Revision: https://reviews.llvm.org/D112643

2 years ago[Clang] Add elementwise abs builtin.
Florian Hahn [Wed, 27 Oct 2021 17:15:17 +0000 (18:15 +0100)]
[Clang] Add elementwise abs builtin.

This patch implements __builtin_elementwise_abs as specified in
D111529.

Reviewed By: aaron.ballman, scanon

Differential Revision: https://reviews.llvm.org/D111986

2 years agoutils/release: Add script for building release documentation
Tom Stellard [Wed, 27 Oct 2021 19:56:55 +0000 (12:56 -0700)]
utils/release: Add script for building release documentation

Reviewed By: hans, kuhnel

Differential Revision: https://reviews.llvm.org/D95284

2 years agoRevert "[NFC] `IRBuilderBase::CreateAdd()`: place constant onto RHS"
Roman Lebedev [Wed, 27 Oct 2021 19:19:59 +0000 (22:19 +0300)]
Revert "[NFC] `IRBuilderBase::CreateAdd()`: place constant onto RHS"

Clang OpenMP codegen tests are failing,
will recommit afterwards.

This reverts commit 4723c9b3c6c46632a5d66e65d198899894b1e2c5.

2 years agoRevert "[IR] `IRBuilderBase::CreateAdd()`: short-circuit `x + 0` --> `x`"
Roman Lebedev [Wed, 27 Oct 2021 19:19:10 +0000 (22:19 +0300)]
Revert "[IR] `IRBuilderBase::CreateAdd()`: short-circuit `x + 0` --> `x`"

Clang OpenMP codegen tests are failing.

This reverts commit 288f1f8abe5835180a0021f142043ee261ab3846.
This reverts commit cb90e5356ac1594e95fed8e208d6e0e9b6a87db1.

2 years ago[ConstantRange] Optimize smul_sat() (NFC)
Nikita Popov [Wed, 27 Oct 2021 13:12:02 +0000 (15:12 +0200)]
[ConstantRange] Optimize smul_sat() (NFC)

Base the implementation on the APInt smul_sat() implementation,
which is much more efficient than performing calculations in
double the bitwidth.

2 years ago[lld-macho] If export_size is zero, export_off must be zero
Jez Ng [Wed, 27 Oct 2021 18:58:15 +0000 (14:58 -0400)]
[lld-macho] If export_size is zero, export_off must be zero

Otherwise tools like codesign_allocate will choke. We were already
handling this correctly for the other DYLD_INFO sections.

Doing this correctly is a bit subtle: we don't know if export_size will
be zero until we have run `ExportSection::finalizeContents()`. However,
we must still add the ExportSection to the `__LINKEDIT` segment in order
that it gets sorted during `sortSectionsAndSegments()`.

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D112589

2 years ago[BasicAA] Remove misleading overflow check
Nikita Popov [Wed, 27 Oct 2021 12:58:34 +0000 (14:58 +0200)]
[BasicAA] Remove misleading overflow check

GEP decomposition currently checks whether the multiplication of
the linear expression offset and GEP scale overflows. However, if
everything else works correctly, this overflow check is both
unnecessary and dangerously misleading. While it will avoid an
overflow in Scale * Offset in particular, other parts of the
calculation (including those on dynamic values) may still overflow.
The code working on the decomposed GEPs is responsible for ensuring
that it remains correct in the presence of overflow. D112611 fixes
the last issue of that kind that I'm aware of (in fact, the overflow
check was originally introduced to work around precisely that issue).

Differential Revision: https://reviews.llvm.org/D112618

2 years ago[formatters] Add a libstdcpp formatter for set and unify tests across stdlibs
Danil Stefaniuc [Wed, 27 Oct 2021 18:54:19 +0000 (11:54 -0700)]
[formatters] Add a libstdcpp formatter for set and unify tests across stdlibs

This diff adds a data formatter for libstdcpp's set. Besides, it unifies the tests for set for libcxx and libstdcpp for maintainability.

Reviewed By: wallace

Differential Revision: https://reviews.llvm.org/D112537

2 years agoFix MLIR LLVMIR test after 4723c9b3c6c46632a5d66e65d198899894b1e2c5
Roman Lebedev [Wed, 27 Oct 2021 18:52:32 +0000 (21:52 +0300)]
Fix MLIR LLVMIR test after 4723c9b3c6c46632a5d66e65d198899894b1e2c5

2 years ago[LowerTypeTests] Emit cfi_jt aliases regardless of function export
Nick Desaulniers [Wed, 27 Oct 2021 18:36:23 +0000 (11:36 -0700)]
[LowerTypeTests] Emit cfi_jt aliases regardless of function export

A constant complaint we get is that the __typeid__ symbols in the CFI
jump tables causes confusing stack traces in applications. Emit the more
readable cfi_jt aliases regardless of function export (LTO vs Thin LTO).

Reviewed By: pcc, tejohnson

Differential Revision: https://reviews.llvm.org/D107934

2 years ago[IR] `IRBuilderBase::CreateAdd()`: short-circuit `x + 0` --> `x`
Roman Lebedev [Wed, 27 Oct 2021 17:35:55 +0000 (20:35 +0300)]
[IR] `IRBuilderBase::CreateAdd()`: short-circuit `x + 0` --> `x`

There's precedent for that in `CreateOr()`/`CreateAnd()`.

The motivation here is to avoid bloating the run-time check's IR
in `SCEVExpander::generateOverflowCheck()`.

Refs. https://reviews.llvm.org/D109368#3089809

2 years ago[NFC] `IRBuilderBase::CreateAdd()`: place constant onto RHS
Roman Lebedev [Wed, 27 Oct 2021 16:58:10 +0000 (19:58 +0300)]
[NFC] `IRBuilderBase::CreateAdd()`: place constant onto RHS

2 years ago[Operator] Add hasPoisonGeneratingFlags [mostly NFC]
Philip Reames [Wed, 27 Oct 2021 17:51:03 +0000 (10:51 -0700)]
[Operator] Add hasPoisonGeneratingFlags [mostly NFC]

This method parallels the dropPoisonGeneratingFlags on Instruction, but is hoisted to operator to handle constant expressions as well.

This is mostly code movement, but I did go ahead and add the inrange constexpr gep case.  This had been discussed previously, but apparently never followed up o.

2 years agoRevert "[SLP]Improve/fix reordering of the gathered graph nodes."
Alexey Bataev [Wed, 27 Oct 2021 18:16:20 +0000 (11:16 -0700)]
Revert "[SLP]Improve/fix reordering of the gathered graph nodes."

This reverts commit 64d1617d18cb8b6f9511d0eda481fc5a5d0ebddf to fix test
non-stability.

2 years ago[libc][obvious] fix strdup being listed twice
Michael Jones [Wed, 27 Oct 2021 18:11:29 +0000 (11:11 -0700)]
[libc][obvious] fix strdup being listed twice

strdup was being included even if malloc wasn't and that was causing
a build failure.

Differential Revision: https://reviews.llvm.org/D112641

2 years agoAdd "REQUIRES: native" to test.
Douglas Yung [Wed, 27 Oct 2021 17:49:52 +0000 (10:49 -0700)]
Add "REQUIRES: native" to test.

This test was failing on the PS4 bot because the test attempts to link, but the PS4 platform requires an external
linker that is not present, causing the test to fail. This should get the PS4 bot green again.

2 years ago[lld/mac] Don't assert when ICFing arm64 code
Nico Weber [Mon, 25 Oct 2021 14:25:14 +0000 (10:25 -0400)]
[lld/mac] Don't assert when ICFing arm64 code

WordLiteralSection dedupes literals by content.
WordLiteralInputSection::getOffset() used to read a literal at the passed-in
offset and look up this value in the deduping map to find the offset of the
deduped value.

But it's possible that (e.g.) a 16-byte literal's value is accessed 4 bytes in.
To get the offset at that address, we have to get the deduped value at offset 0
and then apply the offset 4 to the result.

(See also WordLiteralSection::finalizeContents() which fills in those maps.)

Only a problem on arm64 because in x86_64 the offset is part of the instruction
instead of a separate ARM64_RELOC_ADDEND relocation. (See bug for more details.)

Fixes PR51999.

Differential Revision: https://reviews.llvm.org/D112584