review.tizen.org Git - platform/upstream/llvm.git/log

[LoongArch] Refine the condition to return Match_RequiresAMORdDifferRkRj in AsmParser. NFC

This can suppress compilation warning like `enumerated mismatch in conditional expression`.

See:
https://lab.llvm.org/staging/#/builders/236/builds/645/steps/6/logs/warnings__1_

[MLIR][Tensor] Fix example for pack/unpack (NFC)

[LoopVectorize] Clear cache of `LoopAccessInfoManager`

LAI is cached during the LoopDistribute pass, and is later re-used during LoopVectorize. The problem is that LoopVectorize changes SCEV, and the cached LAI does not get updated. Hence, when re-using the cached LAI, it references an invalid SCEV.

Fixes #59319

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D139601

[InstCombine] Add tests for icmp ne non-zero power of 2; NFC

Tests for D140851.

[TargetParser] Generate the defs for RISCV CPUs using llvm-tblgen.

This patch removes the file `llvm/include/llvm/TargetParser/RISCVTargetParser.def` and replaces it with a tablegen-generated `.inc` file out of `llvm/lib/Target/RISCV/RISCV.td`.

The module system has been updated to make sure we can build clang/llvm with `-DLLVM_ENABLE_MODULES=On`

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D137517

[OpenMP][FIX] Avoid performance regression accidentally introduced

[AArch64] Make -march and target(arch=..) attributes imply dependent features

Specifying an architecture revision should also add feature strings for
any dependent default extensions. Otherwise the new checks for
target-dependent features for acle intrinsics from D134353 and D132034
can fail.

This patch does that in setFeatureEnabled, similar to the addition of
dependent architecture revisions. +sve also needs to be added to armv9
architectures in the target parser, as it is implied by +sve2.

Fixes #59911

Differential Revision: https://reviews.llvm.org/D141411

Reland "[CMake][LoongArch] Add LoongArch to LLVM_ALL_TARGETS so it is built by default"

This follows the [[ https://discourse.llvm.org/t/rfc-promoting-the-loongarch-backend-from-experimental-to-official/67506/ | RFC ]].

Follow-on commits will add appropriate release notes changes etc.

Submit this now and in a minimal form so there is reasonable time before
16.0.0 is branched to resolve any issues arising from e.g. the backend
being exposed on different compiler/sanitizer setups.

The current builder for LoongArch is on the [[ https://lab.llvm.org/staging/#/builders/236 | staging area ]].

Reviewed By: jyknight, MaskRay, echristo, myhsu, tstellar, arsenm

Differential Revision: https://reviews.llvm.org/D141191

[flang] Lowering and implementation for extends_type_of

Add implementation and loweirng for the extends_type_of
intrinsic.

The standard mentions this: otherwise if the dynamic type of A or MOLD is
extensible, the result is true if and only if the dynamic type of A is an
extension type of the dynamic type of MOLD. Which could be interpreted that
`extends_type_of(a, a)` could be false since a type is not an extension of
itself. Gfortran result for this is `true` so the same behavior is applied
here as well.

Depends on D141364

Reviewed By: jeanPerier, PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D141376

[flang] Lowering and implementation for same_type_as

The test performed by same_type_as does not consider kind type
parameters. If an exact match is not found, the name of the
derived type is compared. The name in the runtime info does not include
the kind type parameters as it does in the mangled name.

Reviewed By: jeanPerier, PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D141364

[flang] Only deallocate intent(out) allocatable through runtime if allocated

Deallocation of intent(out) allocatable was done in D133348. This patch adds
an if guard when the deallocation is done through a runtime call. The runtime
is crashing if the box is not allocated. Call the runtime only if the box is
allocated. This is the case for derived type, polymorphic and unlimited
polymorphic entities.

Reviewed By: PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D141427

[PowerPC][GISel] Select sync instructions required by atomic operations

This is part of selecting `G_ATOMIC*` instructions. Select `isync`, `sync` and `lwsync` in GISel.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D141360

[LoongArch] Fixed llvm/test/CodeGen/LoongArch/intrinsic.ll test failure when EXPENSIV_CHECK is enabled [1].

Specifically:
```
*** Bad machine code: Using an undefined physical register ***
- function:    movgr2fcsr
- basic block: %bb.0 entry (0x1af5e60)
- instruction: MOVGR2FCSR $fcsr1, %0:gpr
- operand 0:   $fcsr1

*** Bad machine code: Using an undefined physical register ***
- function:    movfcsr2gr
- basic block: %bb.0 entry (0x133fae0)
- instruction: %0:gpr = MOVFCSR2GR $fcsr1
- operand 1:   $fcsr1
```

By building MachineInstructions, the state of the register is
clarified, and the error caused by using undefined physical registers
is fixed.

[1]: https://lab.llvm.org/buildbot/#/builders/16/builds/41677

[NFC][AsmWriter] Use HasSubstr in test

[AsmWriter] Fix leak after D141343

[DWARFLibrary] Init field after D137882

[Clang][RISCV][NFC] Reorganize test case for rvv intrinsics

The file hierarchy is reorganized into:

```
├── rvv-intrinsics-autogenerated
│   ├── non-policy
│   │   ├── non-overloaded
│   │   └── overloaded
│   └── policy
│       ├── non-overloaded
│       └── overloaded
└── rvv-intrinsics-handcrafted
```

Separating auto-generated test cases and hand-craft ones. The
auto-generated ones are basic API tests generated from the intrinsic
generator [0].

This re-organization allows direct copy-and-paste from the produced
outputs of the generator in future API changes, which is discussed
and needs to be implemented towards a v1.0 RVV intrinsic.

[0] https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/181

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D141198

[libc] Use the boostrap build's target triple if available.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D141428

[Test] Add test showing one more missing case of turn-to-invariant with widening

Do not short circuit hoistIVInc when recomputation of poison flags is needed.

Fixes https://github.com/llvm/llvm-project/issues/59777

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D140836

[libc] Implement str{,n}casecmp

Differential Revision: https://reviews.llvm.org/D141236

[Test] Give test variables more reasonable names

[IndVars] Support AND/OR in optimizeLoopExitWithUnknownExitCount

This patch allows optimizeLoopExitWithUnknownExitCount to deal with
branches by conditions that are not immediately ICmp's, but aggregates
of ICmp's joined by arithmetic or logical AND/OR. Each ICmp is optimized
independently.

Differential Revision: https://reviews.llvm.org/D139832
Reviewed By: nikic

[gn build] Port e4e0f9330798

Remove the ThreadLocal template from LLVM.

This has been obsoleted by C++ thread_local for a long time.
As far as I know, Xcode was the last supported toolchain to add
support for C++ thread_local in 2016.

As a precaution, use LLVM_THREAD_LOCAL which provides even greater
backwards compatibility, allowing this to function even pre-C++11
versions of GCC.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D141349

Remove a FIXME that will never be fixed since undef is being removed.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D141344

[clang-format] Properly handle the C11 _Generic keyword.

This patch properly recognizes the generic selection expression
introduced in C11, by adding an additional token type for the colons
present in such expressions.

Previously, they would be recognized as
"inline ASM colons" purely by the fact that those are the last thing
checked for.

I tried to avoid adding an addition token type, but since colons by
default like having spaces around them, I chose to add a new type so
that no space is added after the type selector.

Currently, no aspect of the formatting of these expressions in able to
be configured, as I'm not sure what could even be configured here.

One notable thing is that association list is always formatted as
either entirely on one line, if it can fit, or with line breaks
after every comma in the expression (also after the controlling expr.)

This visually makes them more similar to switch statements when long,
matching the behaviour of the selection expression, being that of a sort
of switch on types, but also allows for terseness when only selecting
for a few things.

Fixes https://github.com/llvm/llvm-project/issues/18080

Reviewed By: HazardyKnusperkeks, owenpan, MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D139211

[clang-format] Inherit RightAlign options across scopes

D119599 added the ability to align compound assignments, right aligning
them in order to line up at the equals sign.
However, that patch didn't account for AlignTokens being called
recursively across scopes, which reset the right justification to be
false in any scope besides the top scope. This meant the compound
assignments were aligned, just not at the right place.
(No tests also ever introduced any scopes)

This patch makes sure to inherit the right justification value, just as
every other parameter is passed on.

Fixes https://github.com/llvm/llvm-project/issues/58029

Reviewed By: HazardyKnusperkeks, owenpan, MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D141288

[DAGCombiner] Update test comments for pr59898; NFC

[AArch64] Only enable `foldCSELOfCSEl` DAG combine when x != y

https://alive2.llvm.org/ce/z/Uy_x_b

Fix: https://github.com/llvm/llvm-project/issues/59902

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D141359

[mlir][vector] Fix crash in extractelement vec distribution

Prevent creating a vector of size 0 that would fail verifier.
Vector 1d with a single element should be treated like 0d vectors.

Differential Revision: https://reviews.llvm.org/D141452

[X86][CodeGen]Fix extract f16 from big vectors

When use llc -mattr=+avx512fp16, it will crash.

```
define half @test(<64 x half> %x, i64 %idx){
%res = extractelement <64 x half> %x, i64 %idx
ret half %res
}
```
The root cause is when we enable avx512fp16 we lose custom handler
for extract f16 from big vectors which is not loaded from pointer.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D141348

[CMake] Make libLLVM functionable for Android

libLLVM.so is empty if it is built with Android NDK (`-DLLVM_BUILD_LLVM_DYLIB=ON`). The patch fixes it.

Reviewed By: xbolva00

Differential Revision: https://reviews.llvm.org/D140268

[mlir][affine] Canonicalize single value affine.min/max

Canonicalize identity affine.min/max to allow further optimization to follow the def-use chain of the given values. The reported issue is https://github.com/llvm/llvm-project/issues/59399.

Differential Revision: https://reviews.llvm.org/D141354

AMDGPU/SIInsertWait: Skip dummy tied source

For D16 memory load instructions, the hardware usually only write to half
of the 32bit register, but we define the destination register using
32bit register for the MachineIR instruction. Without the extra tied
source register, LLVM framework will think previous write to the other
half of the register being dead. This is because by using 32bit register
as the destination register, LLVM will think the instruction will always
overwrite the whole 32bit register. By adding the extra tied source,
LLVM will think we are reading the register, so previous write to the
register will not be dead. This dummy tied source is introducing
unnecessary read-after-write dependency. The change here is to bypass the
tied source that can be skipped, thus avoiding an unnecessary s_waitcnt.

Reviewed by: foad

Differential Revision: https://reviews.llvm.org/D140537

AMDGPU: Remove IsSourceOfDivergence check

This bit is not set/reserved in td file. Let's remove it for now,
we can always add it back if we need it.

Reviewed by: foad

Differential Revision: https://reviews.llvm.org/D141223

AMDGPU: Promote array alloca if used by memmove/memcpy

Reviewed by: arsenm

Differential Revision: https://reviews.llvm.org/D140599

[NFC][AVR] Use inline field initializers

[AVR] Init member added by D140830

clang: Convert test to generated checks and opaque pointers

AMDGPU: Use constant and externally_initialized for block handle

The runtime initializes this.

AMDGPU: Fix opaque pointer handling for enqueued blocks, again

Revert "[CMake][LoongArch] Add LoongArch to LLVM_ALL_TARGETS so it is built by default"

This reverts commit b5cb91fc1cb851dcc4d3dc82a19a9d56dd352350.

This commit causes expensive check fail:
https://lab.llvm.org/buildbot/#/builders/16/builds/41677.

This is to fix runtime problem for member data used in target region.

The problem is happened when base class member field is used in target
region , the size is wrong, cause runtime to fail. Currently the size of
calculation is depended on index of field, since field is in base class,
the calculation is wrong.

According OpenMP 5.2 148:21:
If the target construct is within a class non-static member function,
and a variable is an accessible data member of the object for which the
non-static data member function is invoked, the variable is treated as
if the this[:1] expression had appeared in a map clause with a map-type
of tofrom.

One way to fix this is emitting code to generate this[:1] instead only
when class has any base class.

Differential Revision: https://reviews.llvm.org/D141350

[NFC][AIX] Temporarily XFAIL test while investigating
Previous attempt to restrict this test to x86 and aarch64 targets only didn't work.
So XFAIL this test while investigating to get the AIX bot back to green.

[AsmWriter] Don't crash when printing a null operand bundle.

Differential Revision: https://reviews.llvm.org/D141415

[libc++] Use _LIBCPP_HIDE_FROM_ABI_VIRTUAL instead of _LIBCPP_INLINE_VISIBILITY attribute on virtual function

Reviewed By: #libc, philnik, ldionne

Differential Revision: https://reviews.llvm.org/D141388

[lldb] Only add lldb-framework-cleanup dependency to existing targets

The Xcode standalone build doesn't have the install-liblldb and
install-liblldb-stripped targets. Fix the resulting CMake error "Cannot
add target-level dependencies to non-existent target" by only adding the
dependency when the targets exist.

[libc++][test] Move `common_input_iterator` to `test_iterators.h`

There's a bit of duplication in use of `common_input_iterator` as noted in
https://reviews.llvm.org/D141216. Lift `common_input_iterator` into
`test_iterators.h`.

Differential Revision: https://reviews.llvm.org/D141238

[HWASAN] Implmented LSAN specifc thread support.

Implemented LSAN interface on HwasanThreadList and added os_id to __hwasan::Thread.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D141143

[CMake][LoongArch] Add LoongArch to LLVM_ALL_TARGETS so it is built by default

This follows the [[ https://discourse.llvm.org/t/rfc-promoting-the-loongarch-backend-from-experimental-to-official/67506/ | RFC ]].

Follow-on commits will add appropriate release notes changes etc.

Submit this now and in a minimal form so there is reasonable time before
16.0.0 is branched to resolve any issues arising from e.g. the backend
being exposed on different compiler/sanitizer setups.

The current builder for LoongArch is on the [[ https://lab.llvm.org/staging/#/builders/236 | staging area ]].

Reviewed By: jyknight, MaskRay, echristo, myhsu, tstellar, arsenm

Differential Revision: https://reviews.llvm.org/D141191

[HWASAN] Added thread annotations to HwasanThreadList.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D141438

[LegacyPM] Port example pass SimplifyCFG to new PM

This is part of effort in removing -enable-new-pm flag.
As a prat of this effort one of example passes SimplifyCFG must
be ported to new PM which will allow to remove the flag
calls from the tests that are using this pass.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D137103

[RISCV] Remove RISCVSubtarget::getArchMinVLen()/getArchMaxVLen().

Fold them into their only callers, getRealMinVLen()/getArchMaxVLen().

It is unclear right now when these are needed so removing to discourage
misuse.

Between Zvl*b extensions, vector length command line options, and
vscale range, we have several ways to influence vector length. We
need to try to keep all code on the same page.

[DWARFLibrary] Add support to re-construct cu-index

According to DWARF5 specification and gnu specification for DWARF4 the offset
entry in the CU/TU Index is 32 bits. This presents a problem when
.debug_info.dwo in DWP file grows beyond 4GB. The CU Index becomes partially
corrupted.

This diff adds manual parsing of .debug_info.dwo/.debug_abbrev.dwo to
reconstruct CU index in general, and TU index for DWARF5. This is a work around
until DWARF6 spec is finalized.

Next patch will change internal CU/TU struct to 64 bit, and change uses as
necessary. The plan is to land all the patches in one go after all are approved.

This patch originates from the discussion in: https://discourse.llvm.org/t/dwarf-dwp-4gb-limit/63902

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D137882

[AsmWriter] Don't crash when printing addrspace and the operand is null

This helps print instructions with detached operands in the debugger.

Differential Revision: https://reviews.llvm.org/D141343

[gn build] Remove --rtlib=libgcc for Android builds

Recent Android NDKs don't ship with libgcc

[OpenMP] Avoid running openmp-opt on dead functions

The Attributor has logic to run only on assumed live functions and this
is exposed to users now. OpenMP-opt will (mostly) ignore dead internal
functions now but run the same deduction as before if an internal
function is marked live.

This should lower compile time as we run on less code and delete more
code early on. For the full OpenMC module compiled with noinline and
JITed at runtime, we save ~25%, or ~10s on my machine during JITing.

[OpenMP] Ensure AAHeapToShared is only looking at one function

When we collect and process allocations we did not verify the call
against the anchor scope / associated function. This should be done to
avoid processing calls multiple times and generally looking at calls not
in the AAs scope.

[OpenMP] Disable ICV deduction by default.

This is not tested well and needs to be revisited in the future.

[test] Use -verify-cfg-preserved=0 in new-pass-manager.ll

This matches other tests and makes this test less sensitive

Revert "[perf-training] Check extension in findFilesWithExtension"

This reverts commit 1fbbf92e4fda3c7a3be1c02e1f7240135557846d.

[Attributor] Improve use of dominating writes during reasoning

This resolves a recent regression introduced by a bug fix and allows us
to use dominating write information (formerly HasBeenWrittenTo
information) to skip potential interfering accesses.

Generally, there are two changes here:
1) If we have dominating writes they form a chain and we can look at the
   least one to minimize the distance between the write and the (read)
   access in question.
2) If such a least dominating write exists, we can ignore writes in
   other functions as long as they cannot be reached from code between
   this write and the (read) access in question.

We have all the tools available to make such queries and the positive
tests show the result. Note that the negative test from the bug fix is
still in tree and not affected.

As a side-effect, we can remove the (arbitrary) treshold now on the
number of interfering accesses since we do not iterate over dominating
ones anymore.

[libc][NFC] Remove the now unused WrapperGen tool and tests.

[mlir][tosa] Remove redundant "tosa.transpose" operations

We can fold redundant Tosa::TransposeOp actions like identity tranpose/transpose(traspose).

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D140466

[Sanitizer][Apple] Enable sanitizer common unittests for arm64 archs on Apple

This patch enables sanitizer common unit tests for arm64 architecture only on apple devices.

It also lowers the expected compression ratio for 64 bit machines. Unsure of justification of 130. On apple arm64 we're seeing this number comeout to 128.6

rdar://101436019

Differential Revision: https://reviews.llvm.org/D141170

[mlir][Index] Add index.mins and index.minu

Signed and unsigned minimum operations were missing from the Index
dialect and are needed to test integer range inference.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D141299

[mlir][Arith] Fix bug in zero-extension range inference

D135089 extracted the extui code into a helper, but used fromSigned
instead of fromUnsigned.

Reviewed By: Mogball, ThomasRaoux

Differential Revision: https://reviews.llvm.org/D141296

[RISCV] Add test coverage for TSO AMOs [nfc]

[mlir][tosa] Add uint16 to TOSA type support

UInt16 is included as a required TOSA type for basic
operations. Added support as a TOSA Tensor.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D141341

[DAGCombine] fold (sext (sext_inreg x)) -> (sext (trunc x))

This fixes a regression introduced by D127115 in test/CodeGen/PowerPC/store-forward-be64.ll

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D140993

Fix math.cbrt with vector and f16 arguments.

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D141421

Reapply "[Attributor] Introduce AA[Intra/Inter]Reachability"

This reverts commit e425a4c45618fcfa8ffb13be4ddfaa5d28aa38f1 after the
memory leak has been fixed.

[libc++][utils] Remove unused import in `run.py`

`import sys` is unused in `libcxx/utils/run.py`: remove it.

Differential Revision: https://reviews.llvm.org/D141331

[libc++] Remove warning for `LIBCXX_SYSROOT`, `LIBCXX_TARGET_TRIPLE`, and `LIBCXX_GCC_TOOLCHAIN`

Support for these CMake variables has been a warning for a while now. The
comment indicates to just remove the warning message entirely as anyone impacted
had to have update to new mechanisms in order to use `libc++`. So, remove the
warning message.

Differential Revision: https://reviews.llvm.org/D141345

[llvm][ADT] Add deduction guides for `MutableArrayRef`

Similar to https://reviews.llvm.org/D140896, this adds deduction guides for the
counterpart of `ArrayRef`: `MutableArrayRef`. The update plan is the following:

1) Add deduction guides for `MutableArrayRef`.
2) Change uses in-tree from `makeMutableArrayRef` to use deduction guides
3) Mark `makeMutableArrayRef` as deprecated for some time before removing to
give downstream users time to update.

The deduction guides are similar to those provided by the `makeMutableArrayRef`
function templates, except we don't need one explicitly from `MutableArrayRef`.

Differential Revision: https://reviews.llvm.org/D141327

build: with -DCLANGD_ENABLE_REMOTE=ON, search for grpc++ dependencies too

Fixes: https://github.com/llvm/llvm-project/issues/59844

Differential Revision: https://reviews.llvm.org/D141047

[flang] Load unlimited polymorphic box the same way as other

Unlimited polymorphic descriptor have a set size and can be loaded the
same way as polymorphic or monomorphic descriptors. The descriptor
code gen as been set in D138587.
Of course, the data hold by those descriptors have an unknown size
at compile time.

Depends on D141383

Reviewed By: jeanPerier, PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D141390

[perf-training] Check extension in findFilesWithExtension

`findFilesWithExtension` helper checks for `endswith(extension)` instead of
exactly matching the file extension. This causes it to match unrelated files,
for example, `.profdata` files while matching `.fdata` files:

http://157.230.108.44:8011/#/builders/56/builds/247
```
Merging data from /worker/worker/bolt-x86_64-ubuntu-clang-bolt-gcc/build/tools/clang/prof.fdata.1124569.fdata...
Merging data from /worker/worker/bolt-x86_64-ubuntu-clang-bolt-gcc/build/tools/clang/test/Frontend/Output/optimization-remark-with-hotness-new-pm.c.tmp.profdata...
```

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D141342

AMDGPU/GlobalISel: Widen s1 SGPR constants during regbankselect

To unambiguously interpret these as 32-bit SGPRs, we need to widen
these to s32. This was selecting to a copy from a 64-bit SGPR to a
32-bit SGPR for wave64.

clang: Fix handling of __builtin_elementwise_copysign

I realized the handling of copysign made no sense at all.
Only the type of the first operand should really matter, and
it should not perform a conversion between them.

Also fixes misleading errors and producing broken IR for
integers.

We could accept different FP types for the sign argument,
if the intrinsic permitted different types like the DAG node.
As it is we would need to insert a cast, which could have
other effects (like trapping on snan) which should not happen
for a copysign.

[CallGraph][FIX] Ensure generic intrinsics are represented in the CG

Intrinsics have historically been excluded from the call graph with an
exception of 3 special ones added at some point. This meant that passes
depending on the call graph needed to handle intrinsics explicitly as
the underlying assumption, namely that intrinsics can't call or modify
things, doesn't hold. We are slowly moving away from special handling of
intrinsics, or at least towards explicitly checking what intrinsics we
want to handle differently.

This patch:
- Includes most intrinsics in the call graph. Debug intrinsics are
still excluded.
- Removes the special handling of intrinsics in the GlobalsAA pass.
- Removes the `IntrinsicInst::isLeaf` method.

Properly
Fixes: https://github.com/llvm/llvm-project/issues/52706

See also:
https://discourse.llvm.org/t/intrinsics-are-not-special-stop-pretending-i-mean-it/67545

Differential Revision: https://reviews.llvm.org/D14119

[lldb] Remove tools copied into LLDB.framework before install

CMake supports building Framework bundles for Apple platforms. We rely
on this functionality to create LLDB.framework. From CMake's
perspective, a framework is associated with a single target. In reality,
this is often not the case. In addition to a main library (liblldb) the
frameworks often contains associated resources that are their own CMake
targets, such as debugserver and lldb-argdumper.

When building and using the framework to run the test suite, those
binaries are expected to be in LLDB.framework. We have a function
(lldb_add_to_buildtree_lldb_framework) that copies those targets into
the framework.

When CMake installs a framework, it copies over the content of the
framework directory and "installs" the associated target. In addition to
copying the target, installing also means updating the RPATH. However,
the RPATH is only updated for the target associated with the framework.
Everything else is naively copied over, including executables or
libraries that have a different build and install RPATH. This means that
those tools need to have their own install rules.

If the framework is installed first, the aforementioned tools are copied
over with build RPATHs from the build tree. And when the tools
themselves are installed, the binaries get overwritten and the RPATHs
updated to the install RPATHs.

The problem is that CMake provides no guarantees when it comes to the
order in which components get installed. If those tools get installed
first, the inverse happens and the binaries get overwritten with the
ones that have build RPATHs.

lldb_add_to_buildtree_lldb_framework has a comment correctly stating
that those copied binaries should be removed before install. This patch
adds a custom target (lldb-framework-cleanup) that will be run before
the install phase and remove those files from LLDB.framework in the
build tree.

Differential revision: https://reviews.llvm.org/D141021

[libc++][test] Silence allocator conversion warnings

... by accepting `std::size_t` instead of `int` in `allocate` and `deallocate` functions.

Drive-by: To conform to the allocator requirements, the `Allocator` types in these tests need to have (1) converting constructors and (2) cross-specialization `==` that returns `true` at least for copies of the same allocator.
Differential Revision: https://reviews.llvm.org/D141334

[AMDGPU] Fix big endian bots after 7c327c2

The pass that this test case is testing has host-endianness
dependencies. This fixes the pertinent one causing failures
on BE bots.

AMDGPU: Don't assert on printf of half

The comment says fields should be 4-byte aligned, so just pass through
after conversion to integer. The conformance test lacks any testing of
half.

LangRef: Add !associated to list of preserved global metadata

AMDGPU: Fix opaque pointer and other bugs in printf of constant strings

Strip pointer casts to get to the global. Fixes not respecting indexed
constant strings. Tolerate non-null terminated and empty strings.

[llvm][dwwarf] Change CU/TU index to 64-bit

Changed contribution data structure to 64 bit. I added the 32bit and 64bit
accessors to make it explicit where we use 32bit and where we use 64bit. Also to
make sure sure we catch all the cases where this data structure is used.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D139379

tsan: fix a race when assigning ThreadSignalContext

The SigCtx function lazily allocates a ThreadSignalContext, and stores it
in the ThreadState. This function may be called by various interceptors and
the signal handler itself.

If SigCtx itself is interrupted by a signal, then (prior to this fix) there
was a possibility of allocating two ThreadSignalContexts. This not only
leaks, it fails to deliver the signal to the program's signal handler, as
the recorded signal is overwritten by the new ThreadSignalContext.

Fix this by using a CAS to swap in the ThreadSignalContext, preventing the
race. Add a test for this case.

Reviewed By: dvyukov, melver

Differential Revision: https://reviews.llvm.org/D140582

[GWP-ASan] Fix up bad report for in-page underflow w/ UaF

Complex scenario, but reports when there's both a use-after-free and
buffer-underflow that is in-page (i.e. doesn't touch the guard page)
ended up generating a pretty bad report:

'Use After Free at 0x7ff392e88fef (18446744073709551615 bytes into a
1-byte allocation at 0x7ff392e88ff0) by thread 3836722 here:'

(note the 2^64-bytes-into-alloc, very cool and good!)

Fix up that case, and add a diagnostic about when you have both a
use-after-free and a buffer-overflow that it's probably a bogus report
(assuming the developer didn't *really* screw up and have a uaf+overflow
bug at the same time).

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D139885

[GWP-ASan] Fix atfork handlers being installed multiple times in tests

We incorrectly install the atfork handlers multiple times in the test
harness, tracked down to the default parameter used by
CheckLateInitIsOK. This manifested in a hang if running the tests with
--gtest_repeat={>=2} as the atfork handler ran multiple times, causing
double-lock and double-unlock, which on my machine hung.

Add a check-fail for this case as well to prevent this from happening
again (it was difficult to track down and is an easy mistake to make).

Differential Revision: https://reviews.llvm.org/D139731

[SLP] Do not ignore ordering for root node when it has in-tree uses.

When rooted with PHIs, a vectorization tree may have another node with PHIs
which have roots as their operands. We cannot ignore ordering information
for root in such a case.

Differential Revision: https://reviews.llvm.org/D141309

[RISCV] Avoid emitting hardware fences for singlethread fences

singlethread fences only synchronize with code running on the same hardware thread (i.e. signal handlers). Because of this, we need to prevent instruction reordering, but do not need to emit hardware fence instructions.

The implementation strategy here matches many other backends. The main motivation of this patch is to introduce the MEMBARRIER node and get some test coverage for it.

Differential Revision: https://reviews.llvm.org/D141311

AMDGPU: Don't insert ptrtoint for printf lowering

AMDGPU: Stop trying to specially handle vector stores in printf lowering

This was broken for 1 element vectors and trying to create invalid
casts. We can directly store any type just fine, so don't bother with
this buggy conversion logic.

clang/AMDGPU: Add missing tests for some builtin

These were tested under opencl but need hip testing for the potential
addrspacecasts.

AMDGPU: Move intrinsic definition point

This was incorrectly listed under the block for backend internal
intrinsics only.

AMDGPU: Set some more attributes on intrinsics

[WebAssembly][LiveDebugValues] Handle target index defs

This adds the missing handling for defs for target index operands, as is
already done for registers.

There are two kinds of target indices: local indices and stack operands.

- Locals are something similar to registers in Wasm-land. For local
  indices, we can check for local-defining instructions (`local.set` or
  `local.tee`).

- Wasm is a stack machine, so we have values in certain Wasm value stack
  location, which change when Wasm instructions produce or consume
  values. So basically any value-producing instrucion, i.e., instruction
  with defs, can change values in the Wasm stack. But I think we don't
  need to worry about this here, because `WebAssemblyDebugFixup`, which
  runs right before this analysis, makes sure to insert terminating
  `DBG_VALUE $noreg` instructions whenever a stack value gets popped.
  After `WebAssemblyDebugFixup`, there shouldn't be any `DBG_VALUE`s for
  stack operands that don't have a terminating `DBG_VALUE $noreg` within
  the same BB.

So this CL only works on `DBG_VALUE`s for locals. When we encounter a
`local.set` or `local.tee` instructions, we delete `DBG_VALUE`s for
those target index locations from the open range set, so they will not
be availble in `OutLocs`. For example,
```
bb.0:
  successors: %bb.1
  DBG_VALUE target-index(wasm-local) + 2, $noreg, "var", ...
  ...
  local.set 2 ...

bb.1:
; predecessors: %bb.0
  ; We shouldn't add `DBG_VALUE target (wasm-local) + 2 here because
  ; it was killed by 'local.set' in bb.0
```

After disabling register coalescing at -O1, the average PC ranges
covered for Emscripten core benchmarks is currently 20.6% in the LLVM
tot. After applying D138943 and this CL, the coverage goes up to 57%.
This also enables LiveDebugValues analysis in the Wasm pipeline by
default.

Reviewed By: dschuff, jmorse

Differential Revision: https://reviews.llvm.org/D140373