review.tizen.org Git - platform/upstream/llvm.git/log

Fix COMPILER_RT_DEBUG build for targets that don't support thread local storage.

022439931f5be77efaf80b44d587666b0c9b13b5 added code that is only enabled
when COMPILER_RT_DEBUG is enabled. This code doesn't build on targets
that don't support thread local storage because the code added uses the
THREADLOCAL macro. Consequently the COMPILER_RT_DEBUG build broke for
some Apple targets (e.g. 32-bit iOS simulators).

```
/Volumes/user_data/dev/llvm/llvm.org/main/src/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_mutex.cpp:216:8: error: thread-local storage is not supported for the current target
static THREADLOCAL InternalDeadlockDetector deadlock_detector;
       ^
/Volumes/user_data/dev/llvm/llvm.org/main/src/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h:227:24: note: expanded from macro 'THREADLOCAL'
# define THREADLOCAL   __thread
                        ^
1 error generated.
```

To fix this, this patch introduces a `SANITIZER_SUPPORTS_THREADLOCAL`
macro that is `1` iff thread local storage is supported by the current
target. That condition is then added to `SANITIZER_CHECK_DEADLOCKS` to
ensure the code is only enabled when thread local storage is available.

The implementation of `SANITIZER_SUPPORTS_THREADLOCAL` currently assumes
Clang. See `llvm-project/clang/include/clang/Basic/Features.def` for the
definition of the `tls` feature.

rdar://81543007

Differential Revision: https://reviews.llvm.org/D107524

[llvm-ar] Fix for handling thin archive with SYM64 and a test case for it

WHen thin archives are created which have symbol table of type SYM64 then all the tools will not work since they cannot read the files properly.
One can reproduce the problem as follows:
1. Take a hello world program and create an archive out of it. The SYM64_THRESHOLD=0 will force the generation of SYM64 symbol table.
    clang -c hello.cpp
    SYM64_THRESHOLD=0 llvm-ar crsT mylib.a hello.o
2. Now try to use any of the tools on this mylib.a and it will fail.
    llvm-nm -M mylib.a

THis fix will eliminate these failures. A regression test is created in llvm/test/Object/archive-symtab.test

Reviewed By: MaskRay, Ramesh

Differential Revision: https://reviews.llvm.org/D107322

Fix clang-interpreter build after 2487db1f286222e2501c2fa8e8244eda13f6afc3

Revert "[X86] combineX86ShuffleChain(): canonicalize mask elts picking from splats"

This reverts commits f819e4c7d0f6efef3cc1042cc45582320bf6c0a2 and
35c0848b570214ed2b2d96cca4dd62bb7ae725cd. It triggers an infinite loop during
compilation.

$ cat t.ll
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define void @MaxPoolGradGrad_1.65() local_unnamed_addr #0 {
entry:
  %wide.vec78 = load <64 x i32>, <64 x i32>* null, align 16
  %strided.vec83 = shufflevector <64 x i32> %wide.vec78, <64 x i32> poison, <8 x i32> <i32 4, i32 12, i32 20, i32 28, i32 36, i32 44, i32 52, i32 60>
  %0 = lshr <8 x i32> %strided.vec83, <i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16>
  %1 = add <8 x i32> zeroinitializer, %0
  %2 = shufflevector <8 x i32> %1, <8 x i32> undef, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
  %3 = shufflevector <16 x i32> %2, <16 x i32> undef, <32 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
  %interleaved.vec = shufflevector <32 x i32> undef, <32 x i32> %3, <64 x i32> <i32 0, i32 8, i32 16, i32 24, i32 32, i32 40, i32 48, i32 56, i32 1, i32 9, i32 17, i32 25, i32 33, i32 41, i32 49, i32 57, i32 2, i32 10, i32 18, i32 26, i32 34, i32 42, i32 50, i32 58, i32 3, i32 11, i32 19, i32 27, i32 35, i32 43, i32 51, i32 59, i32 4, i32 12, i32 20, i32 28, i32 36, i32 44, i32 52, i32 60, i32 5, i32 13, i32 21, i32 29, i32 37, i32 45, i32 53, i32 61, i32 6, i32 14, i32 22, i32 30, i32 38, i32 46, i32 54, i32 62, i32 7, i32 15, i32 23, i32 31, i32 39, i32 47, i32 55, i32 63>
  store <64 x i32> %interleaved.vec, <64 x i32>* undef, align 16
  unreachable
}

$ llc < t.ll -mcpu=skylake
<hang>

[AArch64][GlobalISel] Mark v16s8 <- v8s8, v8s8 G_CONCAT_VECTOR as legal

G_CONCAT_VECTORS shows up from time to time when legalizing other instructions.

We actually import patterns for the v16s8 <- v8s8, v8s8 case so marking it
as legal gives us selection for free.

Differential Revision: https://reviews.llvm.org/D107512

Add llvm-stress binary to Bazel build configuration.

The `llvm-stress` binary is currently missing from the Bazel `BUILD` file for llvm. This patch adds it.

Reviewed By: GMNGeoffrey

Differential Revision: https://reviews.llvm.org/D107571

[SLP] Add additional memory version tests.

Fix signal during the call to checkOpenMPLoop.

The root problem is a null pointer is accessed during the call to
checkOpenMPLoop, because loop up bound expr is an error expression
due to error diagnostic was emit early.

To fix this, in setLCDeclAndLB, setUB and setStep instead return false,
return true when LB, UB or Step contains Error, so that the checking is
stopped in checkOpenMPLoop.

Differential Revision: https://reviews.llvm.org/D107385

[Transforms] Drop unnecessary const from return types (NFC)

Identified with readability-const-return-type.

[SLP]Do not emit extra shuffle for insertelements vectorization.

If the vectorized insertelements instructions form indentity subvector
(the subvector at the beginning of the long vector), it is just enough
to extend the vector itself, no need to generate inserting subvector
shuffle.

Differential Revision: https://reviews.llvm.org/D107494

[DAGCombiner][RISCV][AMDGPU] Call SimplifyDemandedBits at the end of visitMULHU to enable known bits contant folding.

We don't have real demanded bits support for MULHU, but we can
still use the known bits based constant folding support at the end
of SimplifyDemandedBits to simplify a MULHU. This helps with cases
where we know the LHS and RHS have enough leading zeros so that
the high multiply result is always 0.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D106471

Fix build issues caused by 95800da914938129083df2fa0165c1901909c273

[LV] Consider ExtractValue as uniform.

Since all operands to ExtractValue must be loop-invariant when we deem
the loop vectorizable, we can consider ExtractValue to be uniform.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D107286

[PowerPC][AIX] attribute aligned cannot decrease align of a vector var.

On AIX an aligned attribute cannot decrease the alignment of a variable
when placed on a variable declaration of vector type.

Differential Revision: https://reviews.llvm.org/D107522

[NFC][LoopIdiom] rename boolean variable NegStride to IsNegStride

Rename variable for better code readability.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D107570

[AMDGPU][SDag] Better lowering for 32-bit ctlz/cttz

Differential Revision: https://reviews.llvm.org/D107566

[AMDGPU][SDag] Better lowering for 64-bit ctlz/cttz

Differential Revision: https://reviews.llvm.org/D107546

tsan: pass thr/pc to MemoryResetRange

Pass thr/pc args to MemoryResetRange as we do everywhere.
Currently they are unused by MemoryResetRange,
but there is no reason to be inconsistent.

Depends on D107562.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D107563

tsan: qualify autos

clang-tidy warning requires qualifying auto pointers:

clang-tidy: warning: 'auto ctx' can be declared as 'auto *ctx' [llvm-qualified-auto]

Fix remaing cases we have in tsan.

Depends on D107561.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D107562

tsan: don't include tsan_interceptors.h for Go

None of the interceptors machinery is used/enabled for Go,
so don't include the header, it's not needed (must not be).
The problem is that we have fields in ThreadState that are
not present in the Go build, so changes in thread_interceptors.h
can cause Go build breakages due to missing fields.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D107561

[SimpifyCFG] Speculate a store preceded by a local non-escaping load

In SimplifyCFG we may simplify the CFG by speculatively executing
certain stores, when they are preceded by a store to the same
location. This patch allows such speculation also when the stores are
similarly preceded by a load.

In order for this transformation to be correct we need to ensure that
the memory location is writable and the store in the new location does
not introduce a data race.

Local objects (created by an `alloca` instruction) are always
writable, so once we are past a read from a location it is valid to
also write to that same location.

Seeing just a load does not guarantee absence of a data race (unlike
if we see a store) - the load may still be part of a race, just not
causing undefined behaviour
(cf. https://llvm.org/docs/Atomics.html#optimization-outside-atomic).

In the original program, a data race might have been prevented by the
condition, but once we move the store outside the condition, we must
be sure a data race wasn't possible anyway, no matter what the
condition evaluates to.

One way to be sure that a local object is never concurrently
read/written is check that its address never escapes the function.

Hence this transformation is restricted to local, non-escaping
objects.

Reviewed By: nikic, lebedev.ri

Differential Revision: https://reviews.llvm.org/D107281

tsan: handle bugs in symbolizer more gracefully

For symbolizer we only process SIGSEGV signals synchronously
(which means bug in symbolizer or in tsan).
But we still want to reset in_symbolizer to fail gracefully.
Symbolizer and user code use different memory allocators,
so if we don't reset in_symbolizer we can get memory allocated
with one being feed with another, which can cause more crashes.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D107564

tsan: modernize MaybeReportThreadLeak

Use C++ casts and auto.
Rename to CollectThreadLeaks b/c it's only collecting, not reporting.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D107568

[clang] [clang-repl] Fix linking against LLVMLineEditor

LLVMLineEditor library is part of the LLVM dylib. Move it into
LLVM_LINK_COMPONENTS to avoid duplicate linking when dylib is being
used. This fixes building standalone clang against installed LLVM
without static libraries.

Differential Revision: https://reviews.llvm.org/D107558

[DAG] DAGCombiner::visitVECTOR_SHUFFLE - recognise INSERT_SUBVECTOR patterns

IR typically creates INSERT_SUBVECTOR patterns as a widening of the subvector with undefs to pad to the destination size, followed by a shuffle for the actual insertion - SelectionDAGBuilder has to do something similar for shuffles when source/destination vectors are different sizes.

This combine attempts to recognize these patterns by looking for a shuffle of a subvector (from a CONCAT_VECTORS) that starts at a modulo of its size into an otherwise identity shuffle of the base vector.

This uncovered a couple of target-specific issues as we haven't often created INSERT_SUBVECTOR nodes in generic code - aarch64 could only handle insertions into the bottom of undefs (i.e. a vector widening), and x86-avx512 vXi1 insertion wasn't keeping track of undef elements in the base vector.

Fixes PR50053

Differential Revision: https://reviews.llvm.org/D107068

[VectorCombine] Limit scalarization known non-poison indices.

We can only trust the range of the index if it is guaranteed
non-poison.

Fixes PR50949.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D107364

[BuildLibCalls][NFC] Remove redundant attribute list from emitCalloc

Additionally with this patch aligned DSE which is the only user of emitCalloc.

Differential Revision: https://reviews.llvm.org/D103523

[LoopVectorize] Add support for replication of more intrinsics with scalable vectors

This patch adds more instructions to the Uniforms list, for example certain
intrinsics that are uniform by definition or whose operands are loop invariant.
This list includes:

  1. The intrinsics 'experimental.noalias.scope.decl' and 'sideeffect', which
  are always uniform by definition.
  2. If intrinsics 'lifetime.start', 'lifetime.end' and 'assume' have
  loop invariant input operands then these are also uniform too.

Also, in VPRecipeBuilder::handleReplication we check if an instruction is
uniform based purely on whether or not the instruction lives in the Uniforms
list. However, there are certain cases where calls to some intrinsics can
be effectively treated as uniform too. Therefore, we now also treat the
following cases as uniform for scalable vectors:

  1. If the 'assume' intrinsic's operand is not loop invariant, then we
  are free to treat this as uniform anyway since it's only a performance
  hint. We will get the benefit for the first lane.
  2. When the input pointers for 'lifetime.start' and 'lifetime.end' are loop
  variant then for scalable vectors we assume these still ultimately come
  from the broadcast of an alloca. We do not support scalable vectorisation
  of loops containing alloca instructions, hence the alloca itself would
  be invariant. If the pointer does not come from an alloca then the
  intrinsic itself has no effect.

I have updated the assume test for fixed width, since we now treat it
as uniform:

  Transforms/LoopVectorize/assume.ll

I've also added new scalable vectorisation tests for other intriniscs:

  Transforms/LoopVectorize/scalable-assume.ll
  Transforms/LoopVectorize/scalable-lifetime.ll
  Transforms/LoopVectorize/scalable-noalias-scope-decl.ll

Differential Revision: https://reviews.llvm.org/D107284

Revert "[SystemZ][z/OS] Update target specific __attribute__((aligned)) value for test"

This reverts commit d91234b21c1a1a34d98157089a8769d8f9a32f06.

Reviewed By: abhina.sreeskantharajan

Differential Revision: https://reviews.llvm.org/D107565

[SimplifyLibCalls][NFC] Clean up LibCallSimplifier from 'memset + malloc into calloc' transformation

FoldMallocMemset can be safely removed because since https://reviews.llvm.org/D103009
such transformation is already performed in DSE.

Differential Revision: https://reviews.llvm.org/D103451

Delay initialization of OptBisect

When LLVM is used in other projects, it may happen that global cons-
tructors will execute before the call to ParseCommandLineOptions.
Since OptBisect is initialized via a constructor, and has no ability
to be updated at a later time, passing "-opt-bisect-limit" to the
parse function may have no effect.

To avoid this problem use a cl::cb (callback) to set the bisection
limit when the option is actually processed.

Differential Revision: https://reviews.llvm.org/D104551

[NFC] Clean up tests in test/Transforms/LoopVectorize/assume.ll

The tests previously had lots of unnecessary CHECK lines, where
all we really need to check is the presence (or absence) of the
assume intrinsic and the correct input operands.

Differential Revision: https://reviews.llvm.org/D107157

[PowerPC][AIX] Limit attribute aligned to 4096.

Limit the maximum alignment for attribute aligned to 4096 to match
the limit of the .align pseudo op in the system assembler.

Differential Revision: https://reviews.llvm.org/D107497

[DA] control compile-time spent by MIV tests

Function exploreDirections() in DependenceAnalysis implements a recursive
algorithm for refining direction vectors. This algorithm has worst-case
complexity of O(3^(n+1)) where n is the number of common loop levels.
In this patch I'm adding a threshold to control the amount of time we
spend in doing MIV tests (which most of the time end up resulting in over
pessimistic direction vectors anyway).

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D107159

[LV] Remove a change that was added in D106164.

This change wasn't strictly necessary for D106164 and could be removed.
This patch addresses the post-commit comments from @fhahn on D106164, and
also changes sve-widen-gep.ll to use the same IR test as shown in
pointer-induction.ll.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D106878

[NFC] Remove redundant test in Transforms/LoopVectorize/lifetime.ll

The two tests (@testloopvariant and @testbitcast) are actually
identical as in both loops the bitcast gets widened, forcing the
lifetime marker to be replicated using each lane of the input
vector.

Differential Revision: https://reviews.llvm.org/D107150

Add a DIExpression const-folder to prevent silly expressions.

It's entirely possible (because it actually happened) for a bool
variable to end up with a 256-bit DW_AT_const_value. This came about
when a local bool variable was initialized from a bitfield in a
32-byte struct of bitfields, and after inlining and constant
propagation, the variable did have a constant value. The sequence of
optimizations had it carrying "i256" values around, but once the
constant made it into the llvm.dbg.value, no further IR changes could
affect it.

Technically the llvm.dbg.value did have a DIExpression to reduce it
back down to 8 bits, but the compiler is in no way ready to emit an
oversized constant *and* a DWARF expression to manipulate it.
Depending on the circumstances, we had either just the very fat bool
value, or an expression with no starting value.

The sequence of optimizations that led to this state did seem pretty
reasonable, so the solution I came up with was to invent a DWARF
constant expression folder. Currently it only does convert ops, but
there's no reason it couldn't do other ops if that became useful.

This broke three tests that depended on having convert ops survive
into the DWARF, so I added an operator that would abort the folder to
each of those tests.

Differential Revision: https://reviews.llvm.org/D106915

[VectorCombine] Add additional tests with freeze combinations.

Suggested in D107364.

GlobalISel: Fix matchEqualDefs for instructions with multiple defs

Instructions that produceSameValue produce same values for operands with
same index. matchEqualDefs used to return true for any two values from
different instructions that produce same values. Fix this by checking if
values are defined by operands with the same index.

Differential Revision: https://reviews.llvm.org/D107362

[flang][driver] Delete `f18` (i.e. the old Flang driver)

This patch removes `f18`, a.k.a. the old driver. It is being replaced
with the new driver, `flang-new`, which has reached feature parity with
`f18` a while ago. This was discussed in [1] and also in [2].

With this change, `FLANG_BUILD_NEW_DRIVER` is no longer needed and is
also deleted. This means that we are making the dependency on Clang permanent
(i.e. it cannot be disabled with a CMake flag).

LIT set-up is updated accordingly. All references to `f18` or `f18.cpp`
are either updated or removed.

The `F18_FC` variable from the `flang` bash script is replaced with
`FLANG_FC`. The former is still supported for backwards compatibility.

[1] https://lists.llvm.org/pipermail/flang-dev/2021-June/000742.html
[2] https://reviews.llvm.org/D103177

Differential Revision: https://reviews.llvm.org/D105811

[AMDGPU] Add globalisel checks for ctlz_zero_undef/cttz_zero_undef

[X86] Rename Subtarget Tuning Feature Flag Prefix. NFC.

As suggested on D107370, this patch renames the tuning feature flags to start with 'Tuning' instead of 'Feature'.

Differential Revision: https://reviews.llvm.org/D107459

[GlobalISel] Combine shr(shl x, c1), c2 to G_SBFX/G_UBFX

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D107330

tsan: introduce RawShadow type

Currently we hardcode u64 type for shadow everywhere
and do lots of uptr<->u64* casts. It makes it hard to
change u64 to another type (e.g. u32) and makes it easy
to introduce bugs.
Introduce RawShadow type and use it in MemToShadow, ShadowToMem,
IsShadowMem and throughout the code base as u64 replacement.
This makes it possible to change u64 to something else in future
and generally improves static typing.

Depends on D107481.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D107482

tsan: make IsMetaMem accept u32*

MemToMeta returns u32*, so it's reasonable for IsMetaMem
to accept u32* as well.
Changing the argument type just removes few type casts.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D107481

Correct a lot of diagnostic wordings for the driver

Clang diagnostics should not start with a capital letter or use
trailing punctuation (https://clang.llvm.org/docs/InternalsManual.html#the-format-string),
but quite a few driver diagnostics were not following this advice. This
corrects the grammar and punctuation to improve consistency, but does
not change the circumstances under which the diagnostics are produced.

[flang][driver] Refactor boolean options

For boolean options, e.g. `-fxor-operator`/`-fno-xor-operator`, we ought
to be using TableGen multi-classes. This way, we only have to write one
definition to have both forms auto-generated. This patch refactors all
of Flang's boolean options to use two new multi-classes:
`OptInFC1FFOption` and `OptOutFC1FFOption`. These multi-classes are
based on `OptInFFOption`/`OptOutFFOption`, respectively. I've also
simplified the processing of the updated options in
CompilerInvocation.cpp.

With the new approach, "empty" help text (i.e. no `HelpText`) is now
replaced with an empty string (i.e. HelpText<"">). When running
flang-new --help, that's considered as non-empty help messages, which is
then printed (that's controlled by `printHelp` from
llvm/lib/Option/OptTable.cpp). This means that with this patch,
flang-new --help will start printing e.g. -fno-backslash, even though
there is no actual help text to print for this option (apart from the
empty string ""). Tests are updated accordingly.

Note that with this patch, both `-fxor-operator` and `-fno-xor-operator`
(and other boolean options refactored here) remain available in
`flang-new` and `flang-new -fc1`. In this respect, nothing changes. In a
forthcoming patch, I will refine this so that `flang-new -fc1` only
accepts `-ffoo` (`OptInFC1FFOption`) or `-fno-foo` (`OptOutCC1FFOption`).

For clarity, `OptInFFOption`/`OptOutFFOption` are renamed as
`OptInCC1FFOption`/`OptOutCC1FFOption`, respectively. Otherwise, this is
an NFC from Clang's perspective.

Differential Revision: https://reviews.llvm.org/D105881

[OpenCL] Reduce duplicate defs by using multiclasses; NFC

Builtin definitions with pointer arguments were duplicated to provide
overloads differing in the pointer argument's address space.

Reduce this duplication by capturing the definitions in multiclasses.
This still results in the same number of builtins in the generated
tables, but the description is more concise now.

Differential Revision: https://reviews.llvm.org/D107151

Revert "D106035: Remove conditional compilation for WCHAR support in libedit"

This reverts commit 7529f0e3e1427fea93a6a66a2aed5394710e5fb5.

[AMDGPU] Generate checks for ctlz_zero_undef/cttz_zero_undef

Mark tests as requiring AMDGPU target

[SelectionDAG] Correctly determine the VECREDUCE_SEQ_FMUL action

The LegalizeAction for this node should follow the logic for
`VECREDUCE_SEQ_FADD` and be determined using the vector operand's type.

here isn't an in-tree target that makes use of this, but I think it's safe to
say this is how it should behave, should a target want to customize the action
for this node.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D107478

[llvm-jitlink] Don't try to guess the ORC runtime path.

ORC-runtime regression tests will now explicitly specify the runtime path.

[AMDGPU] Make more use of getHiHalf64 and split64BitValue. NFCI.

[clang] Add clang builtins support for gfx90a

Implement target builtins for gfx90a including fadd64, fadd32, add2h,
max and min on various global, flat and ds address spaces for which
intrinsics are implemented.

Differential Revision: https://reviews.llvm.org/D106909

D106035: Remove conditional compilation for WCHAR support in libedit

This change moves to using narrow character types and libedit APIs in
Editline, because those are the same types that the rest of LLVM/LLDB
uses, and it's generally considered better practice to use UTF-8
encoded in char than it is to use wider characters. However, for
character input, the change leaves in using a wchar to enable input of
multi-byte characters.

Differential Revision: https://reviews.llvm.org/D106035

[llvm-rc] Allow specifying language with a leading 0x prefix

This option is always interpreted strictly as a hexadecimal string,
even if it has no prefix that indicates the number format, hence
the existing call to StringRef::getAsInteger(16, ...).

StringRef::getAsInteger(0, ...) consumes a leading "0x" prefix is
present, but when the radix is specified, the radix shouldn't
be included.

Both MS rc.exe and GNU windres accept the language with that
prefix.

Also allow specifying the codepage to llvm-windres with a different
radix, as GNU windres allows that (but MS rc.exe doesn't).

This fixes https://llvm.org/PR51295.

Differential Revision: https://reviews.llvm.org/D107263

[llvm] [lit] Fix inconsistent test order in shtest-keyword-parse-errors

Remove test times when running shtest-keyword-parse-errors test,
in order to prevent the previous executions from impacting subtest
order and therefore causing FileCheck to fail.

Differential Revision: https://reviews.llvm.org/D107427

[ARM][llvm-objdump] Annotate PC-relative memory operands of VLDR instructions

This extends D105979 and adds support for VLDR instructions.

Differential Revision: https://reviews.llvm.org/D105980

[ARM][llvm-objdump] Annotate PC-relative memory operands

This implements `MCInstrAnalysis::evaluateMemoryOperandAddress()` for
Arm so that the disassembler can print the target address of memory
operands that use PC+immediate addressing.

Differential Revision: https://reviews.llvm.org/D105979

[ELF] Apply version script patterns to non-default version symbols

Currently version script patterns are ignored for .symver produced
non-default version (single @) symbols. This makes such symbols
not localizable by `local:`, e.g.

```
.symver foo3_v1,foo3@v1
.globl foo_v1
foo3_v1:

ld.lld --version-script=a.ver -shared a.o
```

This patch adds the support:

* Move `config->versionDefinitions[VER_NDX_LOCAL].patterns` to `config->versionDefinitions[versionId].localPatterns`
* Rename `config->versionDefinitions[versionId].patterns` to `config->versionDefinitions[versionId].nonLocalPatterns`
* Allow `findAllByVersion` to find non-default version symbols when `includeNonDefault` is true. (Note: `symtab` keys do not have `@@`)
* Make each pattern check both the unversioned `pat.name` and the versioned `${pat.name}@${v.name}`
* `localPatterns` can localize `${pat.name}@${v.name}`. `nonLocalPatterns` can prevent localization by assigning `verdefIndex` (before `parseSymbolVersion`).

---

If a user notices new `undefined symbol` errors with a version script containing
`local: *;`, the issue is likely due to a missing `global:` pattern.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D107234

Revert "[ELF] Apply version script patterns to non-default version symbols"

This reverts commit 7ed22a6fa90cbdc70d6806c1121a0c50c1978dce.

buf is not cleared so the commit misses some cases.

[NFCI] [LoopIdiom] Let processLoopStridedStore take StoreSize as SCEV instead of unsigned

Letting it take SCEV allows further modification on the function to optimize
if the StoreSize / Stride is runtime determined.

This is a preceeding of D107353.
The big picture is to let LoopIdiom deal with runtime-determined sizes.

Reviewed By: Whitney, lebedev.ri

Differential Revision: https://reviews.llvm.org/D104595

[WebAssembly] Cleanup Emscripten SjLj tests

- Remove a redundant test: there were `longjmp_only` and `only_longjmp`,
which do the same thing
- Add `CHECK-LABEL` lines for function names

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D107511

[WebAssembly] Use `SDValue::getConstantOperandVal` (NFC)

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D107499

[mlir] Clean up includes in Transforms/Passes.h

Differential Revision: https://reviews.llvm.org/D107520

Disable LibFuncs for stpcpy and stpncpy for Android < 21

These functions don't exist in android API levels < 21. A change in
llvm-12 (rG6dbf0cfcf789) caused Oz builds to emit this symbol assuming
it's available and thus is causing link errors. Simply disable it here.

Differential Revision: https://reviews.llvm.org/D107509

[Compiler-RT] On Apple Platforms switch to always emitting full debug info

Previously the build used `-gline-tables-only` when `COMPILER_RT_DEBUG`
was off (default) and `-g` when `COMPILER_RT_DEBUG` was on. The end
result of this meant that the release build of the Sanitizer runtimes
were difficult to debug (e.g. information about variables and function
arguments were missing).

Presumably the reason for preferring `-gline-tables-only` for release
builds was to save space. However, for Apple platforms this doesn't
matter because debug info lives in separate `.dSYM` files (which aren't
shipped) rather than in the shipped `.dylib` files.

Now on Apple platforms we always emit full debug info if the compiler
supports it and we emit a fatal error if `-g` isn't supported.

rdar://79223184

Differential Revision: https://reviews.llvm.org/D107501

[AVR] emit 'MCSA_Global' references to '__do_global_ctors' and '__do_global_dtors'

Emit references to '__do_global_ctors' and '__do_global_dtors' to allow
constructor/destructor routines to run.

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D107133

[mlir] Update comment in Region.h

The file in which `Region::viewGraph` is defined has changed. This should have been updated with D106342.

Differential Revision: https://reviews.llvm.org/D107517

[CMake][gn] lldMachO=>lldMachOOld, lldMachO2=>lldMachO

Now that D95204 switched default to new Darwin backend, rename some CMake
targets to match.

Reviewed By: #lld-macho, smeenai, int3

Differential Revision: https://reviews.llvm.org/D107516

[Bazel] Add support for lld

This patch adds a Bazel configuration to build lld. That includes a
BUILD.bazel file to export the libunwind headers for use by lld. Since
the lld target itself requires libxml2 (through WindowsManifest) it's
currently disabled on Buildkite and marked manual, but all the libraries
build.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D107414

[libFuzzer] tests/examples for using libFuzzer for out-of-process targets

[libFuzzer] tests/examples for using libFuzzer for out-of-process targets

Reviewed By: kostik

Differential Revision: https://reviews.llvm.org/D107498

[CSSPGO] Remove used of PseudoProbeAttributes::Reserved

D106861 added usage of PseudoProbeAttributes::Reserved as TailCall however this usage hasn't been committed/reviewed. Removing this usage.

Testing
ninja check-all

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D107514

BPF: avoid NE/EQ loop exit condition

Kuniyuki Iwashima reported in [1] that llvm compiler may
convert a loop exit condition with "i < bound" to "i != bound", where
"i" is the loop index variable and "bound" is the upper bound.
In case that "bound" is not a constant, verifier will always have "i != bound"
true, which will cause verifier failure since to verifier this is
an infinite loop.

The fix is to avoid transforming "i < bound" to "i != bound".
In llvm, the transformation is done by IndVarSimplify pass.
The compiler checks loop condition cost (i = i + 1) and if the
cost is lower, it may transform "i < bound" to "i != bound".
This patch implemented getArithmeticInstrCost() in BPF TargetTransformInfo
class to return a higher cost for such an operation, which
will prevent the transformation for the test case
added in this patch.

[1] https://lore.kernel.org/netdev/1994df05-8f01-371f-3c3b-d33d7836878c@fb.com/

Differential Revision: https://reviews.llvm.org/D107483

Adding missing filter check to SourceMgrDiagnosticHandler::EmitDiagnostics

There is a case in EmitDiagnostics where the filter check is bypassed (when locationStack is empty). Filter might also be bypassed when loc instead of showableLoc is added to the locationStack.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D106522

[AArch64][GlobalISel] Legalize wide vector G_PHIs

Clamp the max number of elements when legalizing G_PHI. This allows us to
legalize some common fallbacks like 4 x s64.

Here's an example: https://godbolt.org/z/6YocsEYTd

Had to add -global-isel-abort=0 to legalize-phi.mir to account for the
G_EXTRACT_VECTOR_ELT from the 32 x s8 G_PHI.

Differential Revision: https://reviews.llvm.org/D107508

Apply -fmacro-prefix-map to __builtin_FILE()

This matches the behavior of GCC.
Patch does not change remapping logic itself, so adding one simple smoke test should be enough.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D107393

[gwpAsan] revert minor change

This change reverts a small cmake change that was causing buildbot
failures.

Differential Revision: https://reviews.llvm.org/D107510

[mlir][sparse] Remove comment w/ code in it

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D107484

[OpenCL] allow generic address and non-generic defs for CL3.0

This allows both sets of definitions to exist on CL 3.0

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D107318

[llvm-nm][test] Avoid deprecated alias -M (--print-armap)

-M was inappropriately added since -s is taken by Darwin nm.
-M is deprecated. Tests should use --print-armap instead.

[WebAssembly] Make result of 'catch' inst variadic

`catch` instruction can have any number of result values depending on
its tag, but so far we have only needed a single i32 return value for
C++ exception so the instruction was specified that way. But using the
instruction for SjLj handling requires multiple return values.

This makes `catch` instruction's results variadic and moves selection of
`throw` and `catch` instruction from ISelLowering to ISelDAGToDAG.
Moving `catch` to ISelDAGToDAG is necessary because I am not aware of
a good way to do instruction selection for variadic output instructions
in TableGen. This also moves `throw` because 1. `throw` and `catch`
share the same utility function and 2. there is really no reason we
should do that in ISelLowering in the first place. What we do is mostly
the same in both places, and moving them to ISelDAGToDAG allows us to
remove unnecessary mid-level nodes for `throw` and `catch` in
WebAssemblyISD.def and WebAssemblyInstrInfo.td.

This also adds handling for new `catch` instruction to AsmTypeCheck.

Reviewed By: dschuff, tlively

Differential Revision: https://reviews.llvm.org/D107423

[X86] Remove -x86-experimental-pref-loop-alignment in favor of -align-loops

[Bazel] Drop deprecated tblgen includes mechanism

Includes can now be fully managed via td_library and specified locally
to the tablegen files that require them. This has been deprecated for a
while and is not used upstream. I'm not aware of any downstream users
either.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D107389

[mlir-lsp-server] Only use one MLIRContext per MLIRTextFile

A text file may be comprised of many different "chunks", when
the input file contains the `// -----` split markers. We don't
need to use a unique MLIRContext per chunk, as having
separate contexts is intended to allow for easy unloading of
unused data and all chunks have the same lifetime (tied to the
input file). This commit uses one context for the entire file,
greatly reducing memory consumption in certain situations (up
to 70%).

Differential Revision: https://reviews.llvm.org/D107488

[libc] add integration tests for scudo in libc

This change adds tests to make sure that SCUDO is being properly
included with llvm libc. This change also adds the toggles to properly
use SCUDO, as GWP-ASan is enabled by default and must be included for
SCUDO to function.

Reviewed By: sivachandra, hctim

Differential Revision: https://reviews.llvm.org/D106919

[lld] Remove unused LLD_REPOSITORY

Remnant after D72803.

Distributions who want to customize the string can customize
LLD_VERSION_STRING instead.

Reviewed By: #lld-macho, mstorsjo, thakis

Differential Revision: https://reviews.llvm.org/D107416

[CodeGen] Add -align-loops

to `lib/CodeGen/CommandFlags.cpp`. It can replace
-x86-experimental-pref-loop-alignment=.

The loop alignment is only used by MachineBlockPlacement.
The implementation uses a new `llvm::TargetOptions` for now, as
an IR function attribute/module flags metadata may be overkill.

This is the llvm part of D106701.

[amdgpu] Add an enhanced conversion from i64 to f32.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D107187

[flang] runtime: For Fw.d formatting, don't oscillate forever

The algorithm for Fw.d output will drive binary to decimal conversion for
an initial fixed number of digits, then adjust that number based on the
result's exposent. For value close to a power of ten, this adjustment
process wouldn't terminate; e.g., formatting 9.999 as F10.2 would start
with 1e2, boost the digits to 2, get 9.99e1, decrease the digits, and loop.
Solve by refusing to boost the digits a second time.

Differential Revision: https://reviews.llvm.org/D107490

[flang] Support DFLOAT legacy extension intrinsic function

Like the similar legacy extension FLOAT(), DFLOAT() represents a
conversion from default integer to DOUBLE PRECISION. Rewrite
into a conversion operation.

Differential Revision: https://reviews.llvm.org/D107489

[MemCpyOpt] Relax libcall checks

Rather than blocking the whole MemCpyOpt pass if the libcalls are
not available, only disable creation of new memset/memcpy intrinsics
where only load/stores were used previously. This only affects the
store merging and load-store conversion optimization. Other
optimizations are derived from existing intrinsics, which are
well-defined in the absence of libcalls -- not having the libcalls
just means that call simplification won't convert them to intrinsics.

This is a weaker variation of D104801, which dropped these checks
entirely. Ideally we would not couple emission of intrinsics to
libcall availability at all, but as the intrinsics may be legalized
to libcalls we need to be a bit careful right now.

Differential Revision: https://reviews.llvm.org/D106769

[MLIR][NFC] Get DiagnosticEngine as a reference in doc

'mlir::DiagnosticEngine::DiagnosticEngine(const mlir::DiagnosticEngine&)' is implicitly deleted because the default definition would be ill-formed.

Reviewed By: rdzhabarov

Differential Revision: https://reviews.llvm.org/D107287

[gn build] Add cfi ignorelist to compiler-rt/lib

So that building the compiler-rt target also copies the cfi ignorelist

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D107411

[SLP][NFC]Add tests for constants/undefs used in insertelements, NFC.

[OpenMPOpt] Expand SPMDization with guarding for target parallel regions

This patch expands SPMDization (converting generic execution mode to SPMD for target regions) by guarding code regions that should be executed only by the main thread. Specifically, it generates guarded regions, which only the main thread executes, and the synchronization with worker threads using simple barriers. For correctness, the patch aborts SPMDization for target regions if the same code executes in a parallel region, thus must be not be guarded. This check is implemented using the ParallelLevels AA.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D106892

[DAGCombiner][AMDGPU] Canonicalize constants to the RHS of MULHU/MULHS.

This allows special constants like to 0 to be recognized. It's also
expected by isel patterns if a target had a mulh with immediate instructions.
The commuting done by tablegen won't commute patterns with immediates since it
expects DAGCombine to have done it.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D107486

[RISCV] Add test cases for conditional add/sub. NFC

InstCombine canonicalizes c ? (x+y) : x to (c ? y : 0) + x. It
does the same for and/or/xor. We already reverse this transform
for those, but don't do add/sub yet.

[nfc] [lldb] Prevent needless copies of DataExtractor

lldb_private::DataExtractor contains DataBufferSP m_data_sp which is
relatively expensive to copy (due to multi-threading locking).

llvm::DataExtractor does not have this problem as it uses StringRef
instead.

The copy constructor is explicit as otherwise it is easy to make
unintended modification of a local copy instead of a caller's instance
(D107470 but that is llvm::DataExtractor).

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D107485