review.tizen.org Git - platform/upstream/llvm.git/log

Drop REQUIRES: arm-registered-target from an IR-only test

This works just fine even if the Arm backend is not built.

[UpdateTestChecks][NFC] Drop a python2 workaround

Fix incorrect GEP bitwidth in areNonOverlapSameBaseLoadAndStore()

When using a datalayout that has pointer index width != pointer size this
code triggers an assertion in Value::stripAndAccumulateConstantOffsets().
I encountered this this while compiling FreeBSD for CHERI-RISC-V.
Also update LoadsTest.cpp to use a DataLayout with index width != pointer
width to ensure this case is tested.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D110406

[update_llc_test_checks.py] Fix MIPS ASM regex for functions with EH

On MIPS, functions with exception handling code emits an additional
temporary label at the start of the function (due to UseAssignmentForEHBegin):

    _Z8do_catchv:                           # @_Z8do_catchv
    .Ltmp3:
    .set .Lfunc_begin0, .Ltmp3
    .cfi_startproc
    .cfi_personality 128, DW.ref.__gxx_personality_v0
    .cfi_lsda 0, .Lexception0
    .frame $c11,48,$c17
    .mask 0x00000000,0
    .fmask 0x00000000,0
    .set noreorder
    .set nomacro
    .set noat
    # %bb.0:                                # %entry

The `[^:]*` regex was terminating the search after .Ltmp<N>: and therefore
not detecting functions with exception handling.

Reviewed By: atanasyan, MaskRay

Differential Revision: https://reviews.llvm.org/D100027

[update_llc_test_checks] Baseline test for D100027

Show that we fail to generate CHECK lines for MIPS64 functions with EH.

Differential Revision: https://reviews.llvm.org/D110408

[libc++] [compare] Rip out more vestiges of *_equality. NFCI.

There's really no reason to even have two different enums here,
but *definitely* we shouldn't have *three*, and they don't need
so many synonymous enumerator values.

Differential Revision: https://reviews.llvm.org/D110516

[X86][Costmodel] Load/store i16 Stride=6 VF=16 interleaving costs

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For this tuple, measuring becomes problematic since there's a lot of spilling going on,
but apparently all these memory ops do not affect worst-case estimate at all here.

For load we have:
https://godbolt.org/z/5qGb9odP6 - for intels `Block RThroughput: <=106.0`; for ryzens, `Block RThroughput: <=34.8`
So pick cost of `106`.

For store we have:
https://godbolt.org/z/KrWcv4Ph7 - for intels `Block RThroughput: =58.0`; for ryzens, `Block RThroughput: <=20.5`
So pick cost of `58`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D110593

[X86][Costmodel] Load/store i16 Stride=6 VF=8 interleaving costs

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/3Tc5s897j - for intels `Block RThroughput: =39.0`; for ryzens, `Block RThroughput: <=13.5`
So pick cost of `39`.

For store we have:
https://godbolt.org/z/fo1h9E67e - for intels `Block RThroughput: =21.0`; for ryzens, `Block RThroughput: <=12.0`
So pick cost of `21`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D110592

[X86][Costmodel] Load/store i16 Stride=6 VF=4 interleaving costs

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/1Wcaf9c7T - for intels `Block RThroughput: =9.0`; for ryzens, `Block RThroughput: <=4.5`
So pick cost of `9`.

For store we have:
https://godbolt.org/z/1Wcaf9c7T - for intels `Block RThroughput: =15.0`; for ryzens, `Block RThroughput: <=6.0`
So pick cost of `15`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D110591

[X86][Costmodel] Load/store i16 Stride=6 VF=2 interleaving costs

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/bhscej4WM - for intels `Block RThroughput: =13.0`; for ryzens, `Block RThroughput: <=7.0`
So pick cost of `13`.

For store we have:
https://godbolt.org/z/Yf4Pfnxbq - for intels `Block RThroughput: =10.0`; for ryzens, `Block RThroughput: <=3.5`
So pick cost of `10`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D110590

[llvm-profgen][CSSPGO] On-demand function size computation for preinliner

Similar to https://reviews.llvm.org/D110465, we can compute function size on-demand for the functions that's hit by samples.

Here we leverage the raw range samples' address to compute a set of sample hit function. Then `BinarySizeContextTracker` just works on those function range for the size.

Reviewed By: hoy

Differential Revision: https://reviews.llvm.org/D110466

[llvm-profgen] On-demand symbolization

Previously we do symbolization for all the functions and actually we only need the symbols that's hit by the samples.

This can significantly speed up the time for large size binary.

Optimization for per-inliner will come along with next patch.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D110465

[PowerPC] FP compare and test XL compat builtins.

This patch is in a series of patches to provide builtins for
compatability with the XL compiler. This patch adds builtins for compare
exponent and test data class operations on floating point values.

Reviewed By: #powerpc, lei

Differential Revision: https://reviews.llvm.org/D109437

[SystemZ] Remove redundant declaration SystemZMnemonicSpellCheck (NFC)

Note that SystemZMnemonicSpellCheck is defined in
SystemZGenAsmMatcher.inc, which SystemZAsmParser.cpp includes.

Identified with readability-redundant-declaration.

Revert "[CMake] Enable LLVM_ENABLE_PER_TARGET_RUNTIME_DIR by default on Linux"

See original review https://reviews.llvm.org/D107799

This reverts commit f9dbca68d48e705f6d45df8f58d6b2ee88bce76c.

tsan: print a meaningful frame for stack races

Depends on D110631.

Differential Revision: https://reviews.llvm.org/D110632

tsan: fix tls_race3 test on darwin

Darwin also needs to use __tsan_tls_initialization
to pass the test.

Differential Revision: https://reviews.llvm.org/D110631

[fir][NFC] Rename operand of EmboxOp

Rename `lenParams` to `typeparams` to be in sync with fir-dev.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D110628

Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>

[lldb] [unittests] Fix building the FreeBSD arm64 Register Context test

Differential Revision: https://reviews.llvm.org/D110545

tsan: fix cur_thread alignment

Commit 354ded67b3 ("tsan: align ThreadState to cache line")
did an incomplete thing. It marked ThreadState as cache line
aligned, but the thread local ThreadState instance is declared
as an aligned char array with hard-coded 64-byte alignment.
On PowerPC cache line size is 128 bytes, so the hard-coded
64-byte alignment is not enough.
Use cache line alignment consistently.

Differential Revision: https://reviews.llvm.org/D110629

[lldb] [DynamicRegisterInfo] Refactor SetRegisterInfo()

Move the "slice" and "composite" handling into separate methods to avoid
if/else hell. Use more LLVM types whenever possible. Replace printf()s
with llvm::Error combined with LLDB logging.

Differential Revision: https://reviews.llvm.org/D110619

[ARM] Delay reverting WLS in arm-block-placement

As we have to split blocks, we may be left in an invalid loop state
after a WLS is reverted to a DLS. Instead remember the WLS that could
not be fixed and revert them after finishing processing all other loops.

Differential Revision: https://reviews.llvm.org/D110567

Fix missing return from 9324cc2ca951fe5fe11c85470cb08e699c59499c

No idea how my local machine missed this, but I saw no warning for it,
it seems to have been lost in some level of translating this back for
upstreaming.

Recommit "[AArch64] Split bitmask immediate of bitwise AND operation"

This reverts the revert commit f85d8a5bed95cc17a452b6b63b9866fbf181d94d
with bug fixes.

Original message:

    MOVi32imm + ANDWrr ==> ANDWri + ANDWri
    MOVi64imm + ANDXrr ==> ANDXri + ANDXri

    The mov pseudo instruction could be expanded to multiple mov instructions later.
    In this case, try to split the constant operand of mov instruction into two
    bitmask immediates. It makes only two AND instructions intead of multiple
    mov + and instructions.

    Added a peephole optimization pass on MIR level to implement it.

    Differential Revision: https://reviews.llvm.org/D109963

[SLP]Improve vectorization of phi nodes by trying wider vectors.

Try to improve vectorization of the PHI nodes by trying to vectorize
similar instructions at the size of the widest possible vectors, then
aggregating with compatible type PHIs and trying to vectoriza again and
only if this failed, try smaller sizes of the vector factors for
compatible PHI nodes. This restores performance of several benchmarks
after tuning of the fp/int conversion instructions costs.

Differential Revision: https://reviews.llvm.org/D108740

[LoopFlatten] Updating Phi nodes after IV widening

In rG6a076fa9539e, a problem with updating the old/narrow phi nodes after IV
widening was introduced. If after widening of the IV the transformation is
*not* applied, the narrow phi node was incorrectly modified, which should only
happen if flattening happens. This can be seen in the added test widen-iv2.ll,
which incorrectly had 1 incoming value, but should have its original 2 incoming
values, which is now restored.

Differential Revision: https://reviews.llvm.org/D110234

Refine the constraint for isInlineBuiltinDeclaration

Require it to be always_inline, to more closely match how _FORITFY_SOURCE
behaves.

This avoids generation of `.inline` suffixed functions - these should always be
inlined.

[InstCombine] reduce code for swapped predicate; NFC

[InstCombine] add tests for icmp-gep; NFC

We need more coverage for commuted and (un)signed preds to
verify that things behave as expected here. Currently, we
do not transform signed preds or non-inbounds geps.

[ARM] Skip debug info in recomputeVPTBlockMask

The ARMLowOverheadLoops pass recalculates VPT block masks when it
converts VCMP's inside VPT blocks into VPT's. The function to do so
doesn't seem to handle debug info though, leading to invalid block
creation or asserts at compile time. Make sure the function skips any
debug info between the MVE instructions it inspects.

Differential Revision: https://reviews.llvm.org/D110564

Change __builtin_sycl_unique_stable_name to just use an Itanium mangling

After significant problems in our downstream with the previous
implementation, the SYCL standard has opted to make using macros/etc to
change kernel-naming-lambdas in any way UB (even passively). As a
result, we are able to just emit the itanium mangling.

However, this DOES require a little work in the CXXABI, as the microsoft
and itanium mangler use different numbering schemes for lambdas. This
patch adds a pair of mangling contexts that use the normal 'itanium'
mangling strategy to fill in the "DeviceManglingNumber" used previously
by CUDA.

Differential Revision: https://reviews.llvm.org/D110281

[InstCombine] add/move tests for icmp with gep operand(s); NFC

[Analysis] Be defensive when matching size_t in lib call signatures

When TargetLibraryInfoImpl::isValidProtoForLibFunc is checking
function signatures to detect lib calls it may check that a parameter
or return value matches with the "size_t" type. For this to work it
has to derive the IR type matching with "size_t". Depending on if
a DataLayout is provided or not, this has been done in two different
way. Either a more strict check being based on IntPtrType (which is
given by the DataLayout) or a more relaxed check assuming that any
integer type matches with "size_t".

Given that the stricter approach exist it seems like we do not want
to trigger rewrites etc if we aren't sure that a function calls
actually match with the library function. Therefore it was questioned
why we actually have the more relaxed approach when not being able
to derive an IR type for "size_t". This patch will take a more
defensive approach, requiring that a DataLayout is passed to
isValidProtoForLibFunc.

Differential Revision: https://reviews.llvm.org/D110584

[Analysis] Add FIXME:s related to size_t type checks

Differential Revision: https://reviews.llvm.org/D110583

[IR] Change the default value of InstertElement to poison (1/4)

This patch is for fixing potential insertElement-related bugs like D93818.
```
V = UndefValue::get(VecTy);
for(...)
V = Builder.CreateInsertElementy(V, Elt, Idx);
=>
V = PoisonValue::get(VecTy);
for(...)
V = Builder.CreateInsertElementy(V, Elt, Idx);
```
Like above, this patch changes the placeholder V to poison.
The patch will be separated into several commits.

Reviewed By: aqjune

Differential Revision: https://reviews.llvm.org/D110311

[SLP]No need to schedule/check parent for extract{element/value} instruction.

The instruction extractelement/extractvalue are not required to
be scheduled since they only depend on the source vector/aggregate (with
constant indices), smae applies to the parent basic block checks.
Improves compile time and saves scheduling budget.

Differential Revision: https://reviews.llvm.org/D108703

Update the message for template-template param keyword for C++17

C++17 permits using 'typename' or 'class' for a template template
parameter, but the error message in the parser only refers to 'class'.
This patch, in C++17 or newer modes, adds "or 'template'" to the
diagnostic.

[lldb/test] Remove a check from TestLoadAfterAttach

The two module retrieval methods (qXfer:libraries-svr4 and manual list
traversal) differ in how the handle the
manually-added-but-not-yet-loaded modules. The svr4 path will remove it,
while the manual one will keep in the list.

It's likely the two paths need ought to be synchronized, but right now,
this distinction is not relevant for the test.

Recommit "[Test] Add more tests with cycled phis"

Reland "[flang] GET_COMMAND_ARGUMENT runtime implementation"

Recommit https://reviews.llvm.org/D109813 and
https://reviews.llvm.org/D109814.

This implements the second and final entry point for GET_COMMAND_ARGUMENT,
handling the VALUE, STATUS and ERRMSG parameters.

It has a small fix in that we're now using memcpy instead of strncpy
(which was a bad idea to begin with, since we're not actually interested
in a string copy).

Revert "[Test] Add more tests with cycled phis"

This reverts commit 7128a545b3baa62c1164843103fb08daeba5cd9d.

Need to regenerate tests after rebase.

Revert "[AArch64] Split bitmask immediate of bitwise AND operation"

This reverts commit 864b206796ae8aa7f35f830655337751dbd9176c.

Reverting due to error on buildbots.

[CMake] Add detection for the mold linker in AddLLVM.cmake.

mold says it is compatible with GNU ld and gold linkers:

```
$ mold -v
mold 0.9.5 (compatible with GNU ld and GNU gold)
```

And thus it currently gets detected as Gold.

With the following diff, CMake now correctly reports the linker name, and mold keeps being identified as Gold internally for now.

Reviewed By: ldionne, MaskRay

Differential Revision: https://reviews.llvm.org/D110035

[lldb/test] Add ability to specify environment when spawning processes

We only had that ability for regular debugger launches. This meant that
it was not possible to use the normal dlopen patterns in attach tests.
This fixes that.

[lldb] Remove non-stop mode code

We added some support for this mode back in 2015, but the feature was
never productionized. It is completely untested, and there are known
major structural lldb issues that need to be resolved before this
feature can really be supported.

It also complicates making further changes to stop reply packet
handling, which is what I am about to do.

Differential Revision: https://reviews.llvm.org/D110553

Revert "[flang] GET_COMMAND_ARGUMENT(VALUE) runtime implementation"

This reverts commit 0446f1299f6be9fd35bc5f458c78b34dca3105f6 and
df6302311f88d0fbc666b6277d029aa371039945.

There's a warning on flang-aarch64-latest-gcc related to strncpy using
the result of strlen as a bound. I'll recommit with a fix.

[Test] Add more tests with cycled phis

Add support for `NOLINTBEGIN` ... `NOLINTEND` comments

Add support for NOLINTBEGIN ... NOLINTEND comments to suppress
clang-tidy warnings over multiple lines. All lines between the "begin"
and "end" markers are suppressed.

Example:

// NOLINTBEGIN(some-check)
<Code with warnings to be suppressed, line 1>
<Code with warnings to be suppressed, line 2>
<Code with warnings to be suppressed, line 3>
// NOLINTEND(some-check)
Follows similar syntax as the NOLINT and NOLINTNEXTLINE comments
that are already implemented, i.e. allows multiple checks to be provided
in parentheses; suppresses all checks if the parentheses are omitted,
etc.

If the comments are misused, e.g. using a NOLINTBEGIN but not
terminating it with a NOLINTEND, a clang-tidy-nolint diagnostic
message pointing to the misuse is generated.

As part of implementing this feature, the following bugs were fixed in
existing code:

IsNOLINTFound(): IsNOLINTFound("NOLINT", Str) returns true when Str is
"NOLINTNEXTLINE". This is because the textual search finds NOLINT as
the stem of NOLINTNEXTLINE.

LineIsMarkedWithNOLINT(): NOLINTNEXTLINEs on the very first line of a
file are ignored. This is due to rsplit('\n\').second returning a blank
string when there are no more newline chars to split on.

[VectorCombine] Discard ScalarizationResult state in early exit.

ScalarizationResult's destructor makes sure ToFreeze is not ignored if
set. Currently, scalarizeLoadExtract has an early exit if the index is
not safe directly. But when it is SafeWithFreeze, we need to discard the
state first, otherwise we hit the assert in the destructor.

Fixes PR51992.

[Docs][NFC] Add doxygen comment for AtomicExpandPass in passes.h

Revert "[clangd] Refactor IncludeStructure: use File (unsigned) for most computations"

This reverts

- d1c6e54930f200c40820868c086e114cee1ec693
- 7394d3ba276adeb1527428b2355a920129a2b9b1
- e50771181b7e0d96b30ee33049dc05172125b927
- 1bcd6b51a98263d440ff7549070060f7e7b0326a

Simplify handling of builtin with inline redefinition

It is a common practice in glibc header to provide an inline redefinition of an
existing function. It is especially the case for fortified function.

Clang currently has an imperfect approach to the problem, using a combination of
trivially recursive function detection and noinline attribute.

Simplify the logic by suffixing these functions by `.inline` during codegen, so
that they are not recognized as builtin by llvm.

After that patch, clang passes all tests from https://github.com/serge-sans-paille/fortify-test-suite

Differential Revision: https://reviews.llvm.org/D109967

[gn build] Port 864b206796ae

[AArch64] Split bitmask immediate of bitwise AND operation

MOVi32imm + ANDWrr ==> ANDWri + ANDWri
MOVi64imm + ANDXrr ==> ANDXri + ANDXri

The mov pseudo instruction could be expanded to multiple mov instructions later.
In this case, try to split the constant operand of mov instruction into two
bitmask immediates. It makes only two AND instructions intead of multiple
mov + and instructions.

Added a peephole optimization pass on MIR level to implement it.

Differential Revision: https://reviews.llvm.org/D109963

Fix documentation typos; NFC

Fixes bugprone-virtual-near-miss & performance-type-promotion-in-math-fn.

[clang-format][docs] mark new clang-format configuration options based on which version they would GA

Sometimes I see people unsure about which options they can use in specific versions of clang-format because
https://clang.llvm.org/docs/ClangFormatStyleOptions.html points to the latest and greatest versions.

The reality is this says its version 13.0, but actually anything we add now, will not be in 13.0 GA but
instead 14.0 GA (as 13.0 has already been branched).

How about we introduce some nomenclature to the Format.h so that we can mark which options in the
documentation were introduced for which version?

Reviewed By: HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D110432

[LiveIntervals] Fix another asan debug build failure

Call RemoveMachineInstrFromMaps before erasing instrs.
repairIntervalsInRange will do this for you after erasing the
instruction, but it's not safe to rely on it because assertions in
SlotIndexes::removeMachineInstrFromMaps refer to fields in the erased
instruction.

This fixes asan buildbot failures caused by D110335.

Investigate D110386 failures even further

[fir] Add fir.save_result op

Add the fir.save_result operation. It is use to save an
array, box, or record function result SSA-value to a memory location

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D110407

Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Valentin Clement <clementval@gmail.com>

Recommit "[SCEV] Look through single value PHIs." (take 2)

This reverts commit 8fdac7cb7abbeeaed016ef9eb7a087458e41e33f.

The issue causing the revert has been fixed a while ago in 60b852092c98.

Original message:

    Now that SCEVExpander can preserve LCSSA form,
    we do not have to worry about LCSSA form when
    trying to look through PHIs. SCEVExpander will take
    care of inserting LCSSA PHI nodes as required.

    This increases precision of the analysis in some cases.

    Reviewed By: mkazantsev, bmahjour

    Differential Revision: https://reviews.llvm.org/D71539

[DebugInfo] Emit DW_TAG_namelist and DW_TAG_namelist_item

This patch emits DW_TAG_namelist and DW_TAG_namelist_item for fortran
namelist variables. DICompositeType is extended to support this fortran
feature.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D108553

[fir] Update fir.insert_on_range op

Update the fir.insert_on_range operation. Add a better description,
builder and verifier.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D110389

Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Valentin Clement <clementval@gmail.com>

[flang] GET_COMMAND_ARGUMENT(ERRMSG) runtime implementation

Implement the final part of GET_COMMAND_ARGUMENT, i.e. the handling of
ERRMSG. This uses some of the infrastructure in stat.h and gets rid of
the magic numbers that we were using for return codes.

Differential Revision: https://reviews.llvm.org/D109814

[flang] GET_COMMAND_ARGUMENT(VALUE) runtime implementation

Partial implementation for the second entry point for
GET_COMMAND_ARGUMENT. It handles the VALUE and STATUS arguments, and
doesn't touch ERRMSG.

Differential Revision: https://reviews.llvm.org/D109813

[flang] GET_COMMAND_ARGUMENT(LENGTH) runtime implementation

Implement the ArgumentLength entry point of GET_COMMAND_ARGUMENT. Also
introduce a fixture for the tests.

Note that this also changes the interface for ArgumentLength from
returning a 4-byte integer to returning an 8-byte integer.

Differential Revision: https://reviews.llvm.org/D109227

Investigate D110386 Windows failures

Add more information for test failures inspection.

[mlir] Add min/max operations to Standard.

[RFC: Add min/max ops](https://llvm.discourse.group/t/rfc-add-min-max-operations/4353)

I was following the naming style for Arith dialect in
https://reviews.llvm.org/D110200,
i.e. similar to DivSIOp and DivUIOp I defined MaxSIOp, MaxUIOp.

When Arith PR is landed, I will migrate these ops as well.

Differential Revision: https://reviews.llvm.org/D110540

[LiveIntervals] Repair subreg ranges in processTiedPairs

In TwoAddressInstructionPass::processTiedPairs, update subranges of the
live interval for RegB as well as the main range.

This is a small step towards switching TwoAddressInstructionPass over
from LiveVariables to LiveIntervals. Currently this path is only tested
if you explicitly enable -early-live-intervals.

Differential Revision: https://reviews.llvm.org/D110526

[LiveIntervals] Improve repair after convertToThreeAddress

After TwoAddressInstructionPass calls
TargetInstrInfo::convertToThreeAddress, improve the LiveIntervals repair
to cope with convertToThreeAddress creating more than one new
instruction.

This mostly seems to benefit X86. For example in
test/CodeGen/X86/zext-trunc.ll it converts:

  %4:gr32 = ADD32rr %3:gr32(tied-def 0), %2:gr32, implicit-def dead $eflags

to:

  undef %6.sub_32bit:gr64 = COPY %3:gr32
  undef %7.sub_32bit:gr64_nosp = COPY %2:gr32
  %4:gr32 = LEA64_32r killed %6:gr64, 1, killed %7:gr64_nosp, 0, $noreg

Differential Revision: https://reviews.llvm.org/D110335

Fix URLs to the prod/staging buildbot master in the doc

Differential Revision: https://reviews.llvm.org/D110565

Attempt to fix Windows builds after D110386

http://45.33.8.238/win/46013/summary.html

[clangd] Refactor IncludeStructure: use File (unsigned) for most computations

Preparation for D108194.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D110386

[X86][FP16] Fix a bug when Combine the FADD(A, FMA(B, C, 0)) to FMA(B, C, A).

This bug was introduced by D109953. The operand order of generated FMA
is wrong.

Differential Revision: https://reviews.llvm.org/D110606

[ORC] Fix the LLJITWithRemoteDebugging example.

This was broken by the switch from JITTargetAddress to ExecutorAddr in
21a06254a3a.

[ISel] Legalized arithmetic.fence.f128 for 32-bits target

Reviewed By: Craig Topper, Wang Pengfei

Differential Revision: https://reviews.llvm.org/D110467

[LoopPred Test] Fix lld-x86_64-win BB failure

Need a more general CHECK line for testcase in 5df9112 for correctly
handling lld-x86_64-win buildbot.

Revert "tsan: fix trace tests on darwin"

This reverts commit 94ea36649ecc854d290c6797e6adb91bdfac756d.

Reverting due to errors on buildbots.

Reland "[LoopPredication] Add testcase showing BPI computation. NFC"

This relands commit 16a62d4f.
Relanded after fixing CHECK-LINES for opt pipeline output to be more
general (based on failures seen in buildbot).

clang-format

[llvm-jitlink] Add more information about allocation failures.

Slab allocator failures will now report requested size and remaining capacity.

[PowerPC] MMA - Add __builtin_vsx_build_pair and __builtin_mma_build_acc builtins

This patch adds the following built-ins:

__builtin_vsx_build_pair
__builtin_mma_build_acc

Reviewed By: #powerpc, nemanjai, lei

Differential Revision: https://reviews.llvm.org/D107647

[ORC] Switch from JITTargetAddress to ExecutorAddr for EPC-call APIs.

Part of the ongoing move to ExecutorAddr.

[Polly] Reject regions entered by an indirectbr/callbr.

SplitBlockPredecessors is unable to insert an additional BasicBlock
between an indirectbr/callbr terminator and the successor blocks.
This is needed by Polly to normalize the control flow before emitting
its optimzed code.

This patches rejects regions entered by an indirectbr/callbr to not fail
later at code generation.

This fixes llvm.org/PR51964

Recommit with "REQUIRES: asserts" in test that uses statistics.

[libc++][NFC] s/enable_if<...>::type/enable_if_t<...> in span

There is some use of `enable_if<...>::type` when the rest of the file
uses `enable_if_t`. So, use `enable_if_t` consistently throughout.

Revert "[Polly] Reject reject regions entered by an indirectbr/callbr."

This reverts commit 91f46bb77e6d56955c3b96e9e844ae6a251c41e9 which
causes test failures when assertions are off.

[ORC] Hold shared_ptr<SymbolStringPool> in errors containing SymbolStringPtrs.

This allows these error values to remain valid, even if they tear down the JIT
itself.

[CodeMoverUtils] Enhance isSafeToMoveBefore() when control flow equivalence is satisfied

With improved analysis in determining CFG equivalence that does
not require strict dominance and post-dominance conditions, we
now relax  isSafeToMoveBefore() such that an instruction I can
be moved before InsertPoint even if they do not strictly dominate
each other, as long as they follow the same control flow path.

For example,  we can move Instruction 0 before Instruction 1,
and vice versa.

```
if (cond1)
   // Instruction 0: %add = add i32 1, 2
if (cond1)
   // Instruction 1: %add2 = add i32 2, 1
```

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D110456

Revert "tsan: add a test for stack init race"

This reverts commit b72176b9bc06146d12e495167977effe050dc326.

Broke bot: https://lab.llvm.org/buildbot/#/builders/70/builds/12193

[gn build] Port 6cfb4d46bae1

[llvm-readobj] Support dumping of MSP430 ELF attributes

The MSP430 ABI supports build attributes for specifying
the ISA, code model, data model and enum size in ELF object files.

Differential Revision: https://reviews.llvm.org/D107969

[libomptarget][amdgpu] Follow on to D110513, empty kernarg pools are not fatal

[libomptarget][amdgpu] Report zero devices if plugin construction fails, instead of segv

Revert "[LoopPredication] Add testcase showing BPI computation. NFC"

This reverts commit 16a62d4f3dca189b0e0565c7ebcd83ddfcc67629.

Needs some update to check lines to fix bb failure.

[libc++] Do not enable P1951 before C++23, since it's a breaking change

In reaction to the issues raised by Richard in https://llvm.org/D109066,
this commit does not apply P1951 as a DR in previous standard modes,
since it breaks valid code.

I do believe it should be applied as a DR, however ideally we'd get some
sort of statement from the Committee to this effect (and all implementations
would behave consistently). In the meantime, only implement P1951 starting
with C++23 -- we can always come back and apply it as a DR if that's what
the Committee says.

Differential Revision: https://reviews.llvm.org/D110347

[LoopPredication] Add testcase showing BPI computation. NFC

Precommit testcase for D110438. Since we do not preserve BPI in loop
pass manager, we are forced to compute BPI everytime Loop predication is
invoked.
The patch referenced changes that behaviour by preserving lossy BPI for
loop passes.

[X86] Add slow/fast pmulld test coverage to vector-mul.ll

[gwp-asan] Initialize AllocatorVersionMagic at runtime

GWP-ASan's `AllocatorState` was recently extended with a
`AllocatorVersionMagic` structure required so that GWP-ASan bug reports
can be understood by tools at different versions.

On Fuchsia, this in included in the `scudo::Allocator` structure, and
by having non-zero initializers, this effectively moved the static
allocator structure from the `.bss` segment to the `.data` segment, thus
increasing (significantly) the size of the libc.

This CL proposes to initialize the structure with its magic numbers at
runtime, allowing for the allocator to go back into the `.bss` segment.

I will work on adding a test on the Scudo side to ensure that this type
of changes get detected early on. Additional work is also needed to
reduce the footprint of the (large) memory-tagging related structures
that are currently part of the allocator.

Differential Revision: https://reviews.llvm.org/D110575

[NFC][X86] Add 'gather' optsize/minsize test coverage

[NFC] [PSI] explain encoding of PercentileCutoff.

Reviewed By: mtrofin, davidxl

Differential Revision: https://reviews.llvm.org/D109764

[Driver] Remove confusing *-linux-android detection with non-android --target=

These values allow, for example, `--target=aarch64` and
`--target=aarch64-linux-gnu` to detect `aarch64-linux-android`. This is
confusing. Users should specify `--target=aarch64-linux-android` to get Android GCC
installation.

Reverts D53463.

Reviewed By: nickdesaulniers, danalbert

Differential Revision: https://reviews.llvm.org/D110379