platform/upstream/llvm.git
3 years ago[flang] TRANSFER() intrinsic function
peter klausler [Fri, 2 Apr 2021 16:30:31 +0000 (09:30 -0700)]
[flang] TRANSFER() intrinsic function

API, implementation, and unit tests for the intrinsic
function TRANSFER.

Differential Revision: https://reviews.llvm.org/D99799

3 years ago[RISCV] Improve 64-bit integer constant materialization for more cases.
Craig Topper [Fri, 2 Apr 2021 17:17:54 +0000 (10:17 -0700)]
[RISCV] Improve 64-bit integer constant materialization for more cases.

For positive constants we try shifting left to remove leading zeros
and fill the bottom bits with 1s. We then materialize that constant
shift it right.

This patch adds a new strategy to try filling the bottom bits with
zeros instead. This catches some additional cases.

3 years ago[RISCV] Add missing CHECK-EXPAND line to one case in rv64i-aliases-valid.s.
Craig Topper [Fri, 2 Apr 2021 17:16:49 +0000 (10:16 -0700)]
[RISCV] Add missing CHECK-EXPAND line to one case in rv64i-aliases-valid.s.

Use -NEXT to protect against other missing lines.

3 years ago[InstCombine] fold not+or+neg
Sanjay Patel [Fri, 2 Apr 2021 15:57:34 +0000 (11:57 -0400)]
[InstCombine] fold not+or+neg

~((-X) | Y) --> (X - 1) & (~Y)

We generally prefer 'add' over 'sub', this reduces the
dependency chain, and this looks better for codegen on
x86, ARM, and AArch64 targets.

https://llvm.org/PR45755

https://alive2.llvm.org/ce/z/cxZDSp

3 years ago[InstCombine] add tests for not+or+neg; NFC
Sanjay Patel [Fri, 2 Apr 2021 15:38:24 +0000 (11:38 -0400)]
[InstCombine] add tests for not+or+neg; NFC

https://llvm.org/PR45755

3 years ago[GVNSink] auto-generate test checks; NFC
Sanjay Patel [Thu, 1 Apr 2021 18:13:24 +0000 (14:13 -0400)]
[GVNSink] auto-generate test checks; NFC

3 years ago[SCCP] Avoid modifying AdditionalUsers while iterating over it
Dimitry Andric [Sun, 14 Mar 2021 16:41:21 +0000 (17:41 +0100)]
[SCCP] Avoid modifying AdditionalUsers while iterating over it

When run under valgrind, or with a malloc that poisons freed memory,
this can lead to segfaults or other problems.

To avoid modifying the AdditionalUsers DenseMap while still iterating,
save the instructions to be notified in a separate SmallPtrSet, and use
this to later call OperandChangedState on each instruction.

Fixes PR49582.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D98602

3 years ago[gn build] add build file for tsan runtime
Nico Weber [Fri, 2 Apr 2021 14:37:01 +0000 (10:37 -0400)]
[gn build] add build file for tsan runtime

Linux-only for now. Some mac bits stubbed out, but not tested.

Good enough for the tiny_race.c example at
https://clang.llvm.org/docs/ThreadSanitizer.html :

   $ out/gn/bin/clang -fsanitize=address -g -O1 tiny_race.c
   $ while true; do ./a.out || echo $? ; done

While here, also make `-fsanitize=address` work for .c files.

Differential Revision: https://reviews.llvm.org/D99795

3 years ago[LV] Hoist mapping of IR operands to VPValues (NFC).
Florian Hahn [Fri, 2 Apr 2021 12:28:44 +0000 (13:28 +0100)]
[LV] Hoist mapping of IR operands to VPValues (NFC).

This patch moves mapping of IR operands to VPValues out of
tryToCreateWidenRecipe. This allows using existing VPValue operands when
widening recipes directly, which will be introduced in future patches.

3 years ago[rs4gc] Use loops instead of straightline code for attribute stripping [nfc]
Philip Reames [Fri, 2 Apr 2021 16:23:50 +0000 (09:23 -0700)]
[rs4gc] Use loops instead of straightline code for attribute stripping [nfc]

Mostly because I'm about to add more attributes and the straightline copies get much uglier.  What's currently there isn't too bad.

3 years ago[lld-macho][NFC] Remove redundant member from class Defined
Greg McGary [Fri, 2 Apr 2021 01:42:44 +0000 (18:42 -0700)]
[lld-macho][NFC] Remove redundant member from class Defined

`class Symbol` defines a data member `InputFile *file;`
`class Defined` inherits from `Symbol` and also defines a data member `InputFile *file;` for no apparent purpose.

Differential Revision: https://reviews.llvm.org/D99783

3 years ago[rs4gc] Strip nofree and nosync attributes when lowering from abstract model
Philip Reames [Fri, 2 Apr 2021 16:12:24 +0000 (09:12 -0700)]
[rs4gc] Strip nofree and nosync attributes when lowering from abstract model

The safepoints being inserted exists to free memory, or coordinate with another thread to do so.  Thus, we must strip any inferred attributes and reinfer them after the lowering.

I'm not aware of any active miscompiles caused by this, but since I'm working on strengthening inference of both and leveraging them in the optimization decisions, I figured a bit of future proofing was warranted.

3 years ago[rs4gc] add tests for existing code stripping attributes from function signatures
Philip Reames [Fri, 2 Apr 2021 15:56:49 +0000 (08:56 -0700)]
[rs4gc] add tests for existing code stripping attributes from function signatures

3 years agoRemove attribute handling code for simple attributes; NFC
Aaron Ballman [Fri, 2 Apr 2021 15:33:04 +0000 (11:33 -0400)]
Remove attribute handling code for simple attributes; NFC

Attributes that set the SimpleHandler flag in Attr.td don't need to be
explicitly handled in SemaDeclAttr.cpp.

3 years ago[flang] Fix MSVC build breakage
peter klausler [Fri, 2 Apr 2021 15:26:39 +0000 (08:26 -0700)]
[flang] Fix MSVC build breakage

A recent patch exposed an assumption that "long double" is (at least)
an 80-bit floating-point type, which of course it is not in MSVC.
Also get it right for non-x87 floating-point.

3 years ago[GlobalISel] Allow different types for G_SBFX and G_UBFX operands
Brendon Cahoon [Tue, 30 Mar 2021 15:19:29 +0000 (11:19 -0400)]
[GlobalISel] Allow different types for G_SBFX and G_UBFX operands

Change the definition of G_SBFX and G_UBFX so that the lsb and width
can have different types than the src and dst operands.

Differential Revision: https://reviews.llvm.org/D99739

3 years ago[LVI] Use range metadata on intrinsics
Nikita Popov [Fri, 2 Apr 2021 14:41:58 +0000 (16:41 +0200)]
[LVI] Use range metadata on intrinsics

If we don't know how to handle an intrinsic, we should still
make use of normal call range metadata.

3 years ago[CVP] Add test for !range on intrinsic (NFC)
Nikita Popov [Fri, 2 Apr 2021 14:38:38 +0000 (16:38 +0200)]
[CVP] Add test for !range on intrinsic (NFC)

3 years ago[SLP]Added a test for min/max reductions with the key store inside, NFC.
Alexey Bataev [Fri, 2 Apr 2021 14:37:40 +0000 (07:37 -0700)]
[SLP]Added a test for min/max reductions with the key store inside, NFC.

3 years ago[SLP]Fix a bug in min/max reduction, number of condition uses.
Alexey Bataev [Thu, 1 Apr 2021 18:02:34 +0000 (11:02 -0700)]
[SLP]Fix a bug in min/max reduction, number of condition uses.

The ultimate reduction node may have multiple uses, but if the ultimate
reduction is min/max reduction and based on SelectInstruction, the
condition of this select instruction must have only single use.

Differential Revision: https://reviews.llvm.org/D99753

3 years agoRevert "[X86][SSE] isHorizontalBinOp - use getTargetShuffleInputs helper"
Nico Weber [Fri, 2 Apr 2021 13:54:22 +0000 (09:54 -0400)]
Revert "[X86][SSE] isHorizontalBinOp - use getTargetShuffleInputs helper"

This reverts commit 500969f1d0b1d92d7c4ccfb6bf8807de96b7e4a0.
Makes clang assert compiling avx2 code, see
https://bugs.chromium.org/p/chromium/issues/detail?id=1195353#c4
for a standalone repro.

3 years ago[TableGen] [Docs] Add lldb-tblgen to command guide; add 4 guide stubs
Paul C. Anagnostopoulos [Tue, 30 Mar 2021 16:37:13 +0000 (12:37 -0400)]
[TableGen] [Docs] Add lldb-tblgen to command guide; add 4 guide stubs

Differential Revision: https://reviews.llvm.org/D99605

3 years agoRestore 8954fd436c7 after c06a8f9caa51c
Nico Weber [Fri, 2 Apr 2021 13:19:31 +0000 (09:19 -0400)]
Restore 8954fd436c7 after c06a8f9caa51c

Else, just-built clang can't build programs that include libc++ headers
on macOS if you build via the 'all' target.

3 years ago[NFC][SVE] update sve-intrinsics-int-arith.ll under update_llc_test_checks.py
Jun Ma [Fri, 2 Apr 2021 02:10:35 +0000 (10:10 +0800)]
[NFC][SVE] update sve-intrinsics-int-arith.ll under update_llc_test_checks.py

3 years ago[AArch64][SVE] Lowering sve.dot to DOT node
Jun Ma [Thu, 1 Apr 2021 11:44:59 +0000 (19:44 +0800)]
[AArch64][SVE] Lowering sve.dot to DOT node

Differential Revision: https://reviews.llvm.org/D99699

3 years ago[NFC][SVE] Use SVE_4_Op_Imm_Pat for sve_intx_dot_by_indexed_elem
Jun Ma [Wed, 31 Mar 2021 09:52:22 +0000 (17:52 +0800)]
[NFC][SVE] Use SVE_4_Op_Imm_Pat for sve_intx_dot_by_indexed_elem

3 years ago[mlir][spirv] Add utilities for push constant value
Lei Zhang [Fri, 2 Apr 2021 11:50:45 +0000 (07:50 -0400)]
[mlir][spirv] Add utilities for push constant value

This commit add utility functions for creating push constant
storage variable and loading values from it.

Along the way, performs some clean up:

* Deleted `setABIAttrs`, which is just a 4-liner function
  with one user.
* Moved `SPIRVConverstionTarget` into `mlir` namespace,
  to be consistent with `SPIRVTypeConverter` and
  `LLVMConversionTarget`.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D99725

3 years ago[InstCombine] Fix out-of-bounds ashr(shl) optimization
Jeroen Dobbelaere [Fri, 2 Apr 2021 11:45:11 +0000 (13:45 +0200)]
[InstCombine] Fix out-of-bounds ashr(shl) optimization

This fixes a crash found by the oss fuzzer and reported by @fhahn.
The suggestion of @RKSimon seems to be the correct fix here. (See D91343).

The oss fuzz report can be found here: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=32759

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D99792

3 years ago[RISCV] Test llvm.experimental.vector.insert intrinsics on RV32
Fraser Cormack [Wed, 31 Mar 2021 11:59:22 +0000 (12:59 +0100)]
[RISCV] Test llvm.experimental.vector.insert intrinsics on RV32

RV32 is able to use the llvm.experimental.vector.insert intrinsics too.
This patch ensures they're tested.

Reviewed By: khchen, asb

Differential Revision: https://reviews.llvm.org/D99655

3 years ago[LLDB] Skip TestLoadUsingLazyBind.py on arm/linux
Muhammad Omair Javaid [Fri, 2 Apr 2021 10:54:27 +0000 (15:54 +0500)]
[LLDB] Skip TestLoadUsingLazyBind.py on arm/linux

3 years ago[X86][SSE] isHorizontalBinOp - use getTargetShuffleInputs helper
Simon Pilgrim [Fri, 2 Apr 2021 08:43:51 +0000 (09:43 +0100)]
[X86][SSE] isHorizontalBinOp - use getTargetShuffleInputs helper

Use the getTargetShuffleInputs helper for all shuffle decoding

3 years ago[gn build] Port 0f7bbbc481e2
LLVM GN Syncbot [Fri, 2 Apr 2021 10:22:54 +0000 (10:22 +0000)]
[gn build] Port 0f7bbbc481e2

3 years agoAlways emit error for wrong interfaces to scalable vectors, unless cmdline flag is...
Sander de Smalen [Wed, 17 Mar 2021 21:35:09 +0000 (21:35 +0000)]
Always emit error for wrong interfaces to scalable vectors, unless cmdline flag is passed.

In order to bring up scalable vector support in LLVM incrementally,
we introduced behaviour to emit a warning, instead of an error, when
asking the wrong question of a scalable vector, like asking for the
fixed number of elements.

This patch puts that behaviour under a flag. The default behaviour is
that the compiler will always error, which means that all LLVM unit
tests and regression tests will now fail when a code-path is taken that
still uses the wrong interface.

The behaviour to demote an error to a warning can be individually enabled
for tools that want to support experimental use of scalable vectors.
This patch enables that behaviour when driving compilation from Clang.
This means that for users who want to try out scalable-vector support,
fixed-width codegen support, or build user-code with scalable vector
intrinsics, Clang will not crash and burn when the compiler encounters
such a case.

This allows us to do away with the following pattern in many of the SVE tests:
  RUN: .... 2>%t
  RUN: cat %t | FileCheck --check-prefix=WARN
  WARN-NOT: warning: ...

The behaviour to emit warnings is only temporary and we expect this flag
to be removed in the future when scalable vector support is more stable.

This patch also has fixes the following tests:
 unittests:
   ScalableVectorMVTsTest.SizeQueries
   SelectionDAGAddressAnalysisTest.unknownSizeFrameObjects
   AArch64SelectionDAGTest.computeKnownBitsSVE_ZERO_EXTEND_VECTOR_INREG

 regression tests:
   Transforms/InstCombine/vscale_gep.ll

Reviewed By: paulwalker-arm, ctetreau

Differential Revision: https://reviews.llvm.org/D98856

3 years ago[SLP] Better estimate cost of no-op extracts on target vectors.
Florian Hahn [Fri, 2 Apr 2021 09:40:12 +0000 (10:40 +0100)]
[SLP] Better estimate cost of no-op extracts on target vectors.

The motivation for this patch is to better estimate the cost of
extracelement instructions in cases were they are going to be free,
because the source vector can be used directly.

A simple example is

    %v1.lane.0 = extractelement <2 x double> %v.1, i32 0
    %v1.lane.1 = extractelement <2 x double> %v.1, i32 1

    %a.lane.0 = fmul double %v1.lane.0, %x
    %a.lane.1 = fmul double %v1.lane.1, %y

Currently we only consider the extracts free, if there are no other
users.

In this particular case, on AArch64 which can fit <2 x double> in a
vector register, the extracts should be free, independently of other
users, because the source vector of the extracts will be in a vector
register directly, so it should be free to use the vector directly.

The SLP vectorized version of noop_extracts_9_lanes is 30%-50% faster on
certain AArch64 CPUs.

It looks like this does not impact any code in
SPEC2000/SPEC2006/MultiSource both on X86 and AArch64 with -O3 -flto.

This originally regressed after D80773, so if there's a better
alternative to explore, I'd be more than happy to do that.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D99719

3 years ago[RISCV] Optimize more redundant VSETVLIs
Fraser Cormack [Thu, 1 Apr 2021 14:17:32 +0000 (15:17 +0100)]
[RISCV] Optimize more redundant VSETVLIs

D99717 introduced some test cases which showed that the output of one
vsetvli into another would not be picked up by the RISCVCleanupVSETVLI
pass. This patch teaches the optimization about such a pattern. The
pattern is quite common when using the RVV vsetvli intrinsic to pass the
VL onto other intrinsics.

The second test case introduced by D99717 is left unoptimized by this
patch. It is a rarer case and will require us to rewire any uses of the
redundant vset[i]vli's output to the previous one's.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D99730

3 years ago[RISCV] Add some tests showing vsetvli cleanup opportunities
Fraser Cormack [Thu, 1 Apr 2021 10:24:42 +0000 (11:24 +0100)]
[RISCV] Add some tests showing vsetvli cleanup opportunities

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D99717

3 years ago[libc++] Fix build on macOS older than 10.15.
Marek Kurdej [Fri, 2 Apr 2021 08:31:18 +0000 (10:31 +0200)]
[libc++] Fix build on macOS older than 10.15.

* This was introduced in D99515 that added -Wundef flag. CI run on macOS 10.15 and this problem wasn't caught before.

3 years ago[NARY-REASSOCIATE] Support reassociation of min/max
Evgeniy Brevnov [Tue, 2 Mar 2021 08:14:32 +0000 (15:14 +0700)]
[NARY-REASSOCIATE] Support reassociation of min/max

Support reassociation for min/max. With that we should be able to transform min(min(a, b), c) -> min(min(a, c), b) if min(a, c) is already available.

Reviewed By: mkazantsev, lebedev.ri

Differential Revision: https://reviews.llvm.org/D88287

3 years ago[PassManager] Run additional LICM before LoopRotate
Roman Lebedev [Fri, 2 Apr 2021 07:40:12 +0000 (10:40 +0300)]
[PassManager] Run additional LICM before LoopRotate

Loop rotation often has to perform code duplication
from header into preheader, which introduces PHI nodes.

>>! In D99204, @thopre wrote:
>
> With loop peeling, it is important that unnecessary PHIs be avoided or
> it will leads to spurious peeling. One source of such PHIs is loop
> rotation which creates PHIs for invariant loads. Those PHIs are
> particularly problematic since loop peeling is now run as part of simple
> loop unrolling before GVN is run, and are thus a source of spurious
> peeling.
>
> Note that while some of the load can be hoisted and eventually
> eliminated by instruction combine, this is not always possible due to
> alignment issue. In particular, the motivating example [1] was a load
> inside a class instance which cannot be hoisted because the `this'
> pointer has an alignment of 1.
>
> [1] http://lists.llvm.org/pipermail/llvm-dev/attachments/20210312/4ce73c47/attachment.cpp

Now, we could enhance LoopRotate to avoid duplicating code when not needed,
but instead hoist loop-invariant code, but isn't that a code duplication? (*sic*)
We have LICM, and in fact we already run it right after LoopRotation.

We could try to move it to before LoopRotation,
that is basically free from compile-time perspective:
https://llvm-compile-time-tracker.com/compare.php?from=6c93eb4477d88af046b915bc955c03693b2cbb58&to=a4bee6d07732b1184c436da489040b912f0dc271&stat=instructions
But, looking at stats, i think it isn't great that we would no longer do LICM after LoopRotation, in particular:
| statistic name                                   | LoopRotate-LICM | LICM-LoopRotate |     Δ |       % | abs(%) |
| asm-printer.EmittedInsts                         | 9015930         | 9015799         |  -131 |   0.00% |  0.00% |
| indvars.NumElimCmp                               | 3536            | 3544            |     8 |   0.23% |  0.23% |
| indvars.NumElimExt                               | 36725           | 36580           |  -145 |  -0.39% |  0.39% |
| indvars.NumElimIV                                | 1197            | 1187            |   -10 |  -0.84% |  0.84% |
| indvars.NumElimIdentity                          | 143             | 136             |    -7 |  -4.90% |  4.90% |
| indvars.NumElimRem                               | 4               | 5               |     1 |  25.00% | 25.00% |
| indvars.NumLFTR                                  | 29842           | 29890           |    48 |   0.16% |  0.16% |
| indvars.NumReplaced                              | 2293            | 2227            |   -66 |  -2.88% |  2.88% |
| indvars.NumSimplifiedSDiv                        | 6               | 8               |     2 |  33.33% | 33.33% |
| indvars.NumWidened                               | 26438           | 26329           |  -109 |  -0.41% |  0.41% |
| instcount.TotalBlocks                            | 1178338         | 1173840         | -4498 |  -0.38% |  0.38% |
| instcount.TotalFuncs                             | 111825          | 111829          |     4 |   0.00% |  0.00% |
| instcount.TotalInsts                             | 9905442         | 9896139         | -9303 |  -0.09% |  0.09% |
| lcssa.NumLCSSA                                   | 425871          | 423961          | -1910 |  -0.45% |  0.45% |
| licm.NumHoisted                                  | 378357          | 378753          |   396 |   0.10% |  0.10% |
| licm.NumMovedCalls                               | 2193            | 2208            |    15 |   0.68% |  0.68% |
| licm.NumMovedLoads                               | 35899           | 31821           | -4078 | -11.36% | 11.36% |
| licm.NumPromoted                                 | 11178           | 11154           |   -24 |  -0.21% |  0.21% |
| licm.NumSunk                                     | 13359           | 13587           |   228 |   1.71% |  1.71% |
| loop-delete.NumDeleted                           | 8547            | 8402            |  -145 |  -1.70% |  1.70% |
| loop-instsimplify.NumSimplified                  | 12876           | 11890           |  -986 |  -7.66% |  7.66% |
| loop-peel.NumPeeled                              | 1008            | 925             |   -83 |  -8.23% |  8.23% |
| loop-rotate.NumNotRotatedDueToHeaderSize         | 368             | 365             |    -3 |  -0.82% |  0.82% |
| loop-rotate.NumRotated                           | 42015           | 42003           |   -12 |  -0.03% |  0.03% |
| loop-simplifycfg.NumLoopBlocksDeleted            | 240             | 242             |     2 |   0.83% |  0.83% |
| loop-simplifycfg.NumLoopExitsDeleted             | 497             | 20              |  -477 | -95.98% | 95.98% |
| loop-simplifycfg.NumTerminatorsFolded            | 618             | 336             |  -282 | -45.63% | 45.63% |
| loop-unroll.NumCompletelyUnrolled                | 11028           | 11032           |     4 |   0.04% |  0.04% |
| loop-unroll.NumUnrolled                          | 12608           | 12529           |   -79 |  -0.63% |  0.63% |
| mem2reg.NumDeadAlloca                            | 10222           | 10221           |    -1 |  -0.01% |  0.01% |
| mem2reg.NumPHIInsert                             | 192110          | 192106          |    -4 |   0.00% |  0.00% |
| mem2reg.NumSingleStore                           | 637650          | 637643          |    -7 |   0.00% |  0.00% |
| scalar-evolution.NumBruteForceTripCountsComputed | 814             | 812             |    -2 |  -0.25% |  0.25% |
| scalar-evolution.NumTripCountsComputed           | 283108          | 282934          |  -174 |  -0.06% |  0.06% |
| scalar-evolution.NumTripCountsNotComputed        | 106712          | 106718          |     6 |   0.01% |  0.01% |
| simple-loop-unswitch.NumBranches                 | 5178            | 4752            |  -426 |  -8.23% |  8.23% |
| simple-loop-unswitch.NumCostMultiplierSkipped    | 914             | 503             |  -411 | -44.97% | 44.97% |
| simple-loop-unswitch.NumSwitches                 | 20              | 18              |    -2 | -10.00% | 10.00% |
| simple-loop-unswitch.NumTrivial                  | 183             | 95              |   -88 | -48.09% | 48.09% |

... but that actually regresses LICM (-12% `licm.NumMovedLoads`),
loop-simplifycfg (`NumLoopExitsDeleted`, `NumTerminatorsFolded`),
simple-loop-unswitch (`NumTrivial`).

What if we instead have LICM both before and after LoopRotate?
| statistic name                                | LoopRotate-LICM | LICM-LoopRotate-LICM |     Δ |       % | abs(%) |
| asm-printer.EmittedInsts                      | 9015930         | 9014474              | -1456 |  -0.02% |  0.02% |
| indvars.NumElimCmp                            | 3536            | 3546                 |    10 |   0.28% |  0.28% |
| indvars.NumElimExt                            | 36725           | 36681                |   -44 |  -0.12% |  0.12% |
| indvars.NumElimIV                             | 1197            | 1185                 |   -12 |  -1.00% |  1.00% |
| indvars.NumElimIdentity                       | 143             | 146                  |     3 |   2.10% |  2.10% |
| indvars.NumElimRem                            | 4               | 5                    |     1 |  25.00% | 25.00% |
| indvars.NumLFTR                               | 29842           | 29899                |    57 |   0.19% |  0.19% |
| indvars.NumReplaced                           | 2293            | 2299                 |     6 |   0.26% |  0.26% |
| indvars.NumSimplifiedSDiv                     | 6               | 8                    |     2 |  33.33% | 33.33% |
| indvars.NumWidened                            | 26438           | 26404                |   -34 |  -0.13% |  0.13% |
| instcount.TotalBlocks                         | 1178338         | 1173652              | -4686 |  -0.40% |  0.40% |
| instcount.TotalFuncs                          | 111825          | 111829               |     4 |   0.00% |  0.00% |
| instcount.TotalInsts                          | 9905442         | 9895452              | -9990 |  -0.10% |  0.10% |
| lcssa.NumLCSSA                                | 425871          | 425373               |  -498 |  -0.12% |  0.12% |
| licm.NumHoisted                               | 378357          | 383352               |  4995 |   1.32% |  1.32% |
| licm.NumMovedCalls                            | 2193            | 2204                 |    11 |   0.50% |  0.50% |
| licm.NumMovedLoads                            | 35899           | 35755                |  -144 |  -0.40% |  0.40% |
| licm.NumPromoted                              | 11178           | 11163                |   -15 |  -0.13% |  0.13% |
| licm.NumSunk                                  | 13359           | 14321                |   962 |   7.20% |  7.20% |
| loop-delete.NumDeleted                        | 8547            | 8538                 |    -9 |  -0.11% |  0.11% |
| loop-instsimplify.NumSimplified               | 12876           | 12041                |  -835 |  -6.48% |  6.48% |
| loop-peel.NumPeeled                           | 1008            | 924                  |   -84 |  -8.33% |  8.33% |
| loop-rotate.NumNotRotatedDueToHeaderSize      | 368             | 365                  |    -3 |  -0.82% |  0.82% |
| loop-rotate.NumRotated                        | 42015           | 42005                |   -10 |  -0.02% |  0.02% |
| loop-simplifycfg.NumLoopBlocksDeleted         | 240             | 241                  |     1 |   0.42% |  0.42% |
| loop-simplifycfg.NumTerminatorsFolded         | 618             | 619                  |     1 |   0.16% |  0.16% |
| loop-unroll.NumCompletelyUnrolled             | 11028           | 11029                |     1 |   0.01% |  0.01% |
| loop-unroll.NumUnrolled                       | 12608           | 12525                |   -83 |  -0.66% |  0.66% |
| mem2reg.NumPHIInsert                          | 192110          | 192073               |   -37 |  -0.02% |  0.02% |
| mem2reg.NumSingleStore                        | 637650          | 637652               |     2 |   0.00% |  0.00% |
| scalar-evolution.NumTripCountsComputed        | 283108          | 282998               |  -110 |  -0.04% |  0.04% |
| scalar-evolution.NumTripCountsNotComputed     | 106712          | 106691               |   -21 |  -0.02% |  0.02% |
| simple-loop-unswitch.NumBranches              | 5178            | 5185                 |     7 |   0.14% |  0.14% |
| simple-loop-unswitch.NumCostMultiplierSkipped | 914             | 925                  |    11 |   1.20% |  1.20% |
| simple-loop-unswitch.NumTrivial               | 183             | 179                  |    -4 |  -2.19% |  2.19% |
| simple-loop-unswitch.NumBranches              | 5178            | 4752                 |  -426 |  -8.23% |  8.23% |
| simple-loop-unswitch.NumCostMultiplierSkipped | 914             | 503                  |  -411 | -44.97% | 44.97% |
| simple-loop-unswitch.NumSwitches              | 20              | 18                   |    -2 | -10.00% | 10.00% |
| simple-loop-unswitch.NumTrivial               | 183             | 95                   |   -88 | -48.09% | 48.09% |

I.e. we end up with less instructions, less peeling, more LICM activity,
also note how none of those 4 regressions are here. Namely:

| statistic name                                   | LICM-LoopRotate | LICM-LoopRotate-LICM |     Δ |        % |   abs(%) |
| asm-printer.EmittedInsts                         | 9015799         | 9014474              | -1325 |   -0.01% |    0.01% |
| indvars.NumElimCmp                               | 3544            | 3546                 |     2 |    0.06% |    0.06% |
| indvars.NumElimExt                               | 36580           | 36681                |   101 |    0.28% |    0.28% |
| indvars.NumElimIV                                | 1187            | 1185                 |    -2 |   -0.17% |    0.17% |
| indvars.NumElimIdentity                          | 136             | 146                  |    10 |    7.35% |    7.35% |
| indvars.NumLFTR                                  | 29890           | 29899                |     9 |    0.03% |    0.03% |
| indvars.NumReplaced                              | 2227            | 2299                 |    72 |    3.23% |    3.23% |
| indvars.NumWidened                               | 26329           | 26404                |    75 |    0.28% |    0.28% |
| instcount.TotalBlocks                            | 1173840         | 1173652              |  -188 |   -0.02% |    0.02% |
| instcount.TotalInsts                             | 9896139         | 9895452              |  -687 |   -0.01% |    0.01% |
| lcssa.NumLCSSA                                   | 423961          | 425373               |  1412 |    0.33% |    0.33% |
| licm.NumHoisted                                  | 378753          | 383352               |  4599 |    1.21% |    1.21% |
| licm.NumMovedCalls                               | 2208            | 2204                 |    -4 |   -0.18% |    0.18% |
| licm.NumMovedLoads                               | 31821           | 35755                |  3934 |   12.36% |   12.36% |
| licm.NumPromoted                                 | 11154           | 11163                |     9 |    0.08% |    0.08% |
| licm.NumSunk                                     | 13587           | 14321                |   734 |    5.40% |    5.40% |
| loop-delete.NumDeleted                           | 8402            | 8538                 |   136 |    1.62% |    1.62% |
| loop-instsimplify.NumSimplified                  | 11890           | 12041                |   151 |    1.27% |    1.27% |
| loop-peel.NumPeeled                              | 925             | 924                  |    -1 |   -0.11% |    0.11% |
| loop-rotate.NumRotated                           | 42003           | 42005                |     2 |    0.00% |    0.00% |
| loop-simplifycfg.NumLoopBlocksDeleted            | 242             | 241                  |    -1 |   -0.41% |    0.41% |
| loop-simplifycfg.NumLoopExitsDeleted             | 20              | 497                  |   477 | 2385.00% | 2385.00% |
| loop-simplifycfg.NumTerminatorsFolded            | 336             | 619                  |   283 |   84.23% |   84.23% |
| loop-unroll.NumCompletelyUnrolled                | 11032           | 11029                |    -3 |   -0.03% |    0.03% |
| loop-unroll.NumUnrolled                          | 12529           | 12525                |    -4 |   -0.03% |    0.03% |
| mem2reg.NumDeadAlloca                            | 10221           | 10222                |     1 |    0.01% |    0.01% |
| mem2reg.NumPHIInsert                             | 192106          | 192073               |   -33 |   -0.02% |    0.02% |
| mem2reg.NumSingleStore                           | 637643          | 637652               |     9 |    0.00% |    0.00% |
| scalar-evolution.NumBruteForceTripCountsComputed | 812             | 814                  |     2 |    0.25% |    0.25% |
| scalar-evolution.NumTripCountsComputed           | 282934          | 282998               |    64 |    0.02% |    0.02% |
| scalar-evolution.NumTripCountsNotComputed        | 106718          | 106691               |   -27 |   -0.03% |    0.03% |
| simple-loop-unswitch.NumBranches                 | 4752            | 5185                 |   433 |    9.11% |    9.11% |
| simple-loop-unswitch.NumCostMultiplierSkipped    | 503             | 925                  |   422 |   83.90% |   83.90% |
| simple-loop-unswitch.NumSwitches                 | 18              | 20                   |     2 |   11.11% |   11.11% |
| simple-loop-unswitch.NumTrivial                  | 95              | 179                  |    84 |   88.42% |   88.42% |

{F15983613} {F15983615} {F15983616}
(this is vanilla llvm testsuite + rawspeed + darktable)

As an example of the code where early LICM only is bad, see:
https://godbolt.org/z/GzEbacs4K

This does have an observable compile-time regression of +~0.5% geomean
https://llvm-compile-time-tracker.com/compare.php?from=7c5222e4d1a3a14f029e5f614c9aefd0fa505f1e&to=5d81826c3411982ca26e46b9d0aff34c80577664&stat=instructions
but i think that's basically nothing, and there's potential that it might
be avoidable in the future by fixing clang to produce alignment information
on function arguments, thus making the second run unneeded.

Differential Revision: https://reviews.llvm.org/D99249

3 years ago[NFC][scudo] Inline some functions into ScudoPrimaryTest
Vitaly Buka [Fri, 2 Apr 2021 07:58:09 +0000 (00:58 -0700)]
[NFC][scudo] Inline some functions into ScudoPrimaryTest

3 years ago[NFC][scudo] Convert ScudoPrimaryTest into TYPED_TEST
Vitaly Buka [Fri, 2 Apr 2021 07:17:45 +0000 (00:17 -0700)]
[NFC][scudo] Convert ScudoPrimaryTest into TYPED_TEST

3 years ago[libcxx] [test] Fix invocable tests on Windows
Martin Storsjö [Wed, 31 Mar 2021 06:17:00 +0000 (09:17 +0300)]
[libcxx] [test] Fix invocable tests on Windows

MSVC had a bug regarding preferring intergral conversions over
floating conversions. This is fixed in MSVC 19.28 and newer. Clang in
MSVC mode so far only mimics the old, buggy behaviour, but will
hopefully soon be fixed to comply with the new behaviour too
(see https://reviews.llvm.org/D99663).

Make the negative test to use a distinctly different type,
leaving checks for compiler specific bugs out of the libcxx test.

Differential Revision: https://reviews.llvm.org/D99641

3 years ago[libcxx] [test] Make the condvar wait_for tests a bit more understandable. NFC.
Martin Storsjö [Thu, 1 Apr 2021 20:53:36 +0000 (23:53 +0300)]
[libcxx] [test] Make the condvar wait_for tests a bit more understandable. NFC.

This was requested in the review of D99175; rename the "runs"
variable to clarify what it means wrt the test, and move updating of
it to the main function to clarify its behaviour wrt the two runs
further.

Differential Revision: https://reviews.llvm.org/D99768

3 years ago[clang][ItaniumMangle] Check SizeExpr for DependentSizedArrayType
oToToT [Thu, 1 Apr 2021 23:18:51 +0000 (07:18 +0800)]
[clang][ItaniumMangle] Check SizeExpr for DependentSizedArrayType
(PR49478)

As ArrayType::ArrayType mentioned in clang/lib/AST/Type.cpp, a
DependentSizedArrayType might not have size expression because it it
used as the type of a dependent array of unknown bound with a dependent
braced initializer.

Thus, I add a check when mangling array of that type.

This should fix https://bugs.llvm.org/show_bug.cgi?id=49478

Reviewed By: Richard Smith - zygoloid

Differential Revision: https://reviews.llvm.org/D99407

3 years ago[mlir] add memref dialect as dependent of lower-affine pass
Alex Zinenko [Thu, 1 Apr 2021 12:19:56 +0000 (14:19 +0200)]
[mlir] add memref dialect as dependent of lower-affine pass

The lower-affine pass also processes affine load and store operations
that get converted to load and store operations now available in the
memref dialect. Since it produces operations from the memref dialect,
this dialect should be registered as dependent for this pass. It is rare
but possible to have code that doesn't have memref operations in the
input and calls this pass.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D99720

3 years ago[clang-cl] [Sema] Do not prefer integral conversion over floating-to-integral for...
Marek Kurdej [Fri, 2 Apr 2021 06:57:42 +0000 (08:57 +0200)]
[clang-cl] [Sema] Do not prefer integral conversion over floating-to-integral for MS compatibility 19.28 and higher.

As of MSVC 19.28 (2019 Update 8), integral conversion is no longer preferred over floating-to-integral, and so MSVC is more standard conformant and will generate a compiler error on ambiguous call.
Cf. https://godbolt.org/z/E8xsdqKsb.
Initially found during the review of D99641.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D99663

3 years agoTweak SimpleFastHash
Aaron Green [Fri, 2 Apr 2021 06:20:35 +0000 (23:20 -0700)]
Tweak SimpleFastHash

This change adds a SimpleFastHash64 variant of SimpleFastHash which allows call sites to specify a starting value and get a 64 bit hash in return. This allows a hash to be "resumed" with more data.

A later patch needs this to be able to hash a sequence of module-relative values one at a time, rather than just a region a memory.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D94510

3 years ago[libcxx] adds concepts `std::totally_ordered` and `std::totally_ordered_with`
Christopher Di Bella [Wed, 31 Mar 2021 21:42:37 +0000 (21:42 +0000)]
[libcxx] adds concepts `std::totally_ordered` and `std::totally_ordered_with`

Implements parts of:
    - P0898R3 Standard Library Concepts
    - P1754 Rename concepts to standard_case for C++20, while we still can

Reviewed By: Mordante

Differential Revision: https://reviews.llvm.org/D98983

3 years ago[CSSPGO] Skip dangling probe value when computing profile summary
Wenlei He [Fri, 2 Apr 2021 05:04:40 +0000 (22:04 -0700)]
[CSSPGO] Skip dangling probe value when computing profile summary

Recently we switched to use InvalidProbeCount = UINT64_MAX (instead of 0) to represent dangling probe, but UINT64_MAX is not excluded when computing profile summary. This caused profile summary to produce incorrect hot/cold threshold. The change fixed it by excluding UINT64_MAX from summary builder.

Differential Revision: https://reviews.llvm.org/D99788

3 years ago[RISCV] Add missing nxvXf64 intrinsics tests cases for floating-point compare for...
Craig Topper [Fri, 2 Apr 2021 03:48:18 +0000 (20:48 -0700)]
[RISCV] Add missing nxvXf64 intrinsics tests cases for floating-point compare for RV32.

3 years ago[llvm-reduce] Add header guards and fix clang-tidy warnings
Samuel [Fri, 2 Apr 2021 03:38:39 +0000 (20:38 -0700)]
[llvm-reduce] Add header guards and fix clang-tidy warnings

Add header guards and fix other clang-tidy warnings in .h files.
Also align misaligned header docs

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D99634

3 years ago[RISCV] Add more nxvi64 vector intrinsic tests for RV32. NFC
Craig Topper [Fri, 2 Apr 2021 03:20:15 +0000 (20:20 -0700)]
[RISCV] Add more nxvi64 vector intrinsic tests for RV32. NFC

This confirms we handle most instrutions gracefully. We do
currently fail for vslide1up and vslide1down though.

3 years ago[lld][MachO] Fix -Wsign-compare warning (NFC)
Yang Fan [Fri, 2 Apr 2021 03:25:20 +0000 (11:25 +0800)]
[lld][MachO] Fix -Wsign-compare warning (NFC)

GCC warning:
```
/llvm-project/lld/MachO/InputFiles.cpp:484:24: warning: comparison of integer expressions of different signedness: ‘int64_t’ {aka ‘long int’} and ‘uint64_t’ {aka ‘long unsigned int’} [-Wsign-compare]
484 |           return value < subsectionEntry.offset;
    |                  ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
```

3 years ago[AssumeBundles] offset should be added to correctly calculate align
Juneyoung Lee [Fri, 2 Apr 2021 02:53:45 +0000 (11:53 +0900)]
[AssumeBundles] offset should be added to correctly calculate align

This is a patch to fix the bug in alignment calculation (see https://reviews.llvm.org/D90529#2619492).

Consider this code:

```
call void @llvm.assume(i1 true) ["align"(i32* %a, i32 32, i32 28)]
%arrayidx = getelementptr inbounds i32, i32* %a, i64 -1
; aligment of %arrayidx?
```

The llvm.assume guarantees that `%a - 28` is 32-bytes aligned, meaning that `%a` is 32k + 28 for some k.
Therefore `a - 4` cannot be 32-bytes aligned but the existing code was calculating the pointer as 32-bytes aligned.

The reason why this happened is as follows.
`DiffSCEV` stores `%arrayidx - %a` which is -4.
`OffSCEV` stores the offset value of “align”, which is 28.
`DiffSCEV` + `OffSCEV` = 24 should be used for `a - 4`'s offset from 32k, but `DiffSCEV` - `OffSCEV` = 32 was being used instead.

Reviewed By: Tyker

Differential Revision: https://reviews.llvm.org/D98759

3 years ago[CMake] Use append instead of set with the list
Petr Hosek [Fri, 2 Apr 2021 03:30:49 +0000 (20:30 -0700)]
[CMake] Use append instead of set with the list

This addresses an issue introduced by D99706.

3 years ago[NFC][scudo] Move some shared stuff into ScudoCombinedTest
Vitaly Buka [Fri, 2 Apr 2021 01:44:55 +0000 (18:44 -0700)]
[NFC][scudo] Move some shared stuff into ScudoCombinedTest

3 years ago[lld] Add missing header guard (NFC)
Yang Fan [Fri, 2 Apr 2021 03:03:08 +0000 (11:03 +0800)]
[lld] Add missing header guard (NFC)

3 years ago[lldb] Account for objc_debug_class_getNameRaw returning NULL
Jonas Devlieghere [Fri, 2 Apr 2021 02:57:46 +0000 (19:57 -0700)]
[lldb] Account for objc_debug_class_getNameRaw returning NULL

On macOS Catalina, calling objc_debug_class_getNameRaw on some of the
ISA pointers returns NULL, causing us to crash and unwind before reading
all the Objective-C classes. This does not happen on macOS Big Sur.
Account for that possibility and skip the class when that happens.

3 years agoHandle all standalone combinations of LC_NOTEs w/ & w/o addr & uuid
Jason Molenda [Fri, 2 Apr 2021 01:59:36 +0000 (18:59 -0700)]
Handle all standalone combinations of LC_NOTEs w/ & w/o addr & uuid

Fill out ProcessMachCore::DoLoadCore to handle LC_NOTE hints with
a UUID or with a UUID+address, and load the binary at the specified
offset correctly.  Add tests for all four combinations.  Change
DynamicLoaderStatic to not re-set a Section's load address in the
Target if it's already been specified.

Differential Revision: https://reviews.llvm.org/D99571
rdar://51490545

3 years ago[X86] Fix -Wunused-function warning (NFC)
Yang Fan [Fri, 2 Apr 2021 01:29:18 +0000 (09:29 +0800)]
[X86] Fix -Wunused-function warning (NFC)

GCC warning:
```
/llvm-project/llvm/lib/Target/X86/X86ISelLowering.cpp:9212:13: warning: ‘bool isHorizOp(unsigned int)’ defined but not used [-Wunused-function]
 9212 | static bool isHorizOp(unsigned Opcode) {
      |             ^~~~~~~~~
```

3 years ago[NFC][scudo] Move globals into related test
Vitaly Buka [Fri, 2 Apr 2021 01:33:49 +0000 (18:33 -0700)]
[NFC][scudo] Move globals into related test

3 years ago[debug-info][XCOFF] set `-gno-column-info` by default for DBX
Chen Zheng [Thu, 1 Apr 2021 04:50:49 +0000 (00:50 -0400)]
[debug-info][XCOFF] set `-gno-column-info` by default for DBX

For DBX, it does not handle column info well. Set -gno-column-info
by default for DBX.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D99703

3 years ago[mlir][sparse] support for very narrow index and pointer types
Aart Bik [Thu, 1 Apr 2021 23:23:17 +0000 (16:23 -0700)]
[mlir][sparse] support for very narrow index and pointer types

Rationale:
Small indices and values, when allowed by the required range of the
input tensors, can reduce the memory footprint of sparse tensors
even more. Note, however, that we must be careful zero extending
the values (since sparse tensors never use negatives for indexing),
but LLVM treats the index type as signed in most memory operations
(like the scatter and gather). This CL dots all the i's in this regard.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D99777

3 years ago[NFC][AMDGPU] Add product names for gfx908 and gfx10 processors
Tony [Fri, 26 Mar 2021 02:01:18 +0000 (02:01 +0000)]
[NFC][AMDGPU] Add product names for gfx908 and gfx10 processors

Reviewed By: msearles

Differential Revision: https://reviews.llvm.org/D99781

3 years ago[indvars[ Fix pr49802 by checking for SCEVCouldNotCompute
Philip Reames [Fri, 2 Apr 2021 00:45:53 +0000 (17:45 -0700)]
[indvars[ Fix pr49802 by checking for SCEVCouldNotCompute

The code is assuming that having an exact exit count for the loop implies that exit counts for every exit are known.  This used to be true, but when we added handling for dead exits we broke this invariant.  The new invariant is that an exact loop count implies that any exits non trivially dead have exit counts.

We could have fixed this by either a) explicitly checking for a dead exit, or b) just testing for SCEVCouldNotCompute.  I chose the second as it was simpler.

(Debugging this took longer than it should have since I'd mistyped the original assert and it wasn't checking what it was meant to...)

p.s. Sorry for the lack of test case.  Getting things into a state to actually hit this is difficult and fragile.  The original repro involves loop-deletion leaving SCEV in a slightly inprecise state which lets us bypass other transforms in IndVarSimplify on the way to this one.  All of my attempts to separate it into a standalone test failed.

3 years ago[lld][MachO] Make emitEndFunStab independent from .subsections_via_symbols
Alexander Shaposhnikov [Fri, 2 Apr 2021 00:48:09 +0000 (17:48 -0700)]
[lld][MachO] Make emitEndFunStab independent from .subsections_via_symbols

This diff addresses FIXME in SyntheticSections.cpp and removes
the dependency of emitEndFunStab on .subsections_via_symbols.

Test plan: make check-lld-macho

Differential revision: https://reviews.llvm.org/D99054

3 years ago[NFC][scudo] Use TYPED_TEST to split large test
Vitaly Buka [Fri, 2 Apr 2021 00:31:52 +0000 (17:31 -0700)]
[NFC][scudo] Use TYPED_TEST to split large test

3 years ago[TextAPI] Add support for arm64_32
Daniel Rodríguez Troitiño [Fri, 2 Apr 2021 00:19:09 +0000 (17:19 -0700)]
[TextAPI] Add support for arm64_32

Add a new architecture definition for arm64_32. The change should allow
the new architecture arm64_32 to be recognized in several pieces of
code, TextAPI parsing one of them. llvm-lipo will also recognize the
architecture and will allow lipoing files with this architecture without
failing.

Includes a small test that the architecture is recognized by llvm-nm.

Reviewed By: cishida

Differential Revision: https://reviews.llvm.org/D99673

3 years ago[builtins] Build for arm64_32 for watchOS (Darwin)
Daniel Rodríguez Troitiño [Fri, 2 Apr 2021 00:15:56 +0000 (17:15 -0700)]
[builtins] Build for arm64_32 for watchOS (Darwin)

Trying to build the builtins code fails because `arm64_32_SOURCES` is
missing. Setting it to the same list used for `aarch64_SOURCES` solves
that problem and allow the builtins to compile for that architecture.

Additionally, arm64_32 is added as a possible architecture for watchos
platforms.

Reviewed By: compnerd

Differential Revision: https://reviews.llvm.org/D99690

3 years ago[RISCV] Add nxvXi64 test cases to the RV32 Zvamo intrinsic test files. NFC
Craig Topper [Thu, 1 Apr 2021 22:58:46 +0000 (15:58 -0700)]
[RISCV] Add nxvXi64 test cases to the RV32 Zvamo intrinsic test files. NFC

3 years ago[flang] Disable some new unit tests (non-portable results)
peter klausler [Thu, 1 Apr 2021 23:27:15 +0000 (16:27 -0700)]
[flang] Disable some new unit tests (non-portable results)

Due to architectural variation on the C++ functions std::ceil, std::floor,
and std::trunc, diable some new Fortran unit tests for now that depending
on specifical results for IEEE floating-point edge cases of infinities
and NaNs.

3 years ago[MIPS, test] Fix use of undef FileCheck var
Thomas Preud'homme [Thu, 1 Apr 2021 23:41:56 +0000 (00:41 +0100)]
[MIPS, test] Fix use of undef FileCheck var

LLVM test CodeGen/Mips/sr1.ll tries to check for the absence of a
sequence of instructions with several CHECK-NOT with one of those
directives using a variable defined in another. However CHECK-NOT are
checked independently so that is using a variable defined in a pattern
that should not occur in the input.

This commit removes the definition and uses of variable to check each
line independently, making the check stronger than the current one.

Reviewed By: dsanders

Differential Revision: https://reviews.llvm.org/D99776

3 years agoRevert "[globalisel][unittests] Rename setUp() to avoid potential mix up with SetUp...
Daniel Sanders [Thu, 1 Apr 2021 23:47:43 +0000 (16:47 -0700)]
Revert "[globalisel][unittests] Rename setUp() to avoid potential mix up with SetUp() from gtest"

Forgot to apply commit message changes from phabricator

This reverts commit 3a016e31ecef7eeb876b540c928a25a7c5d2e07a.

3 years ago[globalisel][unittests] Rename setUp() to avoid potential mix up with SetUp() from...
Daniel Sanders [Wed, 31 Mar 2021 23:06:46 +0000 (16:06 -0700)]
[globalisel][unittests] Rename setUp() to avoid potential mix up with SetUp() from gtest

Also, make it structurally required so it can't be forgotten and re-introduce
the bug that led to the rotten green tests.

Differential Revision: https://reviews.llvm.org/D99692

3 years ago[OpenMP, test] Fix use of undef VAR_PRIV FileCheck var
Thomas Preud'homme [Thu, 1 Apr 2021 23:25:37 +0000 (00:25 +0100)]
[OpenMP, test] Fix use of undef VAR_PRIV FileCheck var

Remove the CHECK-NOT directive referring to as-of-yet undefined VAR_PRIV
variable since the pattern of the following CHECK-NOT in the same
CHECK-NOT block covers a superset of the case caught by the first
CHECK-NOT.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D99775

3 years ago[OpenMP, test] Fix use of undef DECL FileCheck var
Thomas Preud'homme [Thu, 1 Apr 2021 21:31:06 +0000 (22:31 +0100)]
[OpenMP, test] Fix use of undef DECL FileCheck var

OpenMP test target_data_use_device_ptr_if_codegen contains a CHECK-NOT
directive using an undefined DECL FileCheck variable. It seems copied
from target_data_use_device_ptr_codegen where there's a CHECK for a load
that defined the variable. Since there is no corresponding load in this
testcase, the simplest is to simply forbid any store and get rid of the
variable altogether.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D99771

3 years ago[OpenMP, test] Fix uses of undef S*VAR FileCheck var
Thomas Preud'homme [Thu, 1 Apr 2021 16:12:14 +0000 (17:12 +0100)]
[OpenMP, test] Fix uses of undef S*VAR FileCheck var

Fix the many cases of use of undefined SIVAR/SVAR/SFVAR in OpenMP
*private_codegen tests, due to a missing BLOCK directive to capture the
IR variable when it is declared. It also fixes a few typo in its use.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D99770

3 years agoSetup OpBuilder to support detached block in loopUnrollByFactor (NFC)
Mehdi Amini [Thu, 1 Apr 2021 23:34:03 +0000 (23:34 +0000)]
Setup OpBuilder to support detached block in loopUnrollByFactor (NFC)

Setting the builder from a block is looking up for a parent operation
to get a context, instead by setting up the builder with an explicit
context we can support invoking this helper in absence of a parent
operation.

3 years ago[flang] Fix unit test failure on POWER
peter klausler [Thu, 1 Apr 2021 22:59:56 +0000 (15:59 -0700)]
[flang] Fix unit test failure on POWER

A new unit test for the Fortran runtime needs to allow for some
architectural variation on Infinity and NaN edge cases of NINT().

3 years ago[OpenMP51] Accept `primary` as proc bind affinity policy in Clang
cchen [Thu, 1 Apr 2021 23:07:12 +0000 (18:07 -0500)]
[OpenMP51] Accept `primary` as proc bind affinity policy in Clang

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D99622

3 years ago[flang] Implement numeric intrinsic functions in runtime
peter klausler [Thu, 1 Apr 2021 19:59:59 +0000 (12:59 -0700)]
[flang] Implement numeric intrinsic functions in runtime

Adds APIs, implementations, and unit tests for AINT, ANINT,
CEILING, EXPONENT, FLOOR, FRACTION, MOD, MODULO, NEAREST, NINT,
RRSPACING, SCALE, SET_EXPONENT, & SPACING.

Differential Revision: https://reviews.llvm.org/D99764

3 years agollvm-shlib: Create object libraries for each component and link against them
Tom Stellard [Thu, 1 Apr 2021 04:35:04 +0000 (21:35 -0700)]
llvm-shlib: Create object libraries for each component and link against them

This makes it possible to build libLLVM.so without first creating a
static library for each component.  In the case where only libLLVM.so is
built (i.e. ninja LLVM) this eliminates 150 linker jobs.

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D95727

3 years ago[funcattrs] Respect nofree attribute on callsites (not just callee)
Philip Reames [Thu, 1 Apr 2021 21:45:14 +0000 (14:45 -0700)]
[funcattrs] Respect nofree attribute on callsites (not just callee)

3 years ago[Driver] -nostdinc -nostdinc++: don't warn for -Wunused-command-line-argument
Fangrui Song [Thu, 1 Apr 2021 21:37:34 +0000 (14:37 -0700)]
[Driver] -nostdinc -nostdinc++: don't warn for -Wunused-command-line-argument

3 years ago[RISCV] Add isel patterns to handle vrsub intrinsic with 2 vector operands.
Craig Topper [Thu, 1 Apr 2021 21:07:04 +0000 (14:07 -0700)]
[RISCV] Add isel patterns to handle vrsub intrinsic with 2 vector operands.

This occurs when we type legalize an i64 scalar input on RV32. We
need to manually splat, which requires a vector input. Rather
than special case this in lowering just pattern match it.

3 years ago[tests] Add tests for forthcoming funcattrs nosync inference improvement
Philip Reames [Thu, 1 Apr 2021 20:58:13 +0000 (13:58 -0700)]
[tests] Add tests for forthcoming funcattrs nosync inference improvement

These are basically all the attributor tests for the same attribute with some minor cleanup for readability and autogened.

3 years agoReland "Add support to -Wa,--version in clang""
Jian Cai [Thu, 1 Apr 2021 19:22:50 +0000 (12:22 -0700)]
Reland "Add support to -Wa,--version in clang""

This relands commit 3cc3c0f8352ec33ca2f2636f94cb1d85fc57ac16 with fixed
test cases, which was reverted by commit
bf2479c347c8ca88fefdb144d8bae0a7a4231e2a.

3 years ago[libc++][NFC] Increase readability of typeinfo comparison of ARM64
Louis Dionne [Tue, 2 Mar 2021 21:18:37 +0000 (16:18 -0500)]
[libc++][NFC] Increase readability of typeinfo comparison of ARM64

We wasted a good deal of time trying to figure out whether our implementation
was correct. In the end, it was, but it wasn't so easy to determine. This
patch dumbs down the implementation and improves the documentation to make
it easier to validate.

See https://lists.llvm.org/pipermail/libcxx-dev/2020-December/001060.html.

Differential Revision: https://reviews.llvm.org/D97802

3 years ago[scudo][NFC] Make tests runs with --gtest_repeat=2
Vitaly Buka [Thu, 1 Apr 2021 19:40:28 +0000 (12:40 -0700)]
[scudo][NFC] Make tests runs with --gtest_repeat=2

Reviewed By: cryptoad

Differential Revision: https://reviews.llvm.org/D99766

3 years ago[Scudo] Fix SizeClassAllocatorLocalCache::drain
Vitaly Buka [Thu, 1 Apr 2021 19:44:31 +0000 (12:44 -0700)]
[Scudo] Fix SizeClassAllocatorLocalCache::drain

It leaved few blocks in PerClassArray[0].

Reviewed By: cryptoad

Differential Revision: https://reviews.llvm.org/D99763

3 years ago[ARM] Allow v6m runtime loop unrolling
David Green [Thu, 1 Apr 2021 20:21:40 +0000 (21:21 +0100)]
[ARM] Allow v6m runtime loop unrolling

This removes the restriction that only Thumb2 targets enable runtime
loop unrolling, allowing it for Thumb1 only cores as well. The existing
T2 heuristics are used (for the time being) to control when and how
unrolling is performed.

Differential Revision: https://reviews.llvm.org/D99588

3 years ago[NFC][scudo] Simplify UseQuarantine initialization
Vitaly Buka [Thu, 1 Apr 2021 04:46:53 +0000 (21:46 -0700)]
[NFC][scudo] Simplify UseQuarantine initialization

3 years ago[flang] Fix arm clang build
peter klausler [Thu, 1 Apr 2021 19:54:22 +0000 (12:54 -0700)]
[flang] Fix arm clang build

The new source file flang/runtime/complex-reduction.c contains
a portability work-around that implicitly assumed that a recent
version of clang would be used; this patch changes the code and
should be portable to older clangs and any other C compilers that
don't support the standard CMPLXF/CMPLX/CMPLXL macros.

3 years ago[OpenMP] Pass mapping names to add components in a user defined mapper
Joseph Huber [Wed, 31 Mar 2021 20:00:38 +0000 (16:00 -0400)]
[OpenMP] Pass mapping names to add components in a user defined mapper

Summary:
Currently the mapping names are not passed to the mapper components that set up
the array region. This means array mappings will not have their names availible
in the runtime. This patch fixes this by passing the argument name to the region
correctly. This means that the mapped variable's name will be the declared
mapper that placed it on the device.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D99681

3 years ago[RISCV] Use softPromoteHalf legalization for fp16 without Zfh rather than PromoteFloat.
Craig Topper [Thu, 1 Apr 2021 18:47:11 +0000 (11:47 -0700)]
[RISCV] Use softPromoteHalf legalization for fp16 without Zfh rather than PromoteFloat.

The default legalization strategy is PromoteFloat which keeps
half in single precision format through multiple floating point
operations. Conversion to/from float is done at loads, stores,
bitcasts, and other places that care about the exact size being 16
bits.

This patches switches to the alternative method softPromoteHalf.
This aims to keep the type in 16-bit format between every operation.
So we promote to float and immediately round for any arithmetic
operation. This should be closer to the IR semantics since we
are rounding after each operation and not accumulating extra
precision across multiple operations. X86 is the only other
target that enables this today. See https://reviews.llvm.org/D73749

I had to update getRegisterTypeForCallingConv to force f16 to
use f32 when the F extension is enabled. This way we can still
pass it in the lower bits of an FPR for ilp32f and lp64f ABIs.
The softPromoteHalf would otherwise always give i16 as the
argument type.

Reviewed By: asb, frasercrmck

Differential Revision: https://reviews.llvm.org/D99148

3 years ago[OpenCL][Docs] Update links to the C++ for OpenCL documentation
Anastasia Stulova [Thu, 1 Apr 2021 19:31:00 +0000 (20:31 +0100)]
[OpenCL][Docs] Update links to the C++ for OpenCL documentation

3 years agoUpdate a test missed in 6ef4505
Philip Reames [Thu, 1 Apr 2021 19:17:01 +0000 (12:17 -0700)]
Update a test missed in 6ef4505

3 years ago[Attributor] Cleanup detection of non-relaxed atomics in nosync inference
Philip Reames [Thu, 1 Apr 2021 19:01:29 +0000 (12:01 -0700)]
[Attributor] Cleanup detection of non-relaxed atomics in nosync inference

The code was checking for cases which are disallowed by the verifier.  Delete dead code and adjust style.

3 years ago[Attributor] Cleanup intrinsic handling in nosync inference [mostly NFC]
Philip Reames [Thu, 1 Apr 2021 18:48:19 +0000 (11:48 -0700)]
[Attributor] Cleanup intrinsic handling in nosync inference [mostly NFC]

Mostly stylistic adjustment, but the old code didn't handle the memcpy.inline intrinsic.  By using the matcher class, we now do.

3 years ago[libcxx] [test] Make the condvar wait_for tests less brittle
Martin Storsjö [Tue, 23 Mar 2021 11:09:26 +0000 (13:09 +0200)]
[libcxx] [test] Make the condvar wait_for tests less brittle

These seem to fail occasionally (they are marked as possibly requiring
a retry).

When doing a condvar wait_for(), it can wake up before the timeout
as a spurious wakeup. In these cases, the wait_for() method returns that
the timeout wasn't hit, and the test reruns another wait_for().

On Windows, it seems like the wait_for() operation often can end up
returning slightly before the intended deadline - when intending to
wait for 250 milliseconds, it can return after e.g. 235 milliseconds.
In these cases, the wait_for() doesn't indicate a timeout.

Previously, the test then reran a new wait_for() for a full 250
milliseconds each time. So for N consecutive wakeups slightly too early,
we'd wait for (N+1)*250 milliseconds. Now it only reruns wait_for() for
the remaining intended wait duration.

Differential Revision: https://reviews.llvm.org/D99175