platform/upstream/llvm.git
2 years ago[X86] combineMul - move MUL_IMM comment inside function. NFC.
Simon Pilgrim [Sun, 22 Aug 2021 17:27:03 +0000 (18:27 +0100)]
[X86] combineMul - move MUL_IMM comment inside function. NFC.

combineMul is now used for other things as well as the mul-with-constant expansion - move the comment to where its actually relevant.

2 years ago[DWARF][Verifier] Do not add child DieRangeInfo with empty address range to the parent.
Alexey Lapshin [Wed, 4 Aug 2021 16:17:33 +0000 (19:17 +0300)]
[DWARF][Verifier] Do not add child DieRangeInfo with empty address range to the parent.

verifyDieRanges function checks for the intersected address ranges.
It adds child DieRangeInfo into parent DieRangeInfo to check
whether children have overlapping address ranges. It is safe to not add
DieRangeInfo with empty address range into parent's children list.
This decreases the number of children which should be navigated and as a result
decreases execution time(parents having a lot of children with empty ranges
spend much time navigating them). For this command: "llvm-dwarfdump --verify clang-repl"
execution time decreased from 220 sec till 75 sec.

Differential Revision: https://reviews.llvm.org/D107554

2 years ago[Transforms] Remove unused declaration emitStrNLen (NFC)
Kazu Hirata [Sun, 22 Aug 2021 16:08:21 +0000 (09:08 -0700)]
[Transforms] Remove unused declaration emitStrNLen (NFC)

The corresponding definition has been missing for at least 5 years.

2 years ago[libc++] Eliminate needless `add_lvalue_reference` from <algorithm> helpers. NFCI.
Arthur O'Dwyer [Thu, 19 Aug 2021 19:06:21 +0000 (15:06 -0400)]
[libc++] Eliminate needless `add_lvalue_reference` from <algorithm> helpers. NFCI.

When `_Compare` is a function parameter already (so it's not `void`
and it's not an abominable function type), `add_lvalue_reference_t<_Compare>`
is simply a synonym for `_Compare&`. We don't need to pull in `<type_traits>`
and instantiate a template trait to figure that out.

Differential Revision: https://reviews.llvm.org/D108400

2 years ago[InstCombine] Perform "eq of parts" fold with logical ops
Nikita Popov [Sun, 22 Aug 2021 14:55:53 +0000 (16:55 +0200)]
[InstCombine] Perform "eq of parts" fold with logical ops

The pattern matched here is too complex for the general logical
and/or to bitwise and/or conversion to trigger. However, the
fold is poison-safe, so match it with a select root as well:

https://alive2.llvm.org/ce/z/vNzzSg
https://alive2.llvm.org/ce/z/Beyumt

2 years ago[InstCombine] Add tests for "eq of parts" with logical op (NFC)
Nikita Popov [Sun, 22 Aug 2021 14:43:27 +0000 (16:43 +0200)]
[InstCombine] Add tests for "eq of parts" with logical op (NFC)

We currently only handle this with a bitwise and/or instruction,
but not a logical.

2 years ago[X86][AVX] matchShuffleAsBlend - use isElementEquivalent to help match broadcast...
Simon Pilgrim [Sun, 22 Aug 2021 14:26:17 +0000 (15:26 +0100)]
[X86][AVX] matchShuffleAsBlend - use isElementEquivalent to help match broadcast/repeated elements

Extend matchShuffleAsBlend to not only match against known in-place elements for BLEND shuffles, but use isElementEquivalent to determine if the shuffle mask's referenced element is the same as the in-place element.

This allows us to replace a number of insertps instructions with more general blendps instructions (better opportunities for commutation, concatenation etc.).

2 years agoFix signed/unsigned comparison warning. NFCI.
Simon Pilgrim [Sun, 22 Aug 2021 14:02:19 +0000 (15:02 +0100)]
Fix signed/unsigned comparison warning. NFCI.

2 years ago[X86] Expose memory codegen in element insert load tests to improve accuracy of checks
Simon Pilgrim [Sun, 22 Aug 2021 13:54:36 +0000 (14:54 +0100)]
[X86] Expose memory codegen in element insert load tests to improve accuracy of checks

Also replace X32 with X86 check prefixes for i686 tests (we tend to try to use X32 for gnux32 targets)

2 years ago[X86][SSE] lowerVECTOR_SHUFFLE - canonicalize with horizontal ops.
Simon Pilgrim [Sun, 22 Aug 2021 13:17:39 +0000 (14:17 +0100)]
[X86][SSE] lowerVECTOR_SHUFFLE - canonicalize with horizontal ops.

Before lowering shuffles, see if we can merge horizontal ops or canonicalize the shuffle mask to point to the same LHS/RHS of the HOps when an HOp's args are repeated.

2 years ago[InstSimplify] fold rotate of -1 to -1
Sanjay Patel [Sun, 22 Aug 2021 13:13:59 +0000 (09:13 -0400)]
[InstSimplify] fold rotate of -1 to -1

This is part of solving more general rotate patterns seen in
bugs related to:
https://llvm.org/PR51575

https://alive2.llvm.org/ce/z/GpkFCt

2 years ago[InstSimplify] fold rotate of zero to zero
Sanjay Patel [Sun, 22 Aug 2021 13:10:52 +0000 (09:10 -0400)]
[InstSimplify] fold rotate of zero to zero

This is part of solving more general rotate patterns seen in
bugs related to:
https://llvm.org/PR51575

https://alive2.llvm.org/ce/z/fjKwqv

2 years ago[InstSimplify] add tests for rotates of 0/-1; NFC
Sanjay Patel [Sun, 22 Aug 2021 13:09:49 +0000 (09:09 -0400)]
[InstSimplify] add tests for rotates of 0/-1; NFC

2 years ago[X86] Try to sync HSW + BDW model class defs to simplify comparisons. NFC.
Simon Pilgrim [Sun, 22 Aug 2021 12:02:51 +0000 (13:02 +0100)]
[X86] Try to sync HSW + BDW model class defs to simplify comparisons. NFC.

Broadwell is mainly a die shrink of Haswell, but the model had many of the scheduling classes in different orders, making side-by-side comparisons very difficult.

The InstRW overrides are still quite different, but at least that part of the side-by-side diff is now in the same position.

This was noticed while I was trying to investigate diffs between llvm-mca and other perf analyzers in https://uica.uops.info/ - we used to be able to do diffs between most of the models very easily, but we seem to have lost that simplicity as classes have been altered, models have been refined and other models have rotted.

2 years ago[InstCombine] generalize subtract with 'not' operands
Sanjay Patel [Sat, 21 Aug 2021 17:05:35 +0000 (13:05 -0400)]
[InstCombine] generalize subtract with 'not' operands

The motivation was to get min/max intrinsics to parity
with cmp+select idioms, but this unlocks a few more
folds because isFreeToInvert recognizes add/sub with
constants too.

In the min/max example, we have too many extra uses
for smaller folds to improve things, but this fold
is able to eliminate uses even though we can't reduce
the number of instructions.

2 years agoCGBuiltin.cpp - pass SVETypeFlags by const reference. NFC.
Simon Pilgrim [Sat, 21 Aug 2021 19:46:12 +0000 (20:46 +0100)]
CGBuiltin.cpp - pass SVETypeFlags by const reference. NFC.

Don't pass the struct by value.

2 years ago[LV] Adjust reduction recipes before recurrence handling.
Florian Hahn [Sun, 22 Aug 2021 09:45:20 +0000 (10:45 +0100)]
[LV] Adjust reduction recipes before recurrence handling.

Adjusting the reduction recipes still relies on references to the
original IR, which can become outdated by the first-order recurrence
handling. Until reduction recipe construction does not require IR
references, move it before first-order recurrence handling, to prevent a
crash as exposed by D106653.

2 years ago[DAGCombiner] Add target hook function to decide folding (mul (add x, c1), c2)
Ben Shi [Thu, 19 Aug 2021 13:51:09 +0000 (21:51 +0800)]
[DAGCombiner] Add target hook function to decide folding (mul (add x, c1), c2)

Reviewed by: lebedev.ri, spatel, craig.topper, luismarques, jrtc27

Differential Revision: https://reviews.llvm.org/D107711

2 years ago[JITLink] Add support of R_X86_64_32S relocation
luxufan [Sun, 22 Aug 2021 08:43:02 +0000 (16:43 +0800)]
[JITLink] Add support of R_X86_64_32S relocation

This patch supported the R_X86_64_32S relocation and add the Pointer32Signed generic edge kind.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D108446

2 years ago[ORC] Add std::tuple support to SimplePackedSerialization.
Lang Hames [Sun, 22 Aug 2021 00:43:06 +0000 (10:43 +1000)]
[ORC] Add std::tuple support to SimplePackedSerialization.

2 years ago[ORC] Rename blobSerializationRoundTrip, drop explicit arg types on calls.
Lang Hames [Sun, 22 Aug 2021 00:58:58 +0000 (10:58 +1000)]
[ORC] Rename blobSerializationRoundTrip, drop explicit arg types on calls.

Renames the blobSerializationRoundTrip test helper function to
spsSerializationRoundTrip ('blob' was the placeholder name for the serialization
scheme during prototyping, this function was missed when renaming everything
for the mainline). Also drops explicit template arguments at call sites where
they can be inferred (and are obvious) from the call argument type.

2 years ago[X86] AVX512FP16 instructions enabling 4/6
Wang, Pengfei [Sun, 22 Aug 2021 00:24:20 +0000 (08:24 +0800)]
[X86] AVX512FP16 instructions enabling 4/6

Enable FP16 unary operator instructions.

Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D105267

2 years ago[ORC] Add missing header.
Lang Hames [Sun, 22 Aug 2021 00:34:38 +0000 (10:34 +1000)]
[ORC] Add missing header.

Should fix bot failure at
https://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/4367

2 years ago[TargetCallingConv] Change OutputArg ctor to match its members
Fangrui Song [Sat, 21 Aug 2021 23:41:48 +0000 (16:41 -0700)]
[TargetCallingConv] Change OutputArg ctor to match its members

This avoids unneeded MVT->EVT conversion.

2 years ago[AArch64] Replace unneeded CCAssignToRegWithShadow with CCAssignToReg
Fangrui Song [Sat, 21 Aug 2021 23:33:29 +0000 (16:33 -0700)]
[AArch64] Replace unneeded CCAssignToRegWithShadow with CCAssignToReg

CCState::AllocateReg handles aliased registers.

2 years ago[TargetMachine] Drop special case for *-win32-macho
Fangrui Song [Sat, 21 Aug 2021 20:59:17 +0000 (13:59 -0700)]
[TargetMachine] Drop special case for *-win32-macho

clang CodeGenModule shouldAssumeDSOLocal has set dso_local.

2 years ago[TargetMachine] Simplify shouldAssumeDSOLocal. NFC
Fangrui Song [Sat, 21 Aug 2021 19:37:29 +0000 (12:37 -0700)]
[TargetMachine] Simplify shouldAssumeDSOLocal. NFC

2 years ago[clang] Fix typos in documentation (NFC)
Kazu Hirata [Sat, 21 Aug 2021 19:17:58 +0000 (12:17 -0700)]
[clang] Fix typos in documentation (NFC)

2 years ago[InstCombine] combine constants by reassociating add/sub/add
Sanjay Patel [Fri, 20 Aug 2021 22:34:09 +0000 (18:34 -0400)]
[InstCombine] combine constants by reassociating add/sub/add

This may overlap partially with the reassociate pass,
but it seems simple enough that we should try it here
in InstCombine to enable other folds.

This shows up as an opportunity and potential regression
if we improve a subtract fold with 'not' ops to be more
general.

2 years ago[InstCombine] add tests for add/sub/add combines; NFC
Sanjay Patel [Fri, 20 Aug 2021 21:50:18 +0000 (17:50 -0400)]
[InstCombine] add tests for add/sub/add combines; NFC

2 years ago[InstCombine] add tests for min/max with nots and sub; NFC
Sanjay Patel [Fri, 20 Aug 2021 21:33:19 +0000 (17:33 -0400)]
[InstCombine] add tests for min/max with nots and sub; NFC

2 years ago[ARM] Fix VQDMULH fold for scalar smin
David Green [Sat, 21 Aug 2021 15:33:18 +0000 (16:33 +0100)]
[ARM] Fix VQDMULH fold for scalar smin

Add a variant of mve-vqdmulh tests that uses min/max intrinsics
directly, including a scalar test that shows it misbehaving for min
intrinsics and a fix for the combine to prevent it from misbehaving.

2 years ago[flang] Refine output file generation
Andrzej Warzynski [Fri, 20 Aug 2021 10:25:11 +0000 (10:25 +0000)]
[flang] Refine output file generation

This patch cleans-up the file generation code in Flang's frontend
driver. It improves the layering between
`CompilerInstance::CreateDefaultOutputFile`,
`CompilerInstance::CreateOutputFile` and their various clients.

* Rename `CreateOutputFile` as `CreateOutputFileImpl` and make it
  private. This method is an implementation detail.
* Instead of passing an `std::error_code` out parameter into
  `CreateOutputFileImpl`, have it return Expected<>. This is a bit shorter
  and idiomatic LLVM.
* Make `CreateDefaultOutputFile` (which calls `CreateOutputFileImpl`)
  issue an error when file creation fails. The error code from
  `CreateOutputFileImpl` is used to generate a meaningful diagnostic
  message.
* Remove error reporting from `PrintPreprocessedAction::ExecuteAction`.
  This is only for cases when output file generation fails. This is
  handled in `CreateDefaultOutputFile` instead (see the previous point).
* Inline `AddOutputFile` into its only caller,
  `CreateDefaultOutputFile`.
* Switch from `lvm::buffer_ostream` to `llvm::buffer_unique_ostream>`
  for non-seekable output streams. This simplifies the logic in the driver
  and was introduced for this very reason in [1]
* Moke sure that the diagnostics from the prescanner when running `-E`
  (`PrintPreprocessedAction::ExecuteAction`) are printed before the actual
  output is generated.
* Update comments, add test.

NOTE: This patch relands [2]. As suggested by Michael Kruse in the
post-commit/post-revert review, I've added the following:
```
config.errc_messages = "@LLVM_LIT_ERRC_MESSAGES@"
```
in Flang's `lit.site.cfg.py.in`. This way, `%errc_ENOENT` in
output-paths.f90 gets the correct value on Windows as well as on Linux.

[1] https://reviews.llvm.org/D93260
[2] fd21d1e198e381a2b9e7af1701044462b2d386cd

Reviewed By: ashermancinelli

Differential Revision: https://reviews.llvm.org/D108390

2 years ago[lldb] Fix typo in the description of breakpoint options
Kirill Shmakov [Fri, 20 Aug 2021 12:27:37 +0000 (15:27 +0300)]
[lldb] Fix typo in the description of breakpoint options

2 years ago[gn build] Port 7f99337f9bcf
LLVM GN Syncbot [Sat, 21 Aug 2021 09:44:22 +0000 (09:44 +0000)]
[gn build] Port 7f99337f9bcf

2 years ago[ORC] Add EPCGenericMemoryAccess: generic executor memory access via EPC calls.
Lang Hames [Fri, 20 Aug 2021 05:52:42 +0000 (15:52 +1000)]
[ORC] Add EPCGenericMemoryAccess: generic executor memory access via EPC calls.

All ExecutorProcessControl subclasses must provide an
ExecutorProcessControl::MemoryAccess object that can be used to access executor
memory from the JIT process. The EPCGenericMemoryAccess class provides an
off-the-shelf MemoryAccess implementation for JITs that do not need (or cannot
provide) a specialized MemoryAccess implementation. This simplifies the process
of creating new ExecutorProcessControl implementations.

2 years ago[NFC][LoopIdiom] Let processLoopStoreOfLoopLoad take StoreSize as SCEV instead of...
eopXD [Thu, 19 Aug 2021 06:15:38 +0000 (23:15 -0700)]
[NFC][LoopIdiom] Let processLoopStoreOfLoopLoad take StoreSize as SCEV instead of unsigned

Letting it take SCEV allows further modification on the function to optimize
if the StoreSize / Stride is runtime determined.

The plan is to let memcpy / memmove deal with runtime-determined sizes, just
like what D107353 did to memset.

Reviewed By: bmahjour

Differential Revision: https://reviews.llvm.org/D108289

2 years ago[libc] Add a new suite called "libc-long-running-tests".
Siva Chandra Reddy [Mon, 21 Jun 2021 06:05:29 +0000 (06:05 +0000)]
[libc] Add a new suite called "libc-long-running-tests".

This suite is helpful is adding long running tests which take a long
time to finish that they can be run on the public builders. They
will probably be run on special builders in future.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D104816

2 years ago[CodeGen] Remove unused declaration setLiveInsUsed (NFC)
Kazu Hirata [Sat, 21 Aug 2021 02:19:54 +0000 (19:19 -0700)]
[CodeGen] Remove unused declaration setLiveInsUsed (NFC)

The corresponding definition was removed on Jan 20, 2017 in commit
710a4c1f3ddba3aa9313c72c43f9619afbc3e259.

2 years ago[OpenMP] Correctly add member expressions to OpenMP info
Joseph Huber [Fri, 20 Aug 2021 20:43:31 +0000 (16:43 -0400)]
[OpenMP] Correctly add member expressions to OpenMP info

Mapping expressions that have `this` as their base expression aren't
considered a valid base variable and the rest of the runtime expects
this. However, if we have an expression with no value declaration we can
try to extract it manually to provide more helpful debuggin information.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D108483

2 years ago[AArch64][GlobalISel] Add legalizer support for the @llvm.get.dynamic.area.offset...
Amara Emerson [Sat, 21 Aug 2021 00:04:36 +0000 (17:04 -0700)]
[AArch64][GlobalISel] Add legalizer support for the @llvm.get.dynamic.area.offset intrinsic.

This is just 0 on AArch64.

2 years ago[Bazel] Fix version defines
Geoffrey Martin-Noble [Fri, 20 Aug 2021 23:53:46 +0000 (16:53 -0700)]
[Bazel] Fix version defines

Some of these were the wrong version and some of them were the wrong
format. Did some hunting around to figure out what exactly they're
supposed to be. Since basically everything is derived from the LLVM
version we should probably make this a bit less hardcoded, but just
fixing the values for now.

Sources:
https://github.com/llvm/llvm-project/blob/b686fc7a1bea/clang/include/clang/Basic/Version.inc.in
https://github.com/llvm/llvm-project/blob/b686fc7a1bea/clang/CMakeLists.txt#L353-L363
https://github.com/llvm/llvm-project/blob/b686fc7a1bea/llvm/CMakeLists.txt#L13-L29
https://github.com/llvm/llvm-project/blob/b686fc7a1bea/lld/CMakeLists.txt#L131-L138

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D108500

2 years ago[Driver] Remove discouraged -gcc-toolchain
Fangrui Song [Fri, 20 Aug 2021 23:36:42 +0000 (16:36 -0700)]
[Driver] Remove discouraged -gcc-toolchain

Space separated driver options are uncommon but Clang traditionally
did not do a good job. --gcc-toolchain= is the preferred form.

This discourage form appears to be rare, so we can just drop it.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D108494

2 years ago[AArch64][GlobalISel] Don't contract cross-bank copies into truncating stores.
Amara Emerson [Fri, 20 Aug 2021 23:23:23 +0000 (16:23 -0700)]
[AArch64][GlobalISel] Don't contract cross-bank copies into truncating stores.

Truncating stores with GPR bank sources shouldn't be mutated into using FPR bank
sources, since those aren't supported.

Ideally this should be a selection failure in the tablegen patterns, but for now
avoid generating them.

2 years ago[Bazel] Reduce quote escaping
Geoffrey Martin-Noble [Fri, 20 Aug 2021 22:58:49 +0000 (15:58 -0700)]
[Bazel] Reduce quote escaping

There's a lot of unnecessary backslashes here that we can avoid to
reduce confusion.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D108495

2 years ago[MLIR][OMP] Ensure nested scf.parallel execute all iterations
William S. Moses [Thu, 19 Aug 2021 23:45:58 +0000 (19:45 -0400)]
[MLIR][OMP] Ensure nested scf.parallel execute all iterations

Presently, the lowering of nested scf.parallel loops to OpenMP creates one omp.parallel region, with two (nested) OpenMP worksharing loops on the inside. When lowered to LLVM and executed, this results in incorrect results. The reason for this is as follows:

An OpenMP parallel region results in the code being run with whatever number of threads available to OpenMP. Within a parallel region a worksharing loop divides up the total number of requested iterations by the available number of threads, and distributes accordingly. For a single ws loop in a parallel region, this works as intended.

Now consider nested ws loops as follows:

omp.parallel {
   A: omp.ws %i = 0...10 {
      B: omp.ws %j = 0...10 {
          code(%i, %j)
      }
   }
}

Suppose we ran this on two threads. The first workshare loop would decide to execute iterations 0, 1, 2, 3, 4 on thread 0, and iterations 5, 6, 7, 8, 9 on thread 1. The second workshare loop would decide the same for its iteration. This means thread 0 would execute i \in [0, 5) and j \in [0, 5). Thread 1 would execute i \in [5, 10) and j \in [5, 10). This means that iterations i in [5, 10), j in [0, 5) and i in [0, 5), j in [5, 10) never get executed, which is clearly wrong.

This permits two options for a remedy:
1) Change the semantics of the omp.wsloop to be distinct from that of the OpenMP runtime call or equivalently #pragma omp for. This could then allow some lowering transformation to remedy the aforementioned issue. I don't think this is desirable for an abstraction standpoint.
2) When lowering an scf.parallel always surround the wsloop with a new parallel region (thereby causing the innermost wsloop to use the number of threads available only to it).

This PR implements the latter change.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D108426

2 years ago[test] Migrate -gcc-toolchain with space separator to --gcc-toolchain=
Fangrui Song [Fri, 20 Aug 2021 22:24:58 +0000 (15:24 -0700)]
[test] Migrate -gcc-toolchain with space separator to --gcc-toolchain=

Space separated driver options are uncommon but Clang traditionally
did not do a good job. --gcc-toolchain= is the preferred form.

2 years ago[AArch64][GlobalISel] Legalize non-register-sized scalar G_BITREVERSE
Jessica Paquette [Wed, 18 Aug 2021 23:48:04 +0000 (16:48 -0700)]
[AArch64][GlobalISel] Legalize non-register-sized scalar G_BITREVERSE

Clamp types to [s32, s64] and make them a power of 2.

This matches SDAG's behaviour.

https://godbolt.org/z/vTeGqf4vT

Differential Revision: https://reviews.llvm.org/D108344

2 years ago[AArch64][GlobalISel] Legalize 32-bit + narrow G_SMULO + G_UMULO
Jessica Paquette [Tue, 17 Aug 2021 20:58:58 +0000 (13:58 -0700)]
[AArch64][GlobalISel] Legalize 32-bit + narrow G_SMULO + G_UMULO

SDAG lowers 32-bit and 64-bit G_SMULO + G_UMULO. We were missing the 32-bit
case.

For other sizes, make the 0th type a power of 2 and clamp it to either 32 bits
or 64 bits.

Right now, this will allow us to handle narrow types (e.g. s4, s24, etc.). The
LegalizerHelper doesn't support narrowing G_SMULO or G_UMULO right now. I think
we want clamping behaviour either way, so we might as well include it now to
be explicit.

Differential Revision: https://reviews.llvm.org/D108240

2 years ago[AArch64][GlobalISel] Clamp vectors of p0 when legalizing G_LOAD/G_STORE
Jessica Paquette [Fri, 20 Aug 2021 21:00:17 +0000 (14:00 -0700)]
[AArch64][GlobalISel] Clamp vectors of p0 when legalizing G_LOAD/G_STORE

We had a rule for <n x s64> but not one for <n x p0>. As a result, we'd fall
back on like <5 x p0> or whatever.

Differential Revision: https://reviews.llvm.org/D108484

2 years ago[AArch64][GlobalISel] Add regbankselect support for G_LROUND
Jessica Paquette [Thu, 19 Aug 2021 22:53:39 +0000 (15:53 -0700)]
[AArch64][GlobalISel] Add regbankselect support for G_LROUND

Destination is always a GPR, since the result is always an integer.

Source is always a FPR, since the source is always floating point.

Differential Revision: https://reviews.llvm.org/D108419

2 years ago[libunwind] Add UNW_AARCH64_* beside UNW_ARM64_*
Fangrui Song [Fri, 20 Aug 2021 21:26:27 +0000 (14:26 -0700)]
[libunwind] Add UNW_AARCH64_* beside UNW_ARM64_*

The original libunwind project defines UNW_AARCH64_* instead of UNW_ARM64_*.
Rename the enum members to match. This allows some applications with simple
`unw_init_local` usage to migrate to llvm-project libunwind.

Note: the canonical names of `UNW_ARM_D{0..31}` are now `UNW_AARCH64_V{0..31}`,
to match the original libunwind.

UNW_ARM64_* are kept for now for compatibility. Some may be unneeded and can be
cleaned up in the future.

Reviewed By: #libunwind, compnerd

Differential Revision: https://reviews.llvm.org/D107996

2 years ago[AArch64][GlobalISel] Mark G_LROUND as legal for s64 dst + s32/s64 src.
Jessica Paquette [Thu, 19 Aug 2021 23:06:15 +0000 (16:06 -0700)]
[AArch64][GlobalISel] Mark G_LROUND as legal for s64 dst + s32/s64 src.

Matches SDAG's behaviour for these types.

Differential Revision: https://reviews.llvm.org/D108420

2 years ago[NFC] addAttribute(FunctionIndex) => addFnAttribute()
Arthur Eubanks [Fri, 20 Aug 2021 21:18:30 +0000 (14:18 -0700)]
[NFC] addAttribute(FunctionIndex) => addFnAttribute()

2 years ago[GlobalISel] Add G_LLROUND
Jessica Paquette [Fri, 20 Aug 2021 00:39:30 +0000 (17:39 -0700)]
[GlobalISel] Add G_LLROUND

Basically the same as G_LROUND. Handles the llvm.llround family of intrinsics.

Also add a helper function to the MachineVerifier for checking if all of the
(virtual register) operands of an instruction are scalars. Seems like a useful
thing to have.

Differential Revision: https://reviews.llvm.org/D108429

2 years ago[LoopPassManager] Assert that MemorySSA is preserved if used
Nikita Popov [Thu, 19 Aug 2021 18:56:09 +0000 (20:56 +0200)]
[LoopPassManager] Assert that MemorySSA is preserved if used

Currently it's possible to silently use a loop pass that does not
preserve MemorySSA in a loop-mssa pass manager, as we don't
statically know which loop passes preserve MemorySSA (as was the
case with the legacy pass manager).

However, we can at least add a check after the fact that if
MemorySSA is used, then it should also have been preserved.
Hopefully this will reduce confusion as seen in
https://bugs.llvm.org/show_bug.cgi?id=51020.

Differential Revision: https://reviews.llvm.org/D108399

2 years ago[NFC][MLGO] Use std::move when moving protobufs
Mircea Trofin [Fri, 20 Aug 2021 19:50:20 +0000 (12:50 -0700)]
[NFC][MLGO] Use std::move when moving protobufs

Because of an odd linking problem, we need to temporarily support
building with TF C API 1.15 + tensorflow 2.50 pip package in
'development' mode scenarios. Protobuf Message 'Swap' is partially
implemented in the header (2.50) and relies on a symbol not found in TF
C API 1.15. std::move avoids that, at no semantic cost.

2 years agoRevert "[LoopVectorize][AArch64] Enable ordered reductions by default for AArch64"
Florian Hahn [Fri, 20 Aug 2021 20:22:59 +0000 (21:22 +0100)]
Revert "[LoopVectorize][AArch64] Enable ordered reductions by default for AArch64"

This reverts commit f4122398e7c195147cde120d070f9b72905d7c91 to
investigate a crash exposed by it.

The patch breaks building the code below with `clang -O2 --target=aarch64-linux`

     int a;
     double b, c;
     void d() {
       for (; a; a++) {
         b += c;
         c = a;
       }
     }

2 years ago[DebugInfo] convert btf_tag attrs to DI annotations for record fields
Yonghong Song [Fri, 20 Aug 2021 19:52:51 +0000 (12:52 -0700)]
[DebugInfo] convert btf_tag attrs to DI annotations for record fields

Generate btf_tag annotations for record fields. The annotations
are represented as an DINodeArray in DebugInfo.

Differential Revision: https://reviews.llvm.org/D106616

2 years ago[mlir][linalg] Finish refactor of TC ops to YAML
Rob Suderman [Thu, 12 Aug 2021 23:20:56 +0000 (16:20 -0700)]
[mlir][linalg] Finish refactor of TC ops to YAML

Multiple operations were still defined as TC ops that had equivalent versions
as YAML operations. Reducing to a single compilation path guarantees that
frontends can lower to their equivalent operations without missing the
optimized fastpath.

Some operations are maintained purely for testing purposes (mainly conv{1,2,3}D
as they are included as sole tests in the vectorizaiton transforms.

Differential Revision: https://reviews.llvm.org/D108169

2 years agoFix SEH table addresses for Windows
Daniel Paoliello [Fri, 20 Aug 2021 18:38:50 +0000 (21:38 +0300)]
Fix SEH table addresses for Windows

Issue Details:
The addresses for SEH tables for Windows are incorrect as 1 was unconditionally being added to all addresses. +1 is required for the SEH end address (as it is exclusive), but the SEH start addresses is inclusive and so should be used as-is.

In the IP2State tables, the addresses are +1 for AMD64 to handle the return address for a call being after the actual call instruction but are as-is for ARM and ARM64 as the `StateFromIp` function in the VC runtime automatically takes this into account and adjusts the address that is it looking up.

Fix Details:
* Split the `getLabel` function into two: `getLabel` (used for the SEH start address and ARM+ARM64 IP2State addresses) and `getLabelPlusOne` (for the SEH end address, and AMD64 IP2State addresses).

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D107784

2 years ago[TypePromotion] Remove unused IRBuilder object. NFC
Craig Topper [Fri, 20 Aug 2021 19:08:42 +0000 (12:08 -0700)]
[TypePromotion] Remove unused IRBuilder object. NFC

2 years ago[DebugInfo] generate btf_tag annotations for DIDerived types
Yonghong Song [Mon, 19 Jul 2021 07:12:15 +0000 (00:12 -0700)]
[DebugInfo] generate btf_tag annotations for DIDerived types

Generate btf_tag annotations for DIDrived types. More specifically,
clang frontend generates the btf_tag annotations for record
fields. The annotations are represented as an DINodeArray
in DebugInfo. The following example illustrate how
annotations are encoded in IR:
      distinct !DIDerivedType(tag: DW_TAG_member, ..., annotations: !10)
      !10 = !{!11, !12}
      !11 = !{!"btf_tag", !"a"}
      !12 = !{!"btf_tag", !"b"}

Differential Revision: https://reviews.llvm.org/D106616

2 years ago[libc++] Remove test-suite annotations for unsupported Clang versions
Louis Dionne [Fri, 20 Aug 2021 15:42:38 +0000 (11:42 -0400)]
[libc++] Remove test-suite annotations for unsupported Clang versions

Differential Revision: https://reviews.llvm.org/D108471

2 years ago[libc++] Include <__iterator/distance.h> instead of <iterator> in a few algorithm...
Joe Loser [Fri, 20 Aug 2021 19:02:03 +0000 (15:02 -0400)]
[libc++] Include <__iterator/distance.h> instead of <iterator> in a few algorithm headers

A few headers in algorithm include `<iterator>` when
`<__iterator/distance.h>` would suffice. Change them
to just include `<__iterator.distance.h>`.

Differential Revision: https://reviews.llvm.org/D108393

2 years ago[Coverage][llvm-cov] Correctly export branch coverage in LCOV format
Christian Fetzer [Fri, 20 Aug 2021 18:24:44 +0000 (13:24 -0500)]
[Coverage][llvm-cov] Correctly export branch coverage in LCOV format

Commit 9f2967bcfe2f7d1fc02281f0098306c90c2c10a5 introduced support for
branch coverage including export to the LCOV format.

This commit corrects the LCOV field name for branches from BFH to BRH.
The mistake seems to have slipped in as typo because the correct field
name BRH is used in the comment section at the beginning of the file.

Differential Revision: https://reviews.llvm.org/D108358

2 years agoPR46874: Reset stack after visiting a node
Aditya Kumar [Fri, 20 Aug 2021 02:03:41 +0000 (19:03 -0700)]
PR46874: Reset stack after visiting a node

When the stack is not reset it keeps previously visited Basic Block
which results in bugs where an instruction is hoisted to a
predecessor where the instruction was not fully anticipable.

Differential Revision: https://reviews.llvm.org/D108425

2 years ago[mlir][sparse] add test for DimOp folding
Aart Bik [Fri, 20 Aug 2021 16:41:28 +0000 (09:41 -0700)]
[mlir][sparse] add test for DimOp folding

Folding in the MLIR uses the order of the type directly
but folding in the underlying implementation must take
the dim ordering into account. These tests clarify that
behavior and verify it is done right.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D108474

2 years ago[SystemZ][z/OS] Avoid assumption for character value in futures tests
Muiez Ahmed [Fri, 20 Aug 2021 18:03:03 +0000 (14:03 -0400)]
[SystemZ][z/OS] Avoid assumption for character value in futures tests

The aim of this patch is to remove the assumption that the character 'a' is always 97. In turn, this patch explicitly uses the character values to account for the EBCDIC 'a' that is not 97.

Differential Revision: https://reviews.llvm.org/D108321

2 years ago[libc] make the scudo integration test run
Michael Jones [Thu, 19 Aug 2021 21:07:46 +0000 (21:07 +0000)]
[libc] make the scudo integration test run

adds a custom command for libc-scudo-integration-test that makes it run
when it is built.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D108409

2 years ago[NFC] Cleanup/remove some AttributeList setter methods
Arthur Eubanks [Fri, 20 Aug 2021 17:28:32 +0000 (10:28 -0700)]
[NFC] Cleanup/remove some AttributeList setter methods

2 years ago[NFC] Remove unused CallBase::addDereferenceableOrNullAttr()
Arthur Eubanks [Fri, 20 Aug 2021 17:23:27 +0000 (10:23 -0700)]
[NFC] Remove unused CallBase::addDereferenceableOrNullAttr()

2 years ago[MCA] Fixing bug that was causing LSUnit not to realize an instruction finished execu...
Patrick Holland [Fri, 20 Aug 2021 03:04:03 +0000 (20:04 -0700)]
[MCA] Fixing bug that was causing LSUnit not to realize an instruction finished executing when the instruction has 0 latency.

Differential Revision: https://reviews.llvm.org/D108443

2 years agoMake test_symbols.py compare files line-by-line
Ivan Zhechev [Fri, 20 Aug 2021 17:14:59 +0000 (18:14 +0100)]
Make test_symbols.py compare files line-by-line

We currently feed full files to Python's unified_diff.
It's not quite what we want though -
line-by-line comparison makes more sense
(we want to be able to identify missing/unnecessary lines)
and is also easier to parse for humans.
This patch makes sure that we compare one line at a time.

This change pretties up the output formatting in the script.
Output before:

```
!DEF:/m/s/xINTENT(IN)(Implicit)ObjectEntityREAL(4)
!DEF:/m/s/yINTENT(INOUT)(Implicit)ObjectEntityREAL(4)
-!-D-E-F-:-f-o-o-b-a-r-
puresubroutines(x,y)bind(c)
!REF:/m/s/x
intent(in)::x
```
Proposed output after:

```
!DEF:/m/s/xINTENT(IN)(Implicit)ObjectEntityREAL(4)
!DEF:/m/s/yINTENT(INOUT)(Implicit)ObjectEntityREAL(4)
-!DEF:foobar
puresubroutines(x,y)bind(c)
!REF:/m/s/x
intent(in)::x
```

Reviewed By: Meinersbur, awarzynski

Differential Revision: https://reviews.llvm.org/D107954

2 years ago[libc++] Remove more test-suite workarounds for unsupported GCC versions
Louis Dionne [Fri, 20 Aug 2021 15:39:46 +0000 (11:39 -0400)]
[libc++] Remove more test-suite workarounds for unsupported GCC versions

Differential Revision: https://reviews.llvm.org/D108466

2 years ago[libc++][PowerPC] Fix a test case failure when compiled with libcxx
Albion Fung [Fri, 20 Aug 2021 17:23:32 +0000 (13:23 -0400)]
[libc++][PowerPC] Fix a test case failure when compiled with libcxx

The test case is not ran unless libcxx is used, and a macro
may be undefined. This patch checks for the definition of the
macro before using it.

Fixes http://llvm.org/PR51430

Differential Revision: https://reviews.llvm.org/D108352

2 years agoRevert "[openmp][nfc] Refactor GridValues"
Jon Chesterfield [Fri, 20 Aug 2021 17:17:27 +0000 (18:17 +0100)]
Revert "[openmp][nfc] Refactor GridValues"

Failed a nvptx codegen test
This reverts commit 2a47a84b40115b01e03e4d89c1d47ba74beb7bf3.

2 years ago[cmake] Fix native tooling when cross-compiling on Linux
Shoaib Meenai [Fri, 20 Aug 2021 16:20:28 +0000 (09:20 -0700)]
[cmake] Fix native tooling when cross-compiling on Linux

At least as of CMake 3.20.3, the CMake platform file for Linux doesn't
define the file type prefix and suffix variables, relying on them being
implicitly empty when they're unset. If we're cross-compiling targeting
Windows on a Linux machine, the values of these prefixes and suffixes
populated by the Windows platform file will still be set after including
the Linux platform file, so we'll incorrectly assume the ".exe" suffix
for the host machine. Explicitly unset the variables before including
the platform file, to prevent any previous values from leaking. Thanks
@beanz for suggesting the fix.

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D108473

2 years ago[NFC] Simplify some CallBase attribute methods
Arthur Eubanks [Thu, 19 Aug 2021 21:41:25 +0000 (14:41 -0700)]
[NFC] Simplify some CallBase attribute methods

2 years ago[NFC] Remove some unused functions
Arthur Eubanks [Thu, 19 Aug 2021 21:38:06 +0000 (14:38 -0700)]
[NFC] Remove some unused functions

2 years ago[X86][SchedModels] Fix missing ReadAdvance for MULX and ADCX/ADOX (PR51494)
Andrea Di Biagio [Wed, 18 Aug 2021 19:35:57 +0000 (20:35 +0100)]
[X86][SchedModels] Fix missing ReadAdvance for MULX and ADCX/ADOX (PR51494)

Before this patch, instructions MULX32rm and MULX64rm were missing a ReadAdvance
for the implicit read of register EDX/RDX.  This patch fixes the issue, and it
also introduces a new SchedWrite for the two variants of MULX. The general idea
behind this last change is to eventually decrease the number of InstRW in the
scheduling models.

This patch also adds a ReadAdvance for the implicit read of EFLAGS in ADCX/ADOX.

Differential Revision: https://reviews.llvm.org/D108372

2 years ago[X86] Add missing __inline__ to functions in amxintrin.h
Craig Topper [Fri, 20 Aug 2021 16:34:03 +0000 (09:34 -0700)]
[X86] Add missing __inline__ to functions in amxintrin.h

2 years ago[AggressiveInstCombine] guard against applying instruction flags with constant folding
Sanjay Patel [Fri, 20 Aug 2021 16:20:04 +0000 (12:20 -0400)]
[AggressiveInstCombine] guard against applying instruction flags with constant folding

This is a minimized version of a crash reported in:
D108201

2 years ago[WebAssembly] Restore builtins and intrinsics for pmin/pmax
Thomas Lively [Fri, 20 Aug 2021 16:21:31 +0000 (09:21 -0700)]
[WebAssembly] Restore builtins and intrinsics for pmin/pmax

Partially reverts 85157c007903, which had removed these builtins and intrinsics
in favor of normal codegen patterns. It turns out that it is possible for the
patterns to be split over multiple basic blocks, however, which means that DAG
ISel is not able to select them to the pmin/pmax instructions. To make sure the
SIMD intrinsics generate the correct instructions in these cases, reintroduce
the clang builtins and corresponding LLVM intrinsics, but also keep the normal
pattern matching as well.

Differential Revision: https://reviews.llvm.org/D108387

2 years ago[mlir][sparse][python] migrate more code from boilerplate into proper numpy land
Aart Bik [Thu, 19 Aug 2021 22:50:24 +0000 (15:50 -0700)]
[mlir][sparse][python] migrate more code from boilerplate into proper numpy land

The boilerplate was setting up some arrays for testing. To fully illustrate
python - MLIR potential, however, this data should also come from numpy land.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D108336

2 years ago[libc++][NFC] Fix minor errors and inconsistencies in the test suite
Louis Dionne [Fri, 20 Aug 2021 16:09:41 +0000 (12:09 -0400)]
[libc++][NFC] Fix minor errors and inconsistencies in the test suite

2 years ago[WebAssembly] Make shift values unsigned in wasm_simd128.h
Thomas Lively [Fri, 20 Aug 2021 16:10:36 +0000 (09:10 -0700)]
[WebAssembly] Make shift values unsigned in wasm_simd128.h

On some platforms, negative shift values mean to shift in the opposite
direction, but this is not true with WebAssembly. To avoid confusion, make the
shift values in the shift intrinsics unsigned.

Differential Revision: https://reviews.llvm.org/D108415

2 years agoReplace an unnecessary null check with an assert; NFC
Aaron Ballman [Fri, 20 Aug 2021 16:04:09 +0000 (12:04 -0400)]
Replace an unnecessary null check with an assert; NFC

2 years ago[WebAssembly] Add SIMD intrinsics using unsigned integers
Thomas Lively [Fri, 20 Aug 2021 15:56:51 +0000 (08:56 -0700)]
[WebAssembly] Add SIMD intrinsics using unsigned integers

For each SIMD intrinsic function that takes or returns a scalar signed integer
value, ensure there is a corresponding intrinsic that returns or an
unsigned value. This is a convenience for users who use -Wsign-conversion so
they don't have to insert explicit casts, especially when the intrinsic
arguments are integer literals that fit into the unsigned integer type but not
the signed type.

Differential Revision: https://reviews.llvm.org/D108412

2 years ago[openmp][nfc] Refactor GridValues
Jon Chesterfield [Fri, 20 Aug 2021 15:41:25 +0000 (16:41 +0100)]
[openmp][nfc] Refactor GridValues

Remove redundant fields and replace pointer with virtual function

Of fourteen fields, three are dead and four can be computed from the
remainder. This leaves a couple of currently dead fields in place as
they are expected to be used from the deviceRTL shortly. Two of the
fields that can be computed are only used from codegen and require a
log2() implementation so are inlined into codegen instead.

This change leaves the new methods in the same location in the struct
as the previous fields for convenience at review.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D108380

2 years agoMake wide multi-character character literals ill-formed
Corentin Jabot [Fri, 20 Aug 2021 15:10:53 +0000 (11:10 -0400)]
Make wide multi-character character literals ill-formed

This implements P2362, which has not yet been approved by the
C++ committee, but because wide-multi character literals are
implementation defined, clang might not have to wait for WG21.

This change is also being applied in C mode as the behavior is
implementation-defined in C as well and there's no benefit to
having different rules between the languages.

The other part of P2362, making non-representable character
literals ill-formed, is already implemented by clang

2 years agoUse DeclContext::getNonTransparentContext(); NFC
Aaron Ballman [Fri, 20 Aug 2021 15:08:58 +0000 (11:08 -0400)]
Use DeclContext::getNonTransparentContext(); NFC

2 years ago[RISCV] Optimize add in the zba extension with SH*ADD
Ben Shi [Tue, 17 Aug 2021 08:40:16 +0000 (16:40 +0800)]
[RISCV] Optimize add in the zba extension with SH*ADD

Optimize (add x, c) to (SH*ADD (c>>b), x) if c is not simm12
while (c>>b) is simm12 and c has b trailing zeros.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D108193

2 years ago[libc++] Fix XFAIL annotation
Louis Dionne [Fri, 20 Aug 2021 14:17:21 +0000 (10:17 -0400)]
[libc++] Fix XFAIL annotation

The triple can sometimes be arm64-apple-macos, where the previous XFAIL
annotation wouldn't match (and hence the test would fail unexpectedly).

2 years ago[asan] Implemented getAddressSanitizerParams used by the ASan callback optimization...
Kirill Stoimenov [Thu, 19 Aug 2021 18:11:10 +0000 (18:11 +0000)]
[asan] Implemented getAddressSanitizerParams used by the ASan callback optimization code.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D108397

2 years ago[mlir][ods] Skip adding TOC in doc gen when present
Jacques Pienaar [Fri, 20 Aug 2021 14:01:54 +0000 (07:01 -0700)]
[mlir][ods] Skip adding TOC in doc gen when present

Enables adding a TOC in the description to be able to interleave
documentation before and after the TOC.

2 years ago[msan] Hotfix clang/test/CodeGen/sanitize-memory-disable.c
Alexander Potapenko [Fri, 20 Aug 2021 13:50:47 +0000 (15:50 +0200)]
[msan] Hotfix clang/test/CodeGen/sanitize-memory-disable.c

Because KMSAN is not supported on many architectures, explicitly build
the test with -target x86_64-linux-gnu.

Fixes the 'unsupported architecture' and 'unsupported operating system'
errors reported by the clang-armv7-quick (https://lab.llvm.org/buildbot#builders/171/builds/2595)
and llvm-clang-x86_64-sie-ubuntu-fast (https://lab.llvm.org/buildbot#builders/139/builds/9079)
builders.

Differential Revision: https://reviews.llvm.org/D108465

2 years ago[CVP] add tests for unreachable switch default; NFC
Sanjay Patel [Fri, 20 Aug 2021 13:27:26 +0000 (09:27 -0400)]
[CVP] add tests for unreachable switch default; NFC
Goes with the proposal at D106056.

2 years ago[DebugInfo][InstrRef] Correctly ignore DBG_VALUE_LIST in InstrRef mode
Jeremy Morse [Fri, 20 Aug 2021 13:48:45 +0000 (14:48 +0100)]
[DebugInfo][InstrRef] Correctly ignore DBG_VALUE_LIST in InstrRef mode

This patch makes InstrRefBasedLDV "safe" to work with DBG_VALUE_LISTs. It
doesn't actually interpret them, but it recognises that they specify
variable locations and avoids propagating false locations, which is better
than the current state. Observe the attached tes

 * We avoid propagating DBG_VALUE_LISTs into successor blocks, as they're
   not "currently" supported,
 * We don't propagate other variable locations across DBG_VALUE_LISTs,
   because we know that the variable location is terminated by the
   DBG_VALUE_LIST.

Differential Revision: https://reviews.llvm.org/D108143

2 years agoFix assertion when generating diagnostic for inline namespaces
Aaron Ballman [Fri, 20 Aug 2021 13:49:07 +0000 (09:49 -0400)]
Fix assertion when generating diagnostic for inline namespaces

When calculating the name to display for inline namespaces, we have
custom logic to try to hide redundant inline namespaces from the
diagnostic. Calculating these redundancies requires performing a lookup
in the parent declaration context, but that lookup should not try to
look through transparent declaration contexts, like linkage
specifications. Instead, loop up the declaration context chain until we
find a non-transparent context and use that instead.

This fixes PR49954.