platform/upstream/llvm.git
3 years ago[X86] Pre-commit tests for D103192. NFC
Craig Topper [Thu, 27 May 2021 15:21:07 +0000 (08:21 -0700)]
[X86] Pre-commit tests for D103192. NFC

3 years ago[mlir] Add error state and error propagation to async runtime values
Eugene Zhulenev [Tue, 25 May 2021 22:06:34 +0000 (15:06 -0700)]
[mlir] Add error state and error propagation to async runtime values

Depends On D103102

Not yet implemented:
1. Error handling after synchronous await
2. Error handling for async groups

Will be addressed in the followup PRs

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D103109

3 years ago[Clang] Enable __has_feature(coverage_sanitizer)
Marco Elver [Thu, 27 May 2021 16:24:21 +0000 (18:24 +0200)]
[Clang] Enable __has_feature(coverage_sanitizer)

Like other sanitizers, enable __has_feature(coverage_sanitizer) if clang
has enabled at least one SanitizerCoverage instrumentation type.

Because coverage instrumentation selection is not handled via normal
-fsanitize= (and thus not in SanitizeSet), passing this information
through to LangOptions required propagating the already parsed
-fsanitize-coverage= options from CodeGenOptions through to LangOptions
in FixupInvocation().

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D103159

3 years ago[mlir] Async reference counting for block successors with divergent reference counted...
Eugene Zhulenev [Tue, 25 May 2021 18:02:42 +0000 (11:02 -0700)]
[mlir] Async reference counting for block successors with divergent reference counted liveness

Support reference counted values implicitly passed (live) only to some of the successors.

Example: if branched to ^bb2 token will leak, unless `drop_ref` operation is properly created

```
^entry:
  %token = async.runtime.create : !async.token
   cond_br %cond, ^bb1, ^bb2
^bb1:
  async.runtime.await %token
  async.runtime.drop_ref %token
  br ^bb2
^bb2:
  return
```

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D103102

3 years ago[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass
maekawatoshiki [Thu, 27 May 2021 16:17:23 +0000 (01:17 +0900)]
[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass

This patch changes LoopUnrollAndJamPass from FunctionPass to LoopNest pass.
The next patch will utilize LoopNest to effectively handle loop nests.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D99149

3 years ago[SPE] Disable strict-fp for SPE by default
Qiu Chaofan [Thu, 27 May 2021 16:10:04 +0000 (00:10 +0800)]
[SPE] Disable strict-fp for SPE by default

As discussed in PR50385, strict-fp on PowerPC SPE has not been handled
well. This patch disables it by default for SPE.

Reviewed By: nemanjai, vit9696, jhibbits

Differential Revision: https://reviews.llvm.org/D103235

3 years ago[mlir][gpu] Relax restriction on MMA store op to allow chain of mma ops.
thomasraoux [Thu, 27 May 2021 15:58:11 +0000 (08:58 -0700)]
[mlir][gpu] Relax restriction on MMA store op to allow chain of mma ops.

In order to allow large matmul operations using the MMA ops we need to chain
operations this is not possible unless "DOp" and "COp" type have matching
layout so remove the "DOp" layout and force accumulator and result type to
match.
Added a test for the case where the MMA value is accumulated.

Differential Revision: https://reviews.llvm.org/D103023

3 years ago[HIP] Check compatibility of -fgpu-sanitize with offload arch
Yaxun (Sam) Liu [Sun, 23 May 2021 03:45:15 +0000 (23:45 -0400)]
[HIP] Check compatibility of -fgpu-sanitize with offload arch

-fgpu-sanitize is incompatible with offload arch containing xnack-.

This patch checks that.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D102975

3 years ago[RISCV] Add a test case showing incorrect call-conv lowering
Fraser Cormack [Thu, 27 May 2021 15:54:42 +0000 (16:54 +0100)]
[RISCV] Add a test case showing incorrect call-conv lowering

@HsiangKai helped find a bug in the lowering of indirect split
scalable-vector types in our calling convention. An imminent patch will
fix this.

3 years agoGlobalISel: Do not change register types in lowerLoad
Matt Arsenault [Tue, 18 May 2021 21:05:49 +0000 (17:05 -0400)]
GlobalISel: Do not change register types in lowerLoad

Adjusting the load register type is a widenScalar type action, not a
lowering. lowerLoad should be reserved for operations that change the
memory access size, such as unaligned load decomposition. With this
trying to adjust the register type, it was hard to avoid infinite
loops in the legalizer. Adds a bandaid to avoid regressing a few
AArch64 tests, but I'm not sure what the exact condition is and
there's probably a cleaner way to do this.

For AMDGPU this regresses handling of some cases for unaligned loads,
but the way this is currently working is a pretty ugly hack.

3 years ago[AIX] Add -lc++abi and -lunwind for linking
jasonliu [Thu, 27 May 2021 15:47:20 +0000 (15:47 +0000)]
[AIX] Add -lc++abi and -lunwind for linking

Summary:
We are going to have libc++abi.a and libunwind.a on AIX.
Add the necessary linking command to pick the libraries up.

Reviewed By: daltenty

Differential Revision: https://reviews.llvm.org/D102813

3 years agoThread safety analysis: Allow exlusive/shared joins for managed and asserted capabilities
Aaron Puchert [Thu, 27 May 2021 15:45:59 +0000 (17:45 +0200)]
Thread safety analysis: Allow exlusive/shared joins for managed and asserted capabilities

Similar to how we allow managed and asserted locks to be held and not
held in joining branches, we also allow them to be held shared and
exclusive. The scoped lock should restore the original state at the end
of the scope in any event, and asserted locks need not be released.

We should probably only allow asserted locks to be subsumed by managed,
not by (directly) acquired locks, but that's for another change.

Reviewed By: delesley

Differential Revision: https://reviews.llvm.org/D102026

3 years agoThread safety analysis: Factor out function for merging locks (NFC)
Aaron Puchert [Thu, 27 May 2021 15:44:43 +0000 (17:44 +0200)]
Thread safety analysis: Factor out function for merging locks (NFC)

It's going to become a bit more complicated, so let's have it separate.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D102025

3 years agoRevert "Emit correct location lists with basic block sections."
Nico Weber [Thu, 27 May 2021 15:40:51 +0000 (11:40 -0400)]
Revert "Emit correct location lists with basic block sections."

Breaks check-llvm on non-linux, see comments on https://reviews.llvm.org/D85085
This reverts commit caae570978c490a137921b9516162a382831209e
and follow-up commit 1546c52d971292ed4145b6d41aaca0d02229ebff.

3 years ago[libc++] NFC: Parenthesize expression to satisfy GCC 11
Louis Dionne [Thu, 27 May 2021 15:41:26 +0000 (11:41 -0400)]
[libc++] NFC: Parenthesize expression to satisfy GCC 11

Otherwise it issues a -Werror=parentheses suggesting parentheses.

3 years ago[libc++] Deprecate std::iterator and remove it as a base class
Louis Dionne [Tue, 25 May 2021 22:15:58 +0000 (18:15 -0400)]
[libc++] Deprecate std::iterator and remove it as a base class

C++17 deprecated std::iterator and removed it as a base class for all
iterator adaptors. We implement that change, but we still provide a way
to inherit from std::iterator in the few cases where doing otherwise
would be an ABI break.

Supersedes D101729 and the std::iterator base parts of D103101 and D102657.

Differential Revision: https://reviews.llvm.org/D103171

3 years agoAMDGPU/GlobalISel: Use IncomingValueAssigner for implicit return
Matt Arsenault [Tue, 25 May 2021 20:25:34 +0000 (16:25 -0400)]
AMDGPU/GlobalISel: Use IncomingValueAssigner for implicit return

This makes no real difference since we assign the same register either
way.

3 years agoAMDGPU/GlobalISel: Fix broken test run line
Matt Arsenault [Fri, 21 May 2021 00:50:34 +0000 (20:50 -0400)]
AMDGPU/GlobalISel: Fix broken test run line

3 years ago[CostModel][X86] AVX512 truncation ops are slower than cost models indicate.
Simon Pilgrim [Thu, 27 May 2021 14:36:29 +0000 (15:36 +0100)]
[CostModel][X86] AVX512 truncation ops are slower than cost models indicate.

The SkylakeServer model (and later IceLake/TigerLake targets according to Agner) have the PMOV truncations as uops=2, rthroughput=2 instructions.

Noticed while trying to reduce the diffs between cost tables and llvm-mca analysis.

3 years ago[X86][SSE] Regenerate some tests to expose the rip relative vector/broadcast loads
Simon Pilgrim [Wed, 26 May 2021 16:42:22 +0000 (17:42 +0100)]
[X86][SSE] Regenerate some tests to expose the rip relative vector/broadcast loads

3 years ago[OpenCL][NFC] Fix typos in test
Sven van Haastregt [Thu, 27 May 2021 15:06:33 +0000 (16:06 +0100)]
[OpenCL][NFC] Fix typos in test

3 years ago[Flang][Openmp] Fortran specific semantic checks for Allocate directive
Isaac Perry [Thu, 27 May 2021 07:56:16 +0000 (08:56 +0100)]
[Flang][Openmp] Fortran specific semantic checks for Allocate directive

This patch adds the following Fortran specific semantic checks for the OpenMP
Allocate directive.
1) A type parameter inquiry cannot appear in an ALLOCATE directive.
2) List items specified in the ALLOCATE directive must not have the ALLOCATABLE
attribute unless the directive is associated with an ALLOCATE statement.

Co-authored-by: Irina Dobrescu <irina.dobrescu@arm.com>
Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D102061

3 years agoVirtRegMap: Preserve LiveDebugVariables
Matt Arsenault [Mon, 29 Oct 2018 22:55:33 +0000 (18:55 -0400)]
VirtRegMap: Preserve LiveDebugVariables

This avoids recomputing it between regalloc runs when allocation is
split, and also avoids a debug info test regression.

3 years agoDisable misc-no-recursion checking in Clang
Aaron Ballman [Thu, 27 May 2021 14:37:33 +0000 (10:37 -0400)]
Disable misc-no-recursion checking in Clang

We currently enable misc-no-recursion, but Clang uses recursion
intentionally in a fair number of places (like RecursiveASTVisitor).
Disabling this check reduces a noise in reviews that add new AST nodes,
like https://reviews.llvm.org/D103112#2780747 which has five CI
warnings that the author can do nothing about.

3 years ago[VP][SelectionDAG] Add a target-configurable EVL operand type
Fraser Cormack [Mon, 24 May 2021 14:24:54 +0000 (15:24 +0100)]
[VP][SelectionDAG] Add a target-configurable EVL operand type

This patch adds a way for the target to configure the type it uses for
the explicit vector length operands of VP SDNodes. The type must be a
legal integer type (there is still no target-independent legalization of
this operand) and must currently be at least as big as i32, the type
used by the IR intrinsics. An implicit zero-extension takes place on
targets which choose a larger type. All VP nodes should be created with
this type used for the EVL operand.

This allows 64-bit RISC-V to avoid custom legalization of all VP nodes,
keeping them in their target-independent form for that bit longer.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D103027

3 years ago[OpenMP]Add support for workshare loop modifier in lowering
Mats Petersson [Fri, 30 Apr 2021 13:13:55 +0000 (14:13 +0100)]
[OpenMP]Add support for workshare loop modifier in lowering

When lowering the dynamic, guided, auto and runtime types of scheduling,
there is an optional monotonic or non-monotonic modifier. This patch
adds support in the OMP IR Builder to pass this down to the runtime
functions.

Also implements tests for the variants.

Differential Revision: https://reviews.llvm.org/D102008

3 years agoHopefully fix the Clang sphinx doc build.
Aaron Ballman [Thu, 27 May 2021 14:25:39 +0000 (10:25 -0400)]
Hopefully fix the Clang sphinx doc build.

This was broken several days ago in 826905787ae4c8540bb8a2384fac59c606c7eaff.

3 years agoCorrect the 'KEYALL' mask.
Erich Keane [Thu, 27 May 2021 14:19:20 +0000 (07:19 -0700)]
Correct the 'KEYALL' mask.

It should technically be a 1, since we are only setting the first bit.

3 years agoReuse temporary files for print-changed=diff
Jamie Schmeiser [Thu, 27 May 2021 14:19:13 +0000 (10:19 -0400)]
Reuse temporary files for print-changed=diff

Summary:
Make the file name and descriptors static so that they are reused by
print-changed=diff. This avoids errors about being unable to create
temporary files when doing the later comparisons in a large compile.

Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: aeubanks (Arthur Eubanks)
Differential Revision: https://reviews.llvm.org/D100116

3 years agoReimplement __builtin_unique_stable_name-
Erich Keane [Fri, 23 Apr 2021 15:22:35 +0000 (08:22 -0700)]
Reimplement __builtin_unique_stable_name-

The original version of this was reverted, and @rjmcall provided some
advice to architect a new solution.  This is that solution.

This implements a builtin to provide a unique name that is stable across
compilations of this TU for the purposes of implementing the library
component of the unnamed kernel feature of SYCL.  It does this by
running the Itanium mangler with a few modifications.

Because it is somewhat common to wrap non-kernel-related lambdas in
macros that aren't present on the device (such as for logging), this
uniquely generates an ID for all lambdas involved in the naming of a
kernel. It uses the lambda-mangling number to do this, except replaces
this with its own number (starting at 10000 for readabililty reasons)
for lambdas used to name a kernel.

Additionally, this implements itself as constexpr with a slight catch:
if a name would be invalidated by the use of this lambda in a later
kernel invocation, it is diagnosed as an error (see the Sema tests).

Differential Revision: https://reviews.llvm.org/D103112

3 years agoSpeculatively fix this harder and with improved spelling capabilities.
Aaron Ballman [Thu, 27 May 2021 13:54:09 +0000 (09:54 -0400)]
Speculatively fix this harder and with improved spelling capabilities.

3 years agoSpeculatively fix a -Woverloaded-virtual diagnostic; NFC
Aaron Ballman [Thu, 27 May 2021 13:48:43 +0000 (09:48 -0400)]
Speculatively fix a -Woverloaded-virtual diagnostic; NFC

3 years agoAMDGPU/GlobalISel: Lower constant-32-bit zextload/sextload consistently
Matt Arsenault [Tue, 18 May 2021 22:22:09 +0000 (18:22 -0400)]
AMDGPU/GlobalISel: Lower constant-32-bit zextload/sextload consistently

We were accidentally leaning on code in lowerLoad which expands
extending loads which should be removed.

3 years agoAMDGPU/GlobalISel: Remove redundant parameter from function
Matt Arsenault [Tue, 18 May 2021 21:02:25 +0000 (17:02 -0400)]
AMDGPU/GlobalISel: Remove redundant parameter from function

3 years agoFix -Wswitch warning; NFC
Aaron Ballman [Thu, 27 May 2021 13:23:20 +0000 (09:23 -0400)]
Fix -Wswitch warning; NFC

3 years ago[DAGCombine][RISCV] Don't try to trunc-store combined vector stores
Fraser Cormack [Wed, 26 May 2021 15:04:59 +0000 (16:04 +0100)]
[DAGCombine][RISCV] Don't try to trunc-store combined vector stores

DAGCombine's `mergeStoresOfConstantsOrVecElts` optimization is told
whether it's to use vector types and also whether it's to issue a
truncating store. However, the truncating store code path assumes a
scalar integer `ConstantSDNode`, and when using vector types it creates
either a `BUILD_VECTOR` or `CONCAT_VECTORS` to store: neither of which
is a constant.

The `riscv64` target is able to expose a crash here because it switches
on both code paths at the same time. The `f32` is stored as `i32` which
must be promoted to `i64`, necessitating a truncating store.
It also decides later that it prefers a vector store of `v2f32`.

While vector truncating stores are legal, this combine is not able to
emit them. We also don't have a test case. This patch adds an assert to
catch this case more gracefully, and updates one of the caller functions
to the function to turn off the use of truncating stores when preferring
vectors.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D103173

3 years ago[RISCV] Allow passing fixed-length vectors via the stack
Fraser Cormack [Thu, 13 May 2021 16:34:29 +0000 (17:34 +0100)]
[RISCV] Allow passing fixed-length vectors via the stack

The vector calling convention dictates that when the vector argument
registers are exhaused, GPRs are used to pass the address via the stack.
When the GPRs themselves are exhausted, at best we would previously
crash with an assertion, and at worst we'd generate incorrect code.

This patch addresses this issue by passing fixed-length vectors via the
stack with their full fixed-length size and aligned to their element
type size. Since the calling convention lowering can't yet handle
scalable vector types, this patch adds a fatal error to make it clear
that we are lacking in this regard.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D102422

3 years ago[VPlan] Do not sink uniform recipes in sinkScalarOperands.
Florian Hahn [Thu, 27 May 2021 12:53:33 +0000 (13:53 +0100)]
[VPlan] Do not sink uniform recipes in sinkScalarOperands.

For uniform ReplicateRecipes, only the first lane should be used, so
sinking them would mean we have to compute the value of the first lane
multiple times. Also, at the moment, sinking them causes a crash because
the value of the first lane is re-used by all users.

Reported post-commit for D100258.

3 years agoAdd support for #elifdef and #elifndef
Aaron Ballman [Thu, 27 May 2021 12:41:00 +0000 (08:41 -0400)]
Add support for #elifdef and #elifndef

WG14 adopted N2645 and WG21 EWG has accepted P2334 in principle (still
subject to full EWG vote + CWG review + plenary vote), which add
support for #elifdef as shorthand for #elif defined and #elifndef as
shorthand for #elif !defined. This patch adds support for the new
preprocessor directives.

3 years ago[mlir][Linalg] Add comprehensive bufferization support for subtensor (5/n)
Nicolas Vasilache [Thu, 27 May 2021 12:19:39 +0000 (12:19 +0000)]
[mlir][Linalg] Add comprehensive bufferization support for subtensor (5/n)

This revision refactors and simplifies the pattern detection logic: thanks to SSA value properties, we can actually look at all the uses of a given value and avoid having to pattern-match specific chains of operations.

A bufferization pattern for subtensor is added and specific inplaceability analysis is implemented for the simple case of subtensor. More advanced use cases will follow.

Differential revision: https://reviews.llvm.org/D102512

3 years agoAdd --quiet option to llvm-gsymutil to suppress output of warnings.
Simon Giesecke [Thu, 20 May 2021 08:04:33 +0000 (08:04 +0000)]
Add --quiet option to llvm-gsymutil to suppress output of warnings.

Differential Revision: https://reviews.llvm.org/D102829

3 years agoRevert "[OpenMP]Add support for workshare loop modifier in lowering"
Mats Petersson [Thu, 27 May 2021 12:07:20 +0000 (13:07 +0100)]
Revert "[OpenMP]Add support for workshare loop modifier in lowering"

This reverts commit ea4c5fb04c6d9618d451fb2d2c360dc95c6d9131.

3 years ago[AMDGPU][Libomptarget][NFC] Remove atmi_mem_place_t
Pushpinder Singh [Thu, 27 May 2021 10:55:38 +0000 (10:55 +0000)]
[AMDGPU][Libomptarget][NFC] Remove atmi_mem_place_t

This struct was used to specify the device on which memory was
being allocated/free in atmi_malloc/free. It has now been replaced
with int DeviceId.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D103239

3 years ago[OpenMP]Add support for workshare loop modifier in lowering
Mats Petersson [Fri, 30 Apr 2021 13:13:55 +0000 (14:13 +0100)]
[OpenMP]Add support for workshare loop modifier in lowering

When lowering the dynamic, guided, auto and runtime types of scheduling,
there is an optional monotonic or non-monotonic modifier. This patch
adds support in the OMP IR Builder to pass this down to the runtime
functions.

Also implements tests for the variants.

Differential Revision: https://reviews.llvm.org/D102008

3 years ago[ARM] Extra test for reverted WLS memset. NFC
David Green [Thu, 27 May 2021 11:20:19 +0000 (12:20 +0100)]
[ARM] Extra test for reverted WLS memset. NFC

3 years ago[clang-format] [NFC] realign documentation in Format.h...
Max Sagebaum [Thu, 27 May 2021 11:10:45 +0000 (13:10 +0200)]
[clang-format] [NFC] realign documentation in Format.h...

... and ClanfFormatStyleOptions.rst for EmptyLineAfterAccessModifier

Differential-Revision: https://reviews.llvm.org/D102989

3 years agoAdd triples to a bunch of x86-specific tests that currently fail on PPC
Benjamin Kramer [Thu, 27 May 2021 10:31:00 +0000 (12:31 +0200)]
Add triples to a bunch of x86-specific tests that currently fail on PPC

3 years ago[lit][test] Improve testing of use_llvm_tool
James Henderson [Wed, 26 May 2021 11:04:24 +0000 (12:04 +0100)]
[lit][test] Improve testing of use_llvm_tool

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D103154

3 years ago[Matrix] Include matrix pipeline for new PM in new-pm-defaults.ll.
Florian Hahn [Thu, 27 May 2021 09:54:08 +0000 (10:54 +0100)]
[Matrix] Include matrix pipeline for new PM in new-pm-defaults.ll.

-enable-matrix just adds a single pass, so it's easier to just check in
new-pm-default.ll rather than duplicating the full checks for -O3 with
the new pass manager.

Suggested post-commit by @aeubanks.

3 years ago[SelectionDAG][RISCV] Don't unroll 0/1-type bool VSELECTs
Fraser Cormack [Wed, 26 May 2021 09:54:35 +0000 (10:54 +0100)]
[SelectionDAG][RISCV] Don't unroll 0/1-type bool VSELECTs

This patch extends the cases in which the legalizer is able to express
VSELECT in terms of XOR/AND/OR. When dealing with a VSELECT between
boolean vector types, the mask itself is an all-ones or all-ones value
of the operand type, so a 0/1 boolean type behaves identically to a 0/-1
type.

This greatly helps RISC-V which relies on expansion for these nodes. It
also allows scalable-vector bool VSELECTs to use the default expansion,
where before it would crash in SelectionDAG::UnrollVectorOp.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D103147

3 years ago[AMDGPU][GlobalISel] Allow amdgpu_gfx calling conv
Sebastian Neubauer [Wed, 26 May 2021 16:50:19 +0000 (18:50 +0200)]
[AMDGPU][GlobalISel] Allow amdgpu_gfx calling conv

Calling functions from shaders already works with the SelectionDAG.

Differential Revision: https://reviews.llvm.org/D103183

3 years ago[mlir] Support dialect-wide canonicalization pattern registration
Matthias Springer [Thu, 27 May 2021 08:26:45 +0000 (17:26 +0900)]
[mlir] Support dialect-wide canonicalization pattern registration

* Add `hasCanonicalizer` option to Dialect.
* Initialize canonicalizer with dialect-wide canonicalization patterns.
* Add test case to TestDialect.

Dialect-wide canonicalization patterns are useful if a canonicalization pattern does not conceptually associate with any single operation, i.e., it should not be registered as part of an operation's `getCanonicalizationPatterns` function. E.g., this is the case for canonicalization patterns that match an op interface.

Differential Revision: https://reviews.llvm.org/D103226

3 years ago[NFCI][LoopDeletion] Do not call complex analysis for known non-zero BTC
Max Kazantsev [Thu, 27 May 2021 08:18:30 +0000 (15:18 +0700)]
[NFCI][LoopDeletion] Do not call complex analysis for known non-zero BTC

3 years ago[NFC] Reuse existing variables instead of re-requesting successors
Max Kazantsev [Thu, 27 May 2021 08:01:20 +0000 (15:01 +0700)]
[NFC] Reuse existing variables instead of re-requesting successors

3 years ago[GlobalISel] Implement splitting of G_SHUFFLE_VECTOR.
Amara Emerson [Thu, 20 May 2021 04:35:05 +0000 (21:35 -0700)]
[GlobalISel] Implement splitting of G_SHUFFLE_VECTOR.

Thhis is a port from the DAG legalization. We're still missing some of the
canonicalizations of shuffles but it's a start.

Differential Revision: https://reviews.llvm.org/D102828

3 years ago[mlir] Add TestLinalgDistribution.cpp to cmake build.
Alexander Belyaev [Thu, 27 May 2021 06:59:05 +0000 (08:59 +0200)]
[mlir] Add TestLinalgDistribution.cpp to cmake build.

3 years ago[docs] llvm-objdump: Mention -M no-aliases is supported on AArch64
Fangrui Song [Thu, 27 May 2021 06:57:32 +0000 (23:57 -0700)]
[docs] llvm-objdump: Mention -M no-aliases is supported on AArch64

3 years ago[mlir] Add a pass to distribute linalg::TiledLoopOp.
Alexander Belyaev [Wed, 26 May 2021 18:22:49 +0000 (20:22 +0200)]
[mlir] Add a pass to distribute linalg::TiledLoopOp.

Differential Revision: https://reviews.llvm.org/D103194

3 years ago[NFCI] Lazily evaluate SCEVs of PHIs
Max Kazantsev [Thu, 27 May 2021 06:20:57 +0000 (13:20 +0700)]
[NFCI] Lazily evaluate SCEVs of PHIs

Eager evaluation has cost of compile time. Only query them if they are
required for proving predicates.

3 years ago[NFC] Formatting fix
Max Kazantsev [Thu, 27 May 2021 05:50:54 +0000 (12:50 +0700)]
[NFC] Formatting fix

3 years ago[NFCI][LoopDeletion] Only query SCEV about loop successor if another successor is...
Max Kazantsev [Thu, 27 May 2021 04:47:30 +0000 (11:47 +0700)]
[NFCI][LoopDeletion] Only query SCEV about loop successor if another successor is also in loop

3 years ago[llvm-objdump] Print the DEBUG type under `--section-headers`.
Esme-Yi [Thu, 27 May 2021 04:53:14 +0000 (04:53 +0000)]
[llvm-objdump] Print the DEBUG type under `--section-headers`.

Summary: Under the option --section-headers, we can only
print the section types of TEXT, DATA, and BSS for now.
This patch adds the DEBUG type.

Reviewed By: jhenderson, Higuoxing

Differential Revision: https://reviews.llvm.org/D102603

3 years ago[gn build] Port 857fa7b7b187
LLVM GN Syncbot [Thu, 27 May 2021 04:42:56 +0000 (04:42 +0000)]
[gn build] Port 857fa7b7b187

3 years ago[gn build] Port 0dc7fd1bc167
LLVM GN Syncbot [Thu, 27 May 2021 04:42:55 +0000 (04:42 +0000)]
[gn build] Port 0dc7fd1bc167

3 years ago[libcxx][iterator] adds `std::ranges::prev`
Christopher Di Bella [Sun, 16 May 2021 01:39:22 +0000 (01:39 +0000)]
[libcxx][iterator] adds `std::ranges::prev`

Implements part of P0896 'The One Ranges Proposal'.
Implements [range.iter.op.prev].

Depends on D102563.

Differential Revision: https://reviews.llvm.org/D102564

3 years ago[libcxx][iterator] adds `std::ranges::next`
Christopher Di Bella [Sat, 8 May 2021 05:02:43 +0000 (05:02 +0000)]
[libcxx][iterator] adds `std::ranges::next`

Implements part of P0896 'The One Ranges Proposal'.
Implements [range.iter.op.next].

Depends on D101922.

Differential Revision: https://reviews.llvm.org/D102563

3 years agoFix non-global-value-max-name-size not considered by LLParser
Hasyimi Bahrudin [Thu, 27 May 2021 04:01:20 +0000 (04:01 +0000)]
Fix non-global-value-max-name-size not considered by LLParser

`non-global-value-max-name-size` is used by `Value` to cap the length of local value name. However, this flag is not considered by `LLParser`, which leads to unexpected `use of undefined value error`. The fix is to move the responsibility of capping the length to `ValueSymbolTable`.

The test is the one provided by [[ https://bugs.llvm.org/show_bug.cgi?id=45899 | Mikael in the bug report ]].

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D102707

3 years ago[Fuchsia][CMake] Add missing include path.
Haowei Wu [Thu, 27 May 2021 00:45:03 +0000 (17:45 -0700)]
[Fuchsia][CMake] Add missing include path.

This patch adds include path for missing header files from "sync".
This patch also fixes the build failures caused by scudo.

Differential Revision: https://reviews.llvm.org/D103218

3 years ago[RS4GC] Introduce intrinsics to get base ptr and offset
Yevgeny Rouban [Thu, 27 May 2021 02:01:55 +0000 (09:01 +0700)]
[RS4GC] Introduce intrinsics to get base ptr and offset

There can be a need for some optimizations to get (base, offset)
for any GC pointer. The base can be calculated by generating
needed instructions as it is done by the
RewriteStatepointsForGC::findBasePointer() function. The offset
can be calculated in the same way. Though to not expose the base
calculation and to make the offset calculation as simple as
ptrtoint(derived_ptr) - ptrtoint(base_ptr), which is illegal
outside RS4GC, this patch introduces 2 intrinsics:

 @llvm.experimental.gc.get.pointer.base(%derived_ptr)
 @llvm.experimental.gc.get.pointer.offset(%derived_ptr)

These intrinsics are inlined by RS4GC along with generation of
statepoint sequences.

With these new intrinsics the GC parseable lowering for atomic
memcpy intrinsics (6ec2c5e402a724ba99bce82a9cac7a3006d660f4)
could be implemented as a separate pass.

Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D100445

3 years agoThe compiler is crashing when compiling a coroutine intrinsic without
Zahira Ammarguellat [Thu, 20 May 2021 19:37:26 +0000 (12:37 -0700)]
The compiler is crashing when compiling a coroutine intrinsic without
the use of the option fcoroutines-ts. This is a patch to fix this.

Fix for https://bugs.llvm.org/show_bug.cgi?id=50406

3 years agoFix unit test after 324af79dbc6066
Jessica Paquette [Thu, 27 May 2021 00:49:04 +0000 (17:49 -0700)]
Fix unit test after 324af79dbc6066

Needed to add in an extra parameter to calls to `libcall`.

3 years ago[ORC-RT] Add endianness support to the ORC runtime.
Lang Hames [Wed, 26 May 2021 21:00:41 +0000 (14:00 -0700)]
[ORC-RT] Add endianness support to the ORC runtime.

endian.h is a cut-down version of llvm/Support/SwapByteOrder.h. It will be used
in upcoming serialization utilities for the ORC runtime.

3 years ago[GlobalISel] Don't emit lost debug location remarks when legalizing tail calls
Jessica Paquette [Tue, 25 May 2021 23:54:20 +0000 (16:54 -0700)]
[GlobalISel] Don't emit lost debug location remarks when legalizing tail calls

There were a bunch of lost debug location remarks that show up when legalizing
tail calls on AArch64.

This would happen because we drop the return in the block where we emit the
tail call. So, we end up dropping the debug location, which makes the
LostDebugLocObserver report a missing debug location.

Although it's *true* that we lose these debug locations, this isn't
a particularly useful remark. We expect to drop these debug locations when
emitting tail calls. Suppressing remarks in this case is preferable, since the
amount of noise could hide actual debug location related bugs.

To do this, I just plumbed the LostDebugLocObserver through the relevant
LegalizerHelper functions. This is the only case I can think of where we need
the LostDebugLocObserver in the LegalizerHelper. So, rather than storing it
in the LegalizerHelper proper and mucking around with the constructors, I
figured it'd be cleanest to take the simplest path for now.

This clears up ~20 noisy lost debug location remarks on CTMark in AArch64 at
-Os.

Differential Revision: https://reviews.llvm.org/D103128

3 years agoEmit correct location lists with basic block sections.
Sriraman Tallam [Thu, 27 May 2021 00:12:31 +0000 (17:12 -0700)]
Emit correct location lists with basic block sections.

This patch addresses multiple things:

1) It ensures that const_value is emitted when possible with basic block
sections.
2) It emits location lists such that the labels are always within the
section boundary.
3) It fixes a bug when the parameter is first used in a non-entry block
which is in a different section from the entry block.

Differential Revision: https://reviews.llvm.org/D85085

3 years ago[AArch64][GlobalISel] Legalize non-power-of-2 vector elements for G_STORE.
Amara Emerson [Wed, 26 May 2021 23:32:42 +0000 (16:32 -0700)]
[AArch64][GlobalISel] Legalize non-power-of-2 vector elements for G_STORE.

The rules were already there, it just needed re-ordering so the odd case didn't
bail out too early.

3 years agoRevert "[scudo] Build scudo_standalone on Android and Fuchsia."
Mitch Phillips [Wed, 26 May 2021 23:51:43 +0000 (16:51 -0700)]
Revert "[scudo] Build scudo_standalone on Android and Fuchsia."

This reverts commit 2fe987e6bacea8884a397041c13a38e8ba97c2d6.

Broke the Android buildbots. Turns out a couple more tweaks are
necessary to turn them back on.

3 years ago[MLIR] Add support for empty IVs to affine.parallel
Frank Laub [Fri, 21 May 2021 01:52:53 +0000 (01:52 +0000)]
[MLIR] Add support for empty IVs to affine.parallel

Allow support for specifying empty IVs in an `affine.parallel`.

For example:

```
affine.parallel () = () to () {
  affine.yield
}
```

Reviewed By: bondhugula, jbruestle

Differential Revision: https://reviews.llvm.org/D102895

3 years ago[Hexagon] Restore handling of expanding shuffles
Krzysztof Parzyszek [Sat, 22 May 2021 18:36:41 +0000 (13:36 -0500)]
[Hexagon] Restore handling of expanding shuffles

Fixed bugs, added testcases.  The byte-unpack is actually recognized by
the DAG combiner, but the halfword-unpack it not.

3 years ago[tests] Add some basic coverage of multiple exit unrolling
Philip Reames [Wed, 26 May 2021 22:51:16 +0000 (15:51 -0700)]
[tests] Add some basic coverage of multiple exit unrolling

3 years ago[scudo] Build scudo_standalone on Android and Fuchsia.
Mitch Phillips [Wed, 26 May 2021 22:29:26 +0000 (15:29 -0700)]
[scudo] Build scudo_standalone on Android and Fuchsia.

This should be fine now, and is necessary for D102543.

Reviewed By: cryptoad

Differential Revision: https://reviews.llvm.org/D103200

3 years ago[mlir] Add n-D vector lowering to LLVM for cast ops
harsh-nod [Wed, 26 May 2021 22:18:32 +0000 (15:18 -0700)]
[mlir] Add n-D vector lowering to LLVM for cast ops

The casting ops (sitofp, uitofp, fptosi, fptoui) lowering currently does
not handle n-D vectors. This patch fixes that.

Differential Revision: https://reviews.llvm.org/D103207

3 years agoRevert "Refactor mutation strategies into a standalone library"
Matt Morehouse [Wed, 26 May 2021 22:14:37 +0000 (15:14 -0700)]
Revert "Refactor mutation strategies into a standalone library"

This reverts commit c4a41cd77c15c2905ac74beeec09f8343a65a549 due to
buildbot failure.

3 years ago[mlir][python] Provide "all passes" registration module in Python
Aart Bik [Wed, 26 May 2021 19:44:33 +0000 (12:44 -0700)]
[mlir][python] Provide "all passes" registration module in Python

Currently, passes are registered on a per-dialect basis, which
provides the smallest footprint obviously. But for prototyping
and experimentation, a convenience "all passes" module is provided,
which registers all known MLIR passes in one run.

Usage in Python:

import mlir.all_passes_registration

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D103130

3 years ago[lldb] Fix gnu_libstdcpp's update methods
Walter Erquinigo [Wed, 26 May 2021 21:30:48 +0000 (14:30 -0700)]
[lldb] Fix gnu_libstdcpp's update methods

The variable.rst documentation says:

```
If it returns a value, and that value is True, LLDB will be allowed to cache the children and the children count it previously obtained, and will not return to the provider class to ask.  If nothing, None, or anything other than True is returned, LLDB will discard the cached information and ask. Regardless, whenever necessary LLDB will call update.
```

However, several update methods in gnu_libstdcpp.py were returning True,
which made lldb unaware of any changes in the corresponding objects.
This problem was visible by lldb-vscode in the following way:

- If a breakpoint is hit and there's a vector with the contents {1, 2},
  it'll be displayed correctly.
- Then the user steps and the next stop contains the vector modified.
  The program changed it to {1, 2, 3}
- frame var then displays {1, 2} incorrectly, due to the caching caused
by the update method

It's worth mentioning that none of libcxx.py'd update methods return True. Same for LibCxxVector.cpp, which returns false.

Added a very simple test that fails without this fix.

Differential Revision: https://reviews.llvm.org/D103209

3 years ago[libFuzzer] Add missing FuzzerBuiltinsMsvc.h include.
Matt Morehouse [Wed, 26 May 2021 21:38:27 +0000 (14:38 -0700)]
[libFuzzer] Add missing FuzzerBuiltinsMsvc.h include.

Should fix the Windows build.

3 years ago[libcxx][nfc] Fix the ASAN bots: update expected.pass.cpp.
zoecarver [Wed, 26 May 2021 19:00:03 +0000 (12:00 -0700)]
[libcxx][nfc] Fix the ASAN bots: update expected.pass.cpp.

Ensures that `get_return_object`'s return type is the same as the return type for the function calling `co_return`. Otherwise, we try to construct an object, then free it, then return it.

Differential Revision: https://reviews.llvm.org/D103196

3 years ago[flang][docs] Initial documentation for the Fortran LLVM Test Suite.
naromero77 [Wed, 26 May 2021 20:54:16 +0000 (15:54 -0500)]
[flang][docs] Initial documentation for the Fortran LLVM Test Suite.

Describes how to run the Fortran LLVM Test Suite, specifically the external SPEC CPU 2017 Fortran tests.

Reviewed By: rovka

Differential Revision: https://reviews.llvm.org/D102877

3 years ago[AArch64] Support llvm-mc/llvm-objdump -M no-aliases
Fangrui Song [Wed, 26 May 2021 20:35:31 +0000 (13:35 -0700)]
[AArch64] Support llvm-mc/llvm-objdump -M no-aliases

This enables the no-aliases forms of many instructions.

Depends on D103004

Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D103005

3 years ago[libcxx][docs] Take mutex for common_iterator, common_view, and empty_view.
zoecarver [Wed, 26 May 2021 20:28:32 +0000 (13:28 -0700)]
[libcxx][docs] Take mutex for common_iterator, common_view, and empty_view.

3 years agoRefactor mutation strategies into a standalone library
Aaron Green [Tue, 25 May 2021 19:04:12 +0000 (12:04 -0700)]
Refactor mutation strategies into a standalone library

This change introduces libMutagen/libclang_rt.mutagen.a as a subset of libFuzzer/libclang_rt.fuzzer.a. This library contains only the fuzzing strategies used by libFuzzer to produce new test inputs from provided inputs, dictionaries, and SanitizerCoverage feedback.

Most of this change is simply moving sections of code to one side or the other of the library boundary. The only meaningful new code is:

* The Mutagen.h interface and its implementation in Mutagen.cpp.
* The following methods in MutagenDispatcher.cpp:
  * UseCmp
  * UseMemmem
  * SetCustomMutator
  * SetCustomCrossOver
  * LateInitialize (similar to the MutationDispatcher's original constructor)
  * Mutate_AddWordFromTORC (uses callbacks instead of accessing TPC directly)
  * StartMutationSequence
  * MutationSequence
  * DictionaryEntrySequence
  * RecommendDictionary
  * RecommendDictionaryEntry
* FuzzerMutate.cpp (which now justs sets callbacks and handles printing)
* MutagenUnittest.cpp (which adds tests of Mutagen.h)

A note on performance: This change was tested with a 100 passes of test/fuzzer/LargeTest.cpp with 1000 runs per pass, both with and without the change. The running time distribution was qualitatively similar both with and without the change, and the average difference was within 30 microseconds (2.240 ms/run vs 2.212 ms/run, respectively). Both times were much higher than observed with the fully optimized system clang (~0.38 ms/run), most likely due to the combination of CMake "dev mode" settings (e.g. CMAKE_BUILD_TYPE="Debug", LLVM_ENABLE_LTO=OFF, etc.). The difference between the two versions built similarly seems to be "in the noise" and suggests no meaningful performance degradation.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D102447

3 years ago[llvm-readobj] Optimize printing stack sizes to linear time.
Rahman Lavaee [Wed, 26 May 2021 20:12:36 +0000 (13:12 -0700)]
[llvm-readobj] Optimize printing stack sizes to linear time.

Currently, each function name lookup is a linear iteration over all symbols defined in the object file which makes the total running time quadratic.

This patch optimizes the function name lookup by populating an **address to index** map upon the first function name lookup which is used to lookup each function name in O(1).

**impact**: For the clang binary built with `-fstack-size-section`, this improves the running time of `llvm-readobj --stack-size` from 7 minutes to 0.25 seconds.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D103072

3 years ago[RISCV] Use X0 as destination of inserted vsetvli when possible.
Craig Topper [Wed, 26 May 2021 18:51:32 +0000 (11:51 -0700)]
[RISCV] Use X0 as destination of inserted vsetvli when possible.

We aren't going to connect the result to anything so we might
as well avoid allocating a register.

Reviewed By: frasercrmck, HsiangKai

Differential Revision: https://reviews.llvm.org/D102031

3 years ago[RISCV][NFC] Fix some whitespace nits in MC test RUN lines
Jessica Clarke [Wed, 26 May 2021 20:03:18 +0000 (21:03 +0100)]
[RISCV][NFC] Fix some whitespace nits in MC test RUN lines

3 years agoUpdate documentation for InlineModel features.
Jacob Hegna [Wed, 26 May 2021 19:13:21 +0000 (12:13 -0700)]
Update documentation for InlineModel features.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D103193

3 years ago[libc++] Add a job testing on GCC 11
Louis Dionne [Tue, 25 May 2021 21:35:35 +0000 (17:35 -0400)]
[libc++] Add a job testing on GCC 11

I'm adding the job as a soft-fail for now, but once all the tests have
been fixed to work on it, we'll switch over from GCC 10 to GCC 11 and
remove the soft-fail.

Differential Revision: https://reviews.llvm.org/D103116

3 years ago[pstl] Workaround more errors in the test suite
Louis Dionne [Wed, 26 May 2021 19:44:52 +0000 (15:44 -0400)]
[pstl] Workaround more errors in the test suite

3 years ago[CostModel][AArch64] Add floating point arithmetic tests. NFC.
Sjoerd Meijer [Wed, 26 May 2021 19:15:29 +0000 (20:15 +0100)]
[CostModel][AArch64] Add floating point arithmetic tests. NFC.

3 years ago[DebugInstrRef][1/3] Track PHI values through register allocation
Jeremy Morse [Wed, 26 May 2021 18:53:33 +0000 (19:53 +0100)]
[DebugInstrRef][1/3] Track PHI values through register allocation

This patch introduces "DBG_PHI" instructions, a marker of where a PHI
instruction used to be, before PHI elimination. Under the instruction
referencing model, we want to know where every value in the function is
defined -- and a PHI, even if implicit, is such a place.

Just like instruction numbers, we can use this to identify a value to be
used as a variable value, but we don't need to know what instruction
defines that value, for example:

bb1:
   DBG_PHI $rax, 1
   [... more insts ... ]
bb2:
   DBG_INSTR_REF 1, 0, !1234, !DIExpression()

This specifies that on entry to bb1, whatever value is in $rax is known
as value number one -- and the later DBG_INSTR_REF marks the position
where variable !1234 should take on value number one.

PHI locations are stored in MachineFunction for the duration of the
regalloc phase in the DebugPHIPositions map. The map is populated by
PHIElimination, and then flushed back into the instruction stream by
virtregrewriter. A small amount of maintenence is needed in
LiveDebugVariables to account for registers being split, but only for
individual positions, not for entire ranges of blocks.

Differential Revision: https://reviews.llvm.org/D86812

3 years ago[pstl] Fix -Wundef errors in the test suite
Louis Dionne [Wed, 26 May 2021 19:24:31 +0000 (15:24 -0400)]
[pstl] Fix -Wundef errors in the test suite

3 years ago[libomptarget][nfc][amdgpu] Factor out setting upper bounds
Jon Chesterfield [Wed, 26 May 2021 18:57:48 +0000 (19:57 +0100)]
[libomptarget][nfc][amdgpu] Factor out setting upper bounds

Refactor suggested in D103037 to help avoid similar copy-paste errors.
Change is mechanical. Some parts of this would be more robust with unsigned.

Reviewed By: dhruvachak

Differential Revision: https://reviews.llvm.org/D103090