platform/upstream/llvm.git
3 years agofix up test from D102742
Nick Desaulniers [Mon, 24 May 2021 19:06:49 +0000 (12:06 -0700)]
fix up test from D102742

In D102742, I mistakenly put the split file designator above a bunch of
CHECK lines, which unintentionally removed the CHECKs from actually
being verified.

This can be verified by observing:
<build dir>/test/CodeGen/X86/Output/stack-protector-3.ll.tmp/main.ll

3 years agoSurface clone APIs in CAPI
George [Mon, 24 May 2021 18:52:41 +0000 (11:52 -0700)]
Surface clone APIs in CAPI

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D102987

3 years ago[gn build] Port b510e4cf1b96
LLVM GN Syncbot [Mon, 24 May 2021 18:48:17 +0000 (18:48 +0000)]
[gn build] Port b510e4cf1b96

3 years ago[RISCV] Add a vsetvli insert pass that can be extended to be aware of incoming VL...
Craig Topper [Mon, 24 May 2021 17:25:27 +0000 (10:25 -0700)]
[RISCV] Add a vsetvli insert pass that can be extended to be aware of incoming VL/VTYPE from other basic blocks.

This is a replacement for D101938 for inserting vsetvli
instructions where needed. This new version changes how
we track the information in such a way that we can extend
it to be aware of VL/VTYPE changes in other blocks. Given
how much it changes the previous patch, I've decided to
abandon the previous patch and post this from scratch.

For now the pass consists of a single phase that assumes
the incoming state from other basic blocks is unknown. A
follow up patch will extend this with a phase to collect
information about how VL/VTYPE change in each block and
a second phase to propagate this information to the entire
function. This will be used by a third phase to do the
vsetvli insertion.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D102737

3 years ago[gn build] Port a64ebb863727
LLVM GN Syncbot [Mon, 24 May 2021 18:36:50 +0000 (18:36 +0000)]
[gn build] Port a64ebb863727

3 years ago[WebAssembly] Add NullifyDebugValueLists pass
Heejin Ahn [Sun, 23 May 2021 09:09:17 +0000 (02:09 -0700)]
[WebAssembly] Add NullifyDebugValueLists pass

`WebAssemblyDebugValueManager` does not currently handle
`DBG_VALUE_LIST`, which is a recent addition to LLVM. We tried to
nullify them within the constructor of `WebAssemblyDebugValueManager` in
D102589, but it made the class error-prone to use because it deletes
instructions within the constructor and thus invalidates existing
iterators within the BB, so the user of the class should take special
care not to use invalidated iterators. This actually caused a bug in
ExplicitLocals pass.

Instead of trying to fix ExplicitLocals pass to make the iterator usage
correct, which is possible but error-prone, this adds
NullifyDebugValueLists pass that nullifies all `DBG_VALUE_LIST`
instructions before we run WebAssembly specific passes in the backend.
We can remove this pass after we implement handlers for
`DBG_VALUE_LIST`s in `WebAssemblyDebugValueManager` and elsewhere.

Fixes https://github.com/emscripten-core/emscripten/issues/14255.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D102999

3 years ago[dfsan] Add function that prints origin stack trace to buffer
George Balatsouras [Fri, 21 May 2021 17:56:45 +0000 (10:56 -0700)]
[dfsan] Add function that prints origin stack trace to buffer

Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D102451

3 years ago[CUDA] Work around compatibility issue with libstdc++ 11.1.0
Artem Belevich [Fri, 21 May 2021 17:53:28 +0000 (10:53 -0700)]
[CUDA]  Work around compatibility issue with libstdc++ 11.1.0

libstdc++ redeclares __failed_assertion multiple times and that results in the
function declared with conflicting set of attributes when we include <complex>
with __host__ __device__ attributes force-applied to all functions.

In order to work around the issue, we rename __failed_assertion within the
region with forced attributes.

See https://bugs.llvm.org/show_bug.cgi?id=50383 for the details.

Differential Revision: https://reviews.llvm.org/D102936

3 years agoEnable MLIR Python bindings for TOSA.
Stella Laurenzo [Mon, 24 May 2021 16:41:38 +0000 (16:41 +0000)]
Enable MLIR Python bindings for TOSA.

Differential Revision: https://reviews.llvm.org/D103035

3 years ago[lldb] Add missing mutex guards to TargetList::CreateTarget
Raphael Isemann [Mon, 24 May 2021 17:16:40 +0000 (19:16 +0200)]
[lldb] Add missing mutex guards to TargetList::CreateTarget

TestMultipleTargets is randomly failing on the bots. The reason for that is that
the test is calling `SBDebugger::CreateTarget` from multiple threads.
`TargetList::CreateTarget` is curiously missing the guard that all of its other
member functions have, so all the threads in the test end up changing the
internal TargetList state at the same time and end up corrupting it.

Reviewed By: vsk, JDevlieghere

Differential Revision: https://reviews.llvm.org/D103020

3 years agoRevert "[NFC] remove explicit default value for strboolattr attribute in tests"
serge-sans-paille [Mon, 24 May 2021 17:43:40 +0000 (19:43 +0200)]
Revert "[NFC] remove explicit default value for strboolattr attribute in tests"

This reverts commit bda6e5bee04c75b1f1332b4fd1ac4e8ef6c3c247.

See https://lab.llvm.org/buildbot/#/builders/109/builds/15424 for instance

3 years ago[NFC] remove explicit default value for strboolattr attribute in tests
serge-sans-paille [Sun, 23 May 2021 11:19:23 +0000 (13:19 +0200)]
[NFC] remove explicit default value for strboolattr attribute in tests

Since d6de1e1a71406c75a4ea4d5a2fe84289f07ea3a1, no attributes is quivalent to
setting attribute to false.

This is a preliminary commit for https://reviews.llvm.org/D99080

3 years ago[X86] Call insertDAGNode on trunc/zext created in tryShiftAmountMod.
Craig Topper [Mon, 24 May 2021 17:19:09 +0000 (10:19 -0700)]
[X86] Call insertDAGNode on trunc/zext created in tryShiftAmountMod.

This puts the new nodes in the proper place in the topologically
sorted list of nodes.

Fixes PR50431, which was introduced recently in D101944.

3 years ago[gn build] Port 095e91c9737b
LLVM GN Syncbot [Mon, 24 May 2021 17:18:43 +0000 (17:18 +0000)]
[gn build] Port 095e91c9737b

3 years ago[NFC][scudo] Small test cleanup
Vitaly Buka [Sun, 23 May 2021 22:49:43 +0000 (15:49 -0700)]
[NFC][scudo] Small test cleanup

Fixing issues raised on D102979 review.

Reviewed By: cryptoad

Differential Revision: https://reviews.llvm.org/D102994

3 years ago[Remarks] Add analysis remarks for memset/memcpy/memmove lengths
Jon Roelofs [Mon, 24 May 2021 16:49:32 +0000 (09:49 -0700)]
[Remarks] Add analysis remarks for memset/memcpy/memmove lengths

Re-landing now that the crasher this patch previously uncovered has been fixed
in: https://reviews.llvm.org/D102935

Differential revision: https://reviews.llvm.org/D102452

3 years ago[X86][Costmodel] getMaskedMemoryOpCost(): don't scalarize non-power-of-two vectors...
Roman Lebedev [Mon, 24 May 2021 17:09:04 +0000 (20:09 +0300)]
[X86][Costmodel] getMaskedMemoryOpCost(): don't scalarize non-power-of-two vectors with legal element type

This follows in steps of similar `getMemoryOpCost()` changes, D100099/D100684.

Intel SDM, `VPMASKMOV — Conditional SIMD Integer Packed Loads and Stores`:
```
Faults occur only due to mask-bit required memory accesses that caused the faults. Faults will not occur due to
referencing any memory location if the corresponding mask bit for that memory location is 0. For example, no
faults will be detected if the mask bits are all zero.
```
I.e., if mask is all-zeros, any address is fine.

Masked load/store's prime use-case is e.g. tail masking the loop remainder,
where for the last iteration, only first some few elements of a vector exist.

So much similarly, i don't see why must we scalarize non-power-of-two vectors,
iff the element type is something we can masked- store/load.
We simply need to legalize it, widen the mask, and be done with it.
And we even already count the cost of widening the mask.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D102990

3 years ago[RISCV] Optimize getVLENFactoredAmount function.
luxufan [Mon, 24 May 2021 16:51:04 +0000 (09:51 -0700)]
[RISCV] Optimize getVLENFactoredAmount function.

If the local variable `NumOfVReg` isPowerOf2_32(NumOfVReg - 1) or isPowerOf2_32(NumOfVReg + 1), the ADDI and MUL instructions can be replaced with SLLI and ADD(or SUB) instructions.

Based on original patch by StephenFan.

Reviewed By: frasercrmck, StephenFan

Differential Revision: https://reviews.llvm.org/D100577

3 years ago[mlir][doc] Fix links and references in top level docs directory
Markus Böck [Mon, 24 May 2021 16:40:39 +0000 (18:40 +0200)]
[mlir][doc] Fix links and references in top level docs directory

This is the fourth and final patch in a series of patches fixing markdown links and references inside the mlir documentation. This patch combined with the other three should fix almost every broken link on mlir.llvm.org as far as I can tell.

This patch in particular addresses all Markdown files in the top level docs directory.

Differential Revision: https://reviews.llvm.org/D103032

3 years ago[Remarks] Look through inttoptr/ptrtoint for -ftrivial-auto-var-init remarks.
Jon Roelofs [Mon, 24 May 2021 16:19:31 +0000 (09:19 -0700)]
[Remarks] Look through inttoptr/ptrtoint for -ftrivial-auto-var-init remarks.

The crasher is a related problem that @aemerson found broke speck2k6/403.gcc
when I landed https://reviews.llvm.org/D102452. It has been reduced & modified
to reproduce without that patch.

Differential revision: https://reviews.llvm.org/D102935

3 years agoCoroSplit: Replace ad-hoc implementation of reachability with API from CFG.h
Adrian Prantl [Mon, 24 May 2021 16:06:00 +0000 (09:06 -0700)]
CoroSplit: Replace ad-hoc implementation of reachability with API from CFG.h

The current ad-hoc implementation used to determine whether a basic
block is unreachable doesn't work correctly in the general case (for
example it won't detect successors of unreachable blocks as
unreachable). This patch replaces it with the correct API that uses a
DominatorTree to answer the question correctly and quickly.

rdar://77181156

Differential Revision: https://reviews.llvm.org/D102963

3 years ago[llvm] Revert align attr test in test/Bitcode/attribute-3.3.ll
Steven Wu [Mon, 24 May 2021 16:13:34 +0000 (09:13 -0700)]
[llvm] Revert align attr test in test/Bitcode/attribute-3.3.ll

Revert testcase changed in D87304 now the upgrader can correctly handle
the align attribute in upgrader.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D102880

3 years ago[mlir][tosa] Align tensor rank specifications with current spec
Suraj Sudhir [Mon, 24 May 2021 15:47:24 +0000 (15:47 +0000)]
[mlir][tosa] Align tensor rank specifications with current spec

Deconstrains several TOSA operators to align with the current TOSA spec, including all the elementwise ops.
Note: some more ops are under consideration for further cleanup; they will follow once the spec has been updated.

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D102958

3 years ago[scudo] Separate Fuchsia & Default SizeClassMap
Kostya Kortchinsky [Wed, 19 May 2021 16:10:30 +0000 (09:10 -0700)]
[scudo] Separate Fuchsia & Default SizeClassMap

The Fuchsia allocator config was using the default size class map.

This CL gives Fuchsia its own size class map and changes a couple of
things in the default one:
- make `SizeDelta` configurable in `Config` for a fixed size class map
  as it currently is for a table size class map;
- switch `SizeDelta` to 0 for the default config, it allows for size
  classes that allow for power of 2s, and overall better wrt pages
  filling;
- increase the max number of caches pointers to 14 in the default,
  this makes the transfer batch 64/128 bytes on 32/64-bit platforms,
  which is cache-line friendly (previous size was 48/96 bytes).

The Fuchsia size class map remains untouched for now, this doesn't
impact Android which uses the table size class map.

Differential Revision: https://reviews.llvm.org/D102783

3 years ago[CVP] Add additional test for phi common val transform (NFC)
Nikita Popov [Mon, 24 May 2021 15:28:38 +0000 (17:28 +0200)]
[CVP] Add additional test for phi common val transform (NFC)

3 years ago[LoopUnroll] Add additional trip multiple test (NFC)
Nikita Popov [Mon, 24 May 2021 13:16:06 +0000 (15:16 +0200)]
[LoopUnroll] Add additional trip multiple test (NFC)

This uses a trip multiple on a (unique) non-latch exit.

3 years ago[LoopUnroll] Regenerate test checks (NFC)
Nikita Popov [Mon, 24 May 2021 10:20:47 +0000 (12:20 +0200)]
[LoopUnroll] Regenerate test checks (NFC)

3 years agoRemark was added to clang tooling Diagnostic
Ivan Murashko [Mon, 24 May 2021 15:21:44 +0000 (11:21 -0400)]
Remark was added to clang tooling Diagnostic

The diff adds Remark to Diagnostic::Level for clang tooling. That makes
Remark diagnostic level ready to use in clang-tidy checks: the
clang-diagnostic-module-import becomes visible as a part of the change.

3 years ago[CostModel][X86] Add missing SSE41 v2iX sext/zext costs
Simon Pilgrim [Mon, 24 May 2021 14:53:31 +0000 (15:53 +0100)]
[CostModel][X86] Add missing SSE41 v2iX sext/zext costs

Also fix existing v4i8->v4i16 sext cost to match the equivalents

3 years ago[libc++][doc] Update format paper status.
Mark de Wever [Mon, 24 May 2021 14:44:22 +0000 (16:44 +0200)]
[libc++][doc] Update format paper status.

- Fixes paper number P1862 -> P1868. (The title was correct.)
- Marks P1868 as in progress.
- Marks P1892 as in progress.
- Marks LWG-3327 as nothing to do, since the wording change doesn't
  impact the code. (Also updated on the general C++20 status page.)

3 years ago[NVPTX] Fix lowering of frem for negative values
thomasraoux [Mon, 24 May 2021 14:36:29 +0000 (07:36 -0700)]
[NVPTX] Fix lowering of frem for negative values

to match fmod frem result must have the dividend sign. Previous implementation
had the wrong sign when passing negative numbers. For ex: frem(-16, 7) was
returning 5 instead of -2. We should just a ftrunc instead of floor when
lowering to get the right behavior.

Differential Revision: https://reviews.llvm.org/D102528

3 years ago[CostModel][X86] Regenerate sse-itoi.ll test checks
Simon Pilgrim [Mon, 24 May 2021 14:40:42 +0000 (15:40 +0100)]
[CostModel][X86] Regenerate sse-itoi.ll test checks

3 years ago[ConstProp] propagate poison from vector reduction element(s) to result
Sanjay Patel [Mon, 24 May 2021 14:15:08 +0000 (10:15 -0400)]
[ConstProp] propagate poison from vector reduction element(s) to result

This follows from the underlying logic for binops and min/max.
Although it does not appear that we handle this for min/max
intrinsics currently.
https://alive2.llvm.org/ce/z/Kq9Xnh

3 years ago[ConstProp] add tests for vector reductions with poison elements; NFC
Sanjay Patel [Mon, 24 May 2021 13:55:36 +0000 (09:55 -0400)]
[ConstProp] add tests for vector reductions with poison elements; NFC

3 years ago[VPlan] Add first VPlan version of sinkScalarOperands.
Florian Hahn [Mon, 24 May 2021 13:14:08 +0000 (14:14 +0100)]
[VPlan] Add first VPlan version of sinkScalarOperands.

This patch adds a first VPlan-based implementation of sinking of scalar
operands.

The current version traverse a VPlan once and processes all operands of
a predicated REPLICATE recipe. If one of those operands can be sunk,
it is moved to the block containing the predicated REPLICATE recipe.
Continue with processing the operands of the sunk recipe.

The initial version does not re-process candidates after other recipes
have been sunk. It also cannot partially sink induction increments at
the moment. The VPlan only contains WIDEN-INDUCTION recipes and if the
induction is used for example in a GEP, only the first lane is used and
in the lowered IR the adds for the other lanes can be sunk into the
predicated blocks.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D100258

3 years ago[lldb] Readd deleted variable in the sample test
Raphael Isemann [Mon, 24 May 2021 14:24:45 +0000 (16:24 +0200)]
[lldb] Readd deleted variable in the sample test

In D102771 wanted to make `test_var` global to demonstrate the a no-launch test,
but the old variable is still needed for another test. This just creates the
global var with a different name to demonstrate the no-launch functionality.

3 years ago[lldb] Introduce createTestTarget for creating a valid target in API tests
Raphael Isemann [Mon, 24 May 2021 14:01:48 +0000 (16:01 +0200)]
[lldb] Introduce createTestTarget for creating a valid target in API tests

At the moment nearly every test calls something similar to
`self.dbg.CreateTarget(self.getBuildArtifact("a.out"))` and them sometimes
checks if the created target is actually valid with something like
`self.assertTrue(target.IsValid(), "some useless text")`.

Beside being really verbose the error messages generated by this pattern are
always just indicating that the target failed to be created but now why.

This patch introduces a helper function `createTestTarget` to our Test class
that creates the target with the much more verbose `CreateTarget` overload that
gives us back an SBError (with a fancy error). If the target couldn't be created
the function prints out the SBError that LLDB returned and asserts for us. It
also defaults to the "a.out" build artifact path that nearly all tests are using
to avoid to hardcode "a.out" in every test.

I converted a bunch of tests to the new function but I'll do the rest of the
test suite as follow ups.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D102771

3 years ago[lldb] Reland "Fix UB in half2float" to fix the ubsan bot.
Raphael Isemann [Mon, 24 May 2021 13:01:15 +0000 (15:01 +0200)]
[lldb] Reland "Fix UB in half2float" to fix the ubsan bot.

This relands part of the UB fix in 4b074b49be206306330076b9fa40632ef1960823.
The original commit also added some additional tests that uncovered some
other issues (see D102845). I landed all the passing tests in
48780527dd6820698f3537f5ebf76499030ee349 and this patch is now just fixing
the UB in half2float. See D102846 for a proposed rewrite of the function.

Original commit message:

  The added DumpDataExtractorTest uncovered that this is lshifting a negative
  integer which upsets ubsan and breaks the sanitizer bot. This patch just
  changes the variable we shift to be unsigned.

3 years ago[OpenCL][Docs] Minor update to OpenCL 3.0
Anastasia Stulova [Mon, 24 May 2021 13:18:56 +0000 (14:18 +0100)]
[OpenCL][Docs] Minor update to OpenCL 3.0

3 years ago[CostModel][X86] Improve accuracy of vector non-uniform shift costs on XOP/AVX2 targets
Simon Pilgrim [Mon, 24 May 2021 12:59:15 +0000 (13:59 +0100)]
[CostModel][X86] Improve accuracy of vector non-uniform shift costs on XOP/AVX2 targets

By llvm-mca analysis, Haswell/Broadwell has a non-uniform vector shift recip-throughput cost of the AVX2 targets at 2 for both 128 and 256-bit vectors - XOP capable targets have better 128-bit vector shifts so improve the fallback in those cases.

3 years ago[VectorCombine] Fix load extract scalarization tests with assumes.
Florian Hahn [Mon, 24 May 2021 10:44:13 +0000 (11:44 +0100)]
[VectorCombine] Fix load extract scalarization tests with assumes.

The input IR for @load_extract_idx_var_i64_known_valid_by_assume
and @load_extract_idx_var_i64_not_known_valid_by_assume_after_load
has been swapped.

This patch fixes the test so that @load_extract_idx_var_i64_known_valid_by_assume
has the assume before the load and the other test has it after.

3 years ago[VPlan] Add mayReadOrWriteMemory & friends.
Florian Hahn [Fri, 14 May 2021 22:18:04 +0000 (23:18 +0100)]
[VPlan] Add mayReadOrWriteMemory & friends.

This patch adds initial implementation of mayReadOrWriteMemory,
mayReadFromMemory and mayWriteToMemory to VPRecipeBase.

Used by D100258.

3 years ago[OpenCL] Fix test by adding SPIR triple
Anastasia Stulova [Mon, 24 May 2021 12:03:32 +0000 (13:03 +0100)]
[OpenCL] Fix test by adding SPIR triple

3 years ago[AArch64][SVE] Add fixed length codegen for FP_ROUND/FP_EXTEND
Bradley Smith [Tue, 11 May 2021 15:39:36 +0000 (16:39 +0100)]
[AArch64][SVE] Add fixed length codegen for FP_ROUND/FP_EXTEND

Depends on D102498

Differential Revision: https://reviews.llvm.org/D102607

3 years ago[AArch64][SVE] Improve codegen for fixed length vector concat
Bradley Smith [Fri, 14 May 2021 11:22:52 +0000 (12:22 +0100)]
[AArch64][SVE] Improve codegen for fixed length vector concat

Differential Revision: https://reviews.llvm.org/D102498

3 years ago[OpenCL] Add clang extension for bit-fields.
Anastasia Stulova [Mon, 24 May 2021 11:38:02 +0000 (12:38 +0100)]
[OpenCL] Add clang extension for bit-fields.

Allow use of bit-fields as a clang extension
in OpenCL. The extension can be enabled using
pragma directives.

This fixes PR45339!

Differential Revision: https://reviews.llvm.org/D101843

3 years ago[ARM] Allow findLoopPreheader to return headers with multiple loop successors
David Green [Mon, 24 May 2021 11:22:15 +0000 (12:22 +0100)]
[ARM] Allow findLoopPreheader to return headers with multiple loop successors

The findLoopPreheader function will currently not find a preheader if it
branches to multiple different loop headers. This patch adds an option
to relax that, allowing ARMLowOverheadLoops to process more loops
successfully. This helps with WhileLoopStart setup instructions that can
branch/fallthrough to the low overhead loop and to branch to a separate
loop from the same preheader (but I don't believe it is possible for
both loops to be low overhead loops).

Differential Revision: https://reviews.llvm.org/D102747

3 years agoRecommit "[VectorCombine] Scalarize vector load/extract."
Florian Hahn [Mon, 24 May 2021 09:11:38 +0000 (10:11 +0100)]
Recommit "[VectorCombine] Scalarize vector load/extract."

This reverts commit 94d54155e2f38b56171811757044a3e6f643c14b.

This fixes a sanitizer failure by moving scalarizeLoadExtract(I)
before foldSingleElementStore(I), which may remove instructions.

3 years ago[ARM] Ensure WLS preheader blocks have branches during memcpy lowering
David Green [Mon, 24 May 2021 10:26:45 +0000 (11:26 +0100)]
[ARM] Ensure WLS preheader blocks have branches during memcpy lowering

This makes sure that the blocks created for lowering memcpy to loops end
up with branches, even if they fall through to the successor. Otherwise
IfCvt is getting confused with unanalyzable branches and creating
invalid block layouts.

The extra branches should be removed as the tail predicated loop is
finalized in almost all cases.

3 years ago[ARM] Fix inline memcpy trip count sequence
David Green [Mon, 24 May 2021 10:01:58 +0000 (11:01 +0100)]
[ARM] Fix inline memcpy trip count sequence

The trip count for a memcpy/memset will be n/16 rounded up to the
nearest integer. So (n+15)>>4. The old code was including a BIC too, to
clear one of the bits, which does not seem correct. This remove the
extra BIC.

Note that ideally this would never actually be generated, as in the
creation of a tail predicated loop we will DCE that setup code, letting
the WLSTP perform the trip count calculation. So this doesn't usually
come up in testing (and apparently the ARMLowOverheadLoops pass does not
do any sort of validation on the tripcount). Only if the generation of
the WLTP fails will it use the incorrect BIC instructions.

Differential Revision: https://reviews.llvm.org/D102629

3 years ago[MLIR] Drop old cmake var names
Uday Bondhugula [Mon, 24 May 2021 03:20:17 +0000 (08:50 +0530)]
[MLIR] Drop old cmake var names

Drop old cmake variable names that were kept around so that zorg
buildbot could be migrated, which has now happened (D102977). D102976
had fixed the inconsistent names.

Differential Revision: https://reviews.llvm.org/D102997

3 years ago[RISCV] Prevent store combining from infinitely looping
Fraser Cormack [Fri, 21 May 2021 12:00:19 +0000 (13:00 +0100)]
[RISCV] Prevent store combining from infinitely looping

RVV code generation does not successfully custom-lower BUILD_VECTOR in all
cases. When it resorts to default expansion it may, on occasion, be expanded to
scalar stores through the stack. Unfortunately these stores may then be picked
up by the post-legalization DAGCombiner which merges them again. The merged
store uses a BUILD_VECTOR which is then expanded, and so on.

This patch addresses the issue by overriding the `mergeStoresAfterLegalization`
hook. A lack of granularity in this method (being passed the scalar type) means
we opt out in almost all cases when RVV fixed-length vector support is enabled.
The only exception to this rule are mask vectors, which are always either
custom-lowered or are expanded to a load from a constant pool.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D102913

3 years ago[debuginfo-tests] Stop using installed LLDB and remove redundancy
James Henderson [Tue, 18 May 2021 09:57:32 +0000 (10:57 +0100)]
[debuginfo-tests] Stop using installed LLDB and remove redundancy

The removed code just replicated what use_llvm_tool does, plus looked
for an installed LLDB on the PATH to use. In a monorepo world, it seems
likely that if people want to run the tests that require LLDB, they
should enable and build LLDB itself. If users really want to use the
installed LLDB executable, they can specify the path to the executable
as an environment variable "LLDB".

See the discussion in https://reviews.llvm.org/D95339#2638619 for
more details.

Reviewed by: jmorse, aprantl

Differential Revision: https://reviews.llvm.org/D102680

3 years ago[NFCI][LoopIdiom] 'left-shift until bittest': assert that BaseX is loop-invariant
Roman Lebedev [Mon, 24 May 2021 09:13:05 +0000 (12:13 +0300)]
[NFCI][LoopIdiom] 'left-shift until bittest': assert that BaseX is loop-invariant

Given that BaseX is an incoming value when coming from the preheader,
it *should* be loop-invariant, but let's just document this assumption.

3 years ago[LoopIdiom] 'logical right shift until zero': the value must be loop-invariant
Roman Lebedev [Mon, 24 May 2021 09:08:30 +0000 (12:08 +0300)]
[LoopIdiom] 'logical right shift until zero': the value must be loop-invariant

As per the reproducer provided by Mikael Holmén in post-commit review.

3 years agoflang: include limits
Thorsten Schütt [Mon, 24 May 2021 09:11:52 +0000 (11:11 +0200)]
flang: include limits

3 years agoRevert "[VectorCombine] Scalarize vector load/extract."
Florian Hahn [Mon, 24 May 2021 09:11:00 +0000 (10:11 +0100)]
Revert "[VectorCombine] Scalarize vector load/extract."

This reverts commit 86497785d540e59eaca24bed4219ddec183cbc9b.

One of the tests causes an ASAN failure.
https://lab.llvm.org/buildbot/#/builders/5/builds/7927/steps/12/logs/stdio

3 years ago[CostModel][X86] Improve accuracy of vXi64 MUL costs on AVX2/AVX512 targets
Simon Pilgrim [Sun, 23 May 2021 21:50:45 +0000 (22:50 +0100)]
[CostModel][X86] Improve accuracy of vXi64 MUL costs on AVX2/AVX512 targets

By llvm-mca analysis, Haswell/Broadwell has the worst v4i64 recip-throughput cost of the AVX2 targets at 6 (vs the currently used cost of 8). Similarly SkylakeServer (our only AVX512 target model) implements PMULLQ with an average cost of 1.5 (rounded up to 2.0), and the PMULUDQ-sequence (without AVX512DQ) as a cost of 6.

3 years ago[AMDGPU][Libomptarget] Remove global KernelNameMap
Pushpinder Singh [Fri, 21 May 2021 08:03:26 +0000 (08:03 +0000)]
[AMDGPU][Libomptarget] Remove global KernelNameMap

KernelNameMap contains entries like "key.kd" => key which clearly
could be replaced by simple logic of removing suffix from the key.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D102691

3 years ago[Debug-Info]update section name to match AIX behaviour; nfc
Chen Zheng [Mon, 24 May 2021 08:32:23 +0000 (04:32 -0400)]
[Debug-Info]update section name to match AIX behaviour; nfc

3 years ago[VectorCombine] Scalarize vector load/extract.
Florian Hahn [Mon, 24 May 2021 08:19:40 +0000 (09:19 +0100)]
[VectorCombine] Scalarize vector load/extract.

This patch adds a new combine that tries to scalarize chains of
`extractelement (load %ptr), %idx` to `load (gep %ptr, %idx)`. This is
profitable when extracting only a few elements out of a large vector.

At the moment, `store (extractelement (load %ptr), %idx), %ptr`
operations on large vectors result in huge code in the backend.

This can easily be triggered by using the matrix extension, e.g.
https://clang.godbolt.org/z/qsccPdPf4

This should complement D98240.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D100273

3 years ago[analyzer] Correctly propagate ConstructionContextLayer thru ParenExpr
Tomasz Kamiński [Mon, 24 May 2021 08:16:52 +0000 (10:16 +0200)]
[analyzer] Correctly propagate ConstructionContextLayer thru ParenExpr

Previously, information about `ConstructionContextLayer` was not
propagated thru causing the expression like:

  Var c = (createVar());

To produce unrelated temporary for the `createVar()` result and conjure
a new symbol for the value of `c` in C++17 mode.

Reviewed By: steakhal

Patch By: tomasz-kaminski-sonarsource!

Differential Revision: https://reviews.llvm.org/D102835

3 years ago[Attributor] Introduce a helper do deal with constant type mismatches
Johannes Doerfert [Mon, 10 May 2021 01:02:18 +0000 (20:02 -0500)]
[Attributor] Introduce a helper do deal with constant type mismatches

If we simplify values we sometimes end up with type mismatches. If the
value is a constant we can often cast it though to still allow
propagation. The logic is now put into a helper and it replaces some
ad hoc things we did before.

This also introduces the AA namespace for abstract attribute related
functions and types.

3 years ago[Attributor] Teach AAIsDead about undef values
Johannes Doerfert [Sat, 8 May 2021 04:05:40 +0000 (23:05 -0500)]
[Attributor] Teach AAIsDead about undef values

Not only if the branch or switch condition is dead but also if it is
assumed `undef` we can delay AAIsDead exploration.

3 years ago[Attributor] Deal with address spaces gracefully
Johannes Doerfert [Thu, 6 May 2021 15:21:16 +0000 (10:21 -0500)]
[Attributor] Deal with address spaces gracefully

When we do value propagation we need to cast address spaces properly.

3 years ago[Attributor] Be more careful to not disturb the CG outside the SCC
Johannes Doerfert [Sun, 16 May 2021 01:19:11 +0000 (20:19 -0500)]
[Attributor] Be more careful to not disturb the CG outside the SCC

We have seen various problems when the call graph was not updated or
the updated did not succeed because it involved functions outside the
SCC. This patch adds assertions and checks to avoid accidentally
changing something outside the SCC that would impact the call graph.
It also prevents us from reanalyzing functions outside the current
SCC which could cause problems on its own. Note that the transformations
we do might cause the CG to be "more precise" but the original one would
always be a super set of the most precise one. Since the call graph is
by nature an approximation, it is good enough to have a super set of all
call edges.

3 years ago[MLIR] [Python] Add Operation.parent
John Demme [Mon, 24 May 2021 03:37:55 +0000 (20:37 -0700)]
[MLIR] [Python] Add Operation.parent

Attribute to get the parent operation of an operation.

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D102981

3 years ago[lld][MachO] Fix code formatting
Alexander Shaposhnikov [Mon, 24 May 2021 03:35:55 +0000 (20:35 -0700)]
[lld][MachO] Fix code formatting

Apply clang-format -style=llvm to InputFile.cpp. NFC.

Test plan: make check-all

3 years ago[MLIR] Make MLIR cmake variable names consistent
Uday Bondhugula [Sat, 22 May 2021 14:25:38 +0000 (19:55 +0530)]
[MLIR] Make MLIR cmake variable names consistent

Fix inconsistent MLIR CMake variable names. Consistently name them as
MLIR_ENABLE_<feature>.

Eg: MLIR_CUDA_RUNNER_ENABLED -> MLIR_ENABLE_CUDA_RUNNER

MLIR follows (or has mostly followed) the convention of naming
cmake enabling variables in the from MLIR_ENABLE_... etc. Using a
convention here is easy and also important for convenience. A counter
pattern was started with variables named MLIR_..._ENABLED. This led to a
sequence of related counter patterns: MLIR_CUDA_RUNNER_ENABLED,
MLIR_ROCM_RUNNER_ENABLED, etc.. From a naming standpoint, the imperative
form is more meaningful. Additional discussion at:
https://llvm.discourse.group/t/mlir-cmake-enable-variable-naming-convention/3520

Switch all inconsistent ones to the ENABLE form. Keep the couple of old
mappings needed until buildbot config is migrated.

Differential Revision: https://reviews.llvm.org/D102976

3 years ago[mlir] Normalize dynamic memrefs with a map of tiled-layout.
Haruki Imai [Mon, 24 May 2021 03:04:45 +0000 (08:34 +0530)]
[mlir] Normalize dynamic memrefs with a map of tiled-layout.

Steps for normalizing dynamic memrefs for tiled layout map
1. Check if original map is tiled layout. Only tiled layout is supported.
2. Create normalized memrefType. Dimensions that include dynamic dimensions
   in the map output will be dynamic dimensions.
3. Create new maps to calculate each dimension size of new memref.
   In tiled layout, the dimension size can be calculated by replacing
    "floordiv <tile size>" with "ceildiv <tile size>" and
    "mod <tile size>" with "<tile size>".
4. Create AffineApplyOp to apply the new maps. The output of AffineApplyOp is
   dynamicSizes for new AllocOp.
5. Add the new dynamic sizes in new AllocOp.

This patch also set MemRefsNormalizable trant in CastOp and DimOp since
they used with dynamic memrefs.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D97655

3 years ago[Attributor][FIX] Account for undef in the constant value lattice
Johannes Doerfert [Sat, 8 May 2021 04:45:07 +0000 (23:45 -0500)]
[Attributor][FIX] Account for undef in the constant value lattice

The constant value lattice looks like this

```
  <None>
     |
  <undef>
  /  |   \
... <0>  ...
 \   |   /
 <unknown>
```
We did not account for the undef and assumed a value meant we could not
change anymore. Now we actually check if we have the same value as
before, which will signal CHANGED to the users when we go from undef to
a specific constant.

This fixes, among other things, the bug exposed by @ipccp4 in
`value-simplify.ll`.

3 years ago[Attributor][FIX] Ensure we replace undef if we see the first "real" value
Johannes Doerfert [Thu, 6 May 2021 21:51:37 +0000 (16:51 -0500)]
[Attributor][FIX] Ensure we replace undef if we see the first "real" value

The state of AAPotentialValues tracks if undef is contained. It should
fold undef into the first non-undef value. However we missed a case
before. There was also a shadowing definition of two variables that
caused trouble. The test exposes both problems.

3 years ago[Attributor][NFC] Precommit test case with branch on undef
Johannes Doerfert [Fri, 21 May 2021 17:43:15 +0000 (12:43 -0500)]
[Attributor][NFC] Precommit test case with branch on undef

This test exposes a bug in the module pass as it simplifies ipccp4 to
unreachable, which is unfortunately wrong.

3 years ago[Attributor][NFC] Add helpful debug outputs
Johannes Doerfert [Sun, 16 May 2021 01:16:39 +0000 (20:16 -0500)]
[Attributor][NFC] Add helpful debug outputs

3 years ago[Attributor][NFC] Clang format the Attributor source files
Johannes Doerfert [Thu, 6 May 2021 16:27:06 +0000 (11:27 -0500)]
[Attributor][NFC] Clang format the Attributor source files

3 years ago[Attributor][NFC] Rerun update_test_checks script on Attributor tests
Johannes Doerfert [Sun, 16 May 2021 01:32:59 +0000 (20:32 -0500)]
[Attributor][NFC] Rerun update_test_checks script on Attributor tests

3 years ago[Debug-Info] handle DW_TAG_rvalue_reference_type at strict DWARF.
Chen Zheng [Thu, 20 May 2021 09:52:34 +0000 (05:52 -0400)]
[Debug-Info] handle DW_TAG_rvalue_reference_type at strict DWARF.

When -gstrict-dwarf is specified, generate DW_TAG_rvalue_reference_type
at DWARF 4 or above

Reviewed By: dblaikie, aprantl

Differential Revision: https://reviews.llvm.org/D100630

3 years ago[NFC] Removing leftover debug code
Fady Ghanim [Sun, 23 May 2021 23:13:09 +0000 (19:13 -0400)]
[NFC] Removing leftover debug code

Removing a missed value::dump() used to debug during development of
OMPBuilder atomic.

3 years ago[AArch64] Delete unneeded fixup_aarch64_ldr_pcrel_imm19 VK_GOT special case
Fangrui Song [Sun, 23 May 2021 22:20:56 +0000 (15:20 -0700)]
[AArch64] Delete unneeded fixup_aarch64_ldr_pcrel_imm19 VK_GOT special case

An AArch64 VK_GOT fixup must have a symbol. MCAssembler::evaluateFixup considers
such a fixup not resolved. The code path cannot trigger.

3 years ago[OpenMP][OMPIRBuilder]Adding support for `omp atomic`
Fady Ghanim [Thu, 6 May 2021 21:23:28 +0000 (17:23 -0400)]
[OpenMP][OMPIRBuilder]Adding support for `omp atomic`

This patch adds support for generating `omp atomic` for all different
atomic clauses

3 years ago[NFC][scudo] Enforce header size alignment
Vitaly Buka [Sun, 23 May 2021 21:12:49 +0000 (14:12 -0700)]
[NFC][scudo] Enforce header size alignment

As-is it should not change struct size, but it will
help to keep correct size if more fields added.

3 years ago[MC] Refactor MCObjectFileInfo initialization and allow targets to create MCObjectFil...
Philipp Krones [Sun, 23 May 2021 21:15:23 +0000 (14:15 -0700)]
[MC] Refactor MCObjectFileInfo initialization and allow targets to create MCObjectFileInfo

This makes it possible for targets to define their own MCObjectFileInfo.
This MCObjectFileInfo is then used to determine things like section alignment.

This is a follow up to D101462 and prepares for the RISCV backend defining the
text section alignment depending on the enabled extensions.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D101921

3 years ago[libc++] use more early returns for consistency
Joerg Sonnenberger [Sun, 23 May 2021 20:55:45 +0000 (22:55 +0200)]
[libc++] use more early returns for consistency

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D96983

3 years ago[LoopUnroll] Add test for partial unrolling again non-latch exit (NFC)
Nikita Popov [Sun, 23 May 2021 21:08:32 +0000 (23:08 +0200)]
[LoopUnroll] Add test for partial unrolling again non-latch exit (NFC)

This test case would get miscompiled by the current version of
D102982, because unrolling does not respect the PreserveCondBr
flag for partial unrolling.

3 years ago[IR] Add a Location to BlockArgument
Chris Lattner [Sun, 23 May 2021 21:08:31 +0000 (14:08 -0700)]
[IR] Add a Location to BlockArgument

This adds the ability to specify a location when creating BlockArguments.
Notably Value::getLoc() will return this correctly, which makes diagnostics
more precise (e.g. the example in test-legalize-type-conversion.mlir).

This is currently optional to avoid breaking any existing code - if
absent, the BlockArgument defaults to using the location of its enclosing
operation (preserving existing behavior).

The bulk of this change is plumbing location tracking through the parser
and printer to make sure it can round trip (in -mlir-print-debuginfo
mode).  This is complete for generic operations, but requires manual
adoption for custom ops.

I added support for function-like ops to round trip their argument
locations - they print correctly, but when parsing the locations are
dropped on the floor.  I intend to fix this, but it will require more
invasive plumbing through "function_like_impl" stuff so I think it
best to split it out to its own patch.

This is a reapply of the patch here: https://reviews.llvm.org/D102567
with an additional change: we now never defer block argument locations,
guaranteeing that we can round trip correctly.

This isn't required in all cases, but allows us to hill climb here and
works around unrelated bugs like https://bugs.llvm.org/show_bug.cgi?id=50451

Differential Revision: https://reviews.llvm.org/D102991

3 years ago[AArch64][MC] Remove unneeded "in .xxx directive" from diagnostics
Fangrui Song [Sun, 23 May 2021 20:58:16 +0000 (13:58 -0700)]
[AArch64][MC] Remove unneeded "in .xxx directive" from diagnostics

The prevailing style does not add the message. The directive name is not useful
because the next line replicates the error line which includes the directive.

3 years ago[SPARC] recognize the "rd %pc, reg" special form
Joerg Sonnenberger [Sun, 23 May 2021 20:52:59 +0000 (22:52 +0200)]
[SPARC] recognize the "rd %pc, reg" special form

Differential Revision: https://reviews.llvm.org/D96312

3 years agoNFC: cleaned up and renamed scalable-vf-analysis.ll -> scalable-vectorization.ll
Sander de Smalen [Sun, 23 May 2021 13:41:13 +0000 (14:41 +0100)]
NFC: cleaned up and renamed scalable-vf-analysis.ll -> scalable-vectorization.ll

* Removes unnecessary loop hints.
* Use RUN line with '-scalable-vectorization=preferred' instead of 'on'
  for the maximize-bandwidth behaviour. This prepares the test for enabling
  scalable vectorization; With a forced instruction-cost of 1, 'on' will
  always favour fixed-width VF to be chosen, whereas with 'preferred'
  we can check that the maximize-bandwidth option in combination with
  scalable-vectorization=preferred actually picks a scalable VF.
* Renamed to scalable-vectorization.ll, because a follow-up patch will
  test more than just analysis.

3 years ago[NFC][X86][Costmodel] Add tests with with masked loads/stores w/non-power-of-two...
Roman Lebedev [Sun, 23 May 2021 18:42:30 +0000 (21:42 +0300)]
[NFC][X86][Costmodel] Add tests with with masked loads/stores w/non-power-of-two vectors

3 years ago[AArch64] Use \t in AsmStreamer to match the prevailing style
Fangrui Song [Sun, 23 May 2021 18:35:42 +0000 (11:35 -0700)]
[AArch64] Use \t in AsmStreamer to match the prevailing style

3 years ago[mlir][doc] Fix links and indentation of mlir::ModuleOp description
Markus Böck [Sun, 23 May 2021 18:00:44 +0000 (20:00 +0200)]
[mlir][doc] Fix links and indentation of mlir::ModuleOp description

All lines after the first are currently indented by one char further to the left than the first line. This leads to the first character of each sentence being cut from the resulting Markdown file after compilation. The text also contains 3 references to sections of other markdown files. One was missing the file, while the other two had outdated files, leading to 404 errors in the documentation.

Differential Revision: https://reviews.llvm.org/D102983

3 years agoFix bugs URL for PR relocations
Simon Pilgrim [Sun, 23 May 2021 16:19:36 +0000 (17:19 +0100)]
Fix bugs URL for PR relocations

The PR works from llvm.org, not bugs.llvm.org

3 years ago[CostModel][X86] Align v2i64 MUL costs on SSE42+ targets with worst case
Simon Pilgrim [Sun, 23 May 2021 09:29:34 +0000 (10:29 +0100)]
[CostModel][X86] Align v2i64 MUL costs on SSE42+ targets with worst case

Based on worst case of sandybridge (which seems to match nehalem for this SSE sequence) (vs btver2 + bdver2) llvm-mca analysis

3 years ago[gn build] (semi-manually) port 0bccdf82f705
Nico Weber [Sun, 23 May 2021 14:01:06 +0000 (10:01 -0400)]
[gn build] (semi-manually) port 0bccdf82f705

3 years ago[InstSimplify] add more tests for rem-mul-div; NFC
Sanjay Patel [Sun, 23 May 2021 13:42:35 +0000 (09:42 -0400)]
[InstSimplify] add more tests for rem-mul-div; NFC

See D102864 for discussion.

3 years ago[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass
maekawatoshiki [Sun, 23 May 2021 13:32:01 +0000 (22:32 +0900)]
[LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass

This patch changes LoopUnrollAndJamPass from FunctionPass to LoopNest pass.
The next patch will utilize LoopNest to effectively handle loop nests.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D99149

3 years ago[LoopUnroll] Add test for unrollable non-latch multi-exit (NFC)
Nikita Popov [Sun, 23 May 2021 08:49:11 +0000 (10:49 +0200)]
[LoopUnroll] Add test for unrollable non-latch multi-exit (NFC)

This test case requires unrolling against a non-latch exit in
a multiple-exit loop with exiting latch. It's not covered by
exiting heuristics or the extension in D102635.

3 years ago[ARM] Add extra debug messages for gather/scatter lowering. NFC
David Green [Sun, 23 May 2021 07:52:13 +0000 (08:52 +0100)]
[ARM] Add extra debug messages for gather/scatter lowering. NFC

3 years ago[NFC][scudo] Replace size_t with uptr
Vitaly Buka [Sun, 23 May 2021 05:55:53 +0000 (22:55 -0700)]
[NFC][scudo] Replace size_t with uptr

3 years ago[NFC][scudo] Add releasePagesToOS test
Vitaly Buka [Sun, 23 May 2021 05:30:03 +0000 (22:30 -0700)]
[NFC][scudo] Add releasePagesToOS test