platform/upstream/llvm.git
3 years ago[InstCombine] matchBSwapOrBitReverse - remove pattern matching early-out. NFCI.
Simon Pilgrim [Sat, 20 Feb 2021 12:25:58 +0000 (12:25 +0000)]
[InstCombine] matchBSwapOrBitReverse - remove pattern matching early-out. NFCI.

recognizeBSwapOrBitReverseIdiom + collectBitParts have pattern matching to bail out early if a bswap/bitreverse pattern isn't possible - we should be able to rely on this instead without any notable change in compile time.

This is part of a cleanup towards letting matchBSwapOrBitReverse /recognizeBSwapOrBitReverseIdiom use 'root' instructions that aren't ORs (FSHL/FSHRs in particular which can be prematurely created).

Differential Revision: https://reviews.llvm.org/D97056

3 years ago[libc++] Fix the build for AppleClang.
Mark de Wever [Sat, 20 Feb 2021 12:52:40 +0000 (13:52 +0100)]
[libc++] Fix the build for AppleClang.

Forgot to add some parts of D93593, this should disable the tests on
Apple. Seems Louis was right ;-)

3 years ago[RISCV] Pre-commit test case for D97055. NFC.
Fraser Cormack [Sat, 20 Feb 2021 12:34:27 +0000 (12:34 +0000)]
[RISCV] Pre-commit test case for D97055. NFC.

This adds a test which unnecessarily spills mask registers.

3 years ago[X86][SSE] Use llvm min/max intrinsics instead of (deprecated) sse intrinsics. NFCI.
Simon Pilgrim [Sat, 20 Feb 2021 12:17:46 +0000 (12:17 +0000)]
[X86][SSE] Use llvm min/max intrinsics instead of (deprecated) sse intrinsics. NFCI.

These are auto-upgraded to the equivalent llvm variants now.

3 years ago[X86][SSE] vector-compare-combines.ll - use llvm min/max intrinsics instead of (depre...
Simon Pilgrim [Sat, 20 Feb 2021 12:16:54 +0000 (12:16 +0000)]
[X86][SSE] vector-compare-combines.ll - use llvm min/max intrinsics instead of (deprecated) sse intrinsics. NFCI.

These are auto-upgraded to the equivalent llvm variants now.

3 years ago[X86][AVX] Remove AVX2 min/max intrinsics tests
Simon Pilgrim [Sat, 20 Feb 2021 12:13:06 +0000 (12:13 +0000)]
[X86][AVX] Remove AVX2 min/max intrinsics tests

These are now autoupgraded to the llvm equivalents and the tests already moved avx2-intrinsics-x86-upgrade.ll

3 years ago[X86][SSE] Remove SSE41 min/max intrinsics tests
Simon Pilgrim [Sat, 20 Feb 2021 12:11:50 +0000 (12:11 +0000)]
[X86][SSE] Remove SSE41 min/max intrinsics tests

These are now autoupgraded to the llvm equivalents and the tests already moved sse41-intrinsics-x86-upgrade.ll

3 years ago[X86][SSE2] Remove SSE2 min/max intrinsics tests
Simon Pilgrim [Sat, 20 Feb 2021 12:10:58 +0000 (12:10 +0000)]
[X86][SSE2] Remove SSE2 min/max intrinsics tests

These are now autoupgraded to the llvm equivalents and the tests already moved sse2-intrinsics-x86-upgrade.ll

3 years ago[X86] KnownBits - use llvm min/max intrinsics instead of (deprecated) sse intrinsics...
Simon Pilgrim [Sat, 20 Feb 2021 12:07:02 +0000 (12:07 +0000)]
[X86] KnownBits - use llvm min/max intrinsics instead of (deprecated) sse intrinsics. NFCI.

These are auto-upgraded to the equivalent llvm variants now.

3 years ago[DAG] foldSubToUSubSat - fold sub(a,trunc(umin(zext(a),b))) -> usubsat(a,trunc(umin...
Simon Pilgrim [Fri, 19 Feb 2021 18:56:20 +0000 (18:56 +0000)]
[DAG] foldSubToUSubSat - fold sub(a,trunc(umin(zext(a),b))) -> usubsat(a,trunc(umin(b,SatLimit)))

This moves the last custom x86 USUBSAT fold to generic DAGCombine.

Completes PR40111

Differential Revision: https://reviews.llvm.org/D96703

3 years ago[ConstantRangeTest] Make exhaustive testing more principled (NFC)
Nikita Popov [Wed, 23 Sep 2020 18:43:59 +0000 (20:43 +0200)]
[ConstantRangeTest] Make exhaustive testing more principled (NFC)

The current infrastructure for exhaustive ConstantRange testing is
somewhat confusing in what exactly it tests and currently cannot even
be used for operations that produce smallest-size results, rather than
signed/unsigned envelopes.

This patch makes the testing more principled by collecting the exact
set of results of an operation into a bit set and then comparing it
against the range approximation by:

 * Checking conservative correctness: All elements in the set must be
   in the range.
 * Checking optimality under a given preference function: None of the
   (slack-free) ranges that can be constructed from the set are
   preferred over the computed range.

Implemented preference functions are:

 * PreferSmallest: Smallest range regardless of signed/unsigned wrapping
   behavior. Probably what we would call "optimal" without further
   qualification.
 * PreferSmallestUnsigned/Signed: Smallest range that has no
   unsigned/signed wrapping. We use this if our calculation is precise
   only up to signed/unsigned envelope.
 * PreferSmallestNonFullUnsigned/Signed: Smallest range that has no
   unsigned/signed wrapping -- but preferring a smaller wrapping range
   over a (non-wrapping) full range. We use this if we have a fully
   precise calculation but apply a sign preference to the result
   (union/intersection). Even with a sign preference, returning a
   wrapping range is still "strictly better" than returning a full one.

This also addresses PR49273 by replacing the fragile manual range
construction logic in testBinarySetOperationExhaustive() with generic
code that isn't specialized to the particular form of ranges that set
operations can produces.

Differential Revision: https://reviews.llvm.org/D88356

3 years ago[Sanitizers][NFC] Fix typo
Luís Marques [Sat, 20 Feb 2021 10:54:00 +0000 (10:54 +0000)]
[Sanitizers][NFC] Fix typo

3 years ago[lit] Add --xfail and --filter-out (inverse of --filter)
David Zarzycki [Tue, 16 Feb 2021 11:16:10 +0000 (06:16 -0500)]
[lit] Add --xfail and --filter-out (inverse of --filter)

In semi-automated environments,  XFAILing or filtering out known regressions without actually committing changes or temporarily modifying the test suite can be quite useful.

Reviewed By: yln

Differential Revision: https://reviews.llvm.org/D96662

3 years agoUpdate BPFAdjustOpt.cpp to accept select form of or as well
Juneyoung Lee [Sat, 20 Feb 2021 09:22:38 +0000 (18:22 +0900)]
Update BPFAdjustOpt.cpp to accept select form of or as well

This is a minor pattern-match update to BPFAdjustOpt.cpp to accept
not only 'or i1 a, b' but also 'select i1 a, i1 true, i1 b'.
This resolves regression after SimplifyCFG's creating select form
of and/or instead (https://reviews.llvm.org/D95026).
This is a small change, and currently such select form isn't created
or doesn't reach to the late pipeline (because InstCombine eagerly
folds it into and/or i1), so I chose to commit without a review process.

3 years ago[AArch64][GlobalISel] Add selection support for G_VECREDUCE of <2 x i32>
Amara Emerson [Sat, 20 Feb 2021 08:38:17 +0000 (00:38 -0800)]
[AArch64][GlobalISel] Add selection support for G_VECREDUCE of <2 x i32>

This selects to a pairwise add and a subreg copy.

3 years ago[libcxx] [test] Remove two unnecesary files/variables in a test
Martin Storsjö [Fri, 19 Feb 2021 21:10:11 +0000 (23:10 +0200)]
[libcxx] [test] Remove two unnecesary files/variables in a test

These don't seem to have any function in the test.

The non_regular_file one seems to have been added in
0f8c8f59df057a85d6d49913ec9877c6d597785b, without any apparent
purpose there.

Differential Revision: https://reviews.llvm.org/D97083

3 years ago[libcxx] Rename a method in PathParser for clarity. NFC.
Martin Storsjö [Fri, 8 Jan 2021 22:20:35 +0000 (00:20 +0200)]
[libcxx] Rename a method in PathParser for clarity. NFC.

Differential Revision: https://reviews.llvm.org/D97081

3 years ago[libc++] Fixes _LIBCPP_HAS_NO_CONCEPTS
Mark de Wever [Sat, 20 Feb 2021 08:13:16 +0000 (09:13 +0100)]
[libc++] Fixes _LIBCPP_HAS_NO_CONCEPTS

Before the define was in a GCC specific part. Now it's available for all
compilers. The patch had its CI run in D93593.

3 years ago[InstCombine] Add more tests to nonnull-select.ll (NFC)
Juneyoung Lee [Sat, 20 Feb 2021 07:59:52 +0000 (16:59 +0900)]
[InstCombine] Add more tests to nonnull-select.ll (NFC)

3 years ago[CodeGen] Use range-based for loops (NFC)
Kazu Hirata [Sat, 20 Feb 2021 06:44:14 +0000 (22:44 -0800)]
[CodeGen] Use range-based for loops (NFC)

3 years ago[TableGen] Use ListSeparator (NFC)
Kazu Hirata [Sat, 20 Feb 2021 06:44:12 +0000 (22:44 -0800)]
[TableGen] Use ListSeparator (NFC)

3 years agoFixed failing test
Dávid Bolvanský [Sat, 20 Feb 2021 06:11:42 +0000 (07:11 +0100)]
Fixed failing test

3 years agoReduce the number of attributes attached to each function
Dávid Bolvanský [Sat, 20 Feb 2021 05:57:47 +0000 (06:57 +0100)]
Reduce the number of attributes attached to each function

This takes advantage of the implicit default behavior to reduce the number of
attributes.

3 years agoReland "[Libcalls, Attrs] Annotate libcalls with noundef"
Dávid Bolvanský [Sat, 20 Feb 2021 03:19:03 +0000 (04:19 +0100)]
Reland "[Libcalls, Attrs] Annotate libcalls with noundef"

Fixed Clang tests.

3 years ago[ValueTracking] Improve impliesPoison
Juneyoung Lee [Thu, 18 Feb 2021 03:38:40 +0000 (12:38 +0900)]
[ValueTracking] Improve impliesPoison

This patch improves ValueTracking's impliesPoison(V1, V2) to do this reasoning:

```
  %res = call { i64, i1 } @llvm.umul.with.overflow.i64(i64 %a, i64 %b)
  %overflow = extractvalue { i64, i1 } %res, 1
  %mul      = extractvalue { i64, i1 } %res, 0

; If %mul is poison, %overflow is also poison, and vice versa.
```

This improvement leads to supporting this optimization under `-instcombine-unsafe-select-transform=0`:

```
define i1 @test2_logical(i64 %a, i64 %b, i64* %ptr) {
; CHECK-LABEL: @test2_logical(
; CHECK-NEXT:    [[MUL:%.*]] = mul i64 [[A:%.*]], [[B:%.*]]
; CHECK-NEXT:    [[TMP1:%.*]] = icmp ne i64 [[A]], 0
; CHECK-NEXT:    [[TMP2:%.*]] = icmp ne i64 [[B]], 0
; CHECK-NEXT:    [[OVERFLOW_1:%.*]] = and i1 [[TMP1]], [[TMP2]]
; CHECK-NEXT:    [[NEG:%.*]] = sub i64 0, [[MUL]]
; CHECK-NEXT:    store i64 [[NEG]], i64* [[PTR:%.*]], align 8
; CHECK-NEXT:    ret i1 [[OVERFLOW_1]]
;

  %res = tail call { i64, i1 } @llvm.umul.with.overflow.i64(i64 %a, i64 %b)
  %overflow = extractvalue { i64, i1 } %res, 1
  %mul = extractvalue { i64, i1 } %res, 0
  %cmp = icmp ne i64 %mul, 0
  %overflow.1 = select i1 %overflow, i1 true, i1 %cmp
  %neg = sub i64 0, %mul
  store i64 %neg, i64* %ptr, align 8
  ret i1 %overflow.1
}
```

Previously, this didn't happen because the flag prevented `select i1 %overflow, i1 true, i1 %cmp` from being `or i1 %overflow, %cmp`.
Note that the select -> or conversion happens only when `impliesPoison(%cmp, %overflow)` returns true.
This improvement allows `impliesPoison` to do the reasoning.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D96929

3 years ago[mlir][sparse] convert function pass to module pass
Aart Bik [Sat, 20 Feb 2021 01:58:08 +0000 (17:58 -0800)]
[mlir][sparse] convert function pass to module pass

Rationale:
Touching function level information can only be done within a module pass.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D97102

3 years ago[SampleFDO] Skip PreLink ICP for better profile quality of MonoLTO PostLink
Wenlei He [Fri, 19 Feb 2021 04:11:58 +0000 (20:11 -0800)]
[SampleFDO] Skip PreLink ICP for better profile quality of MonoLTO PostLink

For ThinLTO, PreLink ICP is skipped to favor better profile annotation during LTO PostLink. This change applies the same tweak for MonoLTO. Note that PreLink ICP not only makes PostLink profile annotation harder, it is also uncoordinated with PostLink ICP so duplicated ICP could happen.

Differential Revision: https://reviews.llvm.org/D97028

3 years agoRevert "[Libcalls, Attrs] Annotate libcalls with noundef"
Dávid Bolvanský [Sat, 20 Feb 2021 03:17:44 +0000 (04:17 +0100)]
Revert "[Libcalls, Attrs] Annotate libcalls with noundef"

This reverts commit 33b0c63775ce58014c55e285671e3315104a6076. Bots are failing. Some Clang tests need to be updated too.

3 years ago[RISCV] Teach our custom vector load/store intrinsic isel code to propagate memory...
Craig Topper [Sat, 20 Feb 2021 02:56:08 +0000 (18:56 -0800)]
[RISCV] Teach our custom vector load/store intrinsic isel code to propagate memory operands if we have them.

We don't currently create memory operands for these intrinsics,
but there was a suggestion of using the indexed load/store
intrinsics to implement isel for scalable vector gather/scatter.
That may propagate the memory operand from the gather/scatter
ISD nodes.

3 years ago[Libcalls, Attrs] Annotate libcalls with noundef
Dávid Bolvanský [Sat, 20 Feb 2021 03:08:50 +0000 (04:08 +0100)]
[Libcalls, Attrs] Annotate libcalls with noundef

I think we can use here same logic as for nonnull.

strlen(X) - X must be noundef => valid pointer.

for libcalls with size arg, we add noundef only if size is known and greater than 0 - so pointers must be noundef (valid ones)

Reviewed By: jdoerfert, aqjune

Differential Revision: https://reviews.llvm.org/D95122

3 years agoRevert "[BuildLibcalls] Mark some libcalls with inaccessiblememonly and inaccessiblem...
Dávid Bolvanský [Sat, 20 Feb 2021 02:58:53 +0000 (03:58 +0100)]
Revert "[BuildLibcalls] Mark some libcalls with inaccessiblememonly and inaccessiblemem_or_argmemonly"

This reverts commit 05d891a19e45687090edcfccfbad334911659eb0.

3 years ago[BuildLibcalls] Mark some libcalls with inaccessiblememonly and inaccessiblemem_or_ar...
Dávid Bolvanský [Wed, 20 Jan 2021 00:27:25 +0000 (01:27 +0100)]
[BuildLibcalls] Mark some libcalls with inaccessiblememonly and inaccessiblemem_or_argmemonly

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D94850

3 years ago[CodeGen] Fix two dots between text section name and symbol name
Pan, Tao [Sat, 20 Feb 2021 02:15:06 +0000 (10:15 +0800)]
[CodeGen] Fix two dots between text section name and symbol name

There is a trailing dot in text section name if it has prefix, don't add
repeated dot when connect text section name and symbol name.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D96327

3 years agoRevert "Implement -bundle_loader"
Vitaly Buka [Sat, 20 Feb 2021 00:04:49 +0000 (16:04 -0800)]
Revert "Implement -bundle_loader"

D95913 passes null pointer into memcpy

This reverts commit 1a0afcf518717f61d45a1cdc6ad1a6436ec663b1.

3 years ago[ValueTypes] Assert if changeVectorElementType is called on a simple type with an...
Craig Topper [Sat, 20 Feb 2021 01:19:25 +0000 (17:19 -0800)]
[ValueTypes] Assert if changeVectorElementType is called on a simple type with an extended element type.

Previously we would use the extended implementation, but
the extended implementation requires the vector type to be extended
so that we can access the LLVMContext. In theory we could
detect this case and use the context from the element type instead,
but since I know of no cases hitting this in practice today
I've done the simplest thing.

Also add asserts to several extended EVT functions that assume
LLVMTy is non-null.

Follow from discussion in D97036

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D97070

3 years ago[dfsan] Add utils that get/set origins
Jianzhou Zhao [Fri, 19 Feb 2021 21:32:37 +0000 (21:32 +0000)]
[dfsan] Add utils that get/set origins

This is a part of https://reviews.llvm.org/D95835.

Reviewed-by: morehouse
Differential Revision: https://reviews.llvm.org/D97087

3 years agoDifferent fix for gcc bug
Jacques Pienaar [Sat, 20 Feb 2021 00:41:00 +0000 (16:41 -0800)]
Different fix for gcc bug

Was still running into

from definition of 'template<class T> struct llvm::DenseMapInfo'
[-fpermissive]
 template <typename T> struct DenseMapInfo;
                               ^

3 years ago[flang][fir] Update flang test tool support classes.
Eric Schweitz [Fri, 19 Feb 2021 19:05:41 +0000 (11:05 -0800)]
[flang][fir] Update flang test tool support classes.

This updates the various classes that support the compliation of
Fortran. These classes are shared by the test tools.

Authors: Eric Schweitz, Sameeran Joshi, et.al.

Differential Revision: https://reviews.llvm.org/D97073

3 years agoRevert "Revert "Fix MLIR Toy tutorial JIT example and add a test to cover it""
Mehdi Amini [Fri, 19 Feb 2021 23:53:13 +0000 (23:53 +0000)]
Revert "Revert "Fix MLIR Toy tutorial JIT example and add a test to cover it""

This reverts commit f36060417ad3e247900dfcb07a2476a9d92ee2d2 and
reapply commit ae15b1e7ad71e4bfde1b031dd5e6b0bbb3b88a42.

JIT test must be annotated to not run on Windows.

3 years ago[SystemZ/z/OS] Add XPLINK 64-bit calling convention to tablegen.
Yusra Syeda [Fri, 19 Feb 2021 22:44:10 +0000 (17:44 -0500)]
[SystemZ/z/OS] Add XPLINK 64-bit calling convention to tablegen.

This commit adds the initial changes to the SystemZ target
description for the XPLINK 64-bit calling convention on z/OS.
Additions include:

 - a new predicate IsTargetXPLINK64
 - different register allocation order
 - generaton of nopr after a call

Reviewed-by: uweigand
Differential Revision: https://reviews.llvm.org/D96887

3 years ago[Coverage] Normalize compilation dir as well
Petr Hosek [Thu, 18 Feb 2021 23:01:05 +0000 (15:01 -0800)]
[Coverage] Normalize compilation dir as well

This matches debug info behavior.

Differential Revision: https://reviews.llvm.org/D97001

3 years ago[libc++][nfc] Only test if pair is_assignable after C++03.
zoecarver [Fri, 19 Feb 2021 23:12:19 +0000 (15:12 -0800)]
[libc++][nfc] Only test if pair is_assignable after C++03.

In C++03 libc++ uses a different set of constructors which aren't
constrained, so these tests won't work. This should fix the bots.

Refs: 82c4701.

3 years ago[libcxx] Enable filesystem by default for mingw targets
Martin Storsjö [Wed, 4 Nov 2020 22:13:22 +0000 (00:13 +0200)]
[libcxx] Enable filesystem by default for mingw targets

This feature can be built successfully for windows now. However,
the helper functions for __int128_t aren't available in MSVC
configurations, so don't enable it by default there yet. (See
https://reviews.llvm.org/D91139 for discussion on how to proceed
with things in MSVC environments.)

Differential Revision: https://reviews.llvm.org/D97075

3 years ago[ValueTracking] Add a two argument form of safeCtxI [NFC]
Philip Reames [Fri, 19 Feb 2021 22:51:53 +0000 (14:51 -0800)]
[ValueTracking] Add a two argument form of safeCtxI [NFC]

The existing implementation was relying on order of evaluation to achieve a particular result.  This got really confusing when wanting to change the handling for arguments in a later patch.

3 years ago[AArch64] Adding Neon Polynomial vadd Intrinsics
Christopher Tetreault [Fri, 19 Feb 2021 22:46:36 +0000 (14:46 -0800)]
[AArch64] Adding Neon Polynomial vadd Intrinsics

This patch adds the following intrinsics:
            vadd_p8
            vadd_p16
            vadd_p64
            vaddq_p8
            vaddq_p16
            vaddq_p64
            vaddq_p128

Reviewed By: t.p.northover, DavidSpickett, ctetreau

Differential Revision: https://reviews.llvm.org/D96825

3 years ago[AArch64][GlobalISel] Make G_VECREDUCE_ADD of <2 x s32> legal.
Amara Emerson [Fri, 19 Feb 2021 22:27:08 +0000 (14:27 -0800)]
[AArch64][GlobalISel] Make G_VECREDUCE_ADD of <2 x s32> legal.

3 years agoRevert "Fix MLIR Toy tutorial JIT example and add a test to cover it"
Stella Stamenova [Fri, 19 Feb 2021 21:38:43 +0000 (13:38 -0800)]
Revert "Fix MLIR Toy tutorial JIT example and add a test to cover it"

This reverts commit ae15b1e7ad71e4bfde1b031dd5e6b0bbb3b88a42.

This commit caused failures on the mlir windows buildbot

3 years ago[dfsan] Add origin address calculation
Jianzhou Zhao [Fri, 19 Feb 2021 17:50:02 +0000 (17:50 +0000)]
[dfsan] Add origin address calculation

This is a part of https://reviews.llvm.org/D95835.

Reviewed-by: morehouse
Differential Revision: https://reviews.llvm.org/D97065

3 years ago[libc++][nfc] SFINAE on pair/tuple assignment operators: LWG 2729.
zoecarver [Fri, 19 Feb 2021 21:24:30 +0000 (13:24 -0800)]
[libc++][nfc] SFINAE on pair/tuple assignment operators: LWG 2729.

This patch ensures that SFINAE is used to delete assignment operators in pair and tuple based on issue 2729.

Differential Review: https://reviews.llvm.org/D62454

3 years ago[RISCV] Remove VPatILoad and VPatIStore multiclasses that are no longer used. NFC
Craig Topper [Fri, 19 Feb 2021 21:23:08 +0000 (13:23 -0800)]
[RISCV] Remove VPatILoad and VPatIStore multiclasses that are no longer used. NFC

3 years agoAdd datalayout to test added in 7e3183d73
Philip Reames [Fri, 19 Feb 2021 21:09:34 +0000 (13:09 -0800)]
Add datalayout to test added in 7e3183d73

Realized after pushing this would probably fail on bots for other than x86-64.

3 years ago[lldb] Rename {stop,run}_vote to report_{stop,run}_vote
Dave Lee [Wed, 17 Feb 2021 23:09:50 +0000 (15:09 -0800)]
[lldb] Rename {stop,run}_vote to report_{stop,run}_vote

Rename `stop_vote` and `run_vote` to `report_stop_vote` and `report_run_vote`
respectively. These variables are limited to logic involving (event) reporting only.
This naming is intended to make their context more clear.

Differential Revision: https://reviews.llvm.org/D96917

3 years agoAdd test triggered by review discussion on D97077
Philip Reames [Fri, 19 Feb 2021 21:03:31 +0000 (13:03 -0800)]
Add test triggered by review discussion on D97077

3 years agoPatch by @wecing (Chenguang Wang).
Tim Shen [Fri, 19 Feb 2021 20:19:34 +0000 (12:19 -0800)]
Patch by @wecing (Chenguang Wang).

The current getFoldedSizeOf() implementation uses naive recursion, which
could be really slow when the input structure type is too complex.

This issue was first brought up in
http://llvm.org/bugs/show_bug.cgi?id=8281; this change fixes it by
adding memoization.

Differential Revision: https://reviews.llvm.org/D6594

3 years ago[mlir] Add math polynomial approximation pass
Eugene Zhulenev [Fri, 19 Feb 2021 00:24:56 +0000 (16:24 -0800)]
[mlir] Add math polynomial approximation pass

This gives ~30x speedup compared to expanding Tanh into exp operations:

```
name                  old cpu/op  new cpu/op  delta
BM_mlir_Tanh_f32/10    253ns ± 3%    55ns ± 7%  -78.35%  (p=0.000 n=44+41)
BM_mlir_Tanh_f32/100  2.21µs ± 4%  0.14µs ± 8%  -93.85%  (p=0.000 n=48+49)
BM_mlir_Tanh_f32/1k   22.6µs ± 4%   0.7µs ± 5%  -96.68%  (p=0.000 n=32+42)
BM_mlir_Tanh_f32/10k   225µs ± 5%     7µs ± 6%  -96.88%  (p=0.000 n=49+55)

name                  old time/op             new time/op             delta
BM_mlir_Tanh_f32/10    259ns ± 1%               56ns ± 2%  -78.31%        (p=0.000 n=41+39)
BM_mlir_Tanh_f32/100  2.27µs ± 1%             0.14µs ± 5%  -93.89%        (p=0.000 n=46+49)
BM_mlir_Tanh_f32/1k   22.9µs ± 1%              0.8µs ± 4%  -96.67%        (p=0.000 n=30+42)
BM_mlir_Tanh_f32/10k   230µs ± 0%                7µs ± 3%  -96.88%        (p=0.000 n=37+55)
```

This approximations is based on Eigen::generic_fast_tanh function

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D96739

3 years ago[clang] Emit type metadata on available_externally vtables for WPD
Teresa Johnson [Wed, 17 Feb 2021 03:44:58 +0000 (19:44 -0800)]
[clang] Emit type metadata on available_externally vtables for WPD

When WPD is enabled, via WholeProgramVTables, emit type metadata for
available_externally vtables. Additionally, add the vtables to the
llvm.compiler.used global so that they are not prematurely eliminated
(before *LTO analysis).

This is needed to avoid devirtualizing calls to a function overriding a
class defined in a header file but with a strong definition in a shared
library. Without type metadata on the available_externally vtables from
the header, the WPD analysis never sees what a derived class is
overriding. Even if the available_externally base class functions are
pure virtual, because shared library definitions are already treated
conservatively (committed patches D91583, D96721, and D96722) we will
not devirtualize, which would be unsafe since the library might contain
overrides that aren't visible to the LTO unit.

An example is std::error_category, which is overridden in LLVM
and causing failures after a self build with WPD enabled, because
libstdc++ contains hidden overrides of the virtual base class methods.

Differential Revision: https://reviews.llvm.org/D96919

3 years ago[msan] Set cmpxchg shadow precisely
Jianzhou Zhao [Fri, 19 Feb 2021 04:37:49 +0000 (04:37 +0000)]
[msan] Set cmpxchg shadow precisely

In terms of https://llvm.org/docs/LangRef.html#cmpxchg-instruction,
the return type of chmpxchg is a pair {ty, i1}, while I think we
only wanted to set the shadow for the address 0th op, and it has type
ty.

Reviewed-by: eugenis
Differential Revision: https://reviews.llvm.org/D97029

3 years agoprecommit test cleanup for D97077
Philip Reames [Fri, 19 Feb 2021 20:19:31 +0000 (12:19 -0800)]
precommit test cleanup for D97077

3 years ago[flang][fir][NFC] run clang-format
Eric Schweitz [Fri, 19 Feb 2021 20:05:26 +0000 (12:05 -0800)]
[flang][fir][NFC] run clang-format

cleanup post-merge

3 years ago[Verifier] remove dead code for saturating intrinsics; NFC
Sanjay Patel [Fri, 19 Feb 2021 19:56:20 +0000 (14:56 -0500)]
[Verifier] remove dead code for saturating intrinsics; NFC

Test coverage shows that we assert with the string from the
tablegen defs file for these intrinsics, so these cases
should never be live.

3 years ago[Verifier] add tests for saturating intrinsics; NFC
Sanjay Patel [Fri, 19 Feb 2021 19:48:12 +0000 (14:48 -0500)]
[Verifier] add tests for saturating intrinsics; NFC

As noted in D96904, we don't have direct tests for these malformed ops.

3 years ago[libcxx] Make generic_*string return paths with forward slashes on windows
Martin Storsjö [Mon, 9 Nov 2020 09:45:13 +0000 (11:45 +0200)]
[libcxx] Make generic_*string return paths with forward slashes on windows

This matches what MS STL returns; in std::filesystem, forward slashes
are considered generic dir separators that are valid on all platforms.

Differential Revision: https://reviews.llvm.org/D91181

3 years ago[elfabi] Fix a bug when .dynsym contains no non-local symbol
Haowei Wu [Thu, 18 Feb 2021 04:10:44 +0000 (20:10 -0800)]
[elfabi] Fix a bug when .dynsym contains no non-local symbol

This patch fixed a bug when elbabi was supplied with a tbe file
contains no non-local symbol. Before this patch, it wrote 0 to
sh_info of the .dynsym section, making the ELF stub file invalid.
This patch fixed this issue.

Differential Revision: https://reviews.llvm.org/D96930

3 years ago[libcxx] Fix LWG 2875: shared_ptr::shared_ptr(Y*, D, […]) constructors should be...
zoecarver [Fri, 19 Feb 2021 19:10:36 +0000 (11:10 -0800)]
[libcxx] Fix LWG 2875: shared_ptr::shared_ptr(Y*, D, […]) constructors should be constrained.

Fixes LWG issue 2875.

Differential Revision: https://reviews.llvm.org/D81414

3 years ago[libcxx] Have lexically_normal return the path with preferred separators
Martin Storsjö [Thu, 5 Nov 2020 21:09:15 +0000 (23:09 +0200)]
[libcxx] Have lexically_normal return the path with preferred separators

Differential Revision: https://reviews.llvm.org/D91179

3 years ago[Analysis][LoopVectorize] do not form reductions of pointers
Sanjay Patel [Fri, 19 Feb 2021 14:06:05 +0000 (09:06 -0500)]
[Analysis][LoopVectorize] do not form reductions of pointers

This is a fix for https://llvm.org/PR49215 either before/after
we make a verifier enhancement for vector reductions with D96904.

I'm not sure what the current thinking is for pointer math/logic
in IR. We allow icmp on pointer values. Therefore, we match min/max
patterns, so without this patch, the vectorizer could form a vector
reduction from that sequence.

But the LangRef definitions for min/max and vector reduction
intrinsics do not allow pointer types:
https://llvm.org/docs/LangRef.html#llvm-smax-intrinsic
https://llvm.org/docs/LangRef.html#llvm-vector-reduce-umax-intrinsic

So we would crash/assert at some point - either in IR verification,
in the cost model, or in codegen. If we do want to allow this kind
of transform, we will need to update the LangRef and all of those
parts of the compiler.

Differential Revision: https://reviews.llvm.org/D97047

3 years ago[Polly] Fix test after D96534.
Michael Kruse [Fri, 19 Feb 2021 18:47:52 +0000 (12:47 -0600)]
[Polly] Fix test after D96534.

3 years ago[RISCV] Use inheritance to reduce some repeated code in tablegen. NFC
Craig Topper [Fri, 19 Feb 2021 18:36:26 +0000 (10:36 -0800)]
[RISCV] Use inheritance to reduce some repeated code in tablegen. NFC

The VLX and VSX searchable tables, share the same format so we
can have a common base class for them.

3 years ago[X86] Regenerate 2007-06-28-X86-64-isel.ll
Simon Pilgrim [Fri, 19 Feb 2021 18:34:57 +0000 (18:34 +0000)]
[X86] Regenerate 2007-06-28-X86-64-isel.ll

3 years ago[X86] Remove unused intrinsic declaration
Simon Pilgrim [Fri, 19 Feb 2021 18:25:11 +0000 (18:25 +0000)]
[X86] Remove unused intrinsic declaration

3 years ago[X86] Regenerate 2011-12-06-AVXVectorExtractCombine.ll
Simon Pilgrim [Fri, 19 Feb 2021 18:23:21 +0000 (18:23 +0000)]
[X86] Regenerate 2011-12-06-AVXVectorExtractCombine.ll

3 years ago[RISCV] Remove unneeded indexed segment load/store vector pseudo instruction.
Craig Topper [Fri, 19 Feb 2021 18:28:45 +0000 (10:28 -0800)]
[RISCV] Remove unneeded indexed segment load/store vector pseudo instruction.

We had more combinations of data and index lmuls than we needed.

Also add some asserts to verify that the IndexVT and data VT have
the same element count when we isel these pseudo instructions.

3 years ago[RISCV] Use custom isel for vector indexed load/store intrinsics.
Craig Topper [Fri, 19 Feb 2021 18:08:43 +0000 (10:08 -0800)]
[RISCV] Use custom isel for vector indexed load/store intrinsics.

There are many legal combinations of index and data VTs supported
for these intrinsics. This results in a lot of isel patterns in
RISCVGenDAGISel.inc.

By adding a separate table similar to what we use for segment
load/stores, we can more efficiently manually select these
intrinsics. We should also be able to reuse this table scalable
vector gather/scatter.

This reduces the llc binary size by ~56K.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D97033

3 years ago[RISCV] Prevent selecting a 0 VL to X0 for the segment load/store intrinsics.
Craig Topper [Fri, 19 Feb 2021 18:00:13 +0000 (10:00 -0800)]
[RISCV] Prevent selecting a 0 VL to X0 for the segment load/store intrinsics.

Just like we do for isel patterns, we need to call selectVLOp
to prevent 0 from being selected to X0 by the default isel.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97021

3 years ago[RISCV] Move SHFLI matching to DAG combine. Add 32-bit support for RV64
Craig Topper [Fri, 19 Feb 2021 17:55:45 +0000 (09:55 -0800)]
[RISCV] Move SHFLI matching to DAG combine. Add 32-bit support for RV64

We previously used isel patterns for this, but that used quite
a bit of space in the isel table due to OR being associative
and commutative. It also wouldn't handle shifts/ands being in
reversed order.

This generalizes the shift/and matching from GREVI to
take the expected mask table as input so we can reuse it for
SHFLI.

There is no SHFLIW instruction, but we can promote a 32-bit
SHFLI to i64 on RV64. As long as bit 4 of the control bit isn't
set, a 64-bit SHFLI will preserve 33 sign bits if the input had
at least 33 sign bits. ComputeNumSignBits has been updated to
account for that to avoid sext.w in the tests.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96661

3 years ago[SampleFDO] Add PromotedInsns to prevent repeated ICP.
Wei Mi [Fri, 19 Feb 2021 17:57:18 +0000 (09:57 -0800)]
[SampleFDO] Add PromotedInsns to prevent repeated ICP.

In https://reviews.llvm.org/rG5fb65c02ca5e91e7e1a00e0efdb8edc899f3e4b9,
We use 0 count value profile to memorize which target has been promoted
and prevent repeated ICP for the same target, so we delete PromotedInsns.
However, I found the implementation in the patch has some shortcomings
to be fixed otherwise there will still be repeated ICP. So I add
PromotedInsns back temorarily. Will remove it after I get a thorough fix.

3 years ago[CUDA] fix builtin constraints for PTX 7.2
Artem Belevich [Fri, 19 Feb 2021 17:32:10 +0000 (09:32 -0800)]
[CUDA] fix builtin constraints for PTX 7.2

This fixes build issues w/ CUDA-11 introduced by https://reviews.llvm.org/D95974

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D97009

3 years ago[Sanitizer][NFC] Fix typo
Luís Marques [Fri, 19 Feb 2021 17:46:02 +0000 (17:46 +0000)]
[Sanitizer][NFC] Fix typo

3 years ago[AArch64][GlobalISel] Run redundant_sext_inreg in the post-legalizer combiner
Jessica Paquette [Wed, 17 Feb 2021 23:16:51 +0000 (15:16 -0800)]
[AArch64][GlobalISel] Run redundant_sext_inreg in the post-legalizer combiner

This is to ensure that we can eliminate G_ASSERT_SEXT.

In a follow-up patch, I'm going to make CallLowering emit G_ASSERT_SEXT for
signext parameters.

Differential Revision: https://reviews.llvm.org/D96913

3 years ago[mlir] Add folding of tensor.cast -> subtensor_insert
Nicolas Vasilache [Fri, 19 Feb 2021 17:04:12 +0000 (17:04 +0000)]
[mlir] Add folding of tensor.cast -> subtensor_insert

Differential Revision: https://reviews.llvm.org/D97059

3 years ago[MLIR] Delete unused functions getCollapsedInitTensor and getExpandedInitTensor
Geoffrey Martin-Noble [Fri, 19 Feb 2021 01:31:50 +0000 (17:31 -0800)]
[MLIR] Delete unused functions getCollapsedInitTensor and getExpandedInitTensor

These are unused since
https://reviews.llvm.org/rG81264dfbe80df08668a325a61613b64243b99c01

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D97014

3 years ago[LV] Fold single-use variable into assert. NFC.
Benjamin Kramer [Fri, 19 Feb 2021 17:11:39 +0000 (18:11 +0100)]
[LV] Fold single-use variable into assert. NFC.

3 years ago[MemCopyOpt] Enable MemorySSA by default
Nikita Popov [Sun, 10 Jan 2021 09:52:01 +0000 (10:52 +0100)]
[MemCopyOpt] Enable MemorySSA by default

This enables use of MemorySSA instead of MemDep in MemCpyOpt. To
allow this without significant compile-time impact, the MemCpyOpt
pass is moved directly before DSE (in the cases where this was not
already the case), which allows us to reuse the existing MemorySSA
analysis.

Unlike the MemDep-based implementation, the MemorySSA-based MemCpyOpt
can also perform simple optimizations across basic blocks.

Differential Revision: https://reviews.llvm.org/D94376

3 years agoHwasan InitPrctl check for error using internal_iserror
Matthew Malcomson [Fri, 19 Feb 2021 16:19:37 +0000 (16:19 +0000)]
Hwasan InitPrctl check for error using internal_iserror

When adding this function in https://reviews.llvm.org/D68794 I did not
notice that internal_prctl has the API of the syscall to prctl rather
than the API of the glibc (posix) wrapper.

This means that the error return value is not necessarily -1 and that
errno is not set by the call.

For InitPrctl this means that the checks do not catch running on a
kernel *without* the required ABI (not caught since I only tested this
function correctly enables the ABI when it exists).
This commit updates the two calls which check for an error condition to
use internal_iserror. That function sets a provided integer to an
equivalent errno value and returns a boolean to indicate success or not.

Tested by running on a kernel that has this ABI and on one that does
not. Verified that running on the kernel without this ABI the current
code prints the provided error message and does not attempt to run the
program. Verified that running on the kernel with this ABI the current
code does not print an error message and turns on the ABI.
This done on an x86 kernel (where the ABI does not exist), an AArch64
kernel without this ABI, and an AArch64 kernel with this ABI.

In order to keep running the testsuite on kernels that do not provide
this new ABI we add another option to the HWASAN_OPTIONS environment
variable, this option determines whether the library kills the process
if it fails to enable the relaxed syscall ABI or not.
This new flag is `fail_without_syscall_abi`.
The check-hwasan testsuite results do not change with this patch on
either x86, AArch64 without a kernel supporting this ABI, and AArch64
with a kernel supporting this ABI.

Differential Revision: https://reviews.llvm.org/D96964

3 years ago[SCEV] Use both known bits and sign bits when computing range of SCEV unknowns
Philip Reames [Fri, 19 Feb 2021 16:27:46 +0000 (08:27 -0800)]
[SCEV] Use both known bits and sign bits when computing range of SCEV unknowns

When computing a range for a SCEVUnknown, today we use computeKnownBits for unsigned ranges, and computeNumSignBots for signed ranges. This means we miss opportunities to improve range results.

One common missed pattern is that we have a signed range of a value which CKB can determine is positive, but CNSB doesn't convey that information. The current range includes the negative part, and is thus double the size.

Per the removed comment, the original concern which delayed using both (after some code merging years back) was a compile time concern. CTMark results (provided by Nikita, thanks!) showed a geomean impact of about 0.1%. This doesn't seem large enough to avoid higher quality results.

Differential Revision: https://reviews.llvm.org/D96534

3 years ago[libc++] Turn off clang-format for auto-generated version header. NFC.
Marek Kurdej [Fri, 19 Feb 2021 13:10:12 +0000 (14:10 +0100)]
[libc++] Turn off clang-format for auto-generated version header. NFC.

3 years ago[OpenMP] Fix nvptx CUDA_VERSION conversion
Joel E. Denny [Fri, 19 Feb 2021 15:59:52 +0000 (10:59 -0500)]
[OpenMP] Fix nvptx CUDA_VERSION conversion

As mentioned in PR#49250, without this patch, ptxas for CUDA 9.1 fails
in the following two tests:

- openmp/libomptarget/test/mapping/lambda_mapping.cpp
- openmp/libomptarget/test/offloading/bug49021.cpp

The error looks like:

```
ptxas /tmp/lambda_mapping-081ea9.s, line 828; error   : Not a name of any known instruction: 'activemask'
```

The problem is that our cmake script converts CUDA version strings
incorrectly: 9.1 becomes 9100, but it should be 9010, as shown in
`getCudaVersion` in `clang/lib/Driver/ToolChains/Cuda.cpp`.  Thus,
`openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu`
inadvertently enables `activemask` because it apparently becomes
available in 9.2.  This patch fixes the conversion.

This patch does not fix the other two tests in PR#49250.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D97012

3 years ago[OpenMP] Fix always,from and delete for data absent at exit
Joel E. Denny [Fri, 19 Feb 2021 15:59:36 +0000 (10:59 -0500)]
[OpenMP] Fix always,from and delete for data absent at exit

Without this patch, there's a runtime error for those map types at
exit from an "omp target data" or at "omp target exit data", but the
spec says the list item should be ignored.

This patch tests that fix in data_absent_at_exit.c, and it also
improves other testing for data that is not fully present at exit.

Reviewed By: grokos, RaviNarayanaswamy

Differential Revision: https://reviews.llvm.org/D96999

3 years ago[NFC][Regalloc] Share the VirtRegAuxInfo object with LiveRangeEdit
Mircea Trofin [Wed, 17 Feb 2021 21:32:26 +0000 (13:32 -0800)]
[NFC][Regalloc] Share the VirtRegAuxInfo object with LiveRangeEdit

VirtRegAuxInfo is an extensibility point, so the register allocator's
decision on which implementation to use should be communicated to the
other users - namely, LiveRangeEdit.

Differential Revision: https://reviews.llvm.org/D96898

3 years agoMake fixed-abi default for AMD HSA OS
madhur13490 [Wed, 10 Feb 2021 16:11:36 +0000 (16:11 +0000)]
Make fixed-abi default for AMD HSA OS

fixed-abi uses pre-defined and predictable
SGPR/VGPRs for passing arguments. This patch makes
this scheme default when HSA OS is specified in triple.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96340

3 years ago[ARM] Correct vector predicate type in MVE getCmpSelInstrCost
David Green [Fri, 19 Feb 2021 14:43:51 +0000 (14:43 +0000)]
[ARM] Correct vector predicate type in MVE getCmpSelInstrCost

3 years ago[AMDGPU] Add some GFX9 test coverage. NFC.
Jay Foad [Fri, 19 Feb 2021 14:38:26 +0000 (14:38 +0000)]
[AMDGPU] Add some GFX9 test coverage. NFC.

3 years ago[DAG] visitTRUNCATE - attempt to truncate USUBSAT
Simon Pilgrim [Fri, 19 Feb 2021 14:24:57 +0000 (14:24 +0000)]
[DAG] visitTRUNCATE - attempt to truncate USUBSAT

Fold trunc(usubsat(zext(x),y)) -> usubsat(x,trunc(umin(y,satlimit)))

3 years ago[mlir][Linalg] NFC - Expose more options to the CodegenStrategy
Nicolas Vasilache [Fri, 19 Feb 2021 14:00:18 +0000 (14:00 +0000)]
[mlir][Linalg] NFC - Expose more options to the CodegenStrategy

3 years ago[llvm-dwarfdump][locstats] Unify handling of inlined vars with no loc
Djordje Todorovic [Mon, 8 Feb 2021 08:21:39 +0000 (00:21 -0800)]
[llvm-dwarfdump][locstats] Unify handling of inlined vars with no loc

The presence or absence of an inline variable (as well as formal
parameter) with only an abstract_origin ref (without DW_AT_location)
should not change the location coverage.

It means, for both:

DW_TAG_inlined_subroutine
  DW_AT_abstract_origin (0x0000004e "f")
  DW_AT_low_pc  (0x0000000000000010)
  DW_AT_high_pc (0x0000000000000013)
  DW_TAG_formal_parameter
    DW_AT_abstract_origin       (0x0000005a "b")

and,

DW_TAG_inlined_subroutine
   DW_AT_abstract_origin (0x0000004e "f")
   DW_AT_low_pc  (0x0000000000000010)
   DW_AT_high_pc (0x0000000000000013)

we should report 0% location coverage. If we add DW_AT_location,
for both cases the coverage should be improved.

Differential Revision: https://reviews.llvm.org/D96045

3 years ago[lldb/Commands] Fix help text typo for 'breakpoint set' -a|--address.
Jan Kratochvil [Fri, 19 Feb 2021 13:33:42 +0000 (14:33 +0100)]
[lldb/Commands] Fix help text typo for 'breakpoint set' -a|--address.

3 years agoRevert "[ARM] Expand the range of allowed post-incs in load/store optimizer"
David Green [Fri, 19 Feb 2021 13:15:10 +0000 (13:15 +0000)]
Revert "[ARM] Expand the range of allowed post-incs in load/store optimizer"

This reverts commit 3b34b06fc5908b4f7dc720c0655d5756bd8e2a28 as runtime
errors were reported.

3 years ago[LV] Remove VPCallback.
Florian Hahn [Fri, 19 Feb 2021 12:50:41 +0000 (12:50 +0000)]
[LV] Remove VPCallback.

Now that all state for generated instructions is managed directly in
VPTransformState, VPCallBack is no longer needed. This patch updates the
last use of `getOrCreateScalarValue` to instead manage the value
directly in VPTransformState and removes VPCallback.

Reviewed By: gilr

Differential Revision: https://reviews.llvm.org/D95383

3 years ago[clangd] Expose absoluteParent helper
Kadir Cetinkaya [Mon, 15 Feb 2021 15:41:17 +0000 (16:41 +0100)]
[clangd] Expose absoluteParent helper

Will be used in other components that need ancestor traversal.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D96123

3 years ago[X86][SSE] Add tests for trunc(usubsat()) patterns.
Simon Pilgrim [Fri, 19 Feb 2021 12:21:02 +0000 (12:21 +0000)]
[X86][SSE] Add tests for trunc(usubsat()) patterns.