platform/upstream/llvm.git
3 years ago[AArch64] Adding Neon Polynomial vadd Intrinsics
Christopher Tetreault [Fri, 19 Feb 2021 22:46:36 +0000 (14:46 -0800)]
[AArch64] Adding Neon Polynomial vadd Intrinsics

This patch adds the following intrinsics:
            vadd_p8
            vadd_p16
            vadd_p64
            vaddq_p8
            vaddq_p16
            vaddq_p64
            vaddq_p128

Reviewed By: t.p.northover, DavidSpickett, ctetreau

Differential Revision: https://reviews.llvm.org/D96825

3 years ago[AArch64][GlobalISel] Make G_VECREDUCE_ADD of <2 x s32> legal.
Amara Emerson [Fri, 19 Feb 2021 22:27:08 +0000 (14:27 -0800)]
[AArch64][GlobalISel] Make G_VECREDUCE_ADD of <2 x s32> legal.

3 years agoRevert "Fix MLIR Toy tutorial JIT example and add a test to cover it"
Stella Stamenova [Fri, 19 Feb 2021 21:38:43 +0000 (13:38 -0800)]
Revert "Fix MLIR Toy tutorial JIT example and add a test to cover it"

This reverts commit ae15b1e7ad71e4bfde1b031dd5e6b0bbb3b88a42.

This commit caused failures on the mlir windows buildbot

3 years ago[dfsan] Add origin address calculation
Jianzhou Zhao [Fri, 19 Feb 2021 17:50:02 +0000 (17:50 +0000)]
[dfsan] Add origin address calculation

This is a part of https://reviews.llvm.org/D95835.

Reviewed-by: morehouse
Differential Revision: https://reviews.llvm.org/D97065

3 years ago[libc++][nfc] SFINAE on pair/tuple assignment operators: LWG 2729.
zoecarver [Fri, 19 Feb 2021 21:24:30 +0000 (13:24 -0800)]
[libc++][nfc] SFINAE on pair/tuple assignment operators: LWG 2729.

This patch ensures that SFINAE is used to delete assignment operators in pair and tuple based on issue 2729.

Differential Review: https://reviews.llvm.org/D62454

3 years ago[RISCV] Remove VPatILoad and VPatIStore multiclasses that are no longer used. NFC
Craig Topper [Fri, 19 Feb 2021 21:23:08 +0000 (13:23 -0800)]
[RISCV] Remove VPatILoad and VPatIStore multiclasses that are no longer used. NFC

3 years agoAdd datalayout to test added in 7e3183d73
Philip Reames [Fri, 19 Feb 2021 21:09:34 +0000 (13:09 -0800)]
Add datalayout to test added in 7e3183d73

Realized after pushing this would probably fail on bots for other than x86-64.

3 years ago[lldb] Rename {stop,run}_vote to report_{stop,run}_vote
Dave Lee [Wed, 17 Feb 2021 23:09:50 +0000 (15:09 -0800)]
[lldb] Rename {stop,run}_vote to report_{stop,run}_vote

Rename `stop_vote` and `run_vote` to `report_stop_vote` and `report_run_vote`
respectively. These variables are limited to logic involving (event) reporting only.
This naming is intended to make their context more clear.

Differential Revision: https://reviews.llvm.org/D96917

3 years agoAdd test triggered by review discussion on D97077
Philip Reames [Fri, 19 Feb 2021 21:03:31 +0000 (13:03 -0800)]
Add test triggered by review discussion on D97077

3 years agoPatch by @wecing (Chenguang Wang).
Tim Shen [Fri, 19 Feb 2021 20:19:34 +0000 (12:19 -0800)]
Patch by @wecing (Chenguang Wang).

The current getFoldedSizeOf() implementation uses naive recursion, which
could be really slow when the input structure type is too complex.

This issue was first brought up in
http://llvm.org/bugs/show_bug.cgi?id=8281; this change fixes it by
adding memoization.

Differential Revision: https://reviews.llvm.org/D6594

3 years ago[mlir] Add math polynomial approximation pass
Eugene Zhulenev [Fri, 19 Feb 2021 00:24:56 +0000 (16:24 -0800)]
[mlir] Add math polynomial approximation pass

This gives ~30x speedup compared to expanding Tanh into exp operations:

```
name                  old cpu/op  new cpu/op  delta
BM_mlir_Tanh_f32/10    253ns ± 3%    55ns ± 7%  -78.35%  (p=0.000 n=44+41)
BM_mlir_Tanh_f32/100  2.21µs ± 4%  0.14µs ± 8%  -93.85%  (p=0.000 n=48+49)
BM_mlir_Tanh_f32/1k   22.6µs ± 4%   0.7µs ± 5%  -96.68%  (p=0.000 n=32+42)
BM_mlir_Tanh_f32/10k   225µs ± 5%     7µs ± 6%  -96.88%  (p=0.000 n=49+55)

name                  old time/op             new time/op             delta
BM_mlir_Tanh_f32/10    259ns ± 1%               56ns ± 2%  -78.31%        (p=0.000 n=41+39)
BM_mlir_Tanh_f32/100  2.27µs ± 1%             0.14µs ± 5%  -93.89%        (p=0.000 n=46+49)
BM_mlir_Tanh_f32/1k   22.9µs ± 1%              0.8µs ± 4%  -96.67%        (p=0.000 n=30+42)
BM_mlir_Tanh_f32/10k   230µs ± 0%                7µs ± 3%  -96.88%        (p=0.000 n=37+55)
```

This approximations is based on Eigen::generic_fast_tanh function

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D96739

3 years ago[clang] Emit type metadata on available_externally vtables for WPD
Teresa Johnson [Wed, 17 Feb 2021 03:44:58 +0000 (19:44 -0800)]
[clang] Emit type metadata on available_externally vtables for WPD

When WPD is enabled, via WholeProgramVTables, emit type metadata for
available_externally vtables. Additionally, add the vtables to the
llvm.compiler.used global so that they are not prematurely eliminated
(before *LTO analysis).

This is needed to avoid devirtualizing calls to a function overriding a
class defined in a header file but with a strong definition in a shared
library. Without type metadata on the available_externally vtables from
the header, the WPD analysis never sees what a derived class is
overriding. Even if the available_externally base class functions are
pure virtual, because shared library definitions are already treated
conservatively (committed patches D91583, D96721, and D96722) we will
not devirtualize, which would be unsafe since the library might contain
overrides that aren't visible to the LTO unit.

An example is std::error_category, which is overridden in LLVM
and causing failures after a self build with WPD enabled, because
libstdc++ contains hidden overrides of the virtual base class methods.

Differential Revision: https://reviews.llvm.org/D96919

3 years ago[msan] Set cmpxchg shadow precisely
Jianzhou Zhao [Fri, 19 Feb 2021 04:37:49 +0000 (04:37 +0000)]
[msan] Set cmpxchg shadow precisely

In terms of https://llvm.org/docs/LangRef.html#cmpxchg-instruction,
the return type of chmpxchg is a pair {ty, i1}, while I think we
only wanted to set the shadow for the address 0th op, and it has type
ty.

Reviewed-by: eugenis
Differential Revision: https://reviews.llvm.org/D97029

3 years agoprecommit test cleanup for D97077
Philip Reames [Fri, 19 Feb 2021 20:19:31 +0000 (12:19 -0800)]
precommit test cleanup for D97077

3 years ago[flang][fir][NFC] run clang-format
Eric Schweitz [Fri, 19 Feb 2021 20:05:26 +0000 (12:05 -0800)]
[flang][fir][NFC] run clang-format

cleanup post-merge

3 years ago[Verifier] remove dead code for saturating intrinsics; NFC
Sanjay Patel [Fri, 19 Feb 2021 19:56:20 +0000 (14:56 -0500)]
[Verifier] remove dead code for saturating intrinsics; NFC

Test coverage shows that we assert with the string from the
tablegen defs file for these intrinsics, so these cases
should never be live.

3 years ago[Verifier] add tests for saturating intrinsics; NFC
Sanjay Patel [Fri, 19 Feb 2021 19:48:12 +0000 (14:48 -0500)]
[Verifier] add tests for saturating intrinsics; NFC

As noted in D96904, we don't have direct tests for these malformed ops.

3 years ago[libcxx] Make generic_*string return paths with forward slashes on windows
Martin Storsjö [Mon, 9 Nov 2020 09:45:13 +0000 (11:45 +0200)]
[libcxx] Make generic_*string return paths with forward slashes on windows

This matches what MS STL returns; in std::filesystem, forward slashes
are considered generic dir separators that are valid on all platforms.

Differential Revision: https://reviews.llvm.org/D91181

3 years ago[elfabi] Fix a bug when .dynsym contains no non-local symbol
Haowei Wu [Thu, 18 Feb 2021 04:10:44 +0000 (20:10 -0800)]
[elfabi] Fix a bug when .dynsym contains no non-local symbol

This patch fixed a bug when elbabi was supplied with a tbe file
contains no non-local symbol. Before this patch, it wrote 0 to
sh_info of the .dynsym section, making the ELF stub file invalid.
This patch fixed this issue.

Differential Revision: https://reviews.llvm.org/D96930

3 years ago[libcxx] Fix LWG 2875: shared_ptr::shared_ptr(Y*, D, […]) constructors should be...
zoecarver [Fri, 19 Feb 2021 19:10:36 +0000 (11:10 -0800)]
[libcxx] Fix LWG 2875: shared_ptr::shared_ptr(Y*, D, […]) constructors should be constrained.

Fixes LWG issue 2875.

Differential Revision: https://reviews.llvm.org/D81414

3 years ago[libcxx] Have lexically_normal return the path with preferred separators
Martin Storsjö [Thu, 5 Nov 2020 21:09:15 +0000 (23:09 +0200)]
[libcxx] Have lexically_normal return the path with preferred separators

Differential Revision: https://reviews.llvm.org/D91179

3 years ago[Analysis][LoopVectorize] do not form reductions of pointers
Sanjay Patel [Fri, 19 Feb 2021 14:06:05 +0000 (09:06 -0500)]
[Analysis][LoopVectorize] do not form reductions of pointers

This is a fix for https://llvm.org/PR49215 either before/after
we make a verifier enhancement for vector reductions with D96904.

I'm not sure what the current thinking is for pointer math/logic
in IR. We allow icmp on pointer values. Therefore, we match min/max
patterns, so without this patch, the vectorizer could form a vector
reduction from that sequence.

But the LangRef definitions for min/max and vector reduction
intrinsics do not allow pointer types:
https://llvm.org/docs/LangRef.html#llvm-smax-intrinsic
https://llvm.org/docs/LangRef.html#llvm-vector-reduce-umax-intrinsic

So we would crash/assert at some point - either in IR verification,
in the cost model, or in codegen. If we do want to allow this kind
of transform, we will need to update the LangRef and all of those
parts of the compiler.

Differential Revision: https://reviews.llvm.org/D97047

3 years ago[Polly] Fix test after D96534.
Michael Kruse [Fri, 19 Feb 2021 18:47:52 +0000 (12:47 -0600)]
[Polly] Fix test after D96534.

3 years ago[RISCV] Use inheritance to reduce some repeated code in tablegen. NFC
Craig Topper [Fri, 19 Feb 2021 18:36:26 +0000 (10:36 -0800)]
[RISCV] Use inheritance to reduce some repeated code in tablegen. NFC

The VLX and VSX searchable tables, share the same format so we
can have a common base class for them.

3 years ago[X86] Regenerate 2007-06-28-X86-64-isel.ll
Simon Pilgrim [Fri, 19 Feb 2021 18:34:57 +0000 (18:34 +0000)]
[X86] Regenerate 2007-06-28-X86-64-isel.ll

3 years ago[X86] Remove unused intrinsic declaration
Simon Pilgrim [Fri, 19 Feb 2021 18:25:11 +0000 (18:25 +0000)]
[X86] Remove unused intrinsic declaration

3 years ago[X86] Regenerate 2011-12-06-AVXVectorExtractCombine.ll
Simon Pilgrim [Fri, 19 Feb 2021 18:23:21 +0000 (18:23 +0000)]
[X86] Regenerate 2011-12-06-AVXVectorExtractCombine.ll

3 years ago[RISCV] Remove unneeded indexed segment load/store vector pseudo instruction.
Craig Topper [Fri, 19 Feb 2021 18:28:45 +0000 (10:28 -0800)]
[RISCV] Remove unneeded indexed segment load/store vector pseudo instruction.

We had more combinations of data and index lmuls than we needed.

Also add some asserts to verify that the IndexVT and data VT have
the same element count when we isel these pseudo instructions.

3 years ago[RISCV] Use custom isel for vector indexed load/store intrinsics.
Craig Topper [Fri, 19 Feb 2021 18:08:43 +0000 (10:08 -0800)]
[RISCV] Use custom isel for vector indexed load/store intrinsics.

There are many legal combinations of index and data VTs supported
for these intrinsics. This results in a lot of isel patterns in
RISCVGenDAGISel.inc.

By adding a separate table similar to what we use for segment
load/stores, we can more efficiently manually select these
intrinsics. We should also be able to reuse this table scalable
vector gather/scatter.

This reduces the llc binary size by ~56K.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D97033

3 years ago[RISCV] Prevent selecting a 0 VL to X0 for the segment load/store intrinsics.
Craig Topper [Fri, 19 Feb 2021 18:00:13 +0000 (10:00 -0800)]
[RISCV] Prevent selecting a 0 VL to X0 for the segment load/store intrinsics.

Just like we do for isel patterns, we need to call selectVLOp
to prevent 0 from being selected to X0 by the default isel.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97021

3 years ago[RISCV] Move SHFLI matching to DAG combine. Add 32-bit support for RV64
Craig Topper [Fri, 19 Feb 2021 17:55:45 +0000 (09:55 -0800)]
[RISCV] Move SHFLI matching to DAG combine. Add 32-bit support for RV64

We previously used isel patterns for this, but that used quite
a bit of space in the isel table due to OR being associative
and commutative. It also wouldn't handle shifts/ands being in
reversed order.

This generalizes the shift/and matching from GREVI to
take the expected mask table as input so we can reuse it for
SHFLI.

There is no SHFLIW instruction, but we can promote a 32-bit
SHFLI to i64 on RV64. As long as bit 4 of the control bit isn't
set, a 64-bit SHFLI will preserve 33 sign bits if the input had
at least 33 sign bits. ComputeNumSignBits has been updated to
account for that to avoid sext.w in the tests.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D96661

3 years ago[SampleFDO] Add PromotedInsns to prevent repeated ICP.
Wei Mi [Fri, 19 Feb 2021 17:57:18 +0000 (09:57 -0800)]
[SampleFDO] Add PromotedInsns to prevent repeated ICP.

In https://reviews.llvm.org/rG5fb65c02ca5e91e7e1a00e0efdb8edc899f3e4b9,
We use 0 count value profile to memorize which target has been promoted
and prevent repeated ICP for the same target, so we delete PromotedInsns.
However, I found the implementation in the patch has some shortcomings
to be fixed otherwise there will still be repeated ICP. So I add
PromotedInsns back temorarily. Will remove it after I get a thorough fix.

3 years ago[CUDA] fix builtin constraints for PTX 7.2
Artem Belevich [Fri, 19 Feb 2021 17:32:10 +0000 (09:32 -0800)]
[CUDA] fix builtin constraints for PTX 7.2

This fixes build issues w/ CUDA-11 introduced by https://reviews.llvm.org/D95974

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D97009

3 years ago[Sanitizer][NFC] Fix typo
Luís Marques [Fri, 19 Feb 2021 17:46:02 +0000 (17:46 +0000)]
[Sanitizer][NFC] Fix typo

3 years ago[AArch64][GlobalISel] Run redundant_sext_inreg in the post-legalizer combiner
Jessica Paquette [Wed, 17 Feb 2021 23:16:51 +0000 (15:16 -0800)]
[AArch64][GlobalISel] Run redundant_sext_inreg in the post-legalizer combiner

This is to ensure that we can eliminate G_ASSERT_SEXT.

In a follow-up patch, I'm going to make CallLowering emit G_ASSERT_SEXT for
signext parameters.

Differential Revision: https://reviews.llvm.org/D96913

3 years ago[mlir] Add folding of tensor.cast -> subtensor_insert
Nicolas Vasilache [Fri, 19 Feb 2021 17:04:12 +0000 (17:04 +0000)]
[mlir] Add folding of tensor.cast -> subtensor_insert

Differential Revision: https://reviews.llvm.org/D97059

3 years ago[MLIR] Delete unused functions getCollapsedInitTensor and getExpandedInitTensor
Geoffrey Martin-Noble [Fri, 19 Feb 2021 01:31:50 +0000 (17:31 -0800)]
[MLIR] Delete unused functions getCollapsedInitTensor and getExpandedInitTensor

These are unused since
https://reviews.llvm.org/rG81264dfbe80df08668a325a61613b64243b99c01

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D97014

3 years ago[LV] Fold single-use variable into assert. NFC.
Benjamin Kramer [Fri, 19 Feb 2021 17:11:39 +0000 (18:11 +0100)]
[LV] Fold single-use variable into assert. NFC.

3 years ago[MemCopyOpt] Enable MemorySSA by default
Nikita Popov [Sun, 10 Jan 2021 09:52:01 +0000 (10:52 +0100)]
[MemCopyOpt] Enable MemorySSA by default

This enables use of MemorySSA instead of MemDep in MemCpyOpt. To
allow this without significant compile-time impact, the MemCpyOpt
pass is moved directly before DSE (in the cases where this was not
already the case), which allows us to reuse the existing MemorySSA
analysis.

Unlike the MemDep-based implementation, the MemorySSA-based MemCpyOpt
can also perform simple optimizations across basic blocks.

Differential Revision: https://reviews.llvm.org/D94376

3 years agoHwasan InitPrctl check for error using internal_iserror
Matthew Malcomson [Fri, 19 Feb 2021 16:19:37 +0000 (16:19 +0000)]
Hwasan InitPrctl check for error using internal_iserror

When adding this function in https://reviews.llvm.org/D68794 I did not
notice that internal_prctl has the API of the syscall to prctl rather
than the API of the glibc (posix) wrapper.

This means that the error return value is not necessarily -1 and that
errno is not set by the call.

For InitPrctl this means that the checks do not catch running on a
kernel *without* the required ABI (not caught since I only tested this
function correctly enables the ABI when it exists).
This commit updates the two calls which check for an error condition to
use internal_iserror. That function sets a provided integer to an
equivalent errno value and returns a boolean to indicate success or not.

Tested by running on a kernel that has this ABI and on one that does
not. Verified that running on the kernel without this ABI the current
code prints the provided error message and does not attempt to run the
program. Verified that running on the kernel with this ABI the current
code does not print an error message and turns on the ABI.
This done on an x86 kernel (where the ABI does not exist), an AArch64
kernel without this ABI, and an AArch64 kernel with this ABI.

In order to keep running the testsuite on kernels that do not provide
this new ABI we add another option to the HWASAN_OPTIONS environment
variable, this option determines whether the library kills the process
if it fails to enable the relaxed syscall ABI or not.
This new flag is `fail_without_syscall_abi`.
The check-hwasan testsuite results do not change with this patch on
either x86, AArch64 without a kernel supporting this ABI, and AArch64
with a kernel supporting this ABI.

Differential Revision: https://reviews.llvm.org/D96964

3 years ago[SCEV] Use both known bits and sign bits when computing range of SCEV unknowns
Philip Reames [Fri, 19 Feb 2021 16:27:46 +0000 (08:27 -0800)]
[SCEV] Use both known bits and sign bits when computing range of SCEV unknowns

When computing a range for a SCEVUnknown, today we use computeKnownBits for unsigned ranges, and computeNumSignBots for signed ranges. This means we miss opportunities to improve range results.

One common missed pattern is that we have a signed range of a value which CKB can determine is positive, but CNSB doesn't convey that information. The current range includes the negative part, and is thus double the size.

Per the removed comment, the original concern which delayed using both (after some code merging years back) was a compile time concern. CTMark results (provided by Nikita, thanks!) showed a geomean impact of about 0.1%. This doesn't seem large enough to avoid higher quality results.

Differential Revision: https://reviews.llvm.org/D96534

3 years ago[libc++] Turn off clang-format for auto-generated version header. NFC.
Marek Kurdej [Fri, 19 Feb 2021 13:10:12 +0000 (14:10 +0100)]
[libc++] Turn off clang-format for auto-generated version header. NFC.

3 years ago[OpenMP] Fix nvptx CUDA_VERSION conversion
Joel E. Denny [Fri, 19 Feb 2021 15:59:52 +0000 (10:59 -0500)]
[OpenMP] Fix nvptx CUDA_VERSION conversion

As mentioned in PR#49250, without this patch, ptxas for CUDA 9.1 fails
in the following two tests:

- openmp/libomptarget/test/mapping/lambda_mapping.cpp
- openmp/libomptarget/test/offloading/bug49021.cpp

The error looks like:

```
ptxas /tmp/lambda_mapping-081ea9.s, line 828; error   : Not a name of any known instruction: 'activemask'
```

The problem is that our cmake script converts CUDA version strings
incorrectly: 9.1 becomes 9100, but it should be 9010, as shown in
`getCudaVersion` in `clang/lib/Driver/ToolChains/Cuda.cpp`.  Thus,
`openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu`
inadvertently enables `activemask` because it apparently becomes
available in 9.2.  This patch fixes the conversion.

This patch does not fix the other two tests in PR#49250.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D97012

3 years ago[OpenMP] Fix always,from and delete for data absent at exit
Joel E. Denny [Fri, 19 Feb 2021 15:59:36 +0000 (10:59 -0500)]
[OpenMP] Fix always,from and delete for data absent at exit

Without this patch, there's a runtime error for those map types at
exit from an "omp target data" or at "omp target exit data", but the
spec says the list item should be ignored.

This patch tests that fix in data_absent_at_exit.c, and it also
improves other testing for data that is not fully present at exit.

Reviewed By: grokos, RaviNarayanaswamy

Differential Revision: https://reviews.llvm.org/D96999

3 years ago[NFC][Regalloc] Share the VirtRegAuxInfo object with LiveRangeEdit
Mircea Trofin [Wed, 17 Feb 2021 21:32:26 +0000 (13:32 -0800)]
[NFC][Regalloc] Share the VirtRegAuxInfo object with LiveRangeEdit

VirtRegAuxInfo is an extensibility point, so the register allocator's
decision on which implementation to use should be communicated to the
other users - namely, LiveRangeEdit.

Differential Revision: https://reviews.llvm.org/D96898

3 years agoMake fixed-abi default for AMD HSA OS
madhur13490 [Wed, 10 Feb 2021 16:11:36 +0000 (16:11 +0000)]
Make fixed-abi default for AMD HSA OS

fixed-abi uses pre-defined and predictable
SGPR/VGPRs for passing arguments. This patch makes
this scheme default when HSA OS is specified in triple.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96340

3 years ago[ARM] Correct vector predicate type in MVE getCmpSelInstrCost
David Green [Fri, 19 Feb 2021 14:43:51 +0000 (14:43 +0000)]
[ARM] Correct vector predicate type in MVE getCmpSelInstrCost

3 years ago[AMDGPU] Add some GFX9 test coverage. NFC.
Jay Foad [Fri, 19 Feb 2021 14:38:26 +0000 (14:38 +0000)]
[AMDGPU] Add some GFX9 test coverage. NFC.

3 years ago[DAG] visitTRUNCATE - attempt to truncate USUBSAT
Simon Pilgrim [Fri, 19 Feb 2021 14:24:57 +0000 (14:24 +0000)]
[DAG] visitTRUNCATE - attempt to truncate USUBSAT

Fold trunc(usubsat(zext(x),y)) -> usubsat(x,trunc(umin(y,satlimit)))

3 years ago[mlir][Linalg] NFC - Expose more options to the CodegenStrategy
Nicolas Vasilache [Fri, 19 Feb 2021 14:00:18 +0000 (14:00 +0000)]
[mlir][Linalg] NFC - Expose more options to the CodegenStrategy

3 years ago[llvm-dwarfdump][locstats] Unify handling of inlined vars with no loc
Djordje Todorovic [Mon, 8 Feb 2021 08:21:39 +0000 (00:21 -0800)]
[llvm-dwarfdump][locstats] Unify handling of inlined vars with no loc

The presence or absence of an inline variable (as well as formal
parameter) with only an abstract_origin ref (without DW_AT_location)
should not change the location coverage.

It means, for both:

DW_TAG_inlined_subroutine
  DW_AT_abstract_origin (0x0000004e "f")
  DW_AT_low_pc  (0x0000000000000010)
  DW_AT_high_pc (0x0000000000000013)
  DW_TAG_formal_parameter
    DW_AT_abstract_origin       (0x0000005a "b")

and,

DW_TAG_inlined_subroutine
   DW_AT_abstract_origin (0x0000004e "f")
   DW_AT_low_pc  (0x0000000000000010)
   DW_AT_high_pc (0x0000000000000013)

we should report 0% location coverage. If we add DW_AT_location,
for both cases the coverage should be improved.

Differential Revision: https://reviews.llvm.org/D96045

3 years ago[lldb/Commands] Fix help text typo for 'breakpoint set' -a|--address.
Jan Kratochvil [Fri, 19 Feb 2021 13:33:42 +0000 (14:33 +0100)]
[lldb/Commands] Fix help text typo for 'breakpoint set' -a|--address.

3 years agoRevert "[ARM] Expand the range of allowed post-incs in load/store optimizer"
David Green [Fri, 19 Feb 2021 13:15:10 +0000 (13:15 +0000)]
Revert "[ARM] Expand the range of allowed post-incs in load/store optimizer"

This reverts commit 3b34b06fc5908b4f7dc720c0655d5756bd8e2a28 as runtime
errors were reported.

3 years ago[LV] Remove VPCallback.
Florian Hahn [Fri, 19 Feb 2021 12:50:41 +0000 (12:50 +0000)]
[LV] Remove VPCallback.

Now that all state for generated instructions is managed directly in
VPTransformState, VPCallBack is no longer needed. This patch updates the
last use of `getOrCreateScalarValue` to instead manage the value
directly in VPTransformState and removes VPCallback.

Reviewed By: gilr

Differential Revision: https://reviews.llvm.org/D95383

3 years ago[clangd] Expose absoluteParent helper
Kadir Cetinkaya [Mon, 15 Feb 2021 15:41:17 +0000 (16:41 +0100)]
[clangd] Expose absoluteParent helper

Will be used in other components that need ancestor traversal.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D96123

3 years ago[X86][SSE] Add tests for trunc(usubsat()) patterns.
Simon Pilgrim [Fri, 19 Feb 2021 12:21:02 +0000 (12:21 +0000)]
[X86][SSE] Add tests for trunc(usubsat()) patterns.

3 years ago[gn build] Port 1a2b3536efef
Nico Weber [Fri, 19 Feb 2021 12:21:56 +0000 (07:21 -0500)]
[gn build] Port 1a2b3536efef

3 years ago[RISCV] Address some clang-tidy warnings. NFCI.
Fraser Cormack [Fri, 19 Feb 2021 12:09:25 +0000 (12:09 +0000)]
[RISCV] Address some clang-tidy warnings. NFCI.

3 years ago[LLD] Fix tests after D96993
Nikita Popov [Fri, 19 Feb 2021 12:06:45 +0000 (13:06 +0100)]
[LLD] Fix tests after D96993

We now need mustprogress to eliminate these calls. The code doesn't
really make sense, but that's not the point of the test...

3 years ago[mlir][nfc] Fix indentation in LinalgOps.td.
Alexander Belyaev [Fri, 19 Feb 2021 12:02:58 +0000 (13:02 +0100)]
[mlir][nfc] Fix indentation in LinalgOps.td.

3 years ago[OPENMP][AMDGCN] Improvements to print_kernel_trace (bit mask)
Ron Lieberman [Thu, 18 Feb 2021 22:10:40 +0000 (17:10 -0500)]
[OPENMP][AMDGCN] Improvements to print_kernel_trace (bit mask)

allow bit masking to select various trace features.
  bit 0 => Launch tracing           (stderr)
  bit 1 => timing of runtime        (stdout)
  bit 2 => detailed launch tracing  (stderr)
  bit 3 => timing goes to stdout instead of stderr

  example: LIBOMPTARGET_KERNEL_TRACE=7     does it all
           LIBOMPTARGET_KERNEL_TRACE=5     Launch + details
           LIBOMPTARGET_KERNEL_TRACE=2     timings + launch to stderr
           LIBOMPTARGET_KERNEL_TRACE=10    timings + launch to stdout

Differential Revision: https://reviews.llvm.org/D96998

3 years ago[AMDGPU] WQM/WWM: Fix marking of partial definitions
Carl Ritson [Fri, 19 Feb 2021 11:12:03 +0000 (20:12 +0900)]
[AMDGPU] WQM/WWM: Fix marking of partial definitions

Track lanes when processing definitions for marking WQM/WWM.
If all lanes have been defined then marking can stop.
This prevents marking unnecessary instructions as WQM/WWM.

In particular this fixes a bug where values passing through
V_SET_INACTIVE would me marked as requiring WWM.

Reviewed By: piotr

Differential Revision: https://reviews.llvm.org/D95503

3 years ago[DCE] Don't remove non-willreturn calls
Nikita Popov [Thu, 18 Feb 2021 21:29:19 +0000 (22:29 +0100)]
[DCE] Don't remove non-willreturn calls

In both ADCE and BDCE (via DemandedBits) we should not remove
instructions that are not guaranteed to return. This issue was
pointed out by fhahn in the recent llvm-dev thread.

Differential Revision: https://reviews.llvm.org/D96993

3 years ago[flang][driver] Add debug measure-parse-tree and pre-fir-tree options
Faris Rehman [Wed, 17 Feb 2021 18:53:05 +0000 (18:53 +0000)]
[flang][driver] Add debug measure-parse-tree and pre-fir-tree options

Add the following options:
* -fdebug-measure-parse-tree
* -fdebug-pre-fir-tree

Summary of changes:
- Add 2 new frontend actions: DebugMeasureParseTreeAction and DebugPreFIRTreeAction
- Add MeasurementVisitor to FrontendActions.h
- Make reportFatalSemanticErrors return true if there are any fatal errors
- Port most of the `-fdebug-pre-fir-tree` tests to use the new driver if built, otherwise use f18.

Differential Revision: https://reviews.llvm.org/D96884

3 years agoRemove unnecessary "using namespace llvm" inside "namespace llvm". NFCI.
Simon Pilgrim [Fri, 19 Feb 2021 11:15:16 +0000 (11:15 +0000)]
Remove unnecessary "using namespace llvm" inside "namespace llvm". NFCI.

3 years ago[X86][AVX] getFauxShuffleMask - decode VBROADCAST(EXTRACT_VECTOR_ELT(V,0))
Simon Pilgrim [Fri, 19 Feb 2021 11:02:38 +0000 (11:02 +0000)]
[X86][AVX] getFauxShuffleMask - decode VBROADCAST(EXTRACT_VECTOR_ELT(V,0))

Handle the case where we're broadcasting a scalar extracted from another vector.

3 years ago[IR] Move willReturn() to Instruction
Nikita Popov [Thu, 18 Feb 2021 21:15:17 +0000 (22:15 +0100)]
[IR] Move willReturn() to Instruction

This moves the willReturn() helper from CallBase to Instruction,
so that it can be used in a more generic manner. This will make
it easier to fix additional passes (ADCE and BDCE), and will give
us one place to change if additional instructions should become
non-willreturn (e.g. there has been talk about handling volatile
operations this way).

I have also included the IntrinsicInst workaround directly in
here, so that it gets applied consistently. (As such this change
is not entirely NFC -- FuncAttrs will now use this as well.)

Differential Revision: https://reviews.llvm.org/D96992

3 years ago[BasicAA] Add simple depth limit to avoid stack overflow (PR49151)
Nikita Popov [Thu, 18 Feb 2021 22:13:33 +0000 (23:13 +0100)]
[BasicAA] Add simple depth limit to avoid stack overflow (PR49151)

This is a simpler variant of D96647. It just adds a straightforward
depth limit with a high cutoff, without introducing complex logic
for BatchAA consistency. It accepts that we may cache a sub-optimal
result if the depth limit is hit.

Eventually this should be more fully addressed by D96647 or similar,
but in the meantime this avoids stack overflows in a cheap way.

Differential Revision: https://reviews.llvm.org/D96996

3 years ago[mlir] Add a TensorLoadToMemref canonicalization
Nicolas Vasilache [Fri, 19 Feb 2021 09:33:56 +0000 (09:33 +0000)]
[mlir] Add a TensorLoadToMemref canonicalization

A folder of `tensor_load + tensor_to_memref` exists but it only applies when
source and destination memref types are the same.

This revision adds a canonicalize `tensor_load + tensor_to_memref` to `memref_cast`
when type mismatches prevent folding to kick in.

Differential Revision: https://reviews.llvm.org/D97038

3 years ago[docs] Fix the GlobalISel/GenericOpcode.rst
Djordje Todorovic [Fri, 19 Feb 2021 09:28:38 +0000 (10:28 +0100)]
[docs] Fix the GlobalISel/GenericOpcode.rst

This couses docs build to fail.
Introduced with D96890.

3 years ago[X86] Fix a codegen crash in getSetCCResultType
Wang, Pengfei [Fri, 19 Feb 2021 08:43:30 +0000 (16:43 +0800)]
[X86] Fix a codegen crash in getSetCCResultType

This patch fixes some crashes coming from
X86ISelLowering::getSetCCResultType, which would occasionally return
an EVT constructed from an invalid MVT, which has a null Type pointer.

This patch refers to D95434.

Differential Revision: https://reviews.llvm.org/D97036

3 years ago[AArch64] Add some missing Neoverse features
Sjoerd Meijer [Thu, 18 Feb 2021 14:26:09 +0000 (14:26 +0000)]
[AArch64] Add some missing Neoverse features

This enables AES fusion and the post RA scheduler for the Neoverse cores.
And while we are it also for the A55 that we had missed earlier.

Differential Revision: https://reviews.llvm.org/D96866

3 years ago[llvm-exegesis] Ignore instructions using custom inserter
Qiu Chaofan [Fri, 19 Feb 2021 09:04:27 +0000 (17:04 +0800)]
[llvm-exegesis] Ignore instructions using custom inserter

Some instructions defined in table-gen files sets usesCustomInserter
bit, which means it has to be lowered by target code and isn't actually
valid instruction at MC level. So we should treat them like pseudo
instructions.

Reviewed By: gchatelet

Differential Revision: https://reviews.llvm.org/D94898

3 years ago[llvm-exegesis] [PowerPC] Add basic LIT test
Qiu Chaofan [Fri, 19 Feb 2021 08:52:45 +0000 (16:52 +0800)]
[llvm-exegesis] [PowerPC] Add basic LIT test

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D94897

3 years ago[debuginfo-tests] Recommit test sret.cpp
OCHyams [Fri, 19 Feb 2021 08:44:32 +0000 (08:44 +0000)]
[debuginfo-tests] Recommit test sret.cpp

This test was accidently removed when the directory structure was shuffled
around for dexter in f78c236efda8.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D96968

3 years ago[NPM][LTO] Do not enable MemorySSA with LoopFullUnrollPass
David Green [Fri, 19 Feb 2021 08:35:11 +0000 (08:35 +0000)]
[NPM][LTO] Do not enable MemorySSA with LoopFullUnrollPass

As with the standard opt pipeline, we disable the MemorySSA dependency
in the LTO LPM pipeline as not all passes preserve MemorySSA.

3 years ago[mlir] Better support for rank-reducing subview / subtensor type inference.
Nicolas Vasilache [Thu, 18 Feb 2021 22:03:02 +0000 (22:03 +0000)]
[mlir] Better support for rank-reducing subview / subtensor type inference.

Differential Revision: https://reviews.llvm.org/D96995

3 years agoReland "[Debugify] Make the debugify aware of the original (-g) Debug Info"
Djordje Todorovic [Thu, 18 Feb 2021 17:49:44 +0000 (09:49 -0800)]
Reland "[Debugify] Make the debugify aware of the original (-g) Debug Info"

    As discussed on the RFC [0], I am sharing the set of patches that
    enables checking of original Debug Info metadata preservation in
    optimizations. The proof-of-concept/proposal can be found at [1].

    The implementation from the [1] was full of duplicated code,
    so this set of patches tries to merge this approach into the existing
    debugify utility.

    For example, the utility pass in the original-debuginfo-check
    mode could be invoked as follows:

      $ opt -verify-debuginfo-preserve -pass-to-test sample.ll

    Since this is very initial stage of the implementation,
    there is a space for improvements such as:
      - Add support for the new pass manager
      - Add support for metadata other than DILocations and DISubprograms

    [0] https://groups.google.com/forum/#!msg/llvm-dev/QOyF-38YPlE/G213uiuwCAAJ
    [1] https://github.com/djolertrk/llvm-di-checker

    Differential Revision: https://reviews.llvm.org/D82545

The test that was failing is now forced to use the old PM.

3 years agoLanguageRuntime can provide an UnwindPlan for special occasions
Jason Molenda [Fri, 19 Feb 2021 07:20:15 +0000 (23:20 -0800)]
LanguageRuntime can provide an UnwindPlan for special occasions

Add a facility in the LanguageRuntime to provide a special
UnwindPlan based on the register values in a RegisterContext,
instead of using the return-pc to find a function and use its
normal UnwindPlans.

Needed when the runtime has special stack frames that we want
to show the user, but aren't actually on the real stack.
Specifically for Swift asynchronous functions.

With feedback from Greg Clayton, Jonas Devlieghere, Dave Lee

<rdar://problem/70398009>

Differential Revision: https://reviews.llvm.org/D96839

3 years ago[mlir][sparse] assert fail on mismatch between rank and annotations array
Aart Bik [Fri, 19 Feb 2021 06:01:39 +0000 (22:01 -0800)]
[mlir][sparse] assert fail on mismatch between rank and annotations array

Rationale:
Providing the wrong number of sparse/dense annotations was silently
ignored or caused unrelated crashes. This minor change verifies that
the provided number matches the rank.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D97034

3 years ago[RISCV] Prune unneeded indexed load/store pseudo instructions.
Craig Topper [Fri, 19 Feb 2021 07:00:18 +0000 (23:00 -0800)]
[RISCV] Prune unneeded indexed load/store pseudo instructions.

We were creating more combinations of value and index lmul than
we needed.

I've copied the loop structure used here from VPseudoAMOEI with
all data sew values instead of just 32/64.

Similar can be done for segment loads/store.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D97008

3 years ago[CodeGen] Use range-based for loops (NFC)
Kazu Hirata [Fri, 19 Feb 2021 06:46:43 +0000 (22:46 -0800)]
[CodeGen] Use range-based for loops (NFC)

3 years ago[Support] Use static_assert instead of assert (NFC)
Kazu Hirata [Fri, 19 Feb 2021 06:46:41 +0000 (22:46 -0800)]
[Support] Use static_assert instead of assert (NFC)

Identified with misc-static-assert.

3 years ago[TableGen] Use ListSeparator (NFC)
Kazu Hirata [Fri, 19 Feb 2021 06:46:39 +0000 (22:46 -0800)]
[TableGen] Use ListSeparator (NFC)

3 years ago[mlir] Load dynamic libraries in JitRunner from absolute paths so that GDB can find...
Christian Sigg [Thu, 18 Feb 2021 21:11:20 +0000 (22:11 +0100)]
[mlir] Load dynamic libraries in JitRunner from absolute paths so that GDB can find the symbol tables.

Reviewed By: mehdi_amini, ftynse

Differential Revision: https://reviews.llvm.org/D96759

3 years ago[FPEnv][AArch64] Implement lowering of llvm.set.rounding
Serge Pavlov [Tue, 16 Feb 2021 13:14:21 +0000 (20:14 +0700)]
[FPEnv][AArch64] Implement lowering of llvm.set.rounding

Differential Revision: https://reviews.llvm.org/D96836

3 years ago[libc++] shared_ptr deleter requirements (LWG 2802).
zoecarver [Fri, 19 Feb 2021 05:31:07 +0000 (21:31 -0800)]
[libc++] shared_ptr deleter requirements (LWG 2802).

This patch implements 2802. Requires _Deleter to have call operator and be move constructible. Based on D62233.

Refs PR37637.

Differential Revision: https://reviews.llvm.org/D62274

3 years agoMark 2534 as Complete.
zoecarver [Fri, 19 Feb 2021 05:28:49 +0000 (21:28 -0800)]
Mark 2534 as Complete.

c90dee1 fixed LWG 1203 which supresses LWG 2534 as well.

Refs D62889.

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D96885

3 years ago[HIP] Support device sanitizer
Yaxun (Sam) Liu [Tue, 16 Feb 2021 18:43:03 +0000 (13:43 -0500)]
[HIP] Support device sanitizer

Add option -fgpu-sanitize to enable sanitizer for AMDGPU target.

Since it is experimental, it is off by default.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D96835

3 years ago[ORC] Print CPU feature string in JITTargetMachineBuilder debugging output.
Lang Hames [Fri, 19 Feb 2021 03:36:15 +0000 (14:36 +1100)]
[ORC] Print CPU feature string in JITTargetMachineBuilder debugging output.

3 years ago[RISCV] Remove redundant test cases for index segment store (8/8).
Hsiangkai Wang [Fri, 19 Feb 2021 03:10:01 +0000 (11:10 +0800)]
[RISCV] Remove redundant test cases for index segment store (8/8).

Differential Revision: https://reviews.llvm.org/D97026

3 years ago[RISCV] Remove redundant test cases for index segment store (7/8).
Hsiangkai Wang [Fri, 19 Feb 2021 03:09:55 +0000 (11:09 +0800)]
[RISCV] Remove redundant test cases for index segment store (7/8).

Differential Revision: https://reviews.llvm.org/D97025

3 years ago[RISCV] Remove redundant test cases for index segment store (6/8).
Hsiangkai Wang [Fri, 19 Feb 2021 03:09:49 +0000 (11:09 +0800)]
[RISCV] Remove redundant test cases for index segment store (6/8).

Differential Revision: https://reviews.llvm.org/D97024

3 years ago[RISCV] Remove redundant test cases for index segment store (5/8).
Hsiangkai Wang [Fri, 19 Feb 2021 03:08:01 +0000 (11:08 +0800)]
[RISCV] Remove redundant test cases for index segment store (5/8).

Differential Revision: https://reviews.llvm.org/D97023

3 years ago[RISCV] Remove redundant test cases for index segment load (4/8).
Hsiangkai Wang [Fri, 19 Feb 2021 03:07:52 +0000 (11:07 +0800)]
[RISCV] Remove redundant test cases for index segment load (4/8).

3 years ago[RISCV] Remove redundant test cases for index segment load (3/8).
Hsiangkai Wang [Fri, 19 Feb 2021 03:07:42 +0000 (11:07 +0800)]
[RISCV] Remove redundant test cases for index segment load (3/8).

Differential Revision: https://reviews.llvm.org/D97022

3 years ago[RISCV] Remove redundant test cases for index segment load (2/8).
Hsiangkai Wang [Fri, 19 Feb 2021 03:07:36 +0000 (11:07 +0800)]
[RISCV] Remove redundant test cases for index segment load (2/8).

3 years ago[RISCV] Remove redundant test cases for index segment load (1/8).
Hsiangkai Wang [Fri, 19 Feb 2021 03:07:25 +0000 (11:07 +0800)]
[RISCV] Remove redundant test cases for index segment load (1/8).

Differential Revision: https://reviews.llvm.org/D97020

3 years ago[Coroutine] Relax CoroElide musttail check
Xun Li [Fri, 19 Feb 2021 03:36:11 +0000 (19:36 -0800)]
[Coroutine] Relax CoroElide musttail check

As discussed in D94834, we don't really need to do complicated analysis. It's safe to just drop the tail call attribute.

Differential Revision: https://reviews.llvm.org/D96926

3 years ago[RISCV] Split zvlsseg searchable table into 4 separate tables. Index by properties...
Craig Topper [Fri, 19 Feb 2021 03:00:48 +0000 (19:00 -0800)]
[RISCV] Split zvlsseg searchable table into 4 separate tables. Index by properties rather than intrinsic ID.

Intrinsic ID is a 32-bit value which made each row of the table 4
byte aligned. The remaining fields used 5 bytes. This meant 3 bytes
of padding per row.

This patch breaks the table into 4 separate tables and indexes them
by properties we know about the intrinsic. NF, masked,
strided, ordered, etc. The indexed load/store tables have no
padding in their rows now.

All together this reduces the size of llc binary by ~28K.

I'm considering adding similar tables for isel of non-segment
load/store as well to cut down the size of the isel table and
probably improve our isel performance. Those tables would need to
indexed from intrinsics, IR loads/stores, gathers/scatters, and
RISCVISD opcodes. So having a table that can be indexed without using
intrinsic ID is more flexible.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D96894