platform/upstream/llvm.git
3 years ago[Coroutine] Update promise object's final layout index
Yuanfang Chen [Wed, 13 Jan 2021 01:42:10 +0000 (17:42 -0800)]
[Coroutine] Update promise object's final layout index

promise is a header field but it is not guaranteed that it would be the third
field of the frame due to `performOptimizedStructLayout`.

Reviewed By: lxfind

Differential Revision: https://reviews.llvm.org/D94137

3 years ago[X86][AMX] Prohibit pointer cast on load.
Luo, Yuanke [Sun, 10 Jan 2021 06:06:18 +0000 (14:06 +0800)]
[X86][AMX] Prohibit pointer cast on load.

The load/store instruction will be transformed to amx intrinsics in the
pass of AMX type lowering. Prohibiting the pointer cast make that pass
happy.

Differential Revision: https://reviews.llvm.org/D94372

3 years ago[mlir][Python] Add checking process before create an AffineMap from a permutation.
zhanghb97 [Tue, 12 Jan 2021 13:40:27 +0000 (21:40 +0800)]
[mlir][Python] Add checking process before create an AffineMap from a permutation.

An invalid permutation will trigger a C++ assertion when attempting to create an AffineMap from the permutation.
This patch adds an `isPermutation` function to check the given permutation before creating the AffineMap.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D94492

3 years ago[gn build] (manually) port 79f99ba65d96
Nico Weber [Wed, 13 Jan 2021 01:30:56 +0000 (20:30 -0500)]
[gn build] (manually) port 79f99ba65d96

3 years ago[MSan] Tweak CopyOrigin
Jianzhou Zhao [Tue, 12 Jan 2021 21:49:59 +0000 (21:49 +0000)]
[MSan] Tweak CopyOrigin

There could be some mis-alignments when copying origins not aligned.

I believe inaligned memcpy is rare so the cases do not matter too much
in practice.

1) About the change at line 50

Let dst be (void*)5,
then d=5, beg=4
so we need to write 3 (4+4-5) bytes from 5 to 7.

2) About the change around line 77.

Let dst be (void*)5,
because of lines 50-55, the bytes from 5-7 were already writen.
So the aligned copy is from 8.

Reviewed-by: eugenis
Differential Revision: https://reviews.llvm.org/D94552

3 years ago[DAGCombiner] Fold BRCOND(FREEZE(COND)) to BRCOND(COND)
Juneyoung Lee [Wed, 13 Jan 2021 00:33:21 +0000 (09:33 +0900)]
[DAGCombiner] Fold BRCOND(FREEZE(COND)) to BRCOND(COND)

This patch resolves the suboptimal codegen described in http://llvm.org/pr47873 .
When CodeGenPrepare lowers select into a conditional branch, a freeze instruction is inserted.
It is then translated to `BRCOND(FREEZE(SETCC))` in SelDag.
The `FREEZE` in the middle of `SETCC` and `BRCOND` was causing a suboptimal code generation however.
This patch adds `BRCOND(FREEZE(cond))` -> `BRCOND(cond)` fold to DAGCombiner to remove the `FREEZE`.

To make this optimization sound, `BRCOND(UNDEF)` simply should nondeterministically jump to the branch or not, rather than raising UB.
It wasn't clear what happens when the condition was undef according to the comments in ISDOpcodes.h, however.
I updated the comments of `BRCOND` to make it explicit (as well as `BR_CC`, which is also a conditional branch instruction).

Note that it diverges from the semantics of `br` instruction in IR, which is explicitly UB.
Since the UB semantics was necessary to explain optimizations that use branching conditions, and SelDag doesn't seem to have such optimization, I think this divergence is okay.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D92015

3 years ago[LangRef] State that a nocapture pointer cannot be returned
Juneyoung Lee [Mon, 11 Jan 2021 05:42:08 +0000 (14:42 +0900)]
[LangRef] State that a nocapture pointer cannot be returned

This is a small patch stating that a nocapture pointer cannot be returned.

Discussed in D93189.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D94386

3 years ago[libc][NFC] Use more specific comparison macros in LdExpTest.h.
Siva Chandra Reddy [Wed, 13 Jan 2021 00:11:28 +0000 (16:11 -0800)]
[libc][NFC] Use more specific comparison macros in LdExpTest.h.

3 years ago[libc] add isascii and toascii implementations
Michael Jones [Tue, 12 Jan 2021 22:37:56 +0000 (22:37 +0000)]
[libc] add isascii and toascii implementations

adding both at once since these are trivial functions.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D94558

3 years ago[AMDGPU] Add _e64 suffix to VOP3 Insts
Joe Nash [Thu, 7 Jan 2021 18:56:02 +0000 (13:56 -0500)]
[AMDGPU] Add _e64 suffix to VOP3 Insts

Previously, instructions which could be
expressed as VOP3 in addition to another
encoding had a _e64 suffix on the tablegen
record name, while those
only available as VOP3 did not. With this
patch, all VOP3s will have the _e64 suffix.
The assembly does not change, only  the mir.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D94341

Change-Id: Ia8ec8890d47f8f94bbbdac43745b4e9dd2b03423

3 years agoDelete unused function (was breaking the -Werror build)
David Blaikie [Tue, 12 Jan 2021 23:29:44 +0000 (15:29 -0800)]
Delete unused function (was breaking the -Werror build)

3 years ago[NFC] Disallow unused prefixes under MC/AMDGPU
Mircea Trofin [Tue, 12 Jan 2021 22:31:58 +0000 (14:31 -0800)]
[NFC] Disallow unused prefixes under MC/AMDGPU

This patches remaining tests, and patches lit.local.cfg to block future
such cases (until we flip FileCheck's flag)

Differential Revision: https://reviews.llvm.org/D94556

3 years ago[Sanitizer][Darwin] Fix test for macOS 11+ point releases
Julian Lettner [Tue, 12 Jan 2021 23:01:18 +0000 (15:01 -0800)]
[Sanitizer][Darwin] Fix test for macOS 11+ point releases

This test wrongly asserted that the minor version is always 0 when
running on macOS 11 and above.

3 years ago[MIPatternMatch] Add matcher for G_PTR_ADD
Jessica Paquette [Fri, 8 Jan 2021 23:06:13 +0000 (15:06 -0800)]
[MIPatternMatch] Add matcher for G_PTR_ADD

Add a matcher which recognizes G_PTR_ADD and add a test.

Differential Revision: https://reviews.llvm.org/D94348

3 years agoAdd sample-profile-suffix-elision-policy attribute with -funique-internal-linkage...
Hongtao Yu [Wed, 23 Dec 2020 06:43:22 +0000 (22:43 -0800)]
Add sample-profile-suffix-elision-policy attribute with -funique-internal-linkage-names.

Adding sample-profile-suffix-elision-policy attribute to functions whose linkage names are uniquefied so that their unique name suffix won't be trimmed when applying AutoFDO profiles.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D94455

3 years ago[ELF][NFCI] small cleanup to OutputSections.h
Bob Haarman [Tue, 12 Jan 2021 02:08:01 +0000 (02:08 +0000)]
[ELF][NFCI] small cleanup to OutputSections.h

OutputSections.h used to close the lld::elf namespace only to
immediately open it again. This change merges both parts into
one.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D94538

3 years ago[RISCV] Remove '.mask' from vcompress intrinsic name. NFC
Craig Topper [Tue, 12 Jan 2021 22:37:28 +0000 (14:37 -0800)]
[RISCV] Remove '.mask' from vcompress intrinsic name. NFC

It has a mask argument, but isn't a masked instruction. It doesn't
use the mask policy of or the v0.t syntax.

3 years ago[ADT][NFC] Use empty base optimisation in BumpPtrAllocatorImpl
Nathan James [Tue, 12 Jan 2021 22:43:48 +0000 (22:43 +0000)]
[ADT][NFC] Use empty base optimisation in BumpPtrAllocatorImpl

Most uses of this class just use the default MallocAllocator.
As this contains no fields, we can use the empty base optimisation for BumpPtrAllocatorImpl and save 8 bytes of padding for most use cases.

This prevents using a class that is marked as `final` as the `AllocatorT` template argument.
In one must use an allocator that has been marked as `final`, the simplest way around this is a proxy class.
The class should have all the methods that `AllocaterBase` expects and should forward the calls to your own allocator instance.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D94439

3 years ago[NFC] Disallow unused prefixes in MC/AMDGPU
Mircea Trofin [Tue, 12 Jan 2021 22:06:30 +0000 (14:06 -0800)]
[NFC] Disallow unused prefixes in MC/AMDGPU

1 out of 2 patches.

Differential Revision: https://reviews.llvm.org/D94553

3 years ago[Driver] Fix assertion failure when -fprofile-generate -fcs-profile-generate are...
Fangrui Song [Tue, 12 Jan 2021 22:19:55 +0000 (14:19 -0800)]
[Driver] Fix assertion failure when -fprofile-generate -fcs-profile-generate are used together

If conflicting `-fprofile-generate -fcs-profile-generate` are used together,
there is currently an assertion failure. Fix the failure.

Also add some driver tests.

Reviewed By: xur

Differential Revision: https://reviews.llvm.org/D94463

3 years agoAMDGPU: Remove wrapper only call limitation
Matt Arsenault [Wed, 6 Jan 2021 19:04:19 +0000 (14:04 -0500)]
AMDGPU: Remove wrapper only call limitation

This seems to only have overridden cold handling, which we probably
shouldn't do. As far as I can tell the wrapper library functions are
still inlined as appropriate.

3 years ago[OpenMP] Fixed a typo in openmp/CMakeLists.txt
Shilei Tian [Tue, 12 Jan 2021 22:00:49 +0000 (17:00 -0500)]
[OpenMP] Fixed a typo in openmp/CMakeLists.txt

3 years ago[libcxx] Avoid overflows in the windows __libcpp_steady_clock_now()
Martin Storsjö [Thu, 17 Dec 2020 13:40:06 +0000 (15:40 +0200)]
[libcxx] Avoid overflows in the windows __libcpp_steady_clock_now()

As freq.QuadValue can be in the range of 10000000 to 19200000,
the multiplication before division makes the calculation overflow
and wrap to negative values every 16-30 minutes.

Instead count the whole seconds separately before adding the
scaled fractional seconds.

Add a testcase for steady_clock to check that the values returned for
now() compare as bigger than the zero time origin; this
corresponds to a testcase in Qt [1] [2] (that failed spuriously
due to this).

[1] https://bugreports.qt.io/browse/QTBUG-89539
[2] https://code.qt.io/cgit/qt/qtbase.git/tree/tests/auto/corelib/kernel/qdeadlinetimer/tst_qdeadlinetimer.cpp?id=f8de5e54022b8b7471131b7ad55c83b69b2684c0#n569

Differential Revision: https://reviews.llvm.org/D93456

3 years ago[AArch64] [Windows] Properly add :lo12: reloc specifiers when generating assembly
Martin Storsjö [Fri, 11 Dec 2020 10:42:07 +0000 (12:42 +0200)]
[AArch64] [Windows] Properly add :lo12: reloc specifiers when generating assembly

This makes sure that assembly output actually can be assembled.

Set the correct MCExpr relocations specifier VK_PAGEOFF - and also
set VK_PAGE consistently even though it's not visible in the assembly
output.

Differential Revision: https://reviews.llvm.org/D94365

3 years ago[OpenMP] Fixed the link error that cannot find static data member
Shilei Tian [Tue, 12 Jan 2021 21:48:19 +0000 (16:48 -0500)]
[OpenMP] Fixed the link error that cannot find static data member

Constant static data member can be defined in the class without another
define after the class in C++17. Although it is C++17, Clang can still handle it
even w/o the flag for C++17. Unluckily, GCC cannot handle that.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D94541

3 years ago[Inliner] Change inline remark format and update ReplayInlineAdvisor to use it
modimo [Tue, 12 Jan 2021 21:19:30 +0000 (13:19 -0800)]
[Inliner] Change inline remark format and update ReplayInlineAdvisor to use it

This change modifies the source location formatting from:
LineNumber.Discriminator
to:
LineNumber:ColumnNumber.Discriminator

The motivation here is to enhance location information for inline replay that currently exists for the SampleProfile inliner. This will be leveraged further in inline replay for the CGSCC inliner in the related diff.

The ReplayInlineAdvisor is also modified to read the new format and now takes into account the callee for greater accuracy.

Testing:
ninja check-llvm

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D94333

3 years ago[mlir] Update LLVM dialect type documentation
Alex Zinenko [Tue, 12 Jan 2021 11:07:12 +0000 (12:07 +0100)]
[mlir] Update LLVM dialect type documentation

Recent commits reconfigured LLVM dialect types to use built-in types whenever
possible. Update the documentation accordingly.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D94485

3 years ago[InstCombine] Handle logical and/or in assume optimization
Nikita Popov [Tue, 12 Jan 2021 21:35:19 +0000 (22:35 +0100)]
[InstCombine] Handle logical and/or in assume optimization

assume(a && b) can be converted to assume(a); assume(b) even if
the condition is logical. Same for assume(!(a || b)).

3 years ago[RISCV] Legalize select when Zbt extension available
Michael Munday [Tue, 12 Jan 2021 21:22:34 +0000 (21:22 +0000)]
[RISCV] Legalize select when Zbt extension available

The custom expansion of select operations in the RISC-V backend
interferes with the matching of cmov instructions. Legalizing
select when the Zbt extension is available solves that problem.

Reviewed By: lenary, craig.topper

Differential Revision: https://reviews.llvm.org/D93767

3 years ago[InstCombine] Add tests for logical and/or poison implication (NFC)
Nikita Popov [Tue, 12 Jan 2021 21:15:54 +0000 (22:15 +0100)]
[InstCombine] Add tests for logical and/or poison implication (NFC)

These tests cover some cases where we can fold select to and/or
based on poison implication logic.

3 years ago[RISCV] Add double test cases to vfmerge-rv32.ll. NFC
Craig Topper [Tue, 12 Jan 2021 21:08:58 +0000 (13:08 -0800)]
[RISCV] Add double test cases to vfmerge-rv32.ll. NFC

3 years ago[SLP] reduce code duplication while processing reductions; NFC
Sanjay Patel [Tue, 12 Jan 2021 20:07:01 +0000 (15:07 -0500)]
[SLP] reduce code duplication while processing reductions; NFC

3 years ago[SLP] rename variable to improve readability; NFC
Sanjay Patel [Tue, 12 Jan 2021 19:55:09 +0000 (14:55 -0500)]
[SLP] rename variable to improve readability; NFC

The OperationData in the 2nd block (visiting the operands)
is completely independent of the 1st block.

3 years ago[SLP] reduce code duplication in processing reductions; NFC
Sanjay Patel [Tue, 12 Jan 2021 18:53:18 +0000 (13:53 -0500)]
[SLP] reduce code duplication in processing reductions; NFC

3 years ago[SLP] reduce code duplication while matching reductions; NFC
Sanjay Patel [Tue, 12 Jan 2021 18:45:32 +0000 (13:45 -0500)]
[SLP] reduce code duplication while matching reductions; NFC

3 years ago[LV] Weaken spuriously strong assert in LoopVersioning
Philip Reames [Tue, 12 Jan 2021 20:54:07 +0000 (12:54 -0800)]
[LV] Weaken spuriously strong assert in LoopVersioning

LoopVectorize uses some utilities on LoopVersioning, but doesn't actually use it for, you know, versioning.  As a result, the precondition LoopVersioning expects is too strong for this user.  At the moment, LoopVectorize supports any loop with a unique exit block, so check the same precondition here.

Really, the whole class structure here is a mess.  We should separate the actual versioning from the metadata updates, but that's a bigger problem.

3 years ago[InstCombine] Duplicate tests for logical and/or (NFC)
Nikita Popov [Tue, 12 Jan 2021 19:54:23 +0000 (20:54 +0100)]
[InstCombine] Duplicate tests for logical and/or (NFC)

This replicates existing and/or tests to also test variants using
select. This should help us get a more accurate view on which
optimizations we're missing if we disable the select -> and/or
fold.

3 years agoFix for crash in __builtin_return_address in template context.
Sunil Srivastava [Tue, 12 Jan 2021 20:37:18 +0000 (12:37 -0800)]
Fix for crash in __builtin_return_address in template context.

The check for argument value needs to be guarded by !isValueDependent().

Differential Revision: https://reviews.llvm.org/D94438

3 years ago[LV] Relax assumption that LCSSA implies single entry
Philip Reames [Tue, 12 Jan 2021 20:32:24 +0000 (12:32 -0800)]
[LV] Relax assumption that LCSSA implies single entry

This relates to the ongoing effort to support vectorization of multiple exit loops (see D93317).

The previous code assumed that LCSSA phis were always single entry before the vectorizer ran. This was correct, but only because the vectorizer allowed only a single exiting edge. There's nothing in the definition of LCSSA which requires single entry phis.

A common case where this comes up is with a loop with multiple exiting blocks which all reach a common exit block. (e.g. see the test updates)

Differential Revision: https://reviews.llvm.org/D93725

3 years ago[InstCombine] Regenerate test checks (NFC)
Nikita Popov [Tue, 12 Jan 2021 20:21:22 +0000 (21:21 +0100)]
[InstCombine] Regenerate test checks (NFC)

3 years ago[clang-tidy] Add test for Transformer-based checks with diagnostics.
Yitzhak Mandelbaum [Mon, 11 Jan 2021 22:28:17 +0000 (22:28 +0000)]
[clang-tidy] Add test for Transformer-based checks with diagnostics.

Adds a test that checks the diagnostic output of the tidy.

Differential Revision: https://reviews.llvm.org/D94453

3 years ago[IR] move nomerge attribute from function declaration/definition to callsites
Zequan Wu [Tue, 12 Jan 2021 19:22:31 +0000 (11:22 -0800)]
[IR] move nomerge attribute from function declaration/definition to callsites

Move nomerge attribute from function declaration/definition to callsites to
allow virtual function calls attach the attribute.

Differential Revision: https://reviews.llvm.org/D94537

3 years ago[FunctionAttrs] Derive willreturn for fns with readonly` & `mustprogress`.
Florian Hahn [Tue, 12 Jan 2021 19:55:17 +0000 (19:55 +0000)]
[FunctionAttrs] Derive willreturn for fns with readonly` & `mustprogress`.

Similar to D94125, derive `willreturn` for functions that are `readonly` and
`mustprogress` in FunctionAttrs.

To quote the reasoning from D94125:

    Since D86233 we have `mustprogress` which, in combination with
    `readonly`, implies `willreturn`. The idea is that every side-effect
    has to be modeled as a "write". Consequently, `readonly` means there
    is no side-effect, and `mustprogress` guarantees that we cannot "loop"
    forever without side-effect.

Reviewed By: jdoerfert, nikic

Differential Revision: https://reviews.llvm.org/D94502

3 years ago[clang][aarch64] Precondition isHomogeneousAggregate on isCXX14Aggregate
David Truby [Thu, 3 Dec 2020 11:25:57 +0000 (11:25 +0000)]
[clang][aarch64] Precondition isHomogeneousAggregate on isCXX14Aggregate

MSVC on WoA64 includes isCXX14Aggregate in its definition. This is de-facto
specification on that platform, so match msvc's behaviour.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=47611

Co-authored-by: Peter Waller <peter.waller@arm.com>
Differential Revision: https://reviews.llvm.org/D92751

3 years ago[libomptarget][amdgpu][nfc] Fix build on centos
Jon Chesterfield [Tue, 12 Jan 2021 19:40:02 +0000 (19:40 +0000)]
[libomptarget][amdgpu][nfc] Fix build on centos

[libomptarget][amdgpu][nfc] Fix build on centos

rtl.cpp replaced 224 with a #define from elf.h, but that
doesn't work on a centos 7 build machine with an old elf.h

Reviewed By: ronlieb

Differential Revision: https://reviews.llvm.org/D94528

3 years ago[OpenMP] Fixed include directories for OpenMP when building OpenMP with LLVM_ENABLE_R...
Shilei Tian [Tue, 12 Jan 2021 19:32:27 +0000 (14:32 -0500)]
[OpenMP] Fixed include directories for OpenMP when building OpenMP with LLVM_ENABLE_RUNTIMES

Some LLVM headers are generated by CMake. Before the installation,
LLVM's headers are distributed everywhere, some of which are in
`${LLVM_SRC_ROOT}/llvm/include/llvm`, and some are in
`${LLVM_BINARY_ROOT}/include/llvm`. After intallation, they're all in
`${LLVM_INSTALLATION_ROOT}/include/llvm`.

OpenMP now depends on LLVM headers. Some headers depend on headers generated
by CMake. When building OpenMP along with LLVM, a.k.a via `LLVM_ENABLE_RUNTIMES`,
we need to tell OpenMP where it can find those headers, especially those still
have not been copied/installed.

Reviewed By: jdoerfert, jhuber6

Differential Revision: https://reviews.llvm.org/D94534

3 years ago[InstSimplify] Don't fold gep p, -p to null
Nikita Popov [Thu, 24 Dec 2020 16:04:40 +0000 (17:04 +0100)]
[InstSimplify] Don't fold gep p, -p to null

This is a partial fix for https://bugs.llvm.org/show_bug.cgi?id=44403.
Folding gep p, q-p to q is only legal if p and q have the same
provenance. This fold should probably be guarded by something like
getUnderlyingObject(p) == getUnderlyingObject(q).

This patch is a partial fix that removes the special handling for
gep p, 0-p, which will fold to a null pointer, which would certainly
not pass an underlying object check (unless p is also null, in which
case this would fold trivially anyway). Folding to a null pointer
is particularly problematic due to the special handling it receives
in many places, making end-to-end miscompiles more likely.

Differential Revision: https://reviews.llvm.org/D93820

3 years ago[libcxx] Port to OpenBSD
Brad Smith [Tue, 12 Jan 2021 19:16:15 +0000 (14:16 -0500)]
[libcxx] Port to OpenBSD

Add initial OpenBSD support.

Reviewed By: ldionne

Differential Revision: https://reviews.llvm.org/D94205

3 years ago[libc++] Add a missing `<_Compare>` template argument.
Arthur O'Dwyer [Fri, 18 Dec 2020 20:11:51 +0000 (15:11 -0500)]
[libc++] Add a missing `<_Compare>` template argument.

Sometimes `_Compare` is an lvalue reference type, so letting it be
deduced is pretty much always wrong. (Well, less efficient than
it could be, anyway.)

Differential Revision: https://reviews.llvm.org/D93562

3 years ago[FunctionAttrs] Precommit tests for willreturn inference.
Florian Hahn [Mon, 11 Jan 2021 16:33:22 +0000 (16:33 +0000)]
[FunctionAttrs] Precommit tests for willreturn inference.

Tests for D94502.

3 years ago[RISCV] Use vmerge.vim for llvm.riscv.vfmerge with a 0.0 scalar operand.
Craig Topper [Tue, 12 Jan 2021 18:52:53 +0000 (10:52 -0800)]
[RISCV] Use vmerge.vim for llvm.riscv.vfmerge with a 0.0 scalar operand.

We can use a 0 immediate to avoid needing to materialize 0 into
an FPR first.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D94459

3 years ago[NewPM] Run non-trivial loop unswitching under -O2/3/s/z
Arthur Eubanks [Mon, 11 Jan 2021 21:50:52 +0000 (13:50 -0800)]
[NewPM] Run non-trivial loop unswitching under -O2/3/s/z

Fixes https://bugs.llvm.org/show_bug.cgi?id=48715.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D94448

3 years ago[clangd] Avoid recursion in TargetFinder::add()
Nathan Ridge [Mon, 11 Jan 2021 01:41:50 +0000 (20:41 -0500)]
[clangd] Avoid recursion in TargetFinder::add()

Fixes https://github.com/clangd/clangd/issues/633

Differential Revision: https://reviews.llvm.org/D94382

3 years ago[LegalizeDAG][RISCV][PowerPC][AMDGPU][WebAssembly] Improve expansion of SETONE/SETUEQ...
Craig Topper [Tue, 12 Jan 2021 17:52:00 +0000 (09:52 -0800)]
[LegalizeDAG][RISCV][PowerPC][AMDGPU][WebAssembly] Improve expansion of SETONE/SETUEQ on targets without SETO/SETUO.

If SETO/SETUO aren't legal, they'll be expanded and we'll end up
with 3 comparisons.

SETONE is equivalent to (SETOGT || SETOLT)
so if one of those operations is supported use that expansion. We
don't need both since we can commute the operands to make the other.

SETUEQ can be implemented with !(SETOGT || SETOLT) or (SETULE && SETUGE).
I've only implemented the first because it didn't look like most of the
affected targets had legal SETULE/SETUGE.

Reviewed By: frasercrmck, tlively, nemanjai

Differential Revision: https://reviews.llvm.org/D94450

3 years ago[Flang][openmp][openacc] Extend CheckNoBranching to handle branching provided by...
sameeran joshi [Thu, 17 Dec 2020 08:58:03 +0000 (14:28 +0530)]
[Flang][openmp][openacc] Extend CheckNoBranching to handle branching provided by LabelEnforce.

`CheckNoBranching` is currently handling only illegal branching out for constructs
with `Parser::Name` in them.
Extend the same for handling illegal branching out caused by `Parser::Label` based statements.
This patch could possibly solve one of the issues(typically branching out) mentioned in D92735.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D93447

3 years ago[instCombine] Add (A ^ B) | ~(A | B) -> ~(A & B)
Dávid Bolvanský [Tue, 12 Jan 2021 18:28:01 +0000 (19:28 +0100)]
[instCombine] Add (A ^ B) | ~(A | B) -> ~(A & B)

define i32 @src(i32 %x, i32 %y) {
%0:
  %xor = xor i32 %y, %x
  %or = or i32 %y, %x
  %neg = xor i32 %or, 4294967295
  %or1 = or i32 %xor, %neg
  ret i32 %or1
}
=>
define i32 @tgt(i32 %x, i32 %y) {
%0:
  %and = and i32 %x, %y
  %neg = xor i32 %and, 4294967295
  ret i32 %neg
}
Transformation seems to be correct!

https://alive2.llvm.org/ce/z/Cvca4a

3 years ago[Tests] Add tests for new InstCombine OR transformation, NFC
Dávid Bolvanský [Tue, 12 Jan 2021 17:56:49 +0000 (18:56 +0100)]
[Tests] Add tests for new InstCombine OR transformation, NFC

3 years ago[llvm] [cmake] Remove obsolete /usr/local hack for *BSD
Michał Górny [Tue, 12 Jan 2021 17:16:57 +0000 (18:16 +0100)]
[llvm] [cmake] Remove obsolete /usr/local hack for *BSD

Remove the hack adding /usr/local paths on FreeBSD and DragonFlyBSD.
It does not seem to be necessary today, and it breaks cross builds.

Differential Revision: https://reviews.llvm.org/D94491

3 years agoReturn false from __has_declspec_attribute() if not explicitly enabled
Timm Bäder [Tue, 12 Jan 2021 18:18:13 +0000 (13:18 -0500)]
Return false from __has_declspec_attribute() if not explicitly enabled

Currently, projects can check for __has_declspec_attribute() and use
it accordingly, but the check for __has_declspec_attribute will return
true even if declspec attributes are not enabled for the target.

This changes Clang to instead return false when declspec attributes are
not supported for the target.

3 years ago[doc] Place sha256 in lld/README.md into backticks
Emil Engler [Thu, 7 Jan 2021 02:28:54 +0000 (18:28 -0800)]
[doc] Place sha256 in lld/README.md into backticks

Reviewed By: smeenai

Differential Revision: https://reviews.llvm.org/D93984

3 years agoAdd -ansi option to CompileOnly group
Timm Bäder [Tue, 12 Jan 2021 18:15:21 +0000 (13:15 -0500)]
Add -ansi option to CompileOnly group

-ansi is documented as being the "same as -std=c89", but there are
differences when passing it to a link.

Adding -ansi to said group makes sense since it's supposed to be an
alias for -std=c89 and resolves this inconsistency.

3 years ago[SVE][NFC] Regenerate a few CodeGen tests
Cullen Rhodes [Tue, 12 Jan 2021 17:48:52 +0000 (17:48 +0000)]
[SVE][NFC] Regenerate a few CodeGen tests

Regenerated using llvm/utils/update_llc_test_checks.py as part of
D94504, committing separately to reduce the diff for D94504.

3 years ago[AMDGPU] Regenerate umax crash test
Simon Pilgrim [Tue, 12 Jan 2021 18:01:41 +0000 (18:01 +0000)]
[AMDGPU] Regenerate umax crash test

3 years agoFix typo in diagnostic message
Akira Hatanaka [Tue, 12 Jan 2021 17:56:06 +0000 (09:56 -0800)]
Fix typo in diagnostic message

rdar://66684531

3 years ago[X86] Regenerate sdiv_fix_sat.ll + udiv_fix_sat.ll tests
Simon Pilgrim [Tue, 12 Jan 2021 17:24:34 +0000 (17:24 +0000)]
[X86] Regenerate sdiv_fix_sat.ll + udiv_fix_sat.ll tests

Adding missing libcall PLT qualifiers

3 years ago[MLIR] Disallow `sym_visibility`, `sym_name` and `type` attributes in the parsed...
Rahul Joshi [Thu, 7 Jan 2021 00:32:59 +0000 (16:32 -0800)]
[MLIR] Disallow `sym_visibility`, `sym_name` and `type` attributes in the parsed attribute dictionary.

Differential Revision: https://reviews.llvm.org/D94200

3 years ago[PowerPC][NFCI] PassSubtarget to ASMWriter
Jinsong Ji [Tue, 12 Jan 2021 15:56:58 +0000 (15:56 +0000)]
[PowerPC][NFCI] PassSubtarget to ASMWriter

Subtarget feature bits are needed to change instprinter's behavior based
on feature bits.

Most of the other popular targets were updated back in 2015,
in https://reviews.llvm.org/rGb46d0234a6969
we should update it too.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D94449

3 years ago[mlir][spirv] NFC: split deserialization into multiple source files
Lei Zhang [Tue, 12 Jan 2021 16:11:45 +0000 (11:11 -0500)]
[mlir][spirv] NFC: split deserialization into multiple source files

This avoids large source files and gives a better structure. It also
allows leveraging compilation parallelism.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D94360

3 years ago[libc++] [C++2b] [P1048] Add is_scoped_enum and is_scoped_enum_v.
Marek Kurdej [Tue, 12 Jan 2021 16:06:58 +0000 (17:06 +0100)]
[libc++] [C++2b] [P1048] Add is_scoped_enum and is_scoped_enum_v.

* https://wg21.link/p1048

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D94409

3 years ago[mlir] Fix for LIT tests
Vladislav Vinogradov [Tue, 12 Jan 2021 16:06:06 +0000 (17:06 +0100)]
[mlir] Fix for LIT tests

Add `MLIR_SPIRV_CPU_RUNNER_ENABLED` to `llvm_canonicalize_cmake_booleans`.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D94407

3 years ago[mlir][CAPI] Fix inline function declaration
Vladislav Vinogradov [Tue, 12 Jan 2021 16:02:56 +0000 (17:02 +0100)]
[mlir][CAPI] Fix inline function declaration

Add `static` keyword, otherwise build fail with linker error for some cases.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D94496

3 years agoReland "[mlir][linalg] Support parsing attributes in named op spec"
Lei Zhang [Mon, 11 Jan 2021 13:50:00 +0000 (08:50 -0500)]
Reland "[mlir][linalg] Support parsing attributes in named op spec"

With this, now we can specify a list of attributes on named ops
generated from the spec. The format is defined as

```
attr-id ::= bare-id (`?`)?
attr-typedef ::= type (`[` `]`)?
attr-def ::= attr-id `:` attr-typedef

tc-attr-def ::= `attr` `(` attr-def-list `)`
tc-def ::= `def` bare-id
  `(`tensor-def-list`)` `->` `(` tensor-def-list`)`
  (tc-attr-def)?
```

For example,

```
ods_def<SomeCppOp>
def some_op(...) -> (...)
attr(
  f32_attr: f32,
  i32_attr: i32,
  array_attr : f32[],
  optional_attr? : f32
)
```

where `?` means optional attribute and `[]` means array type.

Reviewed By: hanchung, nicolasvasilache

Differential Revision: https://reviews.llvm.org/D94240

3 years ago[PowerPC] Add support for embedded devices with EFPU2
Nemanja Ivanovic [Tue, 12 Jan 2021 15:46:11 +0000 (09:46 -0600)]
[PowerPC] Add support for embedded devices with EFPU2

PowerPC cores like e200z759n3 [1] using an efpu2 only support single precision
hardware floating point instructions. The single precision instructions efs*
and evfs* are identical to the spe float instructions while efd* and evfd*
instructions trigger a not implemented exception.

This patch introduces a new command line option -mefpu2 which leads to
single-hardware / double-software code generation.

[1] Core reference:
  https://www.nxp.com/files-static/32bit/doc/ref_manual/e200z759CRM.pdf

Differential revision: https://reviews.llvm.org/D92935

3 years ago[SLP] Add test case showing a bug when dealing with padded types
Bjorn Pettersson [Tue, 12 Jan 2021 15:28:16 +0000 (16:28 +0100)]
[SLP] Add test case showing a bug when dealing with padded types

We shouldn't vectorize stores of non-packed types (i.e. types that
has padding between consecutive variables in a scalar layout,
but being packed in a vector layout).

The problem was detected as a miscompile in a downstream test case.

This is a pre-commit of a test case for the fix in D94446.

3 years ago[mlir][spirv] NFC: place ops in the proper file for their categories
Lei Zhang [Mon, 11 Jan 2021 14:58:31 +0000 (09:58 -0500)]
[mlir][spirv] NFC: place ops in the proper file for their categories

This commit moves dangling ops in the main ops.td file to the proper
file matching their categories. This makes ops.td as purely including
all category files.

Differential Revision: https://reviews.llvm.org/D94413

3 years ago[VE] Update VELIntrinsic tests
Kazushi (Jam) Marukawa [Tue, 12 Jan 2021 12:36:55 +0000 (21:36 +0900)]
[VE] Update VELIntrinsic tests

Update comment and style of regression tests for VELIntrinsic

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D94490

3 years ago[X86] Improved lowering for saturating float to int.
Bevin Hansson [Tue, 12 Jan 2021 14:40:36 +0000 (15:40 +0100)]
[X86] Improved lowering for saturating float to int.

Adapted from D54696 by @nikic.

This patch improves lowering of saturating float to
int conversions, FP_TO_[SU]INT_SAT, for X86.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D86079

3 years ago[mlir][openacc] Use TableGen information for default enum
Valentin Clement [Tue, 12 Jan 2021 14:42:25 +0000 (09:42 -0500)]
[mlir][openacc] Use TableGen information for default enum

Use TableGen and information in ACC.td for the Default enum in the OpenACC dialect.
This patch generalize what was done for OpenMP for directives.

Follow up patch after D93576

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D93710

3 years ago[TableGen] Improve error message for semicolon after braced body.
Paul C. Anagnostopoulos [Mon, 11 Jan 2021 14:46:27 +0000 (09:46 -0500)]
[TableGen] Improve error message for semicolon after braced body.

Add a test for this message.

Differential Revision: https://reviews.llvm.org/D94412

3 years ago[mlir][Linalg] NFC - Refactor fusion APIs
Nicolas Vasilache [Tue, 12 Jan 2021 14:01:59 +0000 (14:01 +0000)]
[mlir][Linalg] NFC - Refactor fusion APIs

This revision uniformizes fusion APIs to allow passing OpOperand, OpResult and adds a finer level of control fusion.

Differential Revision: https://reviews.llvm.org/D94493

3 years ago[X86][SSE] getFauxShuffleMask - handle PACKSS(SRAI(),SRAI()) shuffle patterns.
Simon Pilgrim [Tue, 12 Jan 2021 14:07:53 +0000 (14:07 +0000)]
[X86][SSE] getFauxShuffleMask - handle PACKSS(SRAI(),SRAI()) shuffle patterns.

We can't easily treat ASHR a faux shuffle, but if it was just feeding a PACKSS then it was likely being used as sign-extension for a truncation, so just peek through and adjust the mask accordingly.

3 years ago[X86][SSE] combineSubToSubus - add v16i32 handling on pre-AVX512BW targets.
Simon Pilgrim [Tue, 12 Jan 2021 13:43:56 +0000 (13:43 +0000)]
[X86][SSE] combineSubToSubus - add v16i32 handling on pre-AVX512BW targets.

v16i32 -> v16i16/v8i16 truncation is now good enough using PACKSS/PACKUS + shuffle combining that its no longer necessary to early-out on pre-AVX512BW targets.

This was noticed while looking at completing PR40111 and moving combineSubToSubus to DAGCombine entirely.

3 years ago[Fixed Point] Add codegen for conversion between fixed-point and floating point.
Bevin Hansson [Mon, 11 Jan 2021 21:46:42 +0000 (22:46 +0100)]
[Fixed Point] Add codegen for conversion between fixed-point and floating point.

The patch adds the required methods to FixedPointBuilder
for converting between fixed-point and floating point,
and uses them from Clang.

This depends on D54749.

Reviewed By: leonardchan

Differential Revision: https://reviews.llvm.org/D86632

3 years ago[X86][SSE] combineSubToSubus - remove SSE2 early-out.
Simon Pilgrim [Tue, 12 Jan 2021 11:50:09 +0000 (11:50 +0000)]
[X86][SSE] combineSubToSubus - remove SSE2 early-out.

SSE2 truncation codegen has improved over the past few years (mainly due to better shuffle lowering/combining and computeKnownBits) - its no longer necessary to early-out from v8i32/v8i64 truncations.

This was noticed while looking at completing PR40111 and moving combineSubToSubus to DAGCombine entirely.

3 years ago[RISCV] Improve scalable-vector shift tests (NFC)
Fraser Cormack [Fri, 8 Jan 2021 17:14:08 +0000 (17:14 +0000)]
[RISCV] Improve scalable-vector shift tests (NFC)

All i8/i16 and several i32 tests were testing immediate shift amounts
which exceeded the bits in the vector elements, creating poison values.
Amend the tests to test well-behaved shift amounts.

3 years agoChange the LLVM_ATTRIBUTE_DEPRECATED macro to use C++14 attribute.
Christian Sigg [Thu, 7 Jan 2021 08:41:36 +0000 (09:41 +0100)]
Change the LLVM_ATTRIBUTE_DEPRECATED macro to use C++14 attribute.

C++14 attributes are superior because they can be applied to functions with inline definition and the syntax is cleaner.

I intend to convert all uses and then remove the macro.

One issue that might hold back switching uses to C++14  attributes is that
clang-format does not put long attributes on separate lines and formatted code will look like:

```
template <typename T>
[[deprecated("blah blah")]] void
    foooooooooooooooooooooooooooo() {
  ...
}
```

Putting long attributes on a separate line would be prettier.
See https://stackoverflow.com/questions/45740466/clang-format-setting-to-control-c-attributes

AttributeMacros probably won't help because it can't match the custom message.
https://clang.llvm.org/docs/ClangFormatStyleOptions.html

Reviewed By: rriddle, MaskRay

Differential Revision: https://reviews.llvm.org/D94219

3 years agoRevert "[Test] Add failing test for PR48725"
Nico Weber [Tue, 12 Jan 2021 11:30:32 +0000 (06:30 -0500)]
Revert "[Test] Add failing test for PR48725"

This reverts commit e8287cb2b2923af9da72fd953e2ec5495c33861a.
Test unexpectedly passes on mac, see comment 2 on PR48725.

3 years ago[obj2yaml] - Don't crash when an object has an empty symbol table.
Georgii Rymar [Tue, 22 Dec 2020 14:36:16 +0000 (17:36 +0300)]
[obj2yaml] - Don't crash when an object has an empty symbol table.

Currently we crash when we have an object with SHT_SYMTAB/SHT_DYNSYM sections
of size 0.

With this patch instead of the crash we start to dump them properly.

Differential revision: https://reviews.llvm.org/D93697

3 years ago[obj2yaml,yaml2obj] - Fix issues with creating/dumping group sections.
Georgii Rymar [Mon, 28 Dec 2020 09:20:51 +0000 (12:20 +0300)]
[obj2yaml,yaml2obj] - Fix issues with creating/dumping group sections.

We have the following issues related to group sections:
1) yaml2obj is unable to set the custom `sh_entsize` value, because the `EntSize`
   key is currently ignored.
2) obj2yaml is unable to dump the group section which `sh_entsize != 4`.
3) obj2yaml always dumps the "EntSize" for group sections, though
   usually we are trying to omit dumping default values when dumping keys.
   I.e. we should not print the "EntSize" key when `sh_entsize` == 4.

This patch fixes (1),(3) and adds the test case to document the behavior of (2).

Differential revision: https://reviews.llvm.org/D93854

3 years ago[AMDGPU][GlobalISel] Remove some duplicate RUN lines
Jay Foad [Wed, 26 Aug 2020 13:08:14 +0000 (14:08 +0100)]
[AMDGPU][GlobalISel] Remove some duplicate RUN lines

Differential Revision: https://reviews.llvm.org/D86618

3 years ago[SlotIndexes] Fix and simplify basic block splitting
Jay Foad [Fri, 8 Jan 2021 13:40:29 +0000 (13:40 +0000)]
[SlotIndexes] Fix and simplify basic block splitting

Remove the InsertionPoint argument from SlotIndexes::insertMBBInMaps
because it was confusing: what does it mean to insert a new block
between two instructions, in the middle of an existing block?

Instead, support the case that MachineBasicBlock::splitAt really needs,
where the new block contains some instructions that are already in the
maps because they have been moved there from the tail of the previous
block.

In all other use cases the new block is empty.

Based on work by Carl Ritson!

Differential Revision: https://reviews.llvm.org/D94311

3 years ago[clang][AST] Get rid of an alignment hack in DeclObjC.h [NFCI]
Mikhail Maltsev [Tue, 12 Jan 2021 10:22:35 +0000 (10:22 +0000)]
[clang][AST] Get rid of an alignment hack in DeclObjC.h [NFCI]

This code currently uses a union object to increase the
alignment of the type ObjCTypeParamList. The original intent of this
trick was to be able to use the expression `this + 1` to access the
beginning of a tail-allocated array of `ObjCTypeParamDecl *` pointers.

The code has since been refactored and uses `llvm::TrailingObjects` to
manage the tail-allocated array. This template takes care of
alignment, so the hack is no longer necessary.

This patch removes the union so that the `SourceRange` class can be
used directly instead of being re-implemented with raw representations
of source locations.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D94224

3 years ago[llvm-readobj] - One more attempt to fix BB.
Georgii Rymar [Tue, 12 Jan 2021 10:17:23 +0000 (13:17 +0300)]
[llvm-readobj] - One more attempt to fix BB.

Add `this->` for `W`, which is the member of `ObjDumper`

An example of error:
readobj/ELFDumper.cpp:738:13: error: use of undeclared identifier 'W'
    assert(&W.getOStream() == &llvm::fouts());

3 years ago[mlir][openmp][NFCI] Rename `continuationIP` to `continuationBlock`
Sourabh Singh Tomar [Tue, 12 Jan 2021 10:13:33 +0000 (15:43 +0530)]
[mlir][openmp][NFCI] Rename `continuationIP` to `continuationBlock`

Argument is a `Block` not a `point`.

3 years ago[llvm-readobj] - An attempt to fix BB.
Georgii Rymar [Tue, 12 Jan 2021 10:09:49 +0000 (13:09 +0300)]
[llvm-readobj] - An attempt to fix BB.

This adds the `template` keyword for 'getAsArrayRef' calls.

An example of error:
/b/1/openmp-gcc-x86_64-linux-debian/llvm.src/llvm/tools/llvm-readobj/ELFDumper.cpp:4491:50: error: use 'template' keyword to treat 'getAsArrayRef' as a dependent template name
    for (const Elf_Rel &Rel : this->DynRelRegion.getAsArrayRef<Elf_Rel>())

3 years ago[llvm-readobj] - Add 'override' to fix build bots.
Georgii Rymar [Tue, 12 Jan 2021 10:01:15 +0000 (13:01 +0300)]
[llvm-readobj] - Add 'override' to fix build bots.

This should fix bots after landing D93900.

An example of error is:

/home/worker/2.0.1/lldb-x86_64-debian/llvm-project/llvm/tools/llvm-readobj/ELFDumper.cpp:883:8: warning: 'printSectionMapping' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
  void printSectionMapping() {}

3 years ago[llvm-readef/obj] - Change the design structure of ELF dumper. NFCI.
Georgii Rymar [Tue, 29 Dec 2020 13:03:37 +0000 (16:03 +0300)]
[llvm-readef/obj] - Change the design structure of ELF dumper. NFCI.

This is a refactoring for design of stuff in `ELFDumper.cpp`.
The current design of ELF dumper is far from ideal.

Currently most overridden functions (inherited from `ObjDumper`) in `ELFDumper` just forward to
the functions of `ELFDumperStyle` (which can be either `GNUStyle` or `LLVMStyle`).
A concrete implementation may be in any of `ELFDumper`/`DumperStyle`/`GNUStyle`/`LLVMStyle`.

This patch reorganizes the classes by introducing `GNUStyleELFDumper`/`LLVMStyleELFDumper`
which inherit from `ELFDumper`. The implementations are moved:

`DumperStyle` -> `ELFDumper`
`GNUStyle` -> `GNUStyleELFDumper`
`LLVMStyle` -> `LLVMStyleELFDumper`

With that we can avoid having a lot of redirection calls and helper methods.
The number of code lines changes from 7142 to 6922 (reduced by ~3%) and the
code overall looks cleaner.

Differential revision: https://reviews.llvm.org/D93900

3 years ago[WebAssembly] Remove more unnecessary brs in CFGStackify
Heejin Ahn [Tue, 29 Dec 2020 03:48:44 +0000 (19:48 -0800)]
[WebAssembly] Remove more unnecessary brs in CFGStackify

After placing markers, we removed some unnecessary branches, but it only
handled the simplest case. This makes more unnecessary branches to be
removed.

Reviewed By: dschuff, tlively

Differential Revision: https://reviews.llvm.org/D94047

3 years ago[Test] Add failing test for PR48725
Max Kazantsev [Tue, 12 Jan 2021 09:04:12 +0000 (16:04 +0700)]
[Test] Add failing test for PR48725

3 years ago[mlir] use built-in vector types instead of LLVM dialect types when possible
Alex Zinenko [Mon, 11 Jan 2021 12:58:05 +0000 (13:58 +0100)]
[mlir] use built-in vector types instead of LLVM dialect types when possible

Continue the convergence between LLVM dialect and built-in types by using the
built-in vector type whenever possible, that is for fixed vectors of built-in
integers and built-in floats. LLVM dialect vector type is still in use for
pointers, less frequent floating point types that do not have a built-in
equivalent, and scalable vectors. However, the top-level `LLVMVectorType` class
has been removed in favor of free functions capable of inspecting both built-in
and LLVM dialect vector types: `LLVM::getVectorElementType`,
`LLVM::getNumVectorElements` and `LLVM::getFixedVectorType`. Additional work is
necessary to design an implemented the extensions to built-in types so as to
remove the `LLVMFixedVectorType` entirely.

Note that the default output format for the built-in vectors does not have
whitespace around the `x` separator, e.g., `vector<4xf32>` as opposed to the
LLVM dialect vector type format that does, e.g., `!llvm.vec<4 x fp128>`. This
required changing the FileCheck patterns in several tests.

Reviewed By: mehdi_amini, silvas

Differential Revision: https://reviews.llvm.org/D94405