zhanghb97 [Tue, 12 Jan 2021 13:40:27 +0000 (21:40 +0800)]
[mlir][Python] Add checking process before create an AffineMap from a permutation.
An invalid permutation will trigger a C++ assertion when attempting to create an AffineMap from the permutation.
This patch adds an `isPermutation` function to check the given permutation before creating the AffineMap.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D94492
Nico Weber [Wed, 13 Jan 2021 01:30:56 +0000 (20:30 -0500)]
[gn build] (manually) port
79f99ba65d96
Jianzhou Zhao [Tue, 12 Jan 2021 21:49:59 +0000 (21:49 +0000)]
[MSan] Tweak CopyOrigin
There could be some mis-alignments when copying origins not aligned.
I believe inaligned memcpy is rare so the cases do not matter too much
in practice.
1) About the change at line 50
Let dst be (void*)5,
then d=5, beg=4
so we need to write 3 (4+4-5) bytes from 5 to 7.
2) About the change around line 77.
Let dst be (void*)5,
because of lines 50-55, the bytes from 5-7 were already writen.
So the aligned copy is from 8.
Reviewed-by: eugenis
Differential Revision: https://reviews.llvm.org/D94552
Juneyoung Lee [Wed, 13 Jan 2021 00:33:21 +0000 (09:33 +0900)]
[DAGCombiner] Fold BRCOND(FREEZE(COND)) to BRCOND(COND)
This patch resolves the suboptimal codegen described in http://llvm.org/pr47873 .
When CodeGenPrepare lowers select into a conditional branch, a freeze instruction is inserted.
It is then translated to `BRCOND(FREEZE(SETCC))` in SelDag.
The `FREEZE` in the middle of `SETCC` and `BRCOND` was causing a suboptimal code generation however.
This patch adds `BRCOND(FREEZE(cond))` -> `BRCOND(cond)` fold to DAGCombiner to remove the `FREEZE`.
To make this optimization sound, `BRCOND(UNDEF)` simply should nondeterministically jump to the branch or not, rather than raising UB.
It wasn't clear what happens when the condition was undef according to the comments in ISDOpcodes.h, however.
I updated the comments of `BRCOND` to make it explicit (as well as `BR_CC`, which is also a conditional branch instruction).
Note that it diverges from the semantics of `br` instruction in IR, which is explicitly UB.
Since the UB semantics was necessary to explain optimizations that use branching conditions, and SelDag doesn't seem to have such optimization, I think this divergence is okay.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D92015
Juneyoung Lee [Mon, 11 Jan 2021 05:42:08 +0000 (14:42 +0900)]
[LangRef] State that a nocapture pointer cannot be returned
This is a small patch stating that a nocapture pointer cannot be returned.
Discussed in D93189.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D94386
Siva Chandra Reddy [Wed, 13 Jan 2021 00:11:28 +0000 (16:11 -0800)]
[libc][NFC] Use more specific comparison macros in LdExpTest.h.
Michael Jones [Tue, 12 Jan 2021 22:37:56 +0000 (22:37 +0000)]
[libc] add isascii and toascii implementations
adding both at once since these are trivial functions.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D94558
Joe Nash [Thu, 7 Jan 2021 18:56:02 +0000 (13:56 -0500)]
[AMDGPU] Add _e64 suffix to VOP3 Insts
Previously, instructions which could be
expressed as VOP3 in addition to another
encoding had a _e64 suffix on the tablegen
record name, while those
only available as VOP3 did not. With this
patch, all VOP3s will have the _e64 suffix.
The assembly does not change, only the mir.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D94341
Change-Id: Ia8ec8890d47f8f94bbbdac43745b4e9dd2b03423
David Blaikie [Tue, 12 Jan 2021 23:29:44 +0000 (15:29 -0800)]
Delete unused function (was breaking the -Werror build)
Mircea Trofin [Tue, 12 Jan 2021 22:31:58 +0000 (14:31 -0800)]
[NFC] Disallow unused prefixes under MC/AMDGPU
This patches remaining tests, and patches lit.local.cfg to block future
such cases (until we flip FileCheck's flag)
Differential Revision: https://reviews.llvm.org/D94556
Julian Lettner [Tue, 12 Jan 2021 23:01:18 +0000 (15:01 -0800)]
[Sanitizer][Darwin] Fix test for macOS 11+ point releases
This test wrongly asserted that the minor version is always 0 when
running on macOS 11 and above.
Jessica Paquette [Fri, 8 Jan 2021 23:06:13 +0000 (15:06 -0800)]
[MIPatternMatch] Add matcher for G_PTR_ADD
Add a matcher which recognizes G_PTR_ADD and add a test.
Differential Revision: https://reviews.llvm.org/D94348
Hongtao Yu [Wed, 23 Dec 2020 06:43:22 +0000 (22:43 -0800)]
Add sample-profile-suffix-elision-policy attribute with -funique-internal-linkage-names.
Adding sample-profile-suffix-elision-policy attribute to functions whose linkage names are uniquefied so that their unique name suffix won't be trimmed when applying AutoFDO profiles.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D94455
Bob Haarman [Tue, 12 Jan 2021 02:08:01 +0000 (02:08 +0000)]
[ELF][NFCI] small cleanup to OutputSections.h
OutputSections.h used to close the lld::elf namespace only to
immediately open it again. This change merges both parts into
one.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D94538
Craig Topper [Tue, 12 Jan 2021 22:37:28 +0000 (14:37 -0800)]
[RISCV] Remove '.mask' from vcompress intrinsic name. NFC
It has a mask argument, but isn't a masked instruction. It doesn't
use the mask policy of or the v0.t syntax.
Nathan James [Tue, 12 Jan 2021 22:43:48 +0000 (22:43 +0000)]
[ADT][NFC] Use empty base optimisation in BumpPtrAllocatorImpl
Most uses of this class just use the default MallocAllocator.
As this contains no fields, we can use the empty base optimisation for BumpPtrAllocatorImpl and save 8 bytes of padding for most use cases.
This prevents using a class that is marked as `final` as the `AllocatorT` template argument.
In one must use an allocator that has been marked as `final`, the simplest way around this is a proxy class.
The class should have all the methods that `AllocaterBase` expects and should forward the calls to your own allocator instance.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D94439
Mircea Trofin [Tue, 12 Jan 2021 22:06:30 +0000 (14:06 -0800)]
[NFC] Disallow unused prefixes in MC/AMDGPU
1 out of 2 patches.
Differential Revision: https://reviews.llvm.org/D94553
Fangrui Song [Tue, 12 Jan 2021 22:19:55 +0000 (14:19 -0800)]
[Driver] Fix assertion failure when -fprofile-generate -fcs-profile-generate are used together
If conflicting `-fprofile-generate -fcs-profile-generate` are used together,
there is currently an assertion failure. Fix the failure.
Also add some driver tests.
Reviewed By: xur
Differential Revision: https://reviews.llvm.org/D94463
Matt Arsenault [Wed, 6 Jan 2021 19:04:19 +0000 (14:04 -0500)]
AMDGPU: Remove wrapper only call limitation
This seems to only have overridden cold handling, which we probably
shouldn't do. As far as I can tell the wrapper library functions are
still inlined as appropriate.
Shilei Tian [Tue, 12 Jan 2021 22:00:49 +0000 (17:00 -0500)]
[OpenMP] Fixed a typo in openmp/CMakeLists.txt
Martin Storsjö [Thu, 17 Dec 2020 13:40:06 +0000 (15:40 +0200)]
[libcxx] Avoid overflows in the windows __libcpp_steady_clock_now()
As freq.QuadValue can be in the range of
10000000 to
19200000,
the multiplication before division makes the calculation overflow
and wrap to negative values every 16-30 minutes.
Instead count the whole seconds separately before adding the
scaled fractional seconds.
Add a testcase for steady_clock to check that the values returned for
now() compare as bigger than the zero time origin; this
corresponds to a testcase in Qt [1] [2] (that failed spuriously
due to this).
[1] https://bugreports.qt.io/browse/QTBUG-89539
[2] https://code.qt.io/cgit/qt/qtbase.git/tree/tests/auto/corelib/kernel/qdeadlinetimer/tst_qdeadlinetimer.cpp?id=
f8de5e54022b8b7471131b7ad55c83b69b2684c0#n569
Differential Revision: https://reviews.llvm.org/D93456
Martin Storsjö [Fri, 11 Dec 2020 10:42:07 +0000 (12:42 +0200)]
[AArch64] [Windows] Properly add :lo12: reloc specifiers when generating assembly
This makes sure that assembly output actually can be assembled.
Set the correct MCExpr relocations specifier VK_PAGEOFF - and also
set VK_PAGE consistently even though it's not visible in the assembly
output.
Differential Revision: https://reviews.llvm.org/D94365
Shilei Tian [Tue, 12 Jan 2021 21:48:19 +0000 (16:48 -0500)]
[OpenMP] Fixed the link error that cannot find static data member
Constant static data member can be defined in the class without another
define after the class in C++17. Although it is C++17, Clang can still handle it
even w/o the flag for C++17. Unluckily, GCC cannot handle that.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D94541
modimo [Tue, 12 Jan 2021 21:19:30 +0000 (13:19 -0800)]
[Inliner] Change inline remark format and update ReplayInlineAdvisor to use it
This change modifies the source location formatting from:
LineNumber.Discriminator
to:
LineNumber:ColumnNumber.Discriminator
The motivation here is to enhance location information for inline replay that currently exists for the SampleProfile inliner. This will be leveraged further in inline replay for the CGSCC inliner in the related diff.
The ReplayInlineAdvisor is also modified to read the new format and now takes into account the callee for greater accuracy.
Testing:
ninja check-llvm
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D94333
Alex Zinenko [Tue, 12 Jan 2021 11:07:12 +0000 (12:07 +0100)]
[mlir] Update LLVM dialect type documentation
Recent commits reconfigured LLVM dialect types to use built-in types whenever
possible. Update the documentation accordingly.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D94485
Nikita Popov [Tue, 12 Jan 2021 21:35:19 +0000 (22:35 +0100)]
[InstCombine] Handle logical and/or in assume optimization
assume(a && b) can be converted to assume(a); assume(b) even if
the condition is logical. Same for assume(!(a || b)).
Michael Munday [Tue, 12 Jan 2021 21:22:34 +0000 (21:22 +0000)]
[RISCV] Legalize select when Zbt extension available
The custom expansion of select operations in the RISC-V backend
interferes with the matching of cmov instructions. Legalizing
select when the Zbt extension is available solves that problem.
Reviewed By: lenary, craig.topper
Differential Revision: https://reviews.llvm.org/D93767
Nikita Popov [Tue, 12 Jan 2021 21:15:54 +0000 (22:15 +0100)]
[InstCombine] Add tests for logical and/or poison implication (NFC)
These tests cover some cases where we can fold select to and/or
based on poison implication logic.
Craig Topper [Tue, 12 Jan 2021 21:08:58 +0000 (13:08 -0800)]
[RISCV] Add double test cases to vfmerge-rv32.ll. NFC
Sanjay Patel [Tue, 12 Jan 2021 20:07:01 +0000 (15:07 -0500)]
[SLP] reduce code duplication while processing reductions; NFC
Sanjay Patel [Tue, 12 Jan 2021 19:55:09 +0000 (14:55 -0500)]
[SLP] rename variable to improve readability; NFC
The OperationData in the 2nd block (visiting the operands)
is completely independent of the 1st block.
Sanjay Patel [Tue, 12 Jan 2021 18:53:18 +0000 (13:53 -0500)]
[SLP] reduce code duplication in processing reductions; NFC
Sanjay Patel [Tue, 12 Jan 2021 18:45:32 +0000 (13:45 -0500)]
[SLP] reduce code duplication while matching reductions; NFC
Philip Reames [Tue, 12 Jan 2021 20:54:07 +0000 (12:54 -0800)]
[LV] Weaken spuriously strong assert in LoopVersioning
LoopVectorize uses some utilities on LoopVersioning, but doesn't actually use it for, you know, versioning. As a result, the precondition LoopVersioning expects is too strong for this user. At the moment, LoopVectorize supports any loop with a unique exit block, so check the same precondition here.
Really, the whole class structure here is a mess. We should separate the actual versioning from the metadata updates, but that's a bigger problem.
Nikita Popov [Tue, 12 Jan 2021 19:54:23 +0000 (20:54 +0100)]
[InstCombine] Duplicate tests for logical and/or (NFC)
This replicates existing and/or tests to also test variants using
select. This should help us get a more accurate view on which
optimizations we're missing if we disable the select -> and/or
fold.
Sunil Srivastava [Tue, 12 Jan 2021 20:37:18 +0000 (12:37 -0800)]
Fix for crash in __builtin_return_address in template context.
The check for argument value needs to be guarded by !isValueDependent().
Differential Revision: https://reviews.llvm.org/D94438
Philip Reames [Tue, 12 Jan 2021 20:32:24 +0000 (12:32 -0800)]
[LV] Relax assumption that LCSSA implies single entry
This relates to the ongoing effort to support vectorization of multiple exit loops (see D93317).
The previous code assumed that LCSSA phis were always single entry before the vectorizer ran. This was correct, but only because the vectorizer allowed only a single exiting edge. There's nothing in the definition of LCSSA which requires single entry phis.
A common case where this comes up is with a loop with multiple exiting blocks which all reach a common exit block. (e.g. see the test updates)
Differential Revision: https://reviews.llvm.org/D93725
Nikita Popov [Tue, 12 Jan 2021 20:21:22 +0000 (21:21 +0100)]
[InstCombine] Regenerate test checks (NFC)
Yitzhak Mandelbaum [Mon, 11 Jan 2021 22:28:17 +0000 (22:28 +0000)]
[clang-tidy] Add test for Transformer-based checks with diagnostics.
Adds a test that checks the diagnostic output of the tidy.
Differential Revision: https://reviews.llvm.org/D94453
Zequan Wu [Tue, 12 Jan 2021 19:22:31 +0000 (11:22 -0800)]
[IR] move nomerge attribute from function declaration/definition to callsites
Move nomerge attribute from function declaration/definition to callsites to
allow virtual function calls attach the attribute.
Differential Revision: https://reviews.llvm.org/D94537
Florian Hahn [Tue, 12 Jan 2021 19:55:17 +0000 (19:55 +0000)]
[FunctionAttrs] Derive willreturn for fns with readonly` & `mustprogress`.
Similar to D94125, derive `willreturn` for functions that are `readonly` and
`mustprogress` in FunctionAttrs.
To quote the reasoning from D94125:
Since D86233 we have `mustprogress` which, in combination with
`readonly`, implies `willreturn`. The idea is that every side-effect
has to be modeled as a "write". Consequently, `readonly` means there
is no side-effect, and `mustprogress` guarantees that we cannot "loop"
forever without side-effect.
Reviewed By: jdoerfert, nikic
Differential Revision: https://reviews.llvm.org/D94502
David Truby [Thu, 3 Dec 2020 11:25:57 +0000 (11:25 +0000)]
[clang][aarch64] Precondition isHomogeneousAggregate on isCXX14Aggregate
MSVC on WoA64 includes isCXX14Aggregate in its definition. This is de-facto
specification on that platform, so match msvc's behaviour.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=47611
Co-authored-by: Peter Waller <peter.waller@arm.com>
Differential Revision: https://reviews.llvm.org/D92751
Jon Chesterfield [Tue, 12 Jan 2021 19:40:02 +0000 (19:40 +0000)]
[libomptarget][amdgpu][nfc] Fix build on centos
[libomptarget][amdgpu][nfc] Fix build on centos
rtl.cpp replaced 224 with a #define from elf.h, but that
doesn't work on a centos 7 build machine with an old elf.h
Reviewed By: ronlieb
Differential Revision: https://reviews.llvm.org/D94528
Shilei Tian [Tue, 12 Jan 2021 19:32:27 +0000 (14:32 -0500)]
[OpenMP] Fixed include directories for OpenMP when building OpenMP with LLVM_ENABLE_RUNTIMES
Some LLVM headers are generated by CMake. Before the installation,
LLVM's headers are distributed everywhere, some of which are in
`${LLVM_SRC_ROOT}/llvm/include/llvm`, and some are in
`${LLVM_BINARY_ROOT}/include/llvm`. After intallation, they're all in
`${LLVM_INSTALLATION_ROOT}/include/llvm`.
OpenMP now depends on LLVM headers. Some headers depend on headers generated
by CMake. When building OpenMP along with LLVM, a.k.a via `LLVM_ENABLE_RUNTIMES`,
we need to tell OpenMP where it can find those headers, especially those still
have not been copied/installed.
Reviewed By: jdoerfert, jhuber6
Differential Revision: https://reviews.llvm.org/D94534
Nikita Popov [Thu, 24 Dec 2020 16:04:40 +0000 (17:04 +0100)]
[InstSimplify] Don't fold gep p, -p to null
This is a partial fix for https://bugs.llvm.org/show_bug.cgi?id=44403.
Folding gep p, q-p to q is only legal if p and q have the same
provenance. This fold should probably be guarded by something like
getUnderlyingObject(p) == getUnderlyingObject(q).
This patch is a partial fix that removes the special handling for
gep p, 0-p, which will fold to a null pointer, which would certainly
not pass an underlying object check (unless p is also null, in which
case this would fold trivially anyway). Folding to a null pointer
is particularly problematic due to the special handling it receives
in many places, making end-to-end miscompiles more likely.
Differential Revision: https://reviews.llvm.org/D93820
Brad Smith [Tue, 12 Jan 2021 19:16:15 +0000 (14:16 -0500)]
[libcxx] Port to OpenBSD
Add initial OpenBSD support.
Reviewed By: ldionne
Differential Revision: https://reviews.llvm.org/D94205
Arthur O'Dwyer [Fri, 18 Dec 2020 20:11:51 +0000 (15:11 -0500)]
[libc++] Add a missing `<_Compare>` template argument.
Sometimes `_Compare` is an lvalue reference type, so letting it be
deduced is pretty much always wrong. (Well, less efficient than
it could be, anyway.)
Differential Revision: https://reviews.llvm.org/D93562
Florian Hahn [Mon, 11 Jan 2021 16:33:22 +0000 (16:33 +0000)]
[FunctionAttrs] Precommit tests for willreturn inference.
Tests for D94502.
Craig Topper [Tue, 12 Jan 2021 18:52:53 +0000 (10:52 -0800)]
[RISCV] Use vmerge.vim for llvm.riscv.vfmerge with a 0.0 scalar operand.
We can use a 0 immediate to avoid needing to materialize 0 into
an FPR first.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D94459
Arthur Eubanks [Mon, 11 Jan 2021 21:50:52 +0000 (13:50 -0800)]
[NewPM] Run non-trivial loop unswitching under -O2/3/s/z
Fixes https://bugs.llvm.org/show_bug.cgi?id=48715.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D94448
Nathan Ridge [Mon, 11 Jan 2021 01:41:50 +0000 (20:41 -0500)]
[clangd] Avoid recursion in TargetFinder::add()
Fixes https://github.com/clangd/clangd/issues/633
Differential Revision: https://reviews.llvm.org/D94382
Craig Topper [Tue, 12 Jan 2021 17:52:00 +0000 (09:52 -0800)]
[LegalizeDAG][RISCV][PowerPC][AMDGPU][WebAssembly] Improve expansion of SETONE/SETUEQ on targets without SETO/SETUO.
If SETO/SETUO aren't legal, they'll be expanded and we'll end up
with 3 comparisons.
SETONE is equivalent to (SETOGT || SETOLT)
so if one of those operations is supported use that expansion. We
don't need both since we can commute the operands to make the other.
SETUEQ can be implemented with !(SETOGT || SETOLT) or (SETULE && SETUGE).
I've only implemented the first because it didn't look like most of the
affected targets had legal SETULE/SETUGE.
Reviewed By: frasercrmck, tlively, nemanjai
Differential Revision: https://reviews.llvm.org/D94450
sameeran joshi [Thu, 17 Dec 2020 08:58:03 +0000 (14:28 +0530)]
[Flang][openmp][openacc] Extend CheckNoBranching to handle branching provided by LabelEnforce.
`CheckNoBranching` is currently handling only illegal branching out for constructs
with `Parser::Name` in them.
Extend the same for handling illegal branching out caused by `Parser::Label` based statements.
This patch could possibly solve one of the issues(typically branching out) mentioned in D92735.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D93447
Dávid Bolvanský [Tue, 12 Jan 2021 18:28:01 +0000 (19:28 +0100)]
[instCombine] Add (A ^ B) | ~(A | B) -> ~(A & B)
define i32 @src(i32 %x, i32 %y) {
%0:
%xor = xor i32 %y, %x
%or = or i32 %y, %x
%neg = xor i32 %or,
4294967295
%or1 = or i32 %xor, %neg
ret i32 %or1
}
=>
define i32 @tgt(i32 %x, i32 %y) {
%0:
%and = and i32 %x, %y
%neg = xor i32 %and,
4294967295
ret i32 %neg
}
Transformation seems to be correct!
https://alive2.llvm.org/ce/z/Cvca4a
Dávid Bolvanský [Tue, 12 Jan 2021 17:56:49 +0000 (18:56 +0100)]
[Tests] Add tests for new InstCombine OR transformation, NFC
Michał Górny [Tue, 12 Jan 2021 17:16:57 +0000 (18:16 +0100)]
[llvm] [cmake] Remove obsolete /usr/local hack for *BSD
Remove the hack adding /usr/local paths on FreeBSD and DragonFlyBSD.
It does not seem to be necessary today, and it breaks cross builds.
Differential Revision: https://reviews.llvm.org/D94491
Timm Bäder [Tue, 12 Jan 2021 18:18:13 +0000 (13:18 -0500)]
Return false from __has_declspec_attribute() if not explicitly enabled
Currently, projects can check for __has_declspec_attribute() and use
it accordingly, but the check for __has_declspec_attribute will return
true even if declspec attributes are not enabled for the target.
This changes Clang to instead return false when declspec attributes are
not supported for the target.
Emil Engler [Thu, 7 Jan 2021 02:28:54 +0000 (18:28 -0800)]
[doc] Place sha256 in lld/README.md into backticks
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D93984
Timm Bäder [Tue, 12 Jan 2021 18:15:21 +0000 (13:15 -0500)]
Add -ansi option to CompileOnly group
-ansi is documented as being the "same as -std=c89", but there are
differences when passing it to a link.
Adding -ansi to said group makes sense since it's supposed to be an
alias for -std=c89 and resolves this inconsistency.
Cullen Rhodes [Tue, 12 Jan 2021 17:48:52 +0000 (17:48 +0000)]
[SVE][NFC] Regenerate a few CodeGen tests
Regenerated using llvm/utils/update_llc_test_checks.py as part of
D94504, committing separately to reduce the diff for D94504.
Simon Pilgrim [Tue, 12 Jan 2021 18:01:41 +0000 (18:01 +0000)]
[AMDGPU] Regenerate umax crash test
Akira Hatanaka [Tue, 12 Jan 2021 17:56:06 +0000 (09:56 -0800)]
Fix typo in diagnostic message
rdar://
66684531
Simon Pilgrim [Tue, 12 Jan 2021 17:24:34 +0000 (17:24 +0000)]
[X86] Regenerate sdiv_fix_sat.ll + udiv_fix_sat.ll tests
Adding missing libcall PLT qualifiers
Rahul Joshi [Thu, 7 Jan 2021 00:32:59 +0000 (16:32 -0800)]
[MLIR] Disallow `sym_visibility`, `sym_name` and `type` attributes in the parsed attribute dictionary.
Differential Revision: https://reviews.llvm.org/D94200
Jinsong Ji [Tue, 12 Jan 2021 15:56:58 +0000 (15:56 +0000)]
[PowerPC][NFCI] PassSubtarget to ASMWriter
Subtarget feature bits are needed to change instprinter's behavior based
on feature bits.
Most of the other popular targets were updated back in 2015,
in https://reviews.llvm.org/rGb46d0234a6969
we should update it too.
Reviewed By: sfertile
Differential Revision: https://reviews.llvm.org/D94449
Lei Zhang [Tue, 12 Jan 2021 16:11:45 +0000 (11:11 -0500)]
[mlir][spirv] NFC: split deserialization into multiple source files
This avoids large source files and gives a better structure. It also
allows leveraging compilation parallelism.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D94360
Marek Kurdej [Tue, 12 Jan 2021 16:06:58 +0000 (17:06 +0100)]
[libc++] [C++2b] [P1048] Add is_scoped_enum and is_scoped_enum_v.
* https://wg21.link/p1048
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D94409
Vladislav Vinogradov [Tue, 12 Jan 2021 16:06:06 +0000 (17:06 +0100)]
[mlir] Fix for LIT tests
Add `MLIR_SPIRV_CPU_RUNNER_ENABLED` to `llvm_canonicalize_cmake_booleans`.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D94407
Vladislav Vinogradov [Tue, 12 Jan 2021 16:02:56 +0000 (17:02 +0100)]
[mlir][CAPI] Fix inline function declaration
Add `static` keyword, otherwise build fail with linker error for some cases.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D94496
Lei Zhang [Mon, 11 Jan 2021 13:50:00 +0000 (08:50 -0500)]
Reland "[mlir][linalg] Support parsing attributes in named op spec"
With this, now we can specify a list of attributes on named ops
generated from the spec. The format is defined as
```
attr-id ::= bare-id (`?`)?
attr-typedef ::= type (`[` `]`)?
attr-def ::= attr-id `:` attr-typedef
tc-attr-def ::= `attr` `(` attr-def-list `)`
tc-def ::= `def` bare-id
`(`tensor-def-list`)` `->` `(` tensor-def-list`)`
(tc-attr-def)?
```
For example,
```
ods_def<SomeCppOp>
def some_op(...) -> (...)
attr(
f32_attr: f32,
i32_attr: i32,
array_attr : f32[],
optional_attr? : f32
)
```
where `?` means optional attribute and `[]` means array type.
Reviewed By: hanchung, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D94240
Nemanja Ivanovic [Tue, 12 Jan 2021 15:46:11 +0000 (09:46 -0600)]
[PowerPC] Add support for embedded devices with EFPU2
PowerPC cores like e200z759n3 [1] using an efpu2 only support single precision
hardware floating point instructions. The single precision instructions efs*
and evfs* are identical to the spe float instructions while efd* and evfd*
instructions trigger a not implemented exception.
This patch introduces a new command line option -mefpu2 which leads to
single-hardware / double-software code generation.
[1] Core reference:
https://www.nxp.com/files-static/32bit/doc/ref_manual/e200z759CRM.pdf
Differential revision: https://reviews.llvm.org/D92935
Bjorn Pettersson [Tue, 12 Jan 2021 15:28:16 +0000 (16:28 +0100)]
[SLP] Add test case showing a bug when dealing with padded types
We shouldn't vectorize stores of non-packed types (i.e. types that
has padding between consecutive variables in a scalar layout,
but being packed in a vector layout).
The problem was detected as a miscompile in a downstream test case.
This is a pre-commit of a test case for the fix in D94446.
Lei Zhang [Mon, 11 Jan 2021 14:58:31 +0000 (09:58 -0500)]
[mlir][spirv] NFC: place ops in the proper file for their categories
This commit moves dangling ops in the main ops.td file to the proper
file matching their categories. This makes ops.td as purely including
all category files.
Differential Revision: https://reviews.llvm.org/D94413
Kazushi (Jam) Marukawa [Tue, 12 Jan 2021 12:36:55 +0000 (21:36 +0900)]
[VE] Update VELIntrinsic tests
Update comment and style of regression tests for VELIntrinsic
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D94490
Bevin Hansson [Tue, 12 Jan 2021 14:40:36 +0000 (15:40 +0100)]
[X86] Improved lowering for saturating float to int.
Adapted from D54696 by @nikic.
This patch improves lowering of saturating float to
int conversions, FP_TO_[SU]INT_SAT, for X86.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D86079
Valentin Clement [Tue, 12 Jan 2021 14:42:25 +0000 (09:42 -0500)]
[mlir][openacc] Use TableGen information for default enum
Use TableGen and information in ACC.td for the Default enum in the OpenACC dialect.
This patch generalize what was done for OpenMP for directives.
Follow up patch after D93576
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D93710
Paul C. Anagnostopoulos [Mon, 11 Jan 2021 14:46:27 +0000 (09:46 -0500)]
[TableGen] Improve error message for semicolon after braced body.
Add a test for this message.
Differential Revision: https://reviews.llvm.org/D94412
Nicolas Vasilache [Tue, 12 Jan 2021 14:01:59 +0000 (14:01 +0000)]
[mlir][Linalg] NFC - Refactor fusion APIs
This revision uniformizes fusion APIs to allow passing OpOperand, OpResult and adds a finer level of control fusion.
Differential Revision: https://reviews.llvm.org/D94493
Simon Pilgrim [Tue, 12 Jan 2021 14:07:53 +0000 (14:07 +0000)]
[X86][SSE] getFauxShuffleMask - handle PACKSS(SRAI(),SRAI()) shuffle patterns.
We can't easily treat ASHR a faux shuffle, but if it was just feeding a PACKSS then it was likely being used as sign-extension for a truncation, so just peek through and adjust the mask accordingly.
Simon Pilgrim [Tue, 12 Jan 2021 13:43:56 +0000 (13:43 +0000)]
[X86][SSE] combineSubToSubus - add v16i32 handling on pre-AVX512BW targets.
v16i32 -> v16i16/v8i16 truncation is now good enough using PACKSS/PACKUS + shuffle combining that its no longer necessary to early-out on pre-AVX512BW targets.
This was noticed while looking at completing PR40111 and moving combineSubToSubus to DAGCombine entirely.
Bevin Hansson [Mon, 11 Jan 2021 21:46:42 +0000 (22:46 +0100)]
[Fixed Point] Add codegen for conversion between fixed-point and floating point.
The patch adds the required methods to FixedPointBuilder
for converting between fixed-point and floating point,
and uses them from Clang.
This depends on D54749.
Reviewed By: leonardchan
Differential Revision: https://reviews.llvm.org/D86632
Simon Pilgrim [Tue, 12 Jan 2021 11:50:09 +0000 (11:50 +0000)]
[X86][SSE] combineSubToSubus - remove SSE2 early-out.
SSE2 truncation codegen has improved over the past few years (mainly due to better shuffle lowering/combining and computeKnownBits) - its no longer necessary to early-out from v8i32/v8i64 truncations.
This was noticed while looking at completing PR40111 and moving combineSubToSubus to DAGCombine entirely.
Fraser Cormack [Fri, 8 Jan 2021 17:14:08 +0000 (17:14 +0000)]
[RISCV] Improve scalable-vector shift tests (NFC)
All i8/i16 and several i32 tests were testing immediate shift amounts
which exceeded the bits in the vector elements, creating poison values.
Amend the tests to test well-behaved shift amounts.
Christian Sigg [Thu, 7 Jan 2021 08:41:36 +0000 (09:41 +0100)]
Change the LLVM_ATTRIBUTE_DEPRECATED macro to use C++14 attribute.
C++14 attributes are superior because they can be applied to functions with inline definition and the syntax is cleaner.
I intend to convert all uses and then remove the macro.
One issue that might hold back switching uses to C++14 attributes is that
clang-format does not put long attributes on separate lines and formatted code will look like:
```
template <typename T>
[[deprecated("blah blah")]] void
foooooooooooooooooooooooooooo() {
...
}
```
Putting long attributes on a separate line would be prettier.
See https://stackoverflow.com/questions/
45740466/clang-format-setting-to-control-c-attributes
AttributeMacros probably won't help because it can't match the custom message.
https://clang.llvm.org/docs/ClangFormatStyleOptions.html
Reviewed By: rriddle, MaskRay
Differential Revision: https://reviews.llvm.org/D94219
Nico Weber [Tue, 12 Jan 2021 11:30:32 +0000 (06:30 -0500)]
Revert "[Test] Add failing test for PR48725"
This reverts commit
e8287cb2b2923af9da72fd953e2ec5495c33861a.
Test unexpectedly passes on mac, see comment 2 on PR48725.
Georgii Rymar [Tue, 22 Dec 2020 14:36:16 +0000 (17:36 +0300)]
[obj2yaml] - Don't crash when an object has an empty symbol table.
Currently we crash when we have an object with SHT_SYMTAB/SHT_DYNSYM sections
of size 0.
With this patch instead of the crash we start to dump them properly.
Differential revision: https://reviews.llvm.org/D93697
Georgii Rymar [Mon, 28 Dec 2020 09:20:51 +0000 (12:20 +0300)]
[obj2yaml,yaml2obj] - Fix issues with creating/dumping group sections.
We have the following issues related to group sections:
1) yaml2obj is unable to set the custom `sh_entsize` value, because the `EntSize`
key is currently ignored.
2) obj2yaml is unable to dump the group section which `sh_entsize != 4`.
3) obj2yaml always dumps the "EntSize" for group sections, though
usually we are trying to omit dumping default values when dumping keys.
I.e. we should not print the "EntSize" key when `sh_entsize` == 4.
This patch fixes (1),(3) and adds the test case to document the behavior of (2).
Differential revision: https://reviews.llvm.org/D93854
Jay Foad [Wed, 26 Aug 2020 13:08:14 +0000 (14:08 +0100)]
[AMDGPU][GlobalISel] Remove some duplicate RUN lines
Differential Revision: https://reviews.llvm.org/D86618
Jay Foad [Fri, 8 Jan 2021 13:40:29 +0000 (13:40 +0000)]
[SlotIndexes] Fix and simplify basic block splitting
Remove the InsertionPoint argument from SlotIndexes::insertMBBInMaps
because it was confusing: what does it mean to insert a new block
between two instructions, in the middle of an existing block?
Instead, support the case that MachineBasicBlock::splitAt really needs,
where the new block contains some instructions that are already in the
maps because they have been moved there from the tail of the previous
block.
In all other use cases the new block is empty.
Based on work by Carl Ritson!
Differential Revision: https://reviews.llvm.org/D94311
Mikhail Maltsev [Tue, 12 Jan 2021 10:22:35 +0000 (10:22 +0000)]
[clang][AST] Get rid of an alignment hack in DeclObjC.h [NFCI]
This code currently uses a union object to increase the
alignment of the type ObjCTypeParamList. The original intent of this
trick was to be able to use the expression `this + 1` to access the
beginning of a tail-allocated array of `ObjCTypeParamDecl *` pointers.
The code has since been refactored and uses `llvm::TrailingObjects` to
manage the tail-allocated array. This template takes care of
alignment, so the hack is no longer necessary.
This patch removes the union so that the `SourceRange` class can be
used directly instead of being re-implemented with raw representations
of source locations.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D94224
Georgii Rymar [Tue, 12 Jan 2021 10:17:23 +0000 (13:17 +0300)]
[llvm-readobj] - One more attempt to fix BB.
Add `this->` for `W`, which is the member of `ObjDumper`
An example of error:
readobj/ELFDumper.cpp:738:13: error: use of undeclared identifier 'W'
assert(&W.getOStream() == &llvm::fouts());
Sourabh Singh Tomar [Tue, 12 Jan 2021 10:13:33 +0000 (15:43 +0530)]
[mlir][openmp][NFCI] Rename `continuationIP` to `continuationBlock`
Argument is a `Block` not a `point`.
Georgii Rymar [Tue, 12 Jan 2021 10:09:49 +0000 (13:09 +0300)]
[llvm-readobj] - An attempt to fix BB.
This adds the `template` keyword for 'getAsArrayRef' calls.
An example of error:
/b/1/openmp-gcc-x86_64-linux-debian/llvm.src/llvm/tools/llvm-readobj/ELFDumper.cpp:4491:50: error: use 'template' keyword to treat 'getAsArrayRef' as a dependent template name
for (const Elf_Rel &Rel : this->DynRelRegion.getAsArrayRef<Elf_Rel>())
Georgii Rymar [Tue, 12 Jan 2021 10:01:15 +0000 (13:01 +0300)]
[llvm-readobj] - Add 'override' to fix build bots.
This should fix bots after landing D93900.
An example of error is:
/home/worker/2.0.1/lldb-x86_64-debian/llvm-project/llvm/tools/llvm-readobj/ELFDumper.cpp:883:8: warning: 'printSectionMapping' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
void printSectionMapping() {}
Georgii Rymar [Tue, 29 Dec 2020 13:03:37 +0000 (16:03 +0300)]
[llvm-readef/obj] - Change the design structure of ELF dumper. NFCI.
This is a refactoring for design of stuff in `ELFDumper.cpp`.
The current design of ELF dumper is far from ideal.
Currently most overridden functions (inherited from `ObjDumper`) in `ELFDumper` just forward to
the functions of `ELFDumperStyle` (which can be either `GNUStyle` or `LLVMStyle`).
A concrete implementation may be in any of `ELFDumper`/`DumperStyle`/`GNUStyle`/`LLVMStyle`.
This patch reorganizes the classes by introducing `GNUStyleELFDumper`/`LLVMStyleELFDumper`
which inherit from `ELFDumper`. The implementations are moved:
`DumperStyle` -> `ELFDumper`
`GNUStyle` -> `GNUStyleELFDumper`
`LLVMStyle` -> `LLVMStyleELFDumper`
With that we can avoid having a lot of redirection calls and helper methods.
The number of code lines changes from 7142 to 6922 (reduced by ~3%) and the
code overall looks cleaner.
Differential revision: https://reviews.llvm.org/D93900
Heejin Ahn [Tue, 29 Dec 2020 03:48:44 +0000 (19:48 -0800)]
[WebAssembly] Remove more unnecessary brs in CFGStackify
After placing markers, we removed some unnecessary branches, but it only
handled the simplest case. This makes more unnecessary branches to be
removed.
Reviewed By: dschuff, tlively
Differential Revision: https://reviews.llvm.org/D94047
Max Kazantsev [Tue, 12 Jan 2021 09:04:12 +0000 (16:04 +0700)]
[Test] Add failing test for PR48725
Alex Zinenko [Mon, 11 Jan 2021 12:58:05 +0000 (13:58 +0100)]
[mlir] use built-in vector types instead of LLVM dialect types when possible
Continue the convergence between LLVM dialect and built-in types by using the
built-in vector type whenever possible, that is for fixed vectors of built-in
integers and built-in floats. LLVM dialect vector type is still in use for
pointers, less frequent floating point types that do not have a built-in
equivalent, and scalable vectors. However, the top-level `LLVMVectorType` class
has been removed in favor of free functions capable of inspecting both built-in
and LLVM dialect vector types: `LLVM::getVectorElementType`,
`LLVM::getNumVectorElements` and `LLVM::getFixedVectorType`. Additional work is
necessary to design an implemented the extensions to built-in types so as to
remove the `LLVMFixedVectorType` entirely.
Note that the default output format for the built-in vectors does not have
whitespace around the `x` separator, e.g., `vector<4xf32>` as opposed to the
LLVM dialect vector type format that does, e.g., `!llvm.vec<4 x fp128>`. This
required changing the FileCheck patterns in several tests.
Reviewed By: mehdi_amini, silvas
Differential Revision: https://reviews.llvm.org/D94405
Jan Svoboda [Tue, 12 Jan 2021 08:09:06 +0000 (09:09 +0100)]
[clang][cli] Remove -f[no-]trapping-math from -cc1 command line
This patch removes the -f[no-]trapping-math flags from the -cc1 command line. These flags are ignored in the command line parser and their semantics is fully handled by -ffp-exception-mode.
This patch does not remove -f[no-]trapping-math from the driver command line. The driver flags are being used and do affect compilation.
Reviewed By: dexonsmith, SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D93395
Sebastian Neubauer [Mon, 11 Jan 2021 13:20:36 +0000 (14:20 +0100)]
[AMDGPU] Fix failing assert with scratch ST mode
In ST mode, flat scratch instructions have neither an sgpr nor a vgpr
for the address. This lead to an assertion when inserting hard clauses.
Differential Revision: https://reviews.llvm.org/D94406