Sam Parker [Mon, 28 Sep 2020 13:44:51 +0000 (14:44 +0100)]
[NFC][ARM] Factor out some logic for LoLoops.
Create a DCE function that accepts an instruction.
Jay Foad [Mon, 28 Sep 2020 13:34:23 +0000 (14:34 +0100)]
[AMDGPU] Reformat SITargetLowering::isSDNodeSourceOfDivergence. NFC.
Georgii Rymar [Mon, 28 Sep 2020 11:43:19 +0000 (14:43 +0300)]
[llvm-readobj/elf] - Fix the PREL31 relocation computation used for dumping arm32 unwind info (-u).
This is a part of https://bugs.llvm.org/show_bug.cgi?id=47581.
We have the following computation:
```
(1) uint64_t Location = Address & 0x7fffffff;
(2) if (Location & 0x04000000)
(3) Location |= (uint64_t) ~0x7fffffff;
(4) return Location + Place;
```
At line 2 there is a mistype. The constant should be `0x40000000`,
not `0x04000000`, because the intention here is to sign extend the `Location`,
which is the 31 bit signed value.
Differential revision: https://reviews.llvm.org/D88407
Alexander Kornienko [Mon, 28 Sep 2020 12:58:27 +0000 (14:58 +0200)]
[clang-tidy] IncludeInserter: allow <> in header name
This adds a pair of overloads for create(MainFile)?IncludeInsertion methods that
use the presence of the <> in the file name to control whether the #include
directive will use angle brackets or quotes. Motivating examples:
https://reviews.llvm.org/D82089#inline-789412 and
https://github.com/llvm/llvm-project/blob/master/clang-tools-extra/clang-tidy/modernize/MakeSmartPtrCheck.cpp#L433
The overloads with the IsAngled parameter can be removed after the users are
updated.
Update usages of createIncludeInsertion.
Update (almost all) usages of createMainFileIncludeInsertion.
Reviewed By: hokein
Differential Revision: https://reviews.llvm.org/D85666
Haojian Wu [Mon, 28 Sep 2020 13:08:28 +0000 (15:08 +0200)]
[clang] Don't emit "no member" diagnostic if the lookup fails on an invalid record decl.
The "no member" diagnostic is likely bogus.
Reviewed By: sammccall, #libc
Differential Revision: https://reviews.llvm.org/D86765
Sjoerd Meijer [Mon, 28 Sep 2020 13:01:23 +0000 (14:01 +0100)]
[ARM][MVE] Enable tail-predication by default
We have been running tests/benchmarks downstream with tail-predication enabled
for some time now and this behaves as expected: we are not aware of any
correctness issues, and this performs better across the board than with
tail-predication disabled. Time to flip the switch!
Differential Revision: https://reviews.llvm.org/D88093
Simon Pilgrim [Mon, 28 Sep 2020 12:35:54 +0000 (13:35 +0100)]
[InstCombine] matchRotate - allow undef in uniform constant rotation amounts (PR46895)
An extension to D87452, we can safely permit undefs in the uniform/splat detection
https://alive2.llvm.org/ce/z/nT-ptN
Differential Revision: https://reviews.llvm.org/D88402
Florian Hahn [Mon, 28 Sep 2020 10:59:50 +0000 (11:59 +0100)]
[SCEV] Also use info from assumes in applyLoopGuards.
Similar to collecting information from branches guarding a loop, we can
also collect information from assumes dominating the loop header.
Fixes PR47247.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D87854
Daniel Kiss [Mon, 28 Sep 2020 11:57:21 +0000 (13:57 +0200)]
[AArch64] Generate .note.gnu.property based on module flags.
Flags of the module derived exclusively from the compiler flag `-mbranch-protection`.
The note is generated based on the module flags accordingly.
After this change in case of compile unit without function won't have
the .note.gnu.property if the compiler flag is not present [1].
[1] https://bugs.llvm.org/show_bug.cgi?id=46480
Reviewed By: chill
Differential Revision: https://reviews.llvm.org/D80791
Simon Pilgrim [Mon, 28 Sep 2020 11:53:32 +0000 (12:53 +0100)]
[X86] Flip isShuffleEquivalent argument order to match isTargetShuffleEquivalent
A while ago, we converted isShuffleEquivalent/isTargetShuffleEquivalent to both use IsElementEquivalent internally.
This allows us to make the shuffle args optional like isTargetShuffleEquivalent and update foldShuffleOfHorizOp to use isShuffleEquivalent (which it should as its using a ISD::VECTOR_SHUFFLE mask).
Simon Pilgrim [Mon, 28 Sep 2020 11:20:55 +0000 (12:20 +0100)]
[X86] Simplify broadcast mask detection with isUndefOrEqual helper.
Add an additional isUndefOrEqual variant that matches an entire mask, not just a single value.
LLVM GN Syncbot [Mon, 28 Sep 2020 11:38:04 +0000 (11:38 +0000)]
[gn build] Port
018066d9475
Tadeo Kondrak [Mon, 28 Sep 2020 11:18:24 +0000 (13:18 +0200)]
[clangd] Add a tweak for filling in enumerators of a switch statement.
Add a tweak that populates an empty switch statement of an enumeration type with all of the enumerators of that type.
Before:
```
enum Color { RED, GREEN, BLUE };
void f(Color color) {
switch (color) {}
}
```
After:
```
enum Color { RED, GREEN, BLUE };
void f(Color color) {
switch (color) {
case RED:
case GREEN:
case BLUE:
break;
}
}
```
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D88383
Raphael Isemann [Mon, 28 Sep 2020 10:46:24 +0000 (12:46 +0200)]
[lldb][NFC] Minor cleanup in CxxModuleHandler::tryInstantiateStdTemplate
Using llvm::None and `contains` instead of `find`.
Qiu Chaofan [Mon, 28 Sep 2020 10:16:25 +0000 (18:16 +0800)]
[PowerPC] Clean-up mayRaiseFPException bits
According to POWER ISA, floating point instructions altering exception
bits in FPSCR should be 'may raise FP exception'. (excluding those
read or write the whole FPSCR directly, like mffs/mtfsf) We need to
model FPSCR well in future patches to handle the special case properly.
Instructions added mayRaiseFPException:
- fre(s)/frsqrte(s)
- fmadd(s)/fmsub(s)/fnmadd(s)/fnmsub(s)
- xscmpoqp/xscmpuqp/xscmpeqdp/xscmpgedp/xscmpgtdp
- xscvdphp/xscvhpdp/xvcvhpsp/xvcvsphp/xsrqpxp
- xsmaxcdp/xsincdp/xsmaxjdp/xsminjdp
Instructions removed mayRaiseFPException:
- xstdivdp/xvtdiv(d|s)p/xstsqrtdp/xvtsqrt(d|s)p
- xsabsdp/xsnabsdp/xvabs(d|s)p/xvnabs(d|s)p
- xsnegdp/xscpsgndp/xvneg(d|s)p/xvcpsgn(d|s)p
- xvcvsxwdp/xvcvuxwdp
- xscvdpspn/xscvspdpn
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D87738
Jay Foad [Thu, 24 Sep 2020 16:02:30 +0000 (17:02 +0100)]
[AMDGPU] Add bfi immediate pattern
Differential Revision: https://reviews.llvm.org/D88246
Jay Foad [Thu, 24 Sep 2020 15:52:41 +0000 (16:52 +0100)]
[AMDGPU] Make bfi patterns divergence-aware
This tends to increase code size but more importantly it reduces vgpr
usage, and could avoid costly readfirstlanes if the result needs to be
in an sgpr.
Differential Revision: https://reviews.llvm.org/D88245
Jay Foad [Thu, 24 Sep 2020 15:19:09 +0000 (16:19 +0100)]
[AMDGPU] Split R600 and GCN bfi patterns
This is in preparation for making the GCN patterns divergence-aware.
NFC.
Differential Revision: https://reviews.llvm.org/D88244
Simon Pilgrim [Mon, 28 Sep 2020 08:55:25 +0000 (09:55 +0100)]
[InstCombine] Add tests for vector rotate by constants with undefs.
Georgii Rymar [Wed, 23 Sep 2020 15:00:32 +0000 (18:00 +0300)]
[yaml2obj][obj2yaml] - Add a support for SHT_ARM_EXIDX section.
This adds the support for SHT_ARM_EXIDX sections to obj2yaml/yaml2obj tools.
SHT_ARM_EXIDX is a ARM specific index table filled with entries.
Each entry consists of two 4-bytes values (words).
(https://developer.arm.com/documentation/ihi0038/c/?lang=en#index-table-entries)
Differential revision: https://reviews.llvm.org/D88228
Raphael Isemann [Mon, 28 Sep 2020 08:20:37 +0000 (10:20 +0200)]
[lldb] Reference STL types in import-std-module tests
With the recent patches to the ASTImporter that improve template type importing
(D87444), most of the import-std-module tests can now finally import the
type of the STL container they are testing. This patch removes most of the casts
that were added to simplify types to something the ASTImporter can import
(for example, std::vector<int>::size_type was casted to `size_t` until now).
Also adds the missing tests that require referencing the container type (for
example simply printing the whole container) as here we couldn't use a casting
workaround.
The only casts that remain are in the forward_list tests that reference
the iterator and the stack test. Both tests are still failing to import the
respective container type correctly (or crash while trying to import).
Georgii Rymar [Fri, 25 Sep 2020 11:57:00 +0000 (14:57 +0300)]
[obj2yaml][yaml2obj] - Stop recognizing SHT_MIPS_ABIFLAGS on non-MIPS targets.
Currently we are always recognizing the `SHT_MIPS_ABIFLAGS` section,
even on non-MIPS targets.
The problem of doing this is briefly discussed in D88228 which does the same for `SHT_ARM_EXIDX`:
"The problem is that `SHT_ARM_EXIDX` shares the value with `SHT_X86_64_UNWIND (0x70000001U)`.
We might have other machine specific conflicts, e.g.
`SHT_ARM_ATTRIBUTES` vs `SHT_MSP430_ATTRIBUTES` vs `SHT_RISCV_ATTRIBUTES (0x70000003U)`."
I think we should only recognize target specific sections when the machine type
matches. I.e. `SHT_MIPS_*` should be recognized only on `MIPS`, `SHT_ARM_*`
only on `ARM` etc.
This patch stops recognizing `SHT_MIPS_ABIFLAGS` on `non-MIPS` targets.
Note: I had to update `ScalarEnumerationTraits<ELFYAML::MIPS_ISA>::enumeration`, because
otherwise test crashes, calling `llvm_unreachable`.
Differential revision: https://reviews.llvm.org/D88294
Benjamin Kramer [Mon, 28 Sep 2020 08:26:51 +0000 (10:26 +0200)]
[Coroutines] Remove unused includes. NFC.
Sjoerd Meijer [Thu, 24 Sep 2020 12:43:52 +0000 (13:43 +0100)]
[ARM][MVE] tail-predication: overflow checks for elementcount, cont'd
This is a reimplementation of the overflow checks for the elementcount,
i.e. the 2nd argument of intrinsic get.active.lane.mask. The element
count is lowered in each iteration of the tail-predicated loop, and
we must prove that this expression doesn't overflow.
Many thanks to Eli Friedman and Sam Parker for all their help with
this work.
Differential Revision: https://reviews.llvm.org/D88086
David Green [Mon, 28 Sep 2020 08:14:40 +0000 (09:14 +0100)]
[ARM] Expand cannotInsertWDLSTPBetween to the last instruction
9d9a11c7be037 added this check for predicatable instructions between the
D/WLSTP and the loop's start, but it was missing the last instruction in
the block. Change it to use some iterators instead.
Differential Revision: https://reviews.llvm.org/D88354
Raphael Isemann [Mon, 28 Sep 2020 08:09:45 +0000 (10:09 +0200)]
[lldb] Remove nothreadallow from SWIG's __str__ wrappers to work around a Python>=3.7 crash
Usually when we enter a SWIG wrapper function from Python, SWIG automatically
adds a `Py_BEGIN_ALLOW_THREADS`/`Py_END_ALLOW_THREADS` around the call to the SB
API C++ function. This will ensure that Python's GIL is released when we enter
LLDB and locked again when we return to the wrapper code.
D51569 changed this behaviour but only for the generated `__str__` wrappers. The
added `nothreadallow` disables the injection of the GIL release/re-acquire code
and the GIL is now kept locked when entering LLDB and is expected to be still
locked when returning from the LLDB implementation. The main reason for that was
that back when D51569 landed the wrapper itself created a Python string. These
days it just creates a std::string and SWIG itself takes care of getting the GIL
and creating the Python string from the std::string, so that workaround isn't
necessary anymore.
This patch just removes `nothreadallow` so that our `__str__` functions now
behave like all other wrapper functions in that they release the GIL when
calling into the SB API implementation.
The motivation here is actually to work around another potential bug in LLDB.
When one calls into the LLDB SB API while holding the GIL and that call causes
LLDB to interpret some Python script via `ScriptInterpreterPython`, then the GIL
will be unlocked when the control flow returns from the SB API. In the case of
the `__str__` wrapper this would cause that the next call to a Python function
requiring the GIL would fail (as SWIG will not try to reacquire the GIL as it
isn't aware that LLDB removed it).
The reason for this unexpected GIL release seems to be a workaround for recent
Python versions:
```
// The only case we should go further and acquire the GIL: it is unlocked.
if (PyGILState_Check())
return;
```
The early-exit here causes `InitializePythonRAII::m_was_already_initialized` to
be always false and that causes that `InitializePythonRAII`'s destructor always
directly unlocks the GIL via `PyEval_SaveThread`. I'm investigating how to
properly fix this bug in a follow up patch, but for now this straightforward
patch seems to be enough to unblock my other patches (and it also has the
benefit of removing this workaround).
The test for this is just a simple test for `std::deque` which has a synthetic
child provider implemented as a Python script. Inspecting the deque object will
cause `expect_expr` to create a string error message by calling
`str(deque_object)`. Printing the ValueObject causes the Python script for the
synthetic children to execute which then triggers the bug described above where
the GIL ends up being unlocked.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D88302
Chuanqi Xu [Mon, 28 Sep 2020 07:40:35 +0000 (15:40 +0800)]
[Coroutines] Reuse storage for local variables with non-overlapping lifetimes
bug 45566 shows the process of building coroutine frame won't consider
that the lifetimes of different local variables are not overlapped,
which means the compiler could generates smaller frame.
This patch calculate the lifetime range of each alloca by StackLifetime
class. Then the patch build non-overlapped sets for allocas whose
lifetime ranges are not overlapped. We use the largest type in a
non-overlapped set as the field type in the frame. In insertSpills
process, if we find the type of field is not the same with the alloca,
we cast the pointer to the field type to the pointer to the alloca type.
Since the lifetime range of alloca in one non-overlapped set is not
overlapped with each other, it should be ok to reuse the storage space
in the frame.
Test plan: check-llvm, check-clang, cppcoro, folly
Reviewers: junparser, lxfind, modocache
Differential Revision: https://reviews.llvm.org/D87596
David Sherwood [Fri, 11 Sep 2020 14:17:08 +0000 (15:17 +0100)]
[SVE] Replace / operator in TypeSize/ElementCount with divideCoefficientBy
After some recent upstream discussion we decided that it was best
to avoid having the / operator for both ElementCount and TypeSize,
since this could give the impression that these classes can be used
in the same way as basic integer integer types. However, division
for scalable types is a bit odd because we are only dividing the
minimum quantity by a value, as opposed to something like:
(MinSize * Vscale) / SomeValue
This is why when performing division it's important the caller
first establishes whether the operation makes sense, perhaps by
calling isKnownMultipleOf() prior to division. The caller must now
explictly call divideCoefficientBy() on the class to perform the
operation.
Differential Revision: https://reviews.llvm.org/D87700
Kai Luo [Mon, 28 Sep 2020 06:11:40 +0000 (06:11 +0000)]
[PowerPC] Add tests for `select` patterns. NFC.
Arthur Eubanks [Mon, 28 Sep 2020 05:41:56 +0000 (22:41 -0700)]
Revert "Reland [CodeGen] emit CG profile for COFF object file"
This reverts commit
506b6170cb513f1cb6e93a3b690c758f9ded18ac.
This still causes link errors, see https://crbug.com/1130780.
Max Kazantsev [Mon, 28 Sep 2020 05:04:20 +0000 (12:04 +0700)]
[Test] Add tests where we can replace condition with invariants
Richard Smith [Wed, 2 Sep 2020 22:04:41 +0000 (15:04 -0700)]
Add profiling support for APValues.
For C++20 P0732R2; unused so far. Will be used and tested by a follow-on
commit.
Richard Smith [Wed, 2 Sep 2020 21:42:37 +0000 (14:42 -0700)]
Canonicalize declaration pointers when forming APValues.
References to different declarations of the same entity aren't different
values, so shouldn't have different representations.
Recommit of
e6393ee813178e9d3306b8e3c6949a4f32f8a2cb with fixed handling
for weak declarations. We now look for attributes on the most recent
declaration when determining whether a declaration is weak. (Second
recommit with further fixes for mishandling of weak declarations. Our
behavior here is fundamentally unsound -- see PR47663 -- but this
approach attempts to not make things worse.)
Valentin Clement [Mon, 28 Sep 2020 01:27:49 +0000 (21:27 -0400)]
[mlir][openacc] Add if, deviceptr operands and default attribute
Add operands to represent if and deviceptr. Default clause is represented with
an attribute.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D88331
Valentin Clement [Mon, 28 Sep 2020 01:20:35 +0000 (21:20 -0400)]
[mlir][openacc] Switch to assembly format for acc.data
This patch remove the printer/parser for the acc.data operation since its syntax
fits nicely with the assembly format. It reduces the maintenance for this op.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D88330
Valentin Clement [Mon, 28 Sep 2020 00:27:54 +0000 (20:27 -0400)]
[mlir][openacc] Remove detach and delete operands from acc.data
This patch remove the detach and delete operands. Those operands represent the detach
and delete clauses that will appear in another operation acc.exit_data
Reviewed By: kiranktp, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D88326
Dávid Bolvanský [Sun, 27 Sep 2020 19:32:32 +0000 (21:32 +0200)]
[BuildLibCalls] Add noalias for strcat and stpcpy
strcat:
destination and source shall not overlap. (http://www.cplusplus.com/reference/cstring/strcat/)
stpcpy:
The strings may not overlap, and the destination string dest must be large enough to receive the copy. (https://man7.org/linux/man-pages/man3/stpcpy.3.html)
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D88335
Joseph Huber [Sun, 27 Sep 2020 19:35:47 +0000 (15:35 -0400)]
[OpenMP] Add Missing _static Director for OpenMP Documentation
Summary:
Adding a missing directory needed for generating Sphinx documentation without
errors. Directory current contains a placeholder image just to populate the
directory.
Nikita Popov [Sun, 27 Sep 2020 18:44:25 +0000 (20:44 +0200)]
[CVP] Remove unnecessary block splits in tests (NFC)
These are no longer necessary since D69686.
Nikita Popov [Sun, 30 Aug 2020 14:20:12 +0000 (16:20 +0200)]
[LVI][CVP] Use block value when simplifying icmps
Add a flag to getPredicateAt() that allows making use of the block
value. This allows us to take into account range information from
the current block, rather than only information that is threaded
over edges, making the icmp simplification in CVP a lot more
powerful.
I'm not changing getPredicateAt() to use the block value
unconditionally to avoid any impact on the JumpThreading pass,
which is somewhat picky about LVI query order.
Most test changes here are just icmps that now get dropped (while
previously only a result used in a return was replaced). The three
tests in icmp.ll show some representative improvements. Some of
the folds this enables have been covered by IPSCCP in the meantime,
but LVI can reason about some cases which are hard to support in
IPSCCP, such as in test_br_cmp_with_offset.
The compile-time time cost of doing this is fairly minimal, with
a ~0.05% CTMark regression for ReleaseThinLTO:
https://llvm-compile-time-tracker.com/compare.php?from=
709d03f8af4da4204849a70f01798e7cebba2e32&to=
6236fd503761f43c99f4537121e057a01056f185&stat=instructions
This is because the block values will typically already be queried
and cached by other CVP optimizations anyway.
Differential Revision: https://reviews.llvm.org/D69686
Fangrui Song [Sun, 27 Sep 2020 18:12:13 +0000 (11:12 -0700)]
[NewPM] Port ConstraintElimination to the new pass manager
If -enable-constraint-elimination is specified, add it to the -O2/-O3 pipeline.
(-O1 uses a separate function now.)
Reviewed By: fhahn, aeubanks
Differential Revision: https://reviews.llvm.org/D88365
Benjamin Kramer [Sun, 27 Sep 2020 17:10:53 +0000 (19:10 +0200)]
[InstCombine] Simplify code. NFCI.
Nikita Popov [Sun, 27 Sep 2020 16:56:10 +0000 (18:56 +0200)]
[CVP] Make srem test more robust (NFC)
D69686 will be able to determine that the icmp is always false.
As this is not the purpose of the test, use a different modulus
that doesn't trivialize the condition.
Nikita Popov [Sun, 27 Sep 2020 16:21:19 +0000 (18:21 +0200)]
[LVI] Clarify getValueAt/getValueInBlock doc comments (NFC)
The lattice value returned by getValueInBlock() holds at the start
of the block, not at the end. Also make it clearer what the
difference between getValueInBlock() and getValueAt() is.
Nikita Popov [Sun, 27 Sep 2020 15:41:39 +0000 (17:41 +0200)]
[LVI] Require context instruction in external API (NFCI)
Require CxtI in getConstant() and getConstantRange() APIs.
Accordingly drop the BB parameter, as it is implied by
CxtI->getParent().
This makes sure we don't forget to pass the context instruction,
and makes the API contract clearer (also clean up the comments to
that effect -- the value holds at the context instruction, not
the end of the block).
Nikita Popov [Sun, 27 Sep 2020 15:49:37 +0000 (17:49 +0200)]
[CVP] Pass context instruction when narrowing div/rem
This fold was the only place not passing the context instruction.
The tests worked around that fact by introducing a basic block split,
which is now no longer necessary.
Simon Pilgrim [Sun, 27 Sep 2020 14:59:53 +0000 (15:59 +0100)]
[X86] Add some basic i128 udiv test coverage
Simon Pilgrim [Sun, 27 Sep 2020 14:58:20 +0000 (15:58 +0100)]
[X86] Regenerate i128 sdiv tests and add i686 coverage.
To hopefully help improve the codegen delta in D87976
Sanjay Patel [Sun, 27 Sep 2020 13:52:56 +0000 (09:52 -0400)]
[CostModel] add cl option to check size and latency costs; NFC
This is a setting used by SimplifyCFG, LoopUnroll, and InlineCost,
but there is apparently no direct test coverage for any of those
cost model values.
Sanjay Patel [Sun, 27 Sep 2020 12:24:03 +0000 (08:24 -0400)]
[ValueTracking] enhance isKnownNeverInfinity to understand sitofp
As discussed in D87877, instcombine already has this fold,
but it was missing from the more general ValueTracking logic.
https://alive2.llvm.org/ce/z/PumYZP
Sanjay Patel [Sat, 26 Sep 2020 14:53:29 +0000 (10:53 -0400)]
[InstSimplify] add tests for fcmp with casted op; NFC
This shows missing analysis in ValueTracking's isKnownNeverInfinity().
Aaron Ballman [Sun, 27 Sep 2020 12:30:41 +0000 (08:30 -0400)]
Typo fix; NFC
Alexey Lapshin [Wed, 23 Sep 2020 17:04:52 +0000 (20:04 +0300)]
[llvm-objcopy][NFC] refactor error handling. part 2.
Remove usages of special error reporting functions(error(),
reportError()). This patch is extracted from D87987.
Errors are reported as Expected<>/Error returning values.
This part is for COFF subfolder of llvm-objcopy.
Testing: check-all.
Differential Revision: https://reviews.llvm.org/D88213
Tatsuo Nomura [Sun, 27 Sep 2020 10:15:34 +0000 (12:15 +0200)]
Fix MIPS and MIPS64 ABI to use ConstString in their register info arrays.
RegInfoBasedABI::GetRegisterInfoByName was failing because mips/mips64 ABIs
don't use ConstString in their register info array.
Reviewed By: #lldb, teemperor
Differential Revision: https://reviews.llvm.org/D88375
Amara Emerson [Sun, 27 Sep 2020 08:45:09 +0000 (01:45 -0700)]
[AArch64][GlobalISel] Promote scalar G_SHL constant shift amounts to s64.
This was supposed to be done in the first place as is currently the case for
G_ASHR and G_LSHR but was forgotten when the original shift legalization
overhaul was done last year.
This was exposed because we started falling back on s32 = s32, s64 SHLs
due to a recent combiner change.
Gives a very minor (0.1%) code size -O0 improvement on consumer-typeset.
Nikita Popov [Sat, 12 Sep 2020 21:10:15 +0000 (23:10 +0200)]
[Legalize][X86] Improve nnan fmin/fmax vector reduction
Use +/-Inf or +/-Largest as neutral element for nnan fmin/fmax
reductions. This avoids dropping any FMF flags. Preserving the
nnan flag in particular is important to get a good lowering on X86.
Differential Revision: https://reviews.llvm.org/D87586
Amara Emerson [Sun, 27 Sep 2020 08:22:55 +0000 (01:22 -0700)]
[AArch64][GlobalISel] Use the look-through constant helper for the shift s32->s64 custom legalization.
Almost NFC, except it catches more cases and gives a 0.1% CTMark -O0 size win.
Fangrui Song [Sun, 27 Sep 2020 08:04:30 +0000 (01:04 -0700)]
[DivRemPairs] Use DenseMapBase::find instead of operator[]. NFC
Craig Topper [Sun, 27 Sep 2020 06:26:40 +0000 (23:26 -0700)]
[X86] Add more test cases to inline-asm-flag-output.ll. NFC
These are tests to make sure we are able to use the flag directly
in a conditional branch after the inline asm.
sunshaoce [Sun, 27 Sep 2020 05:40:50 +0000 (05:40 +0000)]
Update Kaleidoscope: Change headers
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D88141
Chen Zheng [Sat, 26 Sep 2020 12:30:48 +0000 (08:30 -0400)]
[Machinesink] add one more profitable loop related pattern
Reviewed By: qcolombet
Differential Revision: https://reviews.llvm.org/D86925
Robert Widmann [Sat, 26 Sep 2020 23:32:38 +0000 (17:32 -0600)]
[LLVM-C] Turn a ShuffleVector Constant Into a Getter.
It is not a good idea to expose raw constants in the LLVM C API. Replace this with an explicit getter.
Differential Revision: https://reviews.llvm.org/D88367
Fangrui Song [Sat, 26 Sep 2020 22:57:09 +0000 (15:57 -0700)]
Internalize functions from various tools. NFC
And internalize some classes if I noticed them:)
Amy Kwan [Fri, 25 Sep 2020 17:58:16 +0000 (12:58 -0500)]
[NFC][PowerPC] Change PPCSubTarget (introduced from D87671) to Subtarget
In D87671, it introduced PPCSubTarget in PPCISelDAGToDAG. This should have been
Subtarget instead. This patch changes PPCSubTarget into Subtarget.
Aaron Puchert [Sat, 26 Sep 2020 22:46:24 +0000 (00:46 +0200)]
Fix sphinx warnings in AttributeReference, NFC
The previous attempt in
d34c8c70 didn't help (the problem was missing
indentation), and another issue was introduced by
a51d51a0.
Fangrui Song [Sat, 26 Sep 2020 22:04:39 +0000 (15:04 -0700)]
[ConstraintElimination] Internalize function/class and delete an implied condition. NFC
Delete an implied condition (E.NumIn <= CB.NumIn)
Simon Pilgrim [Sat, 26 Sep 2020 21:07:51 +0000 (22:07 +0100)]
[X86] Add 64-bit target tests
Russell Yanofsky [Sat, 26 Sep 2020 20:11:43 +0000 (22:11 +0200)]
Thread safety analysis: Improve documentation for ASSERT_CAPABILITY
Previous description didn't actually state the effect the attribute has on
thread safety analysis (causing analysis to assume the capability is held).
Previous description was also ambiguous about (or slightly overstated) the
noreturn assumption made by thread safety analysis, implying the assumption had
to be true about the function's behavior in general, and not just its behavior
in places where it's used. Stating the assumption specifically should avoid a
perceived need to disable thread safety analysis in places where only asserting
that a specific capability is held would be better.
Reviewed By: aaronpuchert, vasild
Differential Revision: https://reviews.llvm.org/D87629
Riccardo Bertossa [Sat, 26 Sep 2020 19:41:20 +0000 (12:41 -0700)]
[flang] SAVE statement should not apply to nested scoping units
SAVE statement, according to 8.6.14, must apply to the same scoping
unit, that excludes nested scoping units. For example, if the SAVE
statement is found in a MODULE, the functions contained in that module
should not inherit the SAVE attribute. I think that the code was doing
this, failing the following source:
```
MODULE pippo
SAVE
CONTAINS
PURE FUNCTION fft_stick_index( )
IMPLICIT NONE
INTEGER :: fft_stick_index
INTEGER :: mc !error: A pure subprogram may not have a variable with the SAVE attribute
END FUNCTION
END MODULE
```
Differential Revision: https://reviews.llvm.org/D88279
Simon Pilgrim [Sat, 26 Sep 2020 19:08:43 +0000 (20:08 +0100)]
[InstCombine] Add basic vector test coverage for icmp_eq/ne zero combines
Florian Hahn [Sat, 26 Sep 2020 16:56:15 +0000 (17:56 +0100)]
Revert "[DSE] Switch to MemorySSA-backed DSE by default."
There appears to be a mis-compile with MemorySSA-backed DSE in
combination with llvm.lifetime.end. It currently appears like
DSE is doing the right thing and the llvm.lifetime.end markers
are incorrect. The reverted patch uncovers the mis-compile.
This patch temporarily switches back to the legacy DSE
implementation, while we investigate.
This reverts commit
9d172c8e9c845a36b61dc12c27de8acdbef8b247.
Nico Weber [Sat, 26 Sep 2020 16:42:50 +0000 (12:42 -0400)]
[gn build] update TODO
Jacques Pienaar [Sat, 26 Sep 2020 16:18:35 +0000 (09:18 -0700)]
[mlir] Fix capitalization typo
Was testing on case insensitive config :-/
Jacques Pienaar [Sat, 26 Sep 2020 16:02:35 +0000 (09:02 -0700)]
[mlir] Updates to generate dialect rather than op docs
Jacques Pienaar [Sat, 26 Sep 2020 15:47:25 +0000 (08:47 -0700)]
[mlir] Fix passes.md's naming & add missing
Simon Pilgrim [Sat, 26 Sep 2020 14:49:19 +0000 (15:49 +0100)]
[X86] Cleanup check-prefixes for vector-mul.ll tests
Many x86/x64 SSE tests codegen are the same so avoid duplication
Simon Pilgrim [Sat, 26 Sep 2020 13:31:17 +0000 (14:31 +0100)]
[DAG] Fold vector mul(x,0)/mul(x,1) to a clearing mask
If we're multiplying all elements of a vector by '0' or '1' then we can more efficiently perform this as a clearing mask (that is likely to further simplify to a shuffle blend).
This was noticed when reviewing D87502 but seems to help idiv/irem by constant cases even more as '0'/'1' values are often used for 'passthrough' cases.
Differential Revision: https://reviews.llvm.org/D88225
Simon Pilgrim [Fri, 25 Sep 2020 21:33:15 +0000 (22:33 +0100)]
MachineCSE.cpp - use auto const& iterators in for-range loops to avoid copies. NFCI.
Serge Pavlov [Sat, 26 Sep 2020 13:23:42 +0000 (20:23 +0700)]
Run test on particular target only
The test `AST/const-fpfeatures-diag.c` requires setting strict FP
semantics, so it fails on targets where support of such semantic
is limited.
Paul C. Anagnostopoulos [Thu, 24 Sep 2020 15:58:07 +0000 (11:58 -0400)]
[TableGen] Add/edit Doxygen comments to match "TableGen Backend Developer's Guide."
Florian Hahn [Sat, 26 Sep 2020 10:03:25 +0000 (11:03 +0100)]
[DSE] Unify & fix mem terminator location checks.
When looking for memory defs killed by memory terminators the code
currently incorrectly ignores the size argument of llvm.lifetime.end.
This patch updates the code to use isMemTerminator and updates
isMemTerminator to use isOverwrite() to make sure locations that are
outside the range marked as dead by llvm.lifetime.end are not
considered. Note that isOverwrite is only used for llvm.lifetime.end,
because free-like functions make the whole underlying object dead.
Florian Hahn [Sat, 26 Sep 2020 09:13:12 +0000 (10:13 +0100)]
[DSE] Add tests with lifetime.end that only mark parts of the obj as dead.
llvm.lifetime.end accepts a size parameters to limit the size of the
location marked as dead. Add a few tests with stores to locations after
the part that has been marked as dead.
Serge Pavlov [Thu, 17 Sep 2020 07:10:07 +0000 (14:10 +0700)]
[FPEnv] Evaluate constant expressions under non-default rounding modes
The change implements evaluation of constant floating point expressions
under non-default rounding modes. The main objective was to support
evaluation of global variable initializers, where constant rounding mode
may be specified by `#pragma STDC FENV_ROUND`.
Differential Revision: https://reviews.llvm.org/D87822
Tyker [Sat, 26 Sep 2020 10:31:12 +0000 (12:31 +0200)]
[LoopDelete][Assume] Allow deleting loops with assumes
This pervent very poor optimization caused by a signle assume like https://godbolt.org/z/EK3oMh
baseline flags: -O3
patched flags: -O3 -mllvm --enable-knowledge-retention
Before the patch
```
Metric: compile_time
Program baseline patched diff
test-suite :: CTMark/tramp3d-v4/tramp3d-v4.test 20.72 29.74 43.5%
test-suite :: CTMark/Bullet/bullet.test 24.39 24.91 2.2%
test-suite :: CTMark/7zip/7zip-benchmark.test 37.39 38.06 1.8%
test-suite :: CTMark/kimwitu++/kc.test 11.76 11.94 1.5%
test-suite :: CTMark/sqlite3/sqlite3.test 12.94 12.91 -0.3%
test-suite :: CTMark/SPASS/SPASS.test 11.72 11.70 -0.2%
test-suite :: CTMark/lencod/lencod.test 16.12 16.10 -0.1%
test-suite :: CTMark/ClamAV/clamscan.test 13.31 13.30 -0.1%
test-suite :: CTMark/mafft/pairlocalalign.test 9.12 9.12 -0.1%
test-suite :: CTMark/consumer-typeset/consumer-typeset.test 9.34 9.34 -0.1%
Geomean difference 4.2%
Metric: compiler_Kinst_count
Program baseline patched diff
test-suite :: CTMark/tramp3d-v4/tramp3d-v4.test
107576069.87
172886418.90 60.7%
test-suite :: CTMark/Bullet/bullet.test
123291865.66
125457117.96 1.8%
test-suite :: CTMark/kimwitu++/kc.test
56347884.64
57298544.14 1.7%
test-suite :: CTMark/7zip/7zip-benchmark.test
180637699.58
183341656.57 1.5%
test-suite :: CTMark/sqlite3/sqlite3.test
66723788.85
66664692.80 -0.1%
test-suite :: CTMark/ClamAV/clamscan.test
69581500.56
69597863.92 0.0%
test-suite :: CTMark/lencod/lencod.test
94236501.48
94216545.32 -0.0%
test-suite :: CTMark/SPASS/SPASS.test
58516756.95
58505089.07 -0.0%
test-suite :: CTMark/consumer-typeset/consumer-typeset.test
48832815.53
48841989.39 0.0%
test-suite :: CTMark/mafft/pairlocalalign.test
49682720.53
49686324.34 0.0%
Geomean difference 5.4%
```
After the patch
```
Metric: compile_time
Program baseline patched diff
test-suite :: CTMark/tramp3d-v4/tramp3d-v4.test 20.70 22.40 8.2%
test-suite :: CTMark/7zip/7zip-benchmark.test 37.13 38.05 2.5%
test-suite :: CTMark/Bullet/bullet.test 24.25 24.83 2.4%
test-suite :: CTMark/kimwitu++/kc.test 11.69 11.94 2.2%
test-suite :: CTMark/ClamAV/clamscan.test 13.19 13.36 1.3%
test-suite :: CTMark/lencod/lencod.test 16.02 16.19 1.1%
test-suite :: CTMark/consumer-typeset/consumer-typeset.test 9.29 9.36 0.7%
test-suite :: CTMark/SPASS/SPASS.test 11.64 11.73 0.7%
test-suite :: CTMark/mafft/pairlocalalign.test 9.10 9.15 0.5%
test-suite :: CTMark/sqlite3/sqlite3.test 12.95 12.96 0.0%
Geomean difference 1.9%
Metric: compiler_Kinst_count
Program baseline patched diff
test-suite :: CTMark/tramp3d-v4/tramp3d-v4.test
107590933.61
114044834.72 6.0%
test-suite :: CTMark/kimwitu++/kc.test
56344526.77
57235806.29 1.6%
test-suite :: CTMark/Bullet/bullet.test
123291285.10
125128334.97 1.5%
test-suite :: CTMark/7zip/7zip-benchmark.test
180641540.10
183155706.39 1.4%
test-suite :: CTMark/sqlite3/sqlite3.test
66725619.22
66668713.92 -0.1%
test-suite :: CTMark/SPASS/SPASS.test
58509029.85
58478704.75 -0.1%
test-suite :: CTMark/consumer-typeset/consumer-typeset.test
48843711.23
48826894.68 -0.0%
test-suite :: CTMark/lencod/lencod.test
94233305.79
94207544.23 -0.0%
test-suite :: CTMark/ClamAV/clamscan.test
69587887.66
69603549.90 0.0%
test-suite :: CTMark/mafft/pairlocalalign.test
49686968.65
49689291.04 0.0%
Geomean difference 1.0%
```
Reviewed By: jdoerfert, efriedma
Differential Revision: https://reviews.llvm.org/D86816
Simon Atanasyan [Thu, 24 Sep 2020 21:01:07 +0000 (00:01 +0300)]
[CodeGen] Do not call `emitGlobalConstantLargeInt` for constant requires 8 bytes to store
This is a fix for PR47630. The regression is caused by the D78011. After
this change the code starts to call the `emitGlobalConstantLargeInt` even
for constants which requires eight bytes to store.
Differential revision: https://reviews.llvm.org/D88261
Qiu Chaofan [Sat, 26 Sep 2020 05:46:40 +0000 (13:46 +0800)]
[SelectionDAG] Add guard to automatically insert flags
This is like FastMathFlagGuard in IR. Since we use SDAG instance to get
values, it's with SelectionDAG. By creating a FlagInserter in current
scope, all values created by getNode will get the flags if no Flags
argument provided.
In this patch, I applied it to floating point operations folding part in
DAG combiner, and removed Flags passing to getNode to show its effect.
Other places in DAG combiner and other helper methods similar to getNode
also need this. They can be done in follow-up patches.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D87361
Dmitry Antipov [Sat, 26 Sep 2020 05:52:08 +0000 (08:52 +0300)]
[Driver] Fix formatting as suggested by clang-format (NFC)
Dmitry Antipov [Sat, 26 Sep 2020 05:44:08 +0000 (08:44 +0300)]
[Driver] Perform Linux distribution detection only once
Differential Revision: https://reviews.llvm.org/D87187
Fangrui Song [Sat, 26 Sep 2020 03:31:31 +0000 (20:31 -0700)]
[bindings/go] Fix TestAttributes after D88241
John Demme [Sat, 26 Sep 2020 02:18:54 +0000 (02:18 +0000)]
Common code preparation for tblgen-types patch
Cleanup and add methods which https://reviews.llvm.org/D86904 requires. Breaking up to lower review load.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D88267
Shilei Tian [Sat, 26 Sep 2020 02:10:22 +0000 (22:10 -0400)]
[Clang][OpenMP] Added support for nowait target in CodeGen via regular task
Previously for nowait target, CG emitted a function call to `__tgt_target_nowait`, etc. However, in OpenMP RTL, these functions just directly call the no-nowait version, which means nowait is not working as expected.
OpenMP specification says a target is acutally a target task, which is an untied and detachable task. It is natural to go to the direction that generates a task for a nowait target. However, OpenMP task has a problem that it must be within to a parallel region; otherwise the task will be executed immediately. As a result, if we directly wrap to a regular task, the `target nowait` outside of a parallel region is still a synchronous version.
In D77609, I added the support for unshackled task in OpenMP RTL. Basically, unshackled task is a task that is not bound to any parallel region. So all nowait target will be tranformed into an unshackled task. In order to distinguish from regular task, a new flag bit is set for unshackled task. This flag will be used by RTL for later process.
Since all target tasks are allocated via `__kmpc_omp_target_task_alloc`, and in current `libomptarget`, `__kmpc_omp_target_task_alloc` just calls `__kmpc_omp_task_alloc`. Therefore, we can modify the flag in `__kmpc_omp_target_task_alloc` so that we don't need to modify the FE too much. If users choose to opt out the feature, they just need to use a RTL w/o support of unshackled threads.
As a result, in this patch, the `target nowait` region is simply wrapped into a regular task. Later once we have RTL support for unshackled tasks, the wrapped tasks can be executed by unshackled threads w/o changes in the FE.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D78075
Amara Emerson [Sat, 26 Sep 2020 00:38:10 +0000 (17:38 -0700)]
[AArch64][GlobalISel] If a G_BUILD_VECTOR operands are all G_CONSTANT then assign to gpr bank.
Even if the type is s8/s16, assigning to gpr is preferable with constants because
worst case we can select via a constant pool load, and without cross-bank copies
to the FPR bank more patterns can be imported later.
Arthur Eubanks [Thu, 17 Sep 2020 17:36:39 +0000 (10:36 -0700)]
[LowerTypeTests][NewPM] Add constructor that uses command line flags
This matches the legacy PM pass by having one constructor use command
line flags, and the other use parameters to the pass.
This fixes all tests under Transforms/LowerTypeTests using NPM.
Reviewed By: ychen, pcc
Differential Revision: https://reviews.llvm.org/D87845
Amara Emerson [Sat, 26 Sep 2020 00:21:03 +0000 (17:21 -0700)]
[AArch64][GlobalISel] Add a few more vector type combinations for shift selection.
Fangrui Song [Sat, 26 Sep 2020 00:33:12 +0000 (17:33 -0700)]
[lldb/bindings] Fix -Wformat after D88123
Evandro Menezes [Fri, 25 Sep 2020 23:07:12 +0000 (18:07 -0500)]
[RISCV] Update driver tests
Add the RISC-V Bullet core to the driver tests.
Michael Collison [Fri, 25 Sep 2020 22:59:08 +0000 (17:59 -0500)]
[RISCV] Scheduler description for Bullet
Add the pipeline model for the RISC-V Bullet micro architecture.
Co-authored-by: Evandro Menezes <evandro.menezes@sifive.com>
Alexander Shaposhnikov [Fri, 25 Sep 2020 23:02:23 +0000 (16:02 -0700)]
[Object][MachO] Refine the interface of Slice
This patch performs a minor cleanup of the class Slice:
static methods and constructors which take a pointer but assume that
it's not null now take the argument by reference.
NFC.
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D88320
Craig Topper [Thu, 24 Sep 2020 22:01:54 +0000 (15:01 -0700)]
[IR] Improve the description for Constant::isNormalFP to list all things that are not normal instead of just denormal. NFC
Evandro Menezes [Fri, 25 Sep 2020 20:44:44 +0000 (15:44 -0500)]
[RISCV] Fix formatting (NFC)