Johannes Doerfert [Fri, 28 Aug 2020 01:34:07 +0000 (20:34 -0500)]
[Attributor][FIX] Properly return changed if the IR was modified
Deleting or replacing anything is certainly a modification. This caused
a later assertion in IPSCCP when compiling 400.perlbench with the new PM.
I'm not sure how to test this.
Max Kazantsev [Tue, 8 Sep 2020 04:14:36 +0000 (11:14 +0700)]
[Test] Auto-generated checks for some IndVarSimplify tests
Zequan Wu [Tue, 8 Sep 2020 03:55:05 +0000 (20:55 -0700)]
[Sema] fix /gr warning test case
Qiu Chaofan [Tue, 8 Sep 2020 03:03:09 +0000 (11:03 +0800)]
[PowerPC] Implement instruction clustering for stores
On Power10, it's profitable to schedule some stores with adjacent target
address together. This patch implements this feature.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D86754
Alexander Shaposhnikov [Tue, 8 Sep 2020 01:29:48 +0000 (18:29 -0700)]
[llvm-objcopy] Consolidate and unify version tests
In this diff the tests which verify version printing functionality are refactored.
Since they are not specific to a particular format we move them into tool-version.test
and slightly unify (similarly to tool-name.test and tool-help-message.test).
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D87211
Kiran Kumar T P [Tue, 8 Sep 2020 01:22:07 +0000 (06:52 +0530)]
[flang][OpenMP] Enhance parser support for atomic construct to OpenMP 5.0
Summary:
This patch enhances parser support for atomic construct to OpenMP 5.0.
2.17.7 atomic -> ATOMIC [clause [,]] atomic-clause [[,] clause] |
ATOMIC [clause]
clause -> memory-order-clause | HINT(hint-expression)
memory-order-clause -> SEQ_CST | ACQ_REL | RELEASE | ACQUIRE | RELAXED
atomic-clause -> READ | WRITE | UPDATE | CAPTURE
The patch includes code changes and testcase modifications.
Reviewed By: DavidTruby, kiranchandramohan, sameeranjoshi
Differential Revision: https://reviews.llvm.org/D82931
Craig Topper [Tue, 8 Sep 2020 00:57:39 +0000 (17:57 -0700)]
[builtins] Inline __paritysi2 into __paritydi2 and inline __paritydi2 into __parityti2.
No point in making __parityti2 go through 2 calls to get to
__paritysi2.
Reviewed By: MaskRay, efriedma
Differential Revision: https://reviews.llvm.org/D87218
Mehdi Amini [Tue, 8 Sep 2020 00:56:10 +0000 (00:56 +0000)]
Update SVG images to be properly cropped (NFC)
Mehdi Amini [Tue, 8 Sep 2020 00:06:37 +0000 (00:06 +0000)]
Add a doc/tutorial on traversing the IR
Reviewed By: stephenneuendorffer
Differential Revision: https://reviews.llvm.org/D87221
Mehdi Amini [Mon, 7 Sep 2020 23:58:54 +0000 (23:58 +0000)]
Add documentation for getDependentDialects() in the PassManagement infra docs
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D87181
Zequan Wu [Fri, 21 Aug 2020 20:42:20 +0000 (13:42 -0700)]
[Sema][MSVC] warn at dynamic_cast when /GR- is given
Differential Revision: https://reviews.llvm.org/D86369
Florian Hahn [Mon, 7 Sep 2020 21:52:10 +0000 (22:52 +0100)]
[DSE,MemorySSA] Add an early check for read clobbers to traversal.
Depending on the benchmark, this early exit can save a substantial
amount of compile-time:
http://llvm-compile-time-tracker.com/compare.php?from=
505f2d817aa8e07ba98e5fd4a8f6ff0666f89df1&to=
eb4e441147f9b4b7a5fcbbc57428cadbe9e01f10&stat=instructions
Fangrui Song [Mon, 7 Sep 2020 21:44:53 +0000 (14:44 -0700)]
[asan][test] Use --image-base for Linux/asan_prelink_test.cpp if ld is LLD
LLD supports -Ttext but with the option there is still a PT_LOAD at address zero
and thus the Linux kernel will map it to a different address and the test will fail.
Use --image-base instead.
Roman Lebedev [Mon, 7 Sep 2020 20:54:06 +0000 (23:54 +0300)]
Reland [SimplifyCFG][LoopRotate] SimplifyCFG: disable common instruction hoisting by default, enable late in pipeline
This was reverted in
503deec2183d466dad64b763bab4e15fd8804239
because it caused gigantic increase (3x) in branch mispredictions
in certain benchmarks on certain CPU's,
see https://reviews.llvm.org/D84108#2227365.
It has since been investigated and here are the results:
https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-
20200907/827578.html
> It's an amazingly severe regression, but it's also all due to branch
> mispredicts (about 3x without this). The code layout looks ok so there's
> probably something else to deal with. I'm not sure there's anything we can
> reasonably do so we'll just have to take the hit for now and wait for
> another code reorganization to make the branch predictor a bit more happy :)
>
> Thanks for giving us some time to investigate and feel free to recommit
> whenever you'd like.
>
> -eric
So let's just reland this.
Original commit message:
I've been looking at missed vectorizations in one codebase.
One particular thing that stands out is that some of the loops
reach vectorizer in a rather mangled form, with weird PHI's,
and some of the loops aren't even in a rotated form.
After taking a more detailed look, that happened because
the loop's headers were too big by then. It is evident that
SimplifyCFG's common code hoisting transform is at fault there,
because the pattern it handles is precisely the unrotated
loop basic block structure.
Surprizingly, `SimplifyCFGOpt::HoistThenElseCodeToIf()` is enabled
by default, and is always run, unlike it's friend, common code sinking
transform, `SinkCommonCodeFromPredecessors()`, which is not enabled
by default and is only run once very late in the pipeline.
I'm proposing to harmonize this, and disable common code hoisting
until //late// in pipeline. Definition of //late// may vary,
here currently i've picked the same one as for code sinking,
but i suppose we could enable it as soon as right after
loop rotation happens.
Experimentation shows that this does indeed unsurprizingly help,
more loops got rotated, although other issues remain elsewhere.
Now, this undoubtedly seriously shakes phase ordering.
This will undoubtedly be a mixed bag in terms of both compile- and
run- time performance, codesize. Since we no longer aggressively
hoist+deduplicate common code, we don't pay the price of said hoisting
(which wasn't big). That may allow more loops to be rotated,
so we pay that price. That, in turn, that may enable all the transforms
that require canonical (rotated) loop form, including but not limited to
vectorization, so we pay that too. And in general, no deduplication means
more [duplicate] instructions going through the optimizations. But there's still
late hoisting, some of them will be caught late.
As per benchmarks i've run {
F12360204}, this is mostly within the noise,
there are some small improvements, some small regressions.
One big regression i saw i fixed in rG8d487668d09fb0e4e54f36207f07c1480ffabbfd, but i'm sure
this will expose many more pre-existing missed optimizations, as usual :S
llvm-compile-time-tracker.com thoughts on this:
http://llvm-compile-time-tracker.com/compare.php?from=
e40315d2b4ed1e38962a8f33ff151693ed4ada63&to=
c8289c0ecbf235da9fb0e3bc052e3c0d6bff5cf9&stat=instructions
* this does regress compile-time by +0.5% geomean (unsurprizingly)
* size impact varies; for ThinLTO it's actually an improvement
The largest fallout appears to be in GVN's load partial redundancy
elimination, it spends *much* more time in
`MemoryDependenceResults::getNonLocalPointerDependency()`.
Non-local `MemoryDependenceResults` is widely-known to be, uh, costly.
There does not appear to be a proper solution to this issue,
other than silencing the compile-time performance regression
by tuning cut-off thresholds in `MemoryDependenceResults`,
at the cost of potentially regressing run-time performance.
D84609 attempts to move in that direction, but the path is unclear
and is going to take some time.
If we look at stats before/after diffs, some excerpts:
* RawSpeed (the target) {
F12360200}
* -14 (-73.68%) loops not rotated due to the header size (yay)
* -272 (-0.67%) `"Number of live out of a loop variables"` - good for vectorizer
* -3937 (-64.19%) common instructions hoisted
* +561 (+0.06%) x86 asm instructions
* -2 basic blocks
* +2418 (+0.11%) IR instructions
* vanilla test-suite + RawSpeed + darktable {
F12360201}
* -36396 (-65.29%) common instructions hoisted
* +1676 (+0.02%) x86 asm instructions
* +662 (+0.06%) basic blocks
* +4395 (+0.04%) IR instructions
It is likely to be sub-optimal for when optimizing for code size,
so one might want to change tune pipeline by enabling sinking/hoisting
when optimizing for size.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D84108
This reverts commit
503deec2183d466dad64b763bab4e15fd8804239.
Nikita Popov [Mon, 7 Sep 2020 19:07:02 +0000 (21:07 +0200)]
[KnownBits] Avoid some copies (NFC)
These lambdas don't need copies, use const reference.
Nikita Popov [Sat, 5 Sep 2020 08:27:23 +0000 (10:27 +0200)]
[SCCP] Compute ranges for supported intrinsics
For intrinsics supported by ConstantRange, compute the result range
based on the argument ranges. We do this independently of whether
some or all of the input ranges are full, as we can often still
constrain the result in some way.
Differential Revision: https://reviews.llvm.org/D87183
Craig Topper [Mon, 7 Sep 2020 19:23:15 +0000 (12:23 -0700)]
[SelectionDAG][X86][ARM] Teach ExpandIntRes_ABS to use sra+add+xor expansion when ADDCARRY is supported.
Rather than using SELECT instructions, use SRA, UADDO/ADDCARRY and
XORs to expand ABS. This is the multi-part version of the sequence
we use in LegalizeDAG.
It's also the same as the Custom sequence uses for i64 on 32-bit
and i128 on 64-bit. So we can remove the X86 customization.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D87215
Sanjay Patel [Mon, 7 Sep 2020 19:47:57 +0000 (15:47 -0400)]
[InstCombine] improve fold of pointer differences
This was supposed to be an NFC cleanup, but there's
a real logic difference (did not drop 'nsw') visible
in some tests in addition to an efficiency improvement.
This is because in the case where we have 2 GEPs,
the code was *always* swapping the operands and
negating the result. But if we have 2 GEPs, we
should *never* need swapping/negation AFAICT.
This is part of improving flags propagation noticed
with PR47430.
Sanjay Patel [Mon, 7 Sep 2020 19:26:43 +0000 (15:26 -0400)]
[InstCombine] add ptr difference tests; NFC
Craig Topper [Mon, 7 Sep 2020 17:59:57 +0000 (10:59 -0700)]
[X86] Use the same sequence for i128 ISD::ABS on 64-bit targets as we use for i64 on 32-bit targets.
Differential Revision: https://reviews.llvm.org/D87214
Craig Topper [Mon, 7 Sep 2020 17:41:05 +0000 (10:41 -0700)]
[X86] Pre-commit new test case for D87214. NFC
Sanjay Patel [Mon, 7 Sep 2020 18:11:06 +0000 (14:11 -0400)]
[DAGCombiner] allow more store merging for non-i8 truncated ops
This is a follow-up suggested in D86420 - if we have a pair of stores
in inverted order for the target endian, we can rotate the source
bits into place.
The "be_i64_to_i16_order" test shows a limitation of the current
function (which might be avoided if we integrate this function with
the other cases in mergeConsecutiveStores). In the earlier
"be_i64_to_i16" test, we skip the first 2 stores because we do not
match the full set as consecutive or rotate-able, but then we reach
the last 2 stores and see that they are an inverted pair of 16-bit
stores. The "be_i64_to_i16_order" test alters the program order of
the stores, so we miss matching the sub-pattern.
Differential Revision: https://reviews.llvm.org/D87112
Eric Astor [Mon, 7 Sep 2020 18:00:05 +0000 (14:00 -0400)]
[ms] [llvm-ml] Allow use of locally-defined variables in expressions
MASM allows variables defined by equate statements to be used in expressions.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D86946
Eric Astor [Mon, 7 Sep 2020 17:58:55 +0000 (13:58 -0400)]
[ms] [llvm-ml] Fix STRUCT field alignment
MASM aligns fields to the _minimum_ of the STRUCT alignment value and the size of the next field.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D86945
Eric Astor [Mon, 7 Sep 2020 17:57:06 +0000 (13:57 -0400)]
[ms] [llvm-ml] Add support for bitwise named operators (AND, NOT, OR) in MASM
Add support for expressions of the form '1 or 2', etc.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D86944
Simon Pilgrim [Mon, 7 Sep 2020 17:35:06 +0000 (18:35 +0100)]
VPlan.h - remove unnecessary forward declarations. NFCI.
Already defined in includes.
Simon Pilgrim [Mon, 7 Sep 2020 17:15:26 +0000 (18:15 +0100)]
MipsISelLowering.h - remove CCState/CCValAssign forward declarations. NFCI.
These are already defined in the CallingConvLower.h include.
Simon Pilgrim [Mon, 7 Sep 2020 16:50:58 +0000 (17:50 +0100)]
BTFDebug.h - reduce MachineInstr.h include to forward declaration. NFCI.
Simon Pilgrim [Mon, 7 Sep 2020 16:09:42 +0000 (17:09 +0100)]
LeonPasses.h - remove unnecessary includes. NFCI.
Reduce to forward declarations and move includes to LeonPasses.cpp where necessary.
Simon Pilgrim [Mon, 7 Sep 2020 15:56:57 +0000 (16:56 +0100)]
LeonPasses.h - remove orphan function declarations. NFCI.
The implementations no longer exist.
Sanjay Patel [Mon, 7 Sep 2020 16:37:59 +0000 (12:37 -0400)]
[InstCombine] improve folds for icmp with multiply operands (PR47432)
Check for no overflow along with an odd constant before
we lose information by converting to bitwise logic.
https://rise4fun.com/Alive/2Xl
Pre: C1 != 0
%mx = mul nsw i8 %x, C1
%my = mul nsw i8 %y, C1
%r = icmp eq i8 %mx, %my
=>
%r = icmp eq i8 %x, %y
Name: nuw ne
Pre: C1 != 0
%mx = mul nuw i8 %x, C1
%my = mul nuw i8 %y, C1
%r = icmp ne i8 %mx, %my
=>
%r = icmp ne i8 %x, %y
Name: odd ne
Pre: C1 % 2 != 0
%mx = mul i8 %x, C1
%my = mul i8 %y, C1
%r = icmp ne i8 %mx, %my
=>
%r = icmp ne i8 %x, %y
Sanjay Patel [Mon, 7 Sep 2020 15:40:59 +0000 (11:40 -0400)]
[InstCombine] move/add tests for icmp with mul operands; NFC
alex-t [Mon, 7 Sep 2020 15:57:27 +0000 (18:57 +0300)]
[AMDGPU] SILowerControlFlow::optimizeEndCF should remove empty basic block
optimizeEndCF removes EXEC restoring instruction case this instruction is the only one except the branch to the single successor and that successor contains EXEC mask restoring instruction that was lowered from END_CF belonging to IF_ELSE.
As a result of such optimization we get the basic block with the only one instruction that is a branch to the single successor.
In case the control flow can reach such an empty block from S_CBRANCH_EXEZ/EXECNZ it might happen that spill/reload instructions that were inserted later by register allocator are placed under exec == 0 condition and never execute.
Removing empty block solves the problem.
This change require further work to re-implement LIS updates. Recently, LIS is always nullptr in this pass. To enable it we need another patch to fix many places across the codegen.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D86634
Momchil Velikov [Mon, 7 Sep 2020 15:16:52 +0000 (16:16 +0100)]
Reduce the number of memory allocations when displaying
a warning about clobbering reserved registers (NFC).
Also address some minor inefficiencies and style issues.
Differential Revision: https://reviews.llvm.org/D86088
Gabor Marton [Mon, 7 Sep 2020 15:15:15 +0000 (17:15 +0200)]
[analyzer][StdLibraryFunctionsChecker] Have proper weak dependencies
We want the generice StdLibraryFunctionsChecker to report only if there
are no specific checkers that would handle the argument constraint for a
function.
Note, the assumptions are still evaluated, even if the arguement
constraint checker is set to not report. This means that the assumptions
made in the generic StdLibraryFunctionsChecker should be an
over-approximation of the assumptions made in the specific checkers. But
most importantly, the assumptions should not contradict.
Differential Revision: https://reviews.llvm.org/D87240
Richard Barton [Mon, 7 Sep 2020 15:33:55 +0000 (16:33 +0100)]
[flang] Spelling and format edits to README.txt. NFC.
Gabor Marton [Thu, 23 Jul 2020 14:57:16 +0000 (16:57 +0200)]
[analyzer][StdLibraryFunctionsChecker] Add POSIX pthread handling functions
Differential Revision: https://reviews.llvm.org/D84415
Richard Barton [Mon, 7 Sep 2020 15:31:12 +0000 (16:31 +0100)]
[flang] Fix link to old repo location in doxygen mainpage. NFC.
Simon Pilgrim [Mon, 7 Sep 2020 15:39:42 +0000 (16:39 +0100)]
AntiDepBreaker.h - remove unnecessary ScheduleDAG.h include. NFCI.
Simon Pilgrim [Mon, 7 Sep 2020 15:17:31 +0000 (16:17 +0100)]
[Sparc] Add reduced funnel shift test case for PR47303
Simon Pilgrim [Mon, 7 Sep 2020 15:11:40 +0000 (16:11 +0100)]
[X86][SSE] Don't use LowerVSETCCWithSUBUS for unsigned compare with +ve operands (PR47448)
We already simplify the unsigned comparisons if we've found the operands are non-negative, but we were still calling LowerVSETCCWithSUBUS which resulted in the PR47448 regressions.
Simon Pilgrim [Mon, 7 Sep 2020 14:53:15 +0000 (15:53 +0100)]
[X86][SSE] Add test cases for PR47448
Simon Pilgrim [Mon, 7 Sep 2020 14:31:54 +0000 (15:31 +0100)]
[X86] Replace UpgradeX86AddSubSatIntrinsics with UpgradeX86BinaryIntrinsics generic helper. NFCI.
Feed the Intrinsic::ID value directly instead of via the IsSigned/IsAddition bool flags.
Sanjay Patel [Mon, 7 Sep 2020 14:26:42 +0000 (10:26 -0400)]
[InstCombine] erase instructions leading up to unreachable
Normal dead code elimination ignores assume intrinsics, so we fail to
delete assumes that are not meaningful (and potentially worse if they
cause conflicts with other assumptions).
The motivating example in https://llvm.org/PR47416 suggests that we
might have problems upstream from here (difference between C and C++),
but this should be a cheap way to make sure we remove more dead code.
Differential Revision: https://reviews.llvm.org/D87149
Frederik Gossen [Mon, 7 Sep 2020 13:58:01 +0000 (13:58 +0000)]
[MLIR][Shape] Merge `shape` to `std`/`scf` lowerings.
Merge the two lowering passes because they are not useful by themselves. The new
pass lowers to `std` and `scf` is considered an auxiliary dialect.
See also
https://llvm.discourse.group/t/conversions-with-multiple-target-dialects/1541/12
Differential Revision: https://reviews.llvm.org/D86779
Sjoerd Meijer [Mon, 7 Sep 2020 13:51:39 +0000 (14:51 +0100)]
Follow up of rG5f1cad4d296a, slightly reduced test case. NFC.
Simon Pilgrim [Mon, 7 Sep 2020 14:07:26 +0000 (15:07 +0100)]
[X86] Auto upgrade SSE/AVX PABS intrinsics to generic Intrinsic::abs
Minor followup to D87101, we were expanding this to a neg+icmp+select pattern like we were in CGBuiltin
Simon Pilgrim [Mon, 7 Sep 2020 13:16:38 +0000 (14:16 +0100)]
[X86][SSE] Use llvm.abs.* vector intrinsics instead of old (deprecated) SSE/AVX intrinsics for combine tests
This also allows us to extend testing to SSE2+ targets
Alex Zinenko [Fri, 4 Sep 2020 08:00:52 +0000 (10:00 +0200)]
[mlir] Take ValueRange instead of ArrayRef<Value> in StructuredIndexed
This was likely overlooked when ValueRange was first introduced. There is no
reason why StructuredIndexed needs specifically an ArrayRef so use ValueRange
for better type compatibility with the rest of the APIs.
Reviewed By: nicolasvasilache, mehdi_amini
Differential Revision: https://reviews.llvm.org/D87127
Esme-Yi [Mon, 7 Sep 2020 13:14:00 +0000 (13:14 +0000)]
[NFC][PowerPC] Add tests in constants-i64.ll.
Georgii Rymar [Mon, 7 Sep 2020 12:52:51 +0000 (15:52 +0300)]
[llvm-readobj] - Remove code duplication when printing dynamic relocations. NFCI.
LLVM style code can be simplified to avoid the duplication of logic
related to printing dynamic relocations.
Differential revision: https://reviews.llvm.org/D87089
Daniel Muñoz [Mon, 7 Sep 2020 13:00:31 +0000 (16:00 +0300)]
[KillTheDoctor/CMake] Add missing keyword PRIVATE in target_link_libraries
Add PRIVATE keyword in target_link_libraries to prevent CMake Error on Windows.
While trying to compile llvm/clang on Windows, the following CMake error occurred. The reason is a missing PUBLIC/PRIVATE/INTERFACE keyword in target_link_libraries.
`
CMake Error at utils/KillTheDoctor/CMakeLists.txt:5 (target_link_libraries):
The keyword signature for target_link_libraries has already been used with
the target "KillTheDoctor". All uses of target_link_libraries with a
target must be either all-keyword or all-plain.
The uses of the keyword signature are here:
* cmake/modules/AddLLVM.cmake:771 (target_link_libraries)
`
Reviewed By: tambre
Differential Revision: https://reviews.llvm.org/D87203
Simon Pilgrim [Mon, 7 Sep 2020 12:53:35 +0000 (13:53 +0100)]
[X86][SSE] Move llvm.x86.ssse3.pabs.*.128 intrinsics to ssse3-intrinsics-x86-upgrade.ll
These have been auto upgraded for some time so this is just a tidyup.
Simon Pilgrim [Mon, 7 Sep 2020 12:44:35 +0000 (13:44 +0100)]
[X86] Update SSE/AVX ABS intrinsics to emit llvm.abs.* (PR46851)
We're now getting close to having the necessary analysis/combines etc. for the new generic llvm.abs.* intrinsics.
This patch updates the SSE/AVX ABS vector intrinsics to emit the generic equivalents instead of the icmp+sub+select code pattern.
Differential Revision: https://reviews.llvm.org/D87101
LLVM GN Syncbot [Mon, 7 Sep 2020 12:51:23 +0000 (12:51 +0000)]
[gn build] Port
23f700c785a
Raphael Isemann [Mon, 7 Sep 2020 12:50:13 +0000 (14:50 +0200)]
Revert "[clang] Prevent that Decl::dump on a CXXRecordDecl deserialises further declarations."
This reverts commit
0478720157f6413fad7595b8eff9c70d2d99b637. This probably
doesn't work when forcing deserialising while dumping (which the ASTDumper
optionally supports).
David Truby [Mon, 7 Sep 2020 12:37:05 +0000 (13:37 +0100)]
Revert "[MLIR][Shape] Merge `shape` to `std`/`scf` lowerings."
This reverts commit
15acdd75439b402e993ebe0dbf8eb02e9b88bbdc.
Georgii Rymar [Mon, 7 Sep 2020 12:30:38 +0000 (15:30 +0300)]
[llvm-readobj/elf] - Generalize the code for printing dynamic relocations. NFCI.
Currently we have 2 large `printDynamicRelocations` methods that
have a very similar code for GNU/LLVM styles.
This patch removes the duplication and renames them to `printDynamicReloc`
for consistency.
Differential revision: https://reviews.llvm.org/D87087
Simon Pilgrim [Mon, 7 Sep 2020 12:20:34 +0000 (13:20 +0100)]
MachineStableHash.h - remove MachineInstr.h include. NFC.
Use forward declarations and move the include to MachineStableHash.cpp
Simon Pilgrim [Mon, 7 Sep 2020 12:19:00 +0000 (13:19 +0100)]
[X86] Replace EmitX86AddSubSatExpr with EmitX86BinaryIntrinsic generic helper. NFCI.
Feed the Intrinsic::ID value directly instead of via the IsSigned/IsAddition bool flags.
Simon Wallis [Mon, 7 Sep 2020 12:21:27 +0000 (13:21 +0100)]
[SelectionDAG] memcpy expansion of const volatile struct ignores const zero
In getMemcpyLoadsAndStores(), a memcpy where the source is a zero constant is expanded to a MemOp::Set instead of a MemOp::Copy, even when the memcpy is volatile.
This is incorrect.
The fix is to add a check for volatile, and expand to MemOp::Copy in the volatile case.
Reviewed By: chill
Differential Revision: https://reviews.llvm.org/D87134
Sanjay Patel [Mon, 7 Sep 2020 11:03:48 +0000 (07:03 -0400)]
[InstCombine] add test with more unreachable insts; NFC
Goes with D87149
Sanjay Patel [Mon, 7 Sep 2020 10:53:40 +0000 (06:53 -0400)]
[InstCombine] give a name to an intermediate value for easier tracking; NFC
As noted in PR47430, we probably want to conditionally include 'nsw'
here anyway, so we are going to need to fill out the optional args.
Nicolas Vasilache [Mon, 7 Sep 2020 10:05:57 +0000 (06:05 -0400)]
[MLIR] Fix Win test due to partial order of CHECK directives
Differential Revision: https://reviews.llvm.org/D87230
Frederik Gossen [Mon, 7 Sep 2020 12:09:43 +0000 (12:09 +0000)]
[MLIR][Shape] Merge `shape` to `std`/`scf` lowerings.
Merge the two lowering passes because they are not useful by themselves. The new
pass lowers to `std` and `scf` is considered an auxiliary dialect.
See also
https://llvm.discourse.group/t/conversions-with-multiple-target-dialects/1541/12
Differential Revision: https://reviews.llvm.org/D86779
Simon Pilgrim [Mon, 7 Sep 2020 12:10:55 +0000 (13:10 +0100)]
LegalizeTypes.h - remove orphan SplitVSETCC declaration. NFCI.
The implementation no longer exists
Georgii Rymar [Fri, 4 Sep 2020 12:25:36 +0000 (15:25 +0300)]
[llvm-readobj/elf] - Introduce Relocation<ELFT> helper.
It removes templating for Elf_Rel[a] handling that we
introduced earlier and introduces a helper class instead.
It was briefly discussed in D87087, which showed,
why having templates is probably not ideal for the generalization
of dumpers code.
Differential revision: https://reviews.llvm.org/D87141
Simon Pilgrim [Mon, 7 Sep 2020 11:56:27 +0000 (12:56 +0100)]
X86AvoidStoreForwardingBlocks.cpp - use unsigned for Opcode values. NFCI.
Fixes clang-tidy cppcoreguidelines-narrowing-conversions warnings.
Simon Pilgrim [Mon, 7 Sep 2020 11:50:32 +0000 (12:50 +0100)]
[X86][AVX] Use lowerShuffleWithPERMV in shuffle combining to support non-VLX targets
lowerShuffleWithPERMV allows us to use the ZMM variants for 128/256-bit variable shuffles on non-VLX AVX512 targets.
This is another step towards shuffle combining through between vector widths - we still end up with an annoying regression (combine_vpermilvar_vperm2f128_zero_8f32) but we're going in the right direction....
Xing GUO [Mon, 7 Sep 2020 11:44:46 +0000 (19:44 +0800)]
[obj2yaml] Add support for dumping the .debug_str section.
This patch adds support for dumping the .debug_str section to obj2yaml.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D86867
Frederik Gossen [Mon, 7 Sep 2020 11:41:27 +0000 (11:41 +0000)]
[MLIR][Standard] Add `dynamic_tensor_from_elements` operation
With `dynamic_tensor_from_elements` tensor values of dynamic size can be
created. The body of the operation essentially maps the index space to tensor
elements.
Declare SCF operations in the `scf` namespace to avoid name clash with the new
`std.yield` operation. Resolve ambiguities between `linalg/shape/std/scf.yield`
operations.
Differential Revision: https://reviews.llvm.org/D86276
Sam Parker [Mon, 7 Sep 2020 10:54:05 +0000 (11:54 +0100)]
[SCEV] Refactor isHighCostExpansionHelper
To enable the cost of constants, the helper function has been
reorganised:
- A struct has been introduced to hold SCEV operand information so
that we know the user of the operand, as well as the operand index.
The Worklist now uses instead instead of a bare SCEV.
- The costing of each SCEV, and collection of its operands, is now
performed in a helper function.
Differential Revision: https://reviews.llvm.org/D86050
LLVM GN Syncbot [Mon, 7 Sep 2020 10:32:22 +0000 (10:32 +0000)]
[gn build] Port
0478720157f
Raphael Isemann [Mon, 7 Sep 2020 09:23:39 +0000 (11:23 +0200)]
[clang] Prevent that Decl::dump on a CXXRecordDecl deserialises further declarations.
Decl::dump is primarily used for debugging to visualise the current state of a
declaration. Usually Decl::dump just displays the current state of the Decl and
doesn't actually change any of its state, however since commit
457226e02a6e8533eaaa864a3fd7c8eeccd2bf58 the method actually started loading
additional declarations from the ExternalASTSource. This causes that calling
Decl::dump during a debugging session now actually does permanent changes to the
AST and will cause the debugged program run to deviate from the original run.
The change that caused this behaviour is the addition of
`hasConstexprDestructor` (which is called from the TextNodeDumper) which
performs a lookup into the current CXXRecordDecl to find the destructor. All
other similar methods just return their respective bit in the DefinitionData
(which obviously doesn't have such side effects).
This just changes the node printer to emit "unknown_constexpr" in case a
CXXRecordDecl is dumped that could potentially call into the ExternalASTSource
instead of the usually empty string/"constexpr". For CXXRecordDecls that can
safely be dumped the old behaviour is preserved
Reviewed By: bruno
Differential Revision: https://reviews.llvm.org/D80878
LLVM GN Syncbot [Mon, 7 Sep 2020 10:25:26 +0000 (10:25 +0000)]
[gn build] Port
81aa66f65f5
Benjamin Kramer [Mon, 7 Sep 2020 10:24:30 +0000 (12:24 +0200)]
[X86] Unbreak the build after
22fa6b20d92e
Simon Pilgrim [Mon, 7 Sep 2020 10:10:40 +0000 (11:10 +0100)]
[X86] getFauxShuffleMask - handle insert_subvector(zero, sub, C)
Directly use SM_SentinelZero elements if we're (widening)inserting into a zero vector.
Simon Pilgrim [Mon, 7 Sep 2020 09:58:53 +0000 (10:58 +0100)]
[X86][AVX] Add extra vperm2f128+vpermilvar combine coverage
The existing test /should/ reduce to a vmovaps (concat xmm with zero upper).
Simon Pilgrim [Mon, 7 Sep 2020 09:49:29 +0000 (10:49 +0100)]
[X86] Use Register instead of unsigned. NFCI.
Fixes llvm-prefer-register-over-unsigned clang-tidy warnings.
Esme-Yi [Mon, 7 Sep 2020 09:45:47 +0000 (09:45 +0000)]
[NFC][PowerPC] Add tests for `mul` with big constants.
Simon Pilgrim [Mon, 7 Sep 2020 09:30:53 +0000 (10:30 +0100)]
[X86] Use Register instead of unsigned. NFCI.
Fixes llvm-prefer-register-over-unsigned clang-tidy warnings.
Simon Pilgrim [Mon, 7 Sep 2020 09:28:01 +0000 (10:28 +0100)]
[X86] Use Register instead of unsigned. NFCI.
Fixes llvm-prefer-register-over-unsigned clang-tidy warning.
Eduardo Caldas [Mon, 31 Aug 2020 16:03:31 +0000 (16:03 +0000)]
[Ignore Expressions][NFC] Refactor to better use `IgnoreExpr.h` and nits
This change groups
* Rename: `ignoreParenBaseCasts` -> `IgnoreParenBaseCasts` for uniformity
* Rename: `IgnoreConversionOperator` -> `IgnoreConversionOperatorSingleStep` for uniformity
* Inline `IgnoreNoopCastsSingleStep` into a lambda inside `IgnoreNoopCasts`
* Refactor `IgnoreUnlessSpelledInSource` to make adequate use of `IgnoreExprNodes`
Differential Revision: https://reviews.llvm.org/D86880
Eduardo Caldas [Fri, 28 Aug 2020 11:52:54 +0000 (11:52 +0000)]
Extract infrastructure to ignore intermediate expressions into `clang/AST/IgnoreExpr.h`
Rationale:
This allows users to use `IgnoreExprNodes` and `Ignore*SingleStep` outside of
`clang/AST/Expr.cpp`.
Minor:
Rename `IgnoreImp...SingleStep` into `IgnoreImplicit...SingleStep`.
Differential Revision: https://reviews.llvm.org/D86778
Nicolas Vasilache [Fri, 4 Sep 2020 15:43:00 +0000 (11:43 -0400)]
[mlir][Vector] Revisit VectorToSCF.
Vector to SCF conversion still had issues due to the interaction with the natural alignment derived by the LLVM data layout. One traditional workaround is to allocate aligned. However, this does not always work for vector sizes that are non-powers of 2.
This revision implements a more portable mechanism where the intermediate allocation is always a memref of elemental vector type. AllocOp is extended to use the natural LLVM DataLayout alignment for non-scalar types, when the alignment is not specified in the first place.
An integration test is added that exercises the transfer to scf.for + scalar lowering with a 5x5 transposition.
Differential Revision: https://reviews.llvm.org/D87150
Pushpinder Singh [Thu, 3 Sep 2020 11:57:46 +0000 (07:57 -0400)]
[OpenMP][AMDGPU] Use DS_Max_Warp_Number instead of WARPSIZE
The size of worker_rootS should have been DS_Max_Warp_Number.
This reduces memory usage by deviceRTL on AMDGPU from around 2.3GB
to around 770MB.
Reviewed By: JonChesterfield, jdoerfert
Differential Revision: https://reviews.llvm.org/D87084
Alex Richardson [Mon, 7 Sep 2020 08:29:56 +0000 (09:29 +0100)]
[clang-format] Correctly parse function declarations with TypenameMacros
When using the always break after return type setting:
Before:
SomeType funcdecl(LIST(uint64_t));
After:
SomeType
funcdecl(LIST(uint64_t));"
Reviewed By: MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D87007
Alex Richardson [Mon, 7 Sep 2020 08:29:40 +0000 (09:29 +0100)]
[clang-format] Parse __underlying_type(T) as a type
Before: MACRO(__underlying_type(A) * a);
After: MACRO(__underlying_type(A) *a);
Reviewed By: MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D86960
Alex Richardson [Tue, 1 Sep 2020 17:09:07 +0000 (18:09 +0100)]
[clang-format] Fix formatting of _Atomic() qualifier
Before: _Atomic(uint64_t) * a;
After: _Atomic(uint64_t) *a;
This treats _Atomic the same as the the TypenameMacros and decltype. It
also allows some cleanup by removing checks whether the token before a
paren is kw_decltype and instead checking for TT_TypeDeclarationParen.
While touching this code also extend the decltype test cases to also check
for typeof() and _Atomic(T).
Reviewed By: MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D86959
Alex Richardson [Mon, 7 Sep 2020 08:26:47 +0000 (09:26 +0100)]
[clang-format] Check that */& after typename macros are pointers/references
Reviewed By: MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D86950
Alex Richardson [Mon, 7 Sep 2020 08:26:16 +0000 (09:26 +0100)]
[clang-format] Handle typename macros inside cast expressions
Before: x = (STACK_OF(uint64_t)) & a;
After: x = (STACK_OF(uint64_t))&a;
Reviewed By: MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D86930
Alex Richardson [Mon, 7 Sep 2020 08:26:05 +0000 (09:26 +0100)]
[clang-format] Allow configuring list of macros that map to attributes
This adds a `AttributeMacros` configuration option that causes certain
identifiers to be parsed like a __attribute__((foo)) annotation.
This is motivated by our CHERI C/C++ fork which adds a __capability
qualifier for pointer/reference. Without this change clang-format parses
many type declarations as multiplications/bitwise-and instead.
I initially considered adding "__capability" as a new clang-format keyword,
but having a list of macros that should be treated as attributes is more
flexible since it can be used e.g. for static analyzer annotations or other language
extensions.
Example: std::vector<foo * __capability> -> std::vector<foo *__capability>
Depends on D86775 (to apply cleanly)
Reviewed By: MyDeveloperDay, jrtc27
Differential Revision: https://reviews.llvm.org/D86782
Sam Parker [Mon, 7 Sep 2020 08:08:07 +0000 (09:08 +0100)]
[SimplifyCFG] Consider cost of combining predicates.
Modify FoldBranchToCommonDest to consider the cost of inserting
instructions when attempting to combine predicates to fold blocks.
The threshold can be controlled via a new option:
-simplifycfg-branch-fold-threshold which defaults to '2' to allow
the insertion of a not and another logical operator.
Differential Revision: https://reviews.llvm.org/D86526
Jay Foad [Thu, 27 Aug 2020 13:26:38 +0000 (14:26 +0100)]
[GlobalISel] Extend not_cmp_fold to work on conditional expressions
Differential Revision: https://reviews.llvm.org/D86709
Sam Parker [Tue, 25 Aug 2020 11:17:24 +0000 (12:17 +0100)]
[ARM][CostModel] CodeSize costs for i1 arith ops
When optimising for size, make the cost of i1 logical operations
relatively expensive so that optimisations don't try to combine
predicates.
Differential Revision: https://reviews.llvm.org/D86525
Xing GUO [Mon, 7 Sep 2020 08:16:38 +0000 (16:16 +0800)]
[DWARFYAML] Make the debug_addr section optional.
This patch makes the debug_addr section optional. When an empty
debug_addr section is specified, yaml2obj only emits a section header
for it.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D87205
Raphael Isemann [Mon, 7 Sep 2020 08:14:22 +0000 (10:14 +0200)]
Add BinaryFormat/ELFRelocs/CSKY.def to LLVM modulemap
Jay Foad [Wed, 2 Sep 2020 15:01:48 +0000 (16:01 +0100)]
[KnownBits] Implement accurate unsigned and signed max and min
Use the new implementation in ValueTracking, SelectionDAG and
GlobalISel.
Differential Revision: https://reviews.llvm.org/D87034
Kristina Bessonova [Mon, 7 Sep 2020 08:03:32 +0000 (10:03 +0200)]
[cmake] Fix build of attribute plugin example on Windows
Seems '${cmake_2_8_12_PRIVATE}' was removed a long time ago, so it should
be just PRIVATE keyword here.
Reviewed By: john.brawn
Differential Revision: https://reviews.llvm.org/D86091
Raul Tambre [Sat, 5 Sep 2020 15:05:14 +0000 (18:05 +0300)]
[CMake][TableGen] Remove dead CMake version checks
LLVM requires CMake 3.13.4, so remove version checks that are dead code.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D87190