Nicholas Guy [Fri, 15 Jan 2021 14:28:23 +0000 (14:28 +0000)]
Reland "[AArch64] Attempt to sink mul operands""
This relands
dda60035e9f0769c8907cdf6561489e0435c2275,
which was reverted by
dbaa6a1858a42f72b683f700d3bd7a9632f7a518
Nicholas Guy [Fri, 15 Jan 2021 14:26:51 +0000 (14:26 +0000)]
[AArch64] Further restricts when a dup(*ext) can be rearranged
In most cases, the dup(*ext) pattern can be rearranged to perform
the extension on the vector side, allowing for further vector-specific
optimisations to be made. However the initial checks for this conversion
were insufficient, allowing invalid encodings to be attempted (causing
compilation to fail).
Differential Revision: https://reviews.llvm.org/D94778
Andy Wingo [Fri, 27 Nov 2020 08:19:46 +0000 (09:19 +0100)]
[WebAssembly] MC layer writes table symbols to object files
Now that the linker handles table symbols, we can allow the frontend to
produce them.
Depends on D91870.
Differential Revision: https://reviews.llvm.org/D92215
Andy Wingo [Thu, 14 Jan 2021 09:15:56 +0000 (10:15 +0100)]
[WebAssembly] Add support for table linking to wasm-ld
This patch adds support to wasm-ld for linking multiple table references
together, in a manner similar to wasm globals. The indirect function
table is synthesized as needed.
To manage the transitional period in which the compiler doesn't yet
produce TABLE_NUMBER relocations and doesn't residualize table symbols,
the linker will detect object files which have table imports or
definitions, but no table symbols. In that case it will synthesize
symbols for the defined and imported tables.
As a change, relocatable objects are now written with table symbols,
which can cause symbol renumbering in some of the tests. If no object
file requires an indirect function table, none will be written to the
file. Note that for legacy ObjFile inputs, this test is conservative: as
we don't have relocs for each use of the indirecy function table, we
just assume that any incoming indirect function table should be
propagated to the output.
Differential Revision: https://reviews.llvm.org/D91870
Simon Pilgrim [Mon, 18 Jan 2021 15:54:05 +0000 (15:54 +0000)]
[X86][AVX] IsElementEquivalent - add matchShuffleWithUNPCK + VBROADCAST/VBROADCAST_LOAD handling
Specify LHS/RHS operands in matchShuffleWithUNPCK's calls to isTargetShuffleEquivalent, and handle VBROADCAST/VBROADCAST_LOAD matching in IsElementEquivalent
Dmitry Preobrazhensky [Mon, 18 Jan 2021 15:38:32 +0000 (18:38 +0300)]
Fix for sanitizer issue in 55c557a
Florian Hahn [Mon, 18 Jan 2021 15:18:17 +0000 (15:18 +0000)]
[LoopRotate] Precommit test for prepare-for-lto handling.
Precommit test for D94232.
Dmitry Preobrazhensky [Mon, 18 Jan 2021 12:16:46 +0000 (15:16 +0300)]
[AMDGPU][MC] Refactored parsing of dpp ctrl
Summary of changes:
- simplified code to improve maintainability;
- replaced lex() with higher level parser functions;
- improved errors handling.
Reviewers: rampitec
Differential Revision: https://reviews.llvm.org/D94777
Sanjay Patel [Mon, 18 Jan 2021 14:28:21 +0000 (09:28 -0500)]
[SLP] rename reduction query for min/max ops; NFC
This will avoid confusion once we start matching
min/max intrinsics. All of these hacks to accomodate
cmp+sel idioms should disappear once we canonicalize
to min/max intrinsics.
Sanjay Patel [Mon, 18 Jan 2021 13:57:09 +0000 (08:57 -0500)]
[SLP] reduce opcode API dependency in reduction cost calc; NFC
The icmp opcode is now hard-coded in the cost model call.
This will make it easier to eventually remove all opcode
queries for min/max patterns as we transition to intrinsics.
Djordje Todorovic [Mon, 18 Jan 2021 14:26:40 +0000 (15:26 +0100)]
[CSInfo][MIPS] Update CSInfo in delay slot filler
In MipsDelaySlotFiller, when replacing old call-branch with
the compact branch instruction, an assertion is caused by erasing
the old call with unhandled CSInfo.
The problem was reported in PR48695.
This patch fixes it, by moving call site info from the old call
instruction to its replace.
Patch by Nikola Tesic
Differential revision: https://reviews.llvm.org/D94685
Sean Fertile [Fri, 15 Jan 2021 21:36:50 +0000 (16:36 -0500)]
[PowerPC][AIX]Do not emit xxspltd mnemonic on AIX.
A bug in the system assembler can assemble the xxspltd extended
menemonic into the wrong instruction (extracting the wrong element).
Emit the full xxpermdi with all operands to work around the problem.
Differential Revision: https://reviews.llvm.org/D94419
Florian Hahn [Mon, 18 Jan 2021 13:40:21 +0000 (13:40 +0000)]
[InferAttrs] Mark some library functions as willreturn.
This patch marks some library functions as willreturn. On the first pass, I
excluded most functions that interact with streams/the filesystem.
Along with willreturn, it also adds nounwind to a set of math functions.
There probably are a few additional attributes we can add for those, but
that should be done separately.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D94684
Caroline Concatto [Mon, 4 Jan 2021 15:21:21 +0000 (15:21 +0000)]
[NFC]Migrate VectorCombine.cpp to use InstructionCost
This patch changes these functions:
vectorizeLoadInsert
isExtractExtractCheap
foldExtractedCmps
scalarizeBinopOrCmp
getShuffleExtract
foldBitcastShuf
to use the class InstructionCost when calling TTI.get<something>Cost().
This patch is part of a series of patches to use InstructionCost instead of
unsigned/int for the cost model functions.
See this thread for context:
http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html
See this patch for the introduction of the type:
https://reviews.llvm.org/D91174
ps.:This patch adds the test || !NewCost.isValid(), because we want to
return false when:
!NewCost.isValid && !OldCost.isValid()->the cost to transform it expensive
and
!NewCost.isValid() && OldCost.isValid()
Therefore for simplication we only add test for !NewCost.isValid()
Differential Revision: https://reviews.llvm.org/D94069
Kai Nacke [Thu, 14 Jan 2021 13:04:39 +0000 (08:04 -0500)]
[Doc] Fix example in codegen doc.
The attributes in the example are placed wrong:
They belong after the type, not after the parameter name.
Reviewed by: abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D94683
Denis Antrushin [Thu, 14 Jan 2021 19:35:18 +0000 (22:35 +0300)]
[Statepoint] Handle `undef` operands in statepoint.
Currently when spilling statepoint register operands in FixupStatepoints
we do not pay attention that it might be `undef`. We just generate a
spill, which may lead to verifier error because we have a use without def.
To handle it, let FixupStateponts ignore `undef` register operands
completely and change them to some constant value when generating
stack map. Use same value as used by ISel for this purpose (0xFEFEFEFE).
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D94703
Abhina Sreeskantharajan [Mon, 18 Jan 2021 12:14:12 +0000 (07:14 -0500)]
[SystemZ][z/OS] Fix No such file or directory expression error matching in lit tests
On z/OS, the following error message is not matched correctly in lit tests. This patch updates the CHECK expression to match successfully.
```
EDC5129I No such file or directory.
```
Reviewed By: muiez
Differential Revision: https://reviews.llvm.org/D94239
Dmitry Preobrazhensky [Mon, 18 Jan 2021 11:54:34 +0000 (14:54 +0300)]
[AMDGPU][MC][GFX10] Improved dpp8 errors handling
Reviewers: rampitec
Differential Revision: https://reviews.llvm.org/D94756
Shilei Tian [Mon, 18 Jan 2021 11:57:52 +0000 (06:57 -0500)]
Revert "[OpenMP] Added the support for hidden helper task in RTL"
This reverts commit
ed939f853da1f2266f00ea087f778fda88848f73.
Florian Hahn [Mon, 18 Jan 2021 10:34:21 +0000 (10:34 +0000)]
[VectorUtils] Do not try to add indices matching tombstone/empty values.
Keys matching the tombstone/empty special values cannot be inserted in a
DenseMap. Under some circumstances, LV tries to add members to an
interleave group that match the special values. Skip adding such
members. This is unlikely to have any impact in practice, because
interleave groups with such indices are very likely to not be
vectorized, due to gaps.
This issue has been surfaced by fuzzing, see
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=11638
Tres Popp [Mon, 18 Jan 2021 11:01:27 +0000 (12:01 +0100)]
Revert "[PowerPC] support register pressure reduction in machine combiner."
This reverts commit
26a396c4ef481cb159bba631982841736a125a9c.
See https://reviews.llvm.org/D92071 for a description of the issue.
Vladislav Vinogradov [Mon, 18 Jan 2021 10:54:06 +0000 (11:54 +0100)]
[mlir] Fix cross-compilation (Linalg ODS gen)
Use cross-compilation approach for `mlir-linalg-ods-gen` application
similar to TblGen tools.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D94598
Simon Pilgrim [Mon, 18 Jan 2021 10:29:08 +0000 (10:29 +0000)]
[DAG] SimplifyDemandedBits - use KnownBits comparisons to remove ISD::UMIN/UMAX ops
Use the KnownBits icmp comparisons to determine when a ISD::UMIN/UMAX op is unnecessary should either op be known to be ULT/ULE or UGT/UGE than the other.
Differential Revision: https://reviews.llvm.org/D94532
Fraser Cormack [Fri, 15 Jan 2021 17:04:52 +0000 (17:04 +0000)]
[RISCV] Add scalable vector truncate patterns
Original patch by @rogfer01.
This patch supports vector truncates, which on RVV must be done in a
series of instructions truncating by one power-of-two at a time. This is
done through custom-lowering and a custom node to avoid LLVM
re-combining the split TRUNCATE nodes.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Fraser Cormack <fraser@codeplay.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D94796
Simon Pilgrim [Fri, 15 Jan 2021 18:25:16 +0000 (18:25 +0000)]
[X86][SSE] isHorizontalBinOp - reuse any existing horizontal ops.
If we already have similar horizontal ops using the same args, then match that, even if we are on a target with slow horizontal ops.
Raphael Isemann [Mon, 18 Jan 2021 10:07:26 +0000 (11:07 +0100)]
[lldb][docs] Use inline literals for code/paths instead of rendering it with the default role
Right now we're using the 'content' role as default which will just render
these things as cursive (which isn't really useful for code examples). It also
prevents us from assigning a more useful default role in the future.
Björn Schäpers [Mon, 18 Jan 2021 09:58:20 +0000 (10:58 +0100)]
[clang-format] Fix documentation of
bcc1dee600
That was an oversight.
Differential Revision: https://reviews.llvm.org/D93776
Georgii Rymar [Fri, 15 Jan 2021 11:21:00 +0000 (14:21 +0300)]
[Object, llvm-readelf] - Move the API for retrieving symbol versions to ELF.h
`ELFDumper.cpp` implements the functionality that allows to get symbol versions.
It is used for dumping versioned symbols.
This helps to implement https://bugs.llvm.org/show_bug.cgi?id=48670 ("make llvm-nm -D print version names"):
we can move out and reuse the code from `ELFDumper.cpp`.
This is what this patch do: it moves the related functionality to `ELFFile<ELFT>`.
Differential revision: https://reviews.llvm.org/D94771
Raphael Isemann [Mon, 18 Jan 2021 09:47:16 +0000 (10:47 +0100)]
[lldb][docs] Resolve the remaining sphinx formatter warnings in the SB API docs
With this patch there should no longer be any warnings when generating the
SB API sphinx docs.
Craig Topper [Mon, 18 Jan 2021 07:46:43 +0000 (23:46 -0800)]
[RISCV] Use tail agnostic policy for instructions with tied defs if the use operand is IMPLICIT_DEF.
The vcompress intrinsic is defined such that it requires a tail
undisturbed policy. This patch makes it so we can use the tail
agnostic policy if the user has passed vundefined to the dest
operand.
We need to do something similar for masked policy, but we need
annotation of which instructions use the mask policy first.
Not sure if this is sufficient for scheduling or if we'll need to
select different pseudos that don't have a tied def.
Reviewed By: evandro
Differential Revision: https://reviews.llvm.org/D94566
Craig Topper [Mon, 18 Jan 2021 07:29:43 +0000 (23:29 -0800)]
[IR] Allow scalable vectors in structs to support intrinsics returning multiple values.
RISC-V would like to use a struct of scalable vectors to return multiple
values from intrinsics. This woud also be needed for target independent
intrinsics like llvm.sadd.overflow.
This patch removes the existing restriction for this. I've modified
StructType::isSized to consider a struct containing scalable vectors
as unsized so the verifier won't allow loads/stores/allocas of these
structs.
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D94142
Björn Schäpers [Wed, 23 Dec 2020 21:03:39 +0000 (22:03 +0100)]
[clang-format] Add StatementAttributeLikeMacros option
This allows to ignore for example Qts emit when
AlignConsecutiveDeclarations is set, otherwise it is parsed as a type
and it results in some misformating:
unsigned char MyChar = 'x';
emit signal(MyChar);
Differential Revision: https://reviews.llvm.org/D93776
Chen Zheng [Mon, 18 Jan 2021 04:53:33 +0000 (23:53 -0500)]
[PowerPC] support register pressure reduction in machine combiner.
Reassociating some patterns to generate more fma instructions to
reduce register pressure.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D92071
Lang Hames [Mon, 18 Jan 2021 04:27:27 +0000 (15:27 +1100)]
[JITLink][ELF] New ELF skip-debug-sections test requires asserts.
This should fix the failures on Release mode testers.
Philip Reames [Mon, 18 Jan 2021 04:29:13 +0000 (20:29 -0800)]
[test] pre commit a couple more tests for vectorizing multiple exit loops
Philip Reames [Mon, 18 Jan 2021 04:03:03 +0000 (20:03 -0800)]
[test] Autogen a loop vectorizer test to make future changes visible
Qiu Chaofan [Mon, 18 Jan 2021 03:56:11 +0000 (11:56 +0800)]
[Legalizer] Promote result type in expanding FP_TO_XINT
This patch promotes result integer type of FP_TO_XINT in expanding.
So crash in conversion from ppc_fp128 to i1 will be fixed.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D92473
Qiu Chaofan [Mon, 18 Jan 2021 03:44:00 +0000 (11:44 +0800)]
[PowerPC] [NFC] Add AIX triple to some regression tests
As part of the effort to improve AIX support, regression test coverage
misses quite a lot for AIX subtarget. This patch adds AIX triple to
those don't need extra change, and we can cover more cases in following
commits.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D94159
Juneyoung Lee [Mon, 18 Jan 2021 02:12:36 +0000 (11:12 +0900)]
[InstCombine] more tests for D94861 (NFC)
Lang Hames [Mon, 18 Jan 2021 00:39:32 +0000 (11:39 +1100)]
[JITLink][ELF] Skip DWARF sections in ELF objects.
This matches current JITLink/MachO behavior and avoids processing currently
unsupported relocations.
Fangrui Song [Mon, 18 Jan 2021 01:19:29 +0000 (17:19 -0800)]
Makefile.rules: Make HOST_OS/OS simply expanded variable to avoid excess uname -s invocations
This decreases the number of runs from 18 to 1.
Chen Zheng [Mon, 18 Jan 2021 00:56:42 +0000 (19:56 -0500)]
[NFC] [TargetRegisterInfo] add one use check to lookThruCopyLike.
add one use check to lookThruCopyLike.
The root node is safe to be deleted if we are sure that every
definition in the copy chain only has one use.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D92069
Chandler Carruth [Mon, 18 Jan 2021 00:17:07 +0000 (16:17 -0800)]
Fix openmp CMake build on non-Linux AArch64 systems.
This just checks for `/proc/cpuinfo` existing before reading it.
Tested on an ARM macOS machine.
Fangrui Song [Sun, 17 Jan 2021 21:16:38 +0000 (13:16 -0800)]
Makefile.rules: Delete GCC 4.6 workaround
5.1 is the minimum supported version.
Pavel Labath [Sun, 17 Jan 2021 19:18:55 +0000 (20:18 +0100)]
[lldb] Skip TestPlatformProcessConnect on windows and darwin
The test fails (for different reasons) on these platforms. Skip for now.
Nikita Popov [Sun, 17 Jan 2021 19:03:22 +0000 (20:03 +0100)]
[ValueTracking] Fix isSafeToSpeculativelyExecute for sdiv (PR48778)
The != -1 check does not work correctly for all bitwidths. Use
isAllOnesValue() instead.
Nikita Popov [Sun, 17 Jan 2021 18:57:59 +0000 (19:57 +0100)]
[SimplifyCFG] Add test for PR48778 (NFC)
The sdiv is incorrectly speculated.
Stephen Kelly [Sun, 17 Jan 2021 18:25:00 +0000 (18:25 +0000)]
NFC: Minor cleanup of function calls
Kazu Hirata [Sun, 17 Jan 2021 18:39:48 +0000 (10:39 -0800)]
[TableGen] Drop redundant const from return types (NFC)
Identified with readability-const-return-type.
Kazu Hirata [Sun, 17 Jan 2021 18:39:47 +0000 (10:39 -0800)]
[IRBuilder] "Zero"-initialize SmallVector (NFC)
Kazu Hirata [Sun, 17 Jan 2021 18:39:45 +0000 (10:39 -0800)]
[llvm] Use llvm::sort (NFC)
Raphael Isemann [Sun, 17 Jan 2021 16:40:54 +0000 (17:40 +0100)]
[lldb][docs] Fix some RST formatting errors related to code examples.
Mostly just making sure the indentation is right (SBDebugger had 0 spaces
as it was still plain text, the others had too much indentation or other
minor issues).
Dávid Bolvanský [Sun, 17 Jan 2021 16:06:06 +0000 (17:06 +0100)]
[InstCombine] Transform abs pattern using multiplication to abs intrinsic (PR45691)
```
unsigned r(int v)
{
return (1 | -(v < 0)) * v;
}
`r` is equivalent to `abs(v)`.
```
```
define <4 x i8> @src(<4 x i8> %0) {
%1:
%2 = ashr <4 x i8> %0, { 31, undef, 31, 31 }
%3 = or <4 x i8> %2, { 1, 1, 1, undef }
%4 = mul nsw <4 x i8> %3, %0
ret <4 x i8> %4
}
=>
define <4 x i8> @tgt(<4 x i8> %0) {
%1:
%2 = icmp slt <4 x i8> %0, { 0, 0, 0, 0 }
%3 = sub nsw <4 x i8> { 0, 0, 0, 0 }, %0
%4 = select <4 x i1> %2, <4 x i8> %3, <4 x i8> %0
ret <4 x i8> %4
}
Transformation seems to be correct!
```
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D94874
Dávid Bolvanský [Sun, 17 Jan 2021 14:30:51 +0000 (15:30 +0100)]
[Tests] Add test for PR45691
Raphael Isemann [Fri, 15 Jan 2021 18:49:51 +0000 (19:49 +0100)]
[lldb][docs] Cleanup the Python doc strings for SB API classes
The first line of the doc string ends up on the SB API class summary at
the root page of the Python API web page of LLDB. Currently many of the
descriptions are missing or are several lines which makes the table really
hard to read.
This just adds the missing docstrings where possible and fixes the formatting
where necessary.
Nikita Popov [Sun, 17 Jan 2021 14:57:53 +0000 (15:57 +0100)]
[InstSimplify] Fold x*C1/C2 <= x (PR48744)
We can fold x*C1/C2 <= x to true if C1 <= C2. This is valid even
if the multiplication is not nuw: https://alive2.llvm.org/ce/z/vULors
The multiplication or division can be replaced by shifts. We don't
handle the case where both are shifts, as that should get folded
away by InstCombine.
Nikita Popov [Sun, 17 Jan 2021 14:58:37 +0000 (15:58 +0100)]
[InstSimplify] Add tests for x*C1/C2<=x (NFC)
Tests for PR48744.
Utkarsh Saxena [Sun, 17 Jan 2021 14:26:40 +0000 (15:26 +0100)]
[clangd] Use !empty() instead of size()>0
Utkarsh Saxena [Sun, 17 Jan 2021 14:13:01 +0000 (15:13 +0100)]
[clangd] Use empty() instead of size()>0
mydeveloperday [Sun, 17 Jan 2021 11:13:50 +0000 (11:13 +0000)]
[clang-format] PR48594 BraceWrapping: SplitEmptyRecord ignored for templates
https://bugs.llvm.org/show_bug.cgi?id=48594
Empty or small templates were not being treated the same way as small classes especially when SplitEmptyRecord was set to true
This revision aims to help this by identifying a case when we should try not to merge the lines together
Reviewed By: curdeius, JohelEGP
Differential Revision: https://reviews.llvm.org/D93839
Raphael Isemann [Fri, 15 Jan 2021 12:24:24 +0000 (13:24 +0100)]
Reland [lldb][docs] Use sphinx instead of epydoc to generate LLDB's Python reference
The build server should now have the missing dependencies.
Original summary:
Currently LLDB uses epydoc to generate the Python API reference for the website.
epydoc however is unmaintained since more than a decade and no longer works with
Python 3. Also whatever setup we had once for generating the documentation on
the website server no longer seems to work, so the current website documentation
has been stale since more than a year.
This patch replaces epydoc with sphinx and its automodapi plugin that can
generate Python API references. LLVM already uses sphinx for the rest of the
documentation, so this way we are more consistent with the rest of LLVM. The
only new dependency is the automodapi plugin for sphinx.
This patch effectively does the following things:
* Remove the epydoc code.
* Make a new dummy Python API page in our website that just calls the Sphinx
command for generated the API documentation.
* Add a mock _lldb module that is only used when generating the Python API.
This way we don't have to build all of LLDB to generate the API reference.
Some notes:
* The long list of skips is necessary due to boilerplate functions that SWIG
is generating. Sadly automodapi is not really scriptable from what I can see,
so we have to blacklist this stuff manually.
* The .gitignore change because automodapi wants a subfolder of our
documentation directory to place generated documentation files there. The path
is also what is used on the website, so we can't really workaround this
(without copying the whole `docs` dir somewhere else when we build).
* We have to use environment variables to pass our build path to our sphinx
configuration. Sphinx doesn't support passing variables onto that script.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D94489
mydeveloperday [Sun, 17 Jan 2021 11:07:31 +0000 (11:07 +0000)]
[clang-format] Revert
e9e6e3b34a8e
Reverting {D92753} due to issues with #pragma indentation in #ifdef/endif structure
Nikita Popov [Tue, 10 Nov 2020 19:43:46 +0000 (20:43 +0100)]
Reapply [BasicAA] Handle recursive queries more efficiently
There are no changes relative to the original commit. However, an issue
this exposed in BasicAA assumption tracking has been fixed in the
previous commit.
-----
An alias query currently works out roughly like this:
* Look up location pair in cache.
* Perform BasicAA logic (including cache lookup and insertion...)
* Perform a recursive query using BestAAResults.
* Look up location pair in cache (and thus do not recurse into BasicAA)
* Query all the other AA providers.
* Query all the other AA providers.
This is a lot of unnecessary work, all ultimately caused by the
BestAAResults query at the end of aliasCheck(). The reason we perform
it, is that aliasCheck() is getting called recursively, and we of
course want those recursive queries to also make use of other AA
providers, not just BasicAA. We can solve this by making the recursive
queries directly use BestAAResults (which will check both BasicAA
and other providers), rather than recursing into aliasCheck().
There are some tradeoffs:
* We can no longer pass through the precomputed underlying object
to aliasCheck(). This is not a major concern, because nowadays
getUnderlyingObject() is quite cheap.
* Results from other AA providers are no longer cached inside
BasicAA. The way this worked was already a bit iffy, in that a
result could be cached, but if it was MayAlias, we'd still end
up re-querying other providers anyway. If we want to cache
non-BasicAA results, we should do that in a more principled manner.
In any case, despite those tradeoffs, this works out to be a decent
compile-time improvment. I think it also simplifies the mental model
of how BasicAA works. It took me quite a while to fully understand
how these things interact.
Differential Revision: https://reviews.llvm.org/D90094
Nikita Popov [Sat, 16 Jan 2021 20:47:01 +0000 (21:47 +0100)]
[BasicAA] Move assumption tracking into AAQI
D91936 placed the tracking for the assumptions into BasicAA.
However, when recursing over phis, we may use fresh AAQI instances.
In this case AssumptionBasedResults from an inner AAQI can reesult
in a removal of an element from the outer AAQI.
To avoid this, move the tracking into AAQI. This generally makes
more sense, as the NoAlias assumptions themselves are also stored
in AAQI.
The test case only produces an assertion failure with D90094
reapplied. I think the issue exists independently of that change
as well, but I wasn't able to come up with a reproducer.
Fangrui Song [Sun, 17 Jan 2021 08:02:13 +0000 (00:02 -0800)]
[ELF] Support R_PPC_ADDR24 (ba foo; bla foo)
Kazushi (Jam) Marukawa [Sat, 26 Dec 2020 13:50:17 +0000 (22:50 +0900)]
[VE] Support VE in libunwind
Modify libunwind to support SjLj exception handling routines for VE.
In order to do that, we need to implement not only SjLj exception
handling routines but also a Registers_ve class. This implementation
of Registers_ve is incomplete. We will work on it later when we need
backtrace in libunwind.
Reviewed By: #libunwind, compnerd
Differential Revision: https://reviews.llvm.org/D94591
Craig Topper [Sun, 17 Jan 2021 05:18:52 +0000 (21:18 -0800)]
[RISCV] Remove an extra map lookup from RISCVCompressInstEmitter. NFC
When we looked up the map to see if the entry already existed,
this created the new entry for us. So save a reference to it so
we can use it to update the entry instead of looking it up again.
Also remove unnecessary StringRef constructors around string
literals on calls to this function.
Craig Topper [Sun, 17 Jan 2021 05:09:37 +0000 (21:09 -0800)]
[RISCV] Few more minor cleanups to RISCVCompressInstEmitter. NFC
-Use StringRef instead of std::string.
-Const correct a parameter.
-Don't call StringRef::data() before printing. Just pass the StringRef.
Craig Topper [Sun, 17 Jan 2021 04:59:48 +0000 (20:59 -0800)]
[RISCV] Simplify mergeCondAndCode in RISCVCompressInstEmitter.cpp. NFC
Instead forming a std::string and returning it to pass into another
raw_ostream, just pass the raw_ostream as a parameter.
Take StringRef as arguments instead raw_string_ostream references
making the caller responsible for converting to strings. Use
StringRef operations instead of std::string::substr.a
Craig Topper [Sun, 17 Jan 2021 04:23:41 +0000 (20:23 -0800)]
[RISC] Replace dyn_casts that are only checked by an assert with a cast. NFC
Craig Topper [Sat, 16 Jan 2021 08:03:35 +0000 (00:03 -0800)]
[RISCV] Remove unneeded StringRef to std::string conversions in RISCVCompressInstEmitter. NFC
Stop concatenating std::string before streaming into a raw_ostream.
Just stream the pieces.
Remove some new lines from asserts. Remove std::string concatenation
from an assert. assert strings aren't really evaluated like this at
runtime. An assertion failure will just print exactly what's between
the parentheses in the source.
Fangrui Song [Sun, 17 Jan 2021 00:39:54 +0000 (16:39 -0800)]
[X86] Default to -x86-pad-for-align=false to drop assembler difference with or w/o -g
Fix PR48742: the D75203 assembler optimization locates MCRelaxableFragment's
within two MCSymbol's and relaxes some MCRelaxableFragment's to reduce the size
of a MCAlignFragment. A -g build has more MCSymbol's and therefore may have
different assembler output (e.g. a MCRelaxableFragment (jmp) may have 5 bytes
with -O1 while 2 bytes with -O1 -g).
`.p2align 4, 0x90` is common due to loops. For a larger program, with a
lot of temporary labels, the assembly output difference is somewhat
destined. The cost seems to overweigh the benefits so we default to
-x86-pad-for-align=false until the heuristic is improved.
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D94542
Nikita Popov [Sat, 16 Jan 2021 11:41:35 +0000 (12:41 +0100)]
[InstCombine] Replace one-use select operand based on condition
InstCombine already performs a fold where X == Y ? f(X) : Z is
transformed to X == Y ? f(Y) : Z if f(Y) simplifies. However,
if f(X) only has one use, then we can always directly replace the
use inside the instruction. To actually be profitable, limit it to
the case where Y is a non-expr constant.
This could be further extended to replace uses further up a one-use
instruction chain, but for now this only looks one level up.
Among other things, this also subsumes D94860.
Differential Revision: https://reviews.llvm.org/D94862
Roman Lebedev [Sat, 16 Jan 2021 18:42:40 +0000 (21:42 +0300)]
[SimplifyCFG] markAliveBlocks(): catchswitch: preserve PostDomTree
When removing catchpad's from catchswitch, if that removes a successor,
we need to record that in DomTreeUpdater.
This fixes PostDomTree preservation failure in an existing test.
This appears to be the single issue that i see in my current test coverage.
David Green [Sat, 16 Jan 2021 22:19:35 +0000 (22:19 +0000)]
[ARM] Align blocks that are not fallthough targets
If the previous block in a function does not fallthough, adding nop's to
align it will never be executed. This means we can freely (except for
codesize) align more branches. This happens in constantislandspass (as
it cannot happen later) and only happens at aggressive optimization
levels as it does increase codesize.
Differential Revision: https://reviews.llvm.org/D94394
David Green [Sat, 16 Jan 2021 18:41:11 +0000 (18:41 +0000)]
[ARM] Test for aligned blocks. NFC
Dávid Bolvanský [Sat, 16 Jan 2021 21:48:23 +0000 (22:48 +0100)]
[NFC] Removed extra text in comments
Aart Bik [Sat, 16 Jan 2021 03:49:01 +0000 (19:49 -0800)]
[mlir][sparse] improved sparse runtime support library
Added the ability to read (an extended version of) the FROSTT
file format, so that we can now read in sparse tensors of arbitrary
rank. Generalized the API to deal with more than two dimensions.
Also added the ability to sort the indices of sparse tensors
lexicographically. This is an important step towards supporting
auto gen of initialization code, since sparse storage formats
are easier to initialize if the indices are sorted. Since most
external formats don't enforce such properties, it is convenient
to have this ability in our runtime support library.
Lastly, the re-entrant problem of the original implementation
is fixed by passing an opaque object around (rather than having
a single static variable, ugh!).
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D94852
Shilei Tian [Sat, 16 Jan 2021 19:12:38 +0000 (14:12 -0500)]
[OpenMP] Added the support for hidden helper task in RTL
The basic design is to create an outer-most parallel team. It is not a regular team because it is only created when the first hidden helper task is encountered, and is only responsible for the execution of hidden helper tasks. We first use `pthread_create` to create a new thread, let's call it the initial and also the main thread of the hidden helper team. This initial thread then initializes a new root, just like what RTL does in initialization. After that, it directly calls `__kmpc_fork_call`. It is like the initial thread encounters a parallel region. The wrapped function for this team is, for main thread, which is the initial thread that we create via `pthread_create` on Linux, waits on a condition variable. The condition variable can only be signaled when RTL is being destroyed. For other work threads, they just do nothing. The reason that main thread needs to wait there is, in current implementation, once the main thread finishes the wrapped function of this team, it starts to free the team which is not what we want.
Two environment variables, `LIBOMP_NUM_HIDDEN_HELPER_THREADS` and `LIBOMP_USE_HIDDEN_HELPER_TASK`, are also set to configure the number of threads and enable/disable this feature. By default, the number of hidden helper threads is 8.
Here are some open issues to be discussed:
1. The main thread goes to sleeping when the initialization is finished. As Andrey mentioned, we might need it to be awaken from time to time to do some stuffs. What kind of update/check should be put here?
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D77609
Sanjay Patel [Sat, 16 Jan 2021 18:51:55 +0000 (13:51 -0500)]
[SLP] remove opcode field from reduction data class
This is NFC-intended and another step towards supporting
intrinsics as reduction candidates.
The remaining bits of the OperationData class do not make
much sense as-is, so I will try to improve that, but I'm
trying to take minimal steps because it's still not clear
how this was intended to work.
Sanjay Patel [Sat, 16 Jan 2021 18:18:05 +0000 (13:18 -0500)]
[SLP] fix typos; NFC
Sanjay Patel [Sat, 16 Jan 2021 16:56:36 +0000 (11:56 -0500)]
[SLP] remove unnecessary use of 'OperationData'
This is another NFC-intended patch to allow matching
intrinsics (example: maxnum) as candidates for reductions.
It's possible that the loop/if logic can be reduced now,
but it's still difficult to understand how this all works.
Dávid Bolvanský [Sat, 16 Jan 2021 18:40:29 +0000 (19:40 +0100)]
[InstSimplify] Handle commutativity for 'and' and 'outer or' for (~A & B) | ~(A | B) --> ~A
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D94870
David Green [Sat, 16 Jan 2021 18:30:21 +0000 (18:30 +0000)]
[ARM] Add low overhead loops terminators to AnalyzeBranch
This treats low overhead loop branches the same as jump tables and
indirect branches in analyzeBranch - they cannot be analyzed but the
direct branches on the end of the block may be removed. This helps
remove the unnecessary branches earlier, which can help produce better
codegen (and change block layout in a number of cases).
Differential Revision: https://reviews.llvm.org/D94392
David Green [Sat, 16 Jan 2021 18:01:30 +0000 (18:01 +0000)]
[ARM] Remove LLC tests from transform/hardware loop tests.
We now have a lot of llc tests for hardware loops in CodeGen, which test
a larger variety of loops and are easier to maintain. This removes the
llc from mixed llc/opt tests.
Dávid Bolvanský [Sat, 16 Jan 2021 17:52:51 +0000 (18:52 +0100)]
[InstSimplify] Precommit new testcases; NFC
Kazu Hirata [Sat, 16 Jan 2021 17:40:54 +0000 (09:40 -0800)]
[llvm] Use *::empty (NFC)
Kazu Hirata [Sat, 16 Jan 2021 17:40:53 +0000 (09:40 -0800)]
[llvm] Construct SmallVector with iterator ranges (NFC)
Kazu Hirata [Sat, 16 Jan 2021 17:40:51 +0000 (09:40 -0800)]
[StringExtras] Fix comment typos (NFC)
Florian Hahn [Sat, 16 Jan 2021 16:28:05 +0000 (16:28 +0000)]
[LTO] Remove options to disable inlining, vectorization & GVNLoadPRE.
This patch removes some ancient options as a clean-up before moving
code-gen to use LTOBackend in D94487.
I think it would preferable to remove those ancient options, because
1. There are no corresponding options in LTOBackend based tools,
2. There are no unit tests for them,
3. They are not passed through by Clang,
4. At least for GNVLoadPRE, users could just use GVN's `enable-load-pre`.
Alternatively we could add support for those options to lto::Config &
co, but I think it would be better to remove them, unless they are
actually used in practice.
Reviewed By: steven_wu, tejohnson
Differential Revision: https://reviews.llvm.org/D94783
Dávid Bolvanský [Sat, 16 Jan 2021 15:31:02 +0000 (16:31 +0100)]
[InstSimplify] Update comments, remove redundant tests
Hsiangkai Wang [Fri, 15 Jan 2021 03:27:11 +0000 (11:27 +0800)]
[RISCV] Correct alignment settings for vector registers.
According to "9. Vector Memory Alignment Constraints" in V
specification, the alignment of vector memory access is aligned to the
size of the element. In our current implementation, we support ELEN up
to 64. We could assume the alignment of vector registers is 64 under the
assumption.
Differential Revision: https://reviews.llvm.org/D94751
Dávid Bolvanský [Sat, 16 Jan 2021 14:43:07 +0000 (15:43 +0100)]
[InstSimplify] Add (~A & B) | ~(A | B) --> ~A
Dávid Bolvanský [Sat, 16 Jan 2021 14:04:54 +0000 (15:04 +0100)]
[Tests] Added tests for new instcombine or simplification; NFC
James Player [Sat, 16 Jan 2021 14:34:20 +0000 (09:34 -0500)]
Fix llvm::Optional build breaks in MSVC using std::is_trivially_copyable
Current code breaks this version of MSVC due to a mismatch between `std::is_trivially_copyable` and `llvm::is_trivially_copyable` for `std::pair` instantiations. Hence I was attempting to use `std::is_trivially_copyable` to set `llvm::is_trivially_copyable<T>::value`.
I spent some time root causing an `llvm::Optional` build error on MSVC 16.8.3 related to the change described above:
```
62>C:\src\ocg_llvm\llvm-project\llvm\include\llvm/ADT/BreadthFirstIterator.h(96,12): error C2280: 'llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> *,llvm::Optional<llvm::Graph<4>::ChildIterator>>> &llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> *,llvm::Optional<llvm::Graph<4>::ChildIterator>>>::operator =(const llvm::Optional<std::pair<std::pair<unsigned int,llvm::Graph<4>::NodeSubset> *,llvm::Optional<llvm::Graph<4>::ChildIterator>>> &)': attempting to reference a deleted function (compiling source file C:\src\ocg_llvm\llvm-project\llvm\unittests\ADT\BreadthFirstIteratorTest.cpp)
...
```
The "trivial" specialization of `optional_detail::OptionalStorage` assumes that the value type is trivially copy constructible and trivially copy assignable. The specialization is invoked based on a check of `is_trivially_copyable` alone, which does not imply both `is_trivially_copy_assignable` and `is_trivially_copy_constructible` are true.
[[ https://en.cppreference.com/w/cpp/named_req/TriviallyCopyable | According to the spec ]], a deleted assignment operator does not make `is_trivially_copyable` false. So I think all these properties need to be checked explicitly in order to specialize `OptionalStorage` to the "trivial" version:
```
/// Storage for any type.
template <typename T, bool = std::is_trivially_copy_constructible<T>::value
&& std::is_trivially_copy_assignable<T>::value>
class OptionalStorage {
```
Above fixed my build break in MSVC, but I think we need to explicitly check `is_trivially_copy_constructible` too since it might be possible the copy constructor is deleted. Also would be ideal to move over to `std::is_trivially_copyable` instead of the `llvm` namespace verson.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D93510
Stephen Kelly [Tue, 5 Jan 2021 23:04:31 +0000 (23:04 +0000)]
[ASTMatchers] Add support for CXXRewrittenBinaryOperator
Differential Revision: https://reviews.llvm.org/D94130
Stephen Kelly [Tue, 5 Jan 2021 01:33:13 +0000 (01:33 +0000)]
[ASTMatchers] Add binaryOperation matcher
This is a simple utility which allows matching on binaryOperator and
cxxOperatorCallExpr. It can also be extended to support
cxxRewrittenBinaryOperator.
Add generic support for MapAnyOfMatchers to auto-marshalling functions.
Differential Revision: https://reviews.llvm.org/D94129
Bjorn Pettersson [Fri, 15 Jan 2021 09:35:56 +0000 (10:35 +0100)]
[LegalizeDAG] Handle NeedInvert when expanding BR_CC
This is a follow-up fix to commit
03c8d6a0c4bd0016bdfd1e5.
Seems like we now end up with NeedInvert being set in the result
from LegalizeSetCCCondCode more often than in the past, so we
need to handle NeedInvert when expanding BR_CC.
Not sure how to deal with the "Tmp4.getNode()" case properly,
but current assumption is that that code path isn't impacted
by the changes in
03c8d6a0c4bd0016bdfd1e5 so we can simply move
the old assert into the if-branch and only handle NeedInvert in the
else-branch.
I think that the test case added here, for PowerPC, might have
failed also before commit
03c8d6a0c4bd0016bdfd1e5. But we started
to hit the assert more often downstream when having merged that
commit.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D94762
Stephen Kelly [Sat, 2 Jan 2021 00:01:03 +0000 (00:01 +0000)]
[ASTMatchers] Make cxxOperatorCallExpr matchers API-compatible with n-ary operators
This makes them composable with mapAnyOf().
Differential Revision: https://reviews.llvm.org/D94128
Stephen Kelly [Fri, 1 Jan 2021 23:18:43 +0000 (23:18 +0000)]
[ASTMatchers] Add mapAnyOf matcher
Make it possible to compose a matcher for different base nodes.
This accepts one or more node matcher functors and zero or more
matchers, composing the latter into the former.
This allows composing of matchers where the same inner matcher name is
used for the same concept, but with a different node functor. Currently,
there is a limitation that the nodes must be in the same "clade", so
while
mapAnyOf(ifStmt, forStmt).with(hasBody(stmt()))
can be used, functionDecl can not be added to the tuple.
It is possible to use this in clang-query, but it will require changes
to the QueryParser, so is deferred to a future review.
Differential Revision: https://reviews.llvm.org/D94127