David Sherwood [Fri, 18 Sep 2020 07:39:31 +0000 (08:39 +0100)]
[SVE] Make EVT::getScalarSizeInBits and others consistent with Type::getScalarSizeInBits
An existing function Type::getScalarSizeInBits returns a uint64_t
instead of a TypeSize class because the caller is requesting a
scalar size, which cannot be scalable. This patch makes other
similar functions requesting a scalar size consistent with that,
thereby eliminating more than 1000 implicit TypeSize -> uint64_t
casts.
Differential revision: https://reviews.llvm.org/D87889
Raphael Isemann [Wed, 23 Sep 2020 08:03:42 +0000 (10:03 +0200)]
Revert "[libc++] Implement LWG1203"
This reverts commit
fdc41e11f9687a50c97e2a59663bf2d541ff5489. It causes the
libcxx/modules/stds_include.sh.cpp test to fail with:
libcxx/include/ostream:1039:45: error: no template named 'enable_if_t'; did you mean 'enable_if'?
template <class _Stream, class _Tp, class = enable_if_t<
Still investigating what's causing this and reverting in the meantime to get
the bots green again.
David Sherwood [Wed, 9 Sep 2020 13:03:07 +0000 (14:03 +0100)]
[SVE] Fix InstCombinerImpl::PromoteCastOfAllocation for scalable vectors
In this patch I've fixed some warnings that arose from the implicit
cast of TypeSize -> uint64_t. I tried writing a variety of different
cases to show how this optimisation might work for scalable vectors
and found:
1. The optimisation does not work for cases where the cast type
is scalable and the allocated type is not. This because we need to
know how many times the cast type fits into the allocated type.
2. If we pass all the various checks for the case when the allocated
type is scalable and the cast type is not, then when creating the
new alloca we have to take vscale into account. This leads to
sub-optimal IR that is worse than the original IR.
3. For the remaining case when both the alloca and cast types are
scalable it is hard to find examples where the optimisation would
kick in, except for simple bitcasts, because we typically fail the
ABI alignment checks.
For now I've changed the code to bail out if only one of the alloca
and cast types is scalable. This means we continue to support the
existing cases where both types are fixed, and also the specific case
when both types are scalable with the same size and alignment, for
example a simple bitcast of an alloca to another type.
I've added tests that show we don't attempt to promote the alloca,
except for simple bitcasts:
Transforms/InstCombine/AArch64/sve-cast-of-alloc.ll
Differential revision: https://reviews.llvm.org/D87378
Piotr Sobczak [Thu, 17 Sep 2020 12:21:23 +0000 (14:21 +0200)]
[AMDGPU] Fix merging m0 inits
Fix incorrect merges of m0 inits in loops.
It was assumed that if a clobbering instruction appears in
the same block as an init and the clobbering instruction
does not dominate the init then it does not interfere with
init.
This does not work in the presence of loops, where in this
scenario, the clobbering instruction does interfere with
the init in another iteration.
To fix this, do not check for block equality and defer the
decision to the predecessor check.
Differential Revision: https://reviews.llvm.org/D87882
MaheshRavishankar [Tue, 22 Sep 2020 22:58:24 +0000 (15:58 -0700)]
[mlir][Linalg] Add pattern to fold linalg.tensor_reshape that add unit extent dims.
A sequence of two reshapes such that one of them is just adding unit
extent dims can be folded to a single reshape.
Differential Revision: https://reviews.llvm.org/D88057
Anatoly Parshintsev [Wed, 23 Sep 2020 06:43:13 +0000 (23:43 -0700)]
[RISCV][ASAN] implementation of ThreadSelf for riscv64
[6/11] patch series to port ASAN for riscv64
Depends On D87574
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D87575
Vitaly Buka [Wed, 23 Sep 2020 06:37:16 +0000 (23:37 -0700)]
[NFC] Reformat preprocessor directives
Vitaly Buka [Wed, 23 Sep 2020 06:29:33 +0000 (23:29 -0700)]
Revert "[RISCV][ASAN] implementation of ThreadSelf for riscv64"
Merged two unrelated commits
This reverts commit
00f6ebef6e347e0d24a8f940fe43656719e88cb8.
Albion Fung [Wed, 23 Sep 2020 06:17:59 +0000 (01:17 -0500)]
[PowerPC] Implementation of 128-bit Binary Vector Mod and Sign Extend builtins
This patch implements 128-bit Binary Vector Mod and Sign Extend builtins for PowerPC10.
Differential: https://reviews.llvm.org/D87394#inline-815858
Martin Storsjö [Wed, 23 Sep 2020 05:52:16 +0000 (08:52 +0300)]
[CVP] Remove a redundant trailing semicolon, fixing GCC warnings. NFC.
Martin Storsjö [Wed, 23 Sep 2020 05:51:34 +0000 (08:51 +0300)]
[InstCombine] Add parentheses in assert to silence GCC warning. NFC.
Martin Storsjö [Wed, 9 Sep 2020 07:34:15 +0000 (10:34 +0300)]
[MC] [Win64EH] Try to generate packed unwind info where possible
In practice, this only gives modest savings (for a 6.5 MB DLL with
230 KB xdata, the xdata sections shrinks by around 2.5 KB); to
gain more, the frame lowering would need to be tweaked to more often
generate frame layouts that match the canonical layouts that can
be written in packed form.
Differential Revision: https://reviews.llvm.org/D87371
Mehdi Amini [Wed, 23 Sep 2020 05:50:05 +0000 (05:50 +0000)]
Add a dump() method on the pass manager for debugging purpose (NFC)
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D88008
Anatoly Parshintsev [Wed, 23 Sep 2020 05:27:40 +0000 (22:27 -0700)]
[RISCV][ASAN] implementation of ThreadSelf for riscv64
[6/11] patch series to port ASAN for riscv64
Depends On D87574
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D87575
Alexey Baturo [Wed, 23 Sep 2020 05:21:05 +0000 (22:21 -0700)]
[RISCV][ASAN] implementation for vfork interceptor for riscv64
[5/11] patch series to port ASAN for riscv64
Depends On D87573
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D87574
Anatoly Parshintsev [Wed, 23 Sep 2020 05:10:13 +0000 (22:10 -0700)]
[RISCV][ASAN] implementation of clone interceptor for riscv64
[4/11] patch series to port ASAN for riscv64
Depends On D87572
Reviewed By: eugenis, vitalybuka
Differential Revision: https://reviews.llvm.org/D87573
Anatoly Parshintsev [Wed, 23 Sep 2020 04:58:25 +0000 (21:58 -0700)]
[RISCV][ASAN] implementation of internal syscalls wrappers for riscv64
implements glibc-like wrappers over Linux syscalls.
[3/11] patch series to port ASAN for riscv64
Depends On D87998
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D87572
Jan Korous [Wed, 23 Sep 2020 04:21:08 +0000 (21:21 -0700)]
[Analyzer][WebKit] Use tri-state types for relevant predicates
Some of the predicates can't always be decided - for example when a type
definition isn't available. At the same time it's necessary to let
client code decide what to do about such cases - specifically we can't
just use true or false values as there are callees with
conflicting strategies how to handle this.
This is a speculative fix for PR47276.
Differential Revision: https://reviews.llvm.org/D88133
Anatoly Parshintsev [Wed, 23 Sep 2020 04:30:09 +0000 (21:30 -0700)]
[RISCV][ASAN] updated platform macros to simplify detection of RISCV64 platform
[2/11] patch series to port ASAN for riscv64
Depends On D87997
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D87998
Fangrui Song [Wed, 23 Sep 2020 04:36:59 +0000 (21:36 -0700)]
[sanitizers] Remove the message queue with IPC_RMID after D82897
Nemanja Ivanovic [Wed, 23 Sep 2020 04:18:33 +0000 (23:18 -0500)]
[Sanitizers] Fix test case that doesn't clean up after itself
Commit https://reviews.llvm.org/rG144e57fc9535 added this test
case that creates message queues but does not remove them. The
message queues subsequently build up on the machine until the
system wide limit is reached. This has caused failures for a
number of bots running on a couple of big PPC machines.
This patch just adds the missing cleanup.
Greg McGary [Wed, 23 Sep 2020 03:42:12 +0000 (20:42 -0700)]
[lld-maco] fix build breakage
Teresa Johnson [Sat, 19 Sep 2020 17:09:28 +0000 (10:09 -0700)]
[ThinLTO] Avoid temporaries when loading global decl attachment metadata
When performing ThinLTO importing, the metadata loader attempts to lazy
load, by building an index. However, module level global decl attachment
metadata was being parsed early while building the index, since the
associated (module level) global values aren't materialized on demand.
This results in the creation of forward reference temporary metadatas,
which are expensive.
Normally, these module level global values don't have much attached
metadata. However, in the case of -fwhole-program-vtables (e.g. for
whole program devirtualization), the vtables may have many attached type
metadatas. This was resulting in very slow performance when performing
ThinLTO importing with the default lazy loading.
This patch restructures the handling of these global decl attachment
records, delaying their parsing until after the lazy loading index has
been built. Then the parser can use the interface that loads from the
index, which resolves forward references immediately instead of creating
expensive temporaries.
For one ThinLTO backend that imports from modules containing huge
numbers of vtables and associated types, I measured the following
compile times for the metadata materialization during function
importing, rounded to nearest second:
No -fwhole-program-vtables:
Lazy loading on (head): 1s
Lazy loading off (head): 3s
Lazy loading on (patch): 1s
With -fwhole-program-vtables:
Lazy loading on (head): 440s
Lazy loading off (head): 4s
Lazy loading on (patch): 2s
Differential Revision: https://reviews.llvm.org/D87970
Greg McGary [Sun, 13 Sep 2020 03:45:00 +0000 (20:45 -0700)]
[lld-macho] In the context of relocs, s/target/referent/ for sections & symbols
The word "target" is overloaded, so lighten its load by using another word to denote the symbol or section to which a reloc points. While more stilted than "target", "referent" is rather less pompous than "designatum" or "denotatum". :P
Along the way, make a few neighboring variable names more descriptive.
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D87584
Arthur Eubanks [Wed, 23 Sep 2020 02:30:03 +0000 (19:30 -0700)]
[test][NewPM] Clean up ScalarEvolution tests to work under NPM
Bing1 Yu [Wed, 23 Sep 2020 02:13:03 +0000 (10:13 +0800)]
[CostModel][X86] add CostModel for SK_Select(v8f64, v8i64, v16f32, v16i32, v32i16, v64i8)
add CostModel for SK_Select(v8f64, v8i64, v16f32, v16i32, v32i16, v64i8)
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D87884
Arthur Eubanks [Wed, 23 Sep 2020 02:26:30 +0000 (19:26 -0700)]
[test][NewPM] Fix update-scev.ll under NPM
Andrew Litteken [Wed, 23 Sep 2020 02:02:34 +0000 (21:02 -0500)]
Revert "[IRSim] Adding IRSimilarityCandidate that contains a region of IRInstructionData."
This reverts commit
4944bb190fed8861d4d043eaf45e3c1e12aa2dc5.
Arthur Eubanks [Fri, 18 Sep 2020 20:52:11 +0000 (13:52 -0700)]
[NewPM] Pin tests with -debug-pass to legacy PM
-debug-pass is a legacy PM only option.
Some tests checks that the pass returned that it made a change,
which is not relevant to the NPM, since passes return PreservedAnalyses.
Some tests check that passes are freed at the proper time, which is also
not relevant to the NPM.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D87945
Xing GUO [Wed, 23 Sep 2020 00:41:46 +0000 (08:41 +0800)]
[DWARFYAML][test] Simplify __debug_pubnames/types tests. NFC.
This patch stripped unneeded sections from the test case.
Reviewed By: grimar, MaskRay
Differential Revision: https://reviews.llvm.org/D88073
Leonard Chan [Wed, 23 Sep 2020 00:40:53 +0000 (17:40 -0700)]
Revert "Canonicalize declaration pointers when forming APValues."
This reverts commit
905b9ca26c94fa86339451a528cedde5004fc1bb.
Reverting because this strips `weak` attributes off function
declarations, leading to the linker error we see at
https://ci.chromium.org/p/fuchsia/builders/ci/clang_toolchain.fuchsia-arm64-debug-subbuild/
b8868932035091473008.
See https://reviews.llvm.org/rG905b9ca26c94 for reproducer details.
Fangrui Song [Mon, 21 Sep 2020 16:43:23 +0000 (09:43 -0700)]
[EHStreamer] Ensure CallSiteEntry::{BeginLabel,EndLabel} are non-null. NFC
... to simplify the code a bit.
Reviewed By: rahmanl
Differential Revision: https://reviews.llvm.org/D87999
Greg McGary [Mon, 21 Sep 2020 18:04:13 +0000 (11:04 -0700)]
[lld-macho] handle option -headerpad_max_install_names
Differential Revision: https://reviews.llvm.org/D88064
Yuanfang Chen [Tue, 22 Sep 2020 23:51:23 +0000 (16:51 -0700)]
[Clang] Fix a typo in implicit-int-float-conversion.c
Andrew Litteken [Tue, 15 Sep 2020 22:30:31 +0000 (17:30 -0500)]
[IRSim] Adding IRSimilarityCandidate that contains a region of IRInstructionData.
The IRSimilarityCandidate is a container to hold a region of
IRInstructions and offer interfaces for the starting instruction, ending
instruction, parent function, length. It also assigns a global value
number for each unique instance of a value in the region.
It also contains an interface to compare two IRSimilarity as to whether
they have the same sequence of similar instructions.
Tests for whether the instructions are similar are found in
unittests/Analysis/IRSimilarityIdentifierTest.cpp.
Differential Revision: https://reviews.llvm.org/D86970
antonio-cortes-perez [Tue, 22 Sep 2020 23:39:50 +0000 (23:39 +0000)]
[NFC][docs] Fix link.
The rendered html was (no hyperlink was generated):
(see Getting Started <GettingStarted.html#git-pre-push-hook>)
Now, it is (with proper hyperlink):
(see Git pre-push hook)
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D88116
Lang Hames [Tue, 22 Sep 2020 23:04:37 +0000 (16:04 -0700)]
[ORC][examples] Add missing library dependencies.
Walter Erquinigo [Tue, 22 Sep 2020 21:49:16 +0000 (14:49 -0700)]
[trace] avoid using <regex>
Easy fix based on the feedback by maskray on
https://reviews.llvm.org/D85705.
Hubert Tong [Tue, 22 Sep 2020 22:52:47 +0000 (18:52 -0400)]
[InstCombine][NFC][tests] Add ninf base value case to pow-sqrt.ll
Hubert Tong [Tue, 22 Sep 2020 22:49:55 +0000 (18:49 -0400)]
[InstCombine] Fix errno bug in pow expansion to sqrt
A conversion from `pow` to `sqrt` shall not call an `errno`-setting
`sqrt` with -//infinity//: the `sqrt` will set `EDOM` where the `pow`
call need not.
This patch avoids the erroneous (pun not intended) transformation by
applying the restrictions discussed in the thread for
https://lists.llvm.org/pipermail/llvm-dev/2020-September/145051.html.
The existing tests are updated (depending on emphasis in the checks for
library calls, avoidance of overlap, and overall coverage):
- to add `ninf`, retaining the intended library call,
- to use the intrinsic, retaining the use of `select`, or
- to expect the replacement to not occur.
The following is tested:
- The pow intrinsic folds to a `select` instruction to
handle -//infinity//.
- The pow library call folds, with `ninf`, to `sqrt` without the
`select` instruction associated with handling -//infinity//.
- The pow library call does not fold to `sqrt` without `ninf`.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D87877
Alexey Bataev [Tue, 22 Sep 2020 20:50:21 +0000 (16:50 -0400)]
[SLP]Fix coding style, NFC.
Louis Dionne [Mon, 21 Sep 2020 14:54:16 +0000 (10:54 -0400)]
[libc++] NFC: Reindent the feature test macro generation script
Each feature-test macro is now a clear block indentation-wise.
Louis Dionne [Mon, 21 Sep 2020 14:36:37 +0000 (10:36 -0400)]
[libc++] NFC: Collocate C++20 removed members of std::allocator
Philip Reames [Tue, 22 Sep 2020 21:23:16 +0000 (14:23 -0700)]
[AArch64] Teach analyzeBranch to remove branch equivelent to fallthrough
The motivation here is that MachineBlockPlacement relies on analyzeBranch to remove branches to fallthrough blocks when the branch is not fully analyzeable. With the introduction of the FAULTING_OP psuedo for implicit null checking (see D87861), this case becomes important. Note that it's hard to otherwise exercise this path as BranchFolding handle's any fully analyzeable branch sequence without using this interface.
p.s. For anyone who saw my comment in the original review, what I thought was an issue in BranchFolding originally turned out to simply be a bug in my patch. (Now fixed.)
Differential Revision: https://reviews.llvm.org/D88035
Michael Liao [Tue, 22 Sep 2020 21:33:38 +0000 (17:33 -0400)]
Fix build due to renaming in LoopInfo.
Louis Dionne [Tue, 22 Sep 2020 19:46:37 +0000 (15:46 -0400)]
[libc++] Implement LWG1203
Libc++ had an issue where nonsensical code like
decltype(std::stringstream{} << std::vector<int>{});
would compile, as long as you kept the expression inside decltype in
an unevaluated operand. This turned out to be that we didn't implement
LWG1203, which clarifies what we should do in that case.
rdar://
58769296
Fangrui Song [Tue, 22 Sep 2020 21:07:39 +0000 (14:07 -0700)]
Change LoopInfo::empty to isInnermost after D82895
Stefanos Baziotis [Tue, 22 Sep 2020 20:59:34 +0000 (23:59 +0300)]
Small fixes for "[LoopInfo] empty() -> isInnermost(), add isOutermost()"
Reid Kleckner [Tue, 22 Sep 2020 20:45:22 +0000 (13:45 -0700)]
Revert "[CodeGen] emit CG profile for COFF object file"
This reverts commit
91aed9bf975f1e4346cc8f4bdefc98436386ced2, it is
causing link errors.
Stefanos Baziotis [Tue, 22 Sep 2020 20:28:00 +0000 (23:28 +0300)]
[LoopInfo] empty() -> isInnermost(), add isOutermost()
Differential Revision: https://reviews.llvm.org/D82895
Congzhe Cao [Tue, 22 Sep 2020 20:21:27 +0000 (16:21 -0400)]
[AArch64] Avoid pairing loads with same result reg
When pairing ldr instructions to an ldp instruction, we cannot pair two ldr
destination registers where one is a sub or super register of the other.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D86906
Mircea Trofin [Wed, 16 Sep 2020 19:08:15 +0000 (12:08 -0700)]
[ThinLTO] Option to bypass function importing.
This completes the circle, complementing -lto-embed-bitcode
(specifically, post-merge-pre-opt). Using -thinlto-assume-merged skips
function importing. The index file is still needed for the other data it
contains.
Differential Revision: https://reviews.llvm.org/D87949
David Tenty [Thu, 3 Sep 2020 15:40:13 +0000 (11:40 -0400)]
[compiler-rt][AIX] Add CMake support for 32-bit Power builds
This patch enables support for building compiler-rt builtins for 32-bit
Power arch on AIX. For now, we leave out the specialized ppc builtin
implementations for 128-bit long double and friends since those will
need some special handling for AIX.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D87383
Fangrui Song [Tue, 22 Sep 2020 20:08:12 +0000 (13:08 -0700)]
[lldb][test] Remove accidental import pdb in
783dc7dc7ed7487d0782c2feb8854df949b98e69
Paul C. Anagnostopoulos [Tue, 22 Sep 2020 19:55:51 +0000 (15:55 -0400)]
Two patches to fix the broken build.
One to fix a C++ compiler warning.
One to allow Sphinx to find a new document.
Sriraman Tallam [Tue, 22 Sep 2020 19:12:21 +0000 (12:12 -0700)]
Revert "The wrong placement of add pass with optimizations led to -funique-internal-linkage-names being disabled."
This reverts commit
6950db36d33d85d18e3241ab6c87494c05ebe0fb.
Michael Kruse [Tue, 22 Sep 2020 19:20:37 +0000 (14:20 -0500)]
[flang][msvc] Explicitly reference "this" inside closure. NFC.
The Microsoft compiler seems to have difficulties to decide between a const/non-const method of a captured object context in a closure. The error message is:
```
symbol.cpp(261): error C2668: 'Fortran::semantics::Symbol::detailsIf': ambiguous call to overloaded function
symbol.h(535): note: could be 'const D *Fortran::semantics::Symbol::detailsIf<Fortran::semantics::DerivedTypeDetails>(void) const'
symbol.h(534): note: or 'D *Fortran::semantics::Symbol::detailsIf<Fortran::semantics::DerivedTypeDetails>(void)'
symbol.cpp(261): note: while trying to match the argument list '()'
```
Explicitly using the this-pointer resolves this problem.
This patch is part of the series to make flang compilable with MS Visual Studio <http://lists.llvm.org/pipermail/flang-dev/2020-July/000448.html>.
Reviewed By: DavidTruby
Differential Revision: https://reviews.llvm.org/D88052
Michael Kruse [Tue, 22 Sep 2020 19:19:40 +0000 (14:19 -0500)]
[flang][msvc] Add explicit function template argument to applyLamda. NFC.
Like in D87961, msvc has difficulties deducing the template argument. The error message is:
```
expr-parsers.cpp(383): error C2672: 'applyLambda': no matching overloaded function found
```
Explicitly pass the first template argument to help it.
This patch is part of the series to make flang compilable with MS Visual Studio <http://lists.llvm.org/pipermail/flang-dev/2020-July/000448.html>.
Reviewed By: DavidTruby
Differential Revision: https://reviews.llvm.org/D88001
Michael Kruse [Tue, 22 Sep 2020 19:17:46 +0000 (14:17 -0500)]
[flang][msvc] Add explicit function template argument to applyFunction. NFC.
Msvc has difficulties deducing the template argument here. The error message is:
```
basic-parsers.h(790,12): error C2672: 'applyFunction': no matching overloaded function found
```
Explicitly pass the first template argument to help it.
This patch is part of the series to make flang compilable with MS Visual Studio <http://lists.llvm.org/pipermail/flang-dev/2020-July/000448.html>.
Reviewed By: DavidTruby
Differential Revision: https://reviews.llvm.org/D87961
Raphael Isemann [Tue, 22 Sep 2020 19:13:44 +0000 (21:13 +0200)]
Revert "[lldb] XFAIL TestMemoryHistory on Linux"
This reverts commit
7518006d75accd21325747430d6bced66b2c5ada.
This test apparently works on the Swift CI ubuntu bot, so it shouldn't be
XFAIL'd on Linux.
Mehdi Amini [Tue, 22 Sep 2020 00:51:27 +0000 (00:51 +0000)]
Implement a new kind of Pass: dynamic pass pipeline
Instead of performing a transformation, such pass yields a new pass pipeline
to run on the currently visited operation.
This feature can be used for example to implement a sub-pipeline that
would run only on an operation with specific attributes. Another example
would be to compute a cost model and dynamic schedule a pipeline based
on the result of this analysis.
Discussion: https://llvm.discourse.group/t/rfc-dynamic-pass-pipeline/1637
Recommit after fixing an ASAN issue: the callback lambda needs to be
allocated to a temporary to have its lifetime extended to the end of the
current block instead of just the current call expression.
Reviewed By: silvas
Differential Revision: https://reviews.llvm.org/D86392
Roman Lebedev [Tue, 22 Sep 2020 13:33:18 +0000 (16:33 +0300)]
[CVP] Narrow SDiv/SRem to the smallest power-of-2 that's sufficient to contain its operands
This is practically identical to what we already do for UDiv/URem:
https://rise4fun.com/Alive/04K
Name: narrow udiv
Pre: C0 u<= 255 && C1 u<= 255
%r = udiv i16 C0, C1
=>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = udiv i8 %t0, %t1
%r = zext i8 %t2 to i16
Name: narrow exact udiv
Pre: C0 u<= 255 && C1 u<= 255
%r = udiv exact i16 C0, C1
=>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = udiv exact i8 %t0, %t1
%r = zext i8 %t2 to i16
Name: narrow urem
Pre: C0 u<= 255 && C1 u<= 255
%r = urem i16 C0, C1
=>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = urem i8 %t0, %t1
%r = zext i8 %t2 to i16
... only here we need to look for 'min signed bits', not 'active bits',
and there's an UB to be aware of:
https://rise4fun.com/Alive/KG86
https://rise4fun.com/Alive/LwR
Name: narrow sdiv
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128
%r = sdiv i16 C0, C1
=>
%t0 = trunc i16 C0 to i9
%t1 = trunc i16 C1 to i9
%t2 = sdiv i9 %t0, %t1
%r = sext i9 %t2 to i16
Name: narrow exact sdiv
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128
%r = sdiv exact i16 C0, C1
=>
%t0 = trunc i16 C0 to i9
%t1 = trunc i16 C1 to i9
%t2 = sdiv exact i9 %t0, %t1
%r = sext i9 %t2 to i16
Name: narrow srem
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128
%r = srem i16 C0, C1
=>
%t0 = trunc i16 C0 to i9
%t1 = trunc i16 C1 to i9
%t2 = srem i9 %t0, %t1
%r = sext i9 %t2 to i16
Name: narrow sdiv
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128 && !(C0 == -128 && C1 == -1)
%r = sdiv i16 C0, C1
=>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = sdiv i8 %t0, %t1
%r = sext i8 %t2 to i16
Name: narrow exact sdiv
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128 && !(C0 == -128 && C1 == -1)
%r = sdiv exact i16 C0, C1
=>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = sdiv exact i8 %t0, %t1
%r = sext i8 %t2 to i16
Name: narrow srem
Pre: C0 <= 127 && C1 <= 127 && C0 >= -128 && C1 >= -128 && !(C0 == -128 && C1 == -1)
%r = srem i16 C0, C1
=>
%t0 = trunc i16 C0 to i8
%t1 = trunc i16 C1 to i8
%t2 = srem i8 %t0, %t1
%r = sext i8 %t2 to i16
The ConstantRangeTest.losslessSignedTruncationSignext test sanity-checks
the logic, that we can losslessly truncate ConstantRange to
`getMinSignedBits()` and signext it back, and it will be identical
to the original CR.
On vanilla llvm test-suite + RawSpeed, this fires 1262 times,
while the same fold for UDiv/URem only fires 384 times. Sic!
Additionally, this causes +606.18% (+1079) extra cases of
aggressive-instcombine.NumDAGsReduced, and +473.14% (+1145)
of aggressive-instcombine.NumInstrsReduced folds.
Roman Lebedev [Tue, 22 Sep 2020 14:33:39 +0000 (17:33 +0300)]
[NFC][CVP] Add tests for SDiv/SRem narrowing
Roman Lebedev [Tue, 22 Sep 2020 13:21:19 +0000 (16:21 +0300)]
[NFC][CVP] Give a better name STATISTIC() counting udiv i16 -> udiv i8 xforms
Roman Lebedev [Tue, 22 Sep 2020 12:51:25 +0000 (15:51 +0300)]
[ConstantRange] Introduce getMinSignedBits() method
Similar to the ConstantRange::getActiveBits(), and to similarly-named
methods in APInt, returns the bitwidth needed to represent
the given signed constant range
Roman Lebedev [Tue, 22 Sep 2020 18:34:31 +0000 (21:34 +0300)]
[NFC][APInt] Refactor getMinSignedBits() in terms of getNumSignBits()
This is fully identical to the old implementation, just easier to read.
Roman Lebedev [Tue, 22 Sep 2020 12:34:45 +0000 (15:34 +0300)]
[NFC][CVP] processUDivOrURem(): refactor to use ConstantRange::getActiveBits()
As an exhaustive test shows, this logic is fully identical to the old
implementation, with exception of the case where both of the operands
had empty ranges:
```
TEST_F(ConstantRangeTest, CVP_UDiv) {
unsigned Bits = 4;
EnumerateConstantRanges(Bits, [&](const ConstantRange &CR0) {
if(CR0.isEmptySet())
return;
EnumerateConstantRanges(Bits, [&](const ConstantRange &CR1) {
if(CR0.isEmptySet())
return;
unsigned MaxActiveBits = 0;
for (const ConstantRange &CR : {CR0, CR1})
MaxActiveBits = std::max(MaxActiveBits, CR.getActiveBits());
ConstantRange OperandRange(Bits, /*isFullSet=*/false);
for (const ConstantRange &CR : {CR0, CR1})
OperandRange = OperandRange.unionWith(CR);
unsigned NewWidth = OperandRange.getUnsignedMax().getActiveBits();
EXPECT_EQ(MaxActiveBits, NewWidth) << CR0 << " " << CR1;
});
});
}
```
Roman Lebedev [Tue, 22 Sep 2020 12:17:24 +0000 (15:17 +0300)]
[ConstantRange] Introduce getActiveBits() method
Much like APInt::getActiveBits(), computes how many bits are needed
to be able to represent every value in this constant range,
treating the values as unsigned.
Roman Lebedev [Tue, 22 Sep 2020 08:50:25 +0000 (11:50 +0300)]
[ConstantRange] binaryXor(): special-case binary complement case - the result is precise
Use the fact that `~X` is equivalent to `-1 - X`, which gives us
fully-precise answer, and we only need to special-handle the wrapped case.
This fires ~16k times for vanilla llvm test-suite + RawSpeed.
Roman Lebedev [Tue, 22 Sep 2020 07:37:15 +0000 (10:37 +0300)]
[CVP] Enhance SRem -> URem fold to work not just on non-negative operands
This is a continuation of
8d487668d09fb0e4e54f36207f07c1480ffabbfd,
the logic is pretty much identical for SRem:
Name: pos pos
Pre: C0 >= 0 && C1 >= 0
%r = srem i8 C0, C1
=>
%r = urem i8 C0, C1
Name: pos neg
Pre: C0 >= 0 && C1 <= 0
%r = srem i8 C0, C1
=>
%r = urem i8 C0, -C1
Name: neg pos
Pre: C0 <= 0 && C1 >= 0
%r = srem i8 C0, C1
=>
%t0 = urem i8 -C0, C1
%r = sub i8 0, %t0
Name: neg neg
Pre: C0 <= 0 && C1 <= 0
%r = srem i8 C0, C1
=>
%t0 = urem i8 -C0, -C1
%r = sub i8 0, %t0
https://rise4fun.com/Alive/Vd6
Now, this new logic does not result in any new catches
as of vanilla llvm test-suite + RawSpeed.
but it should be virtually compile-time free,
and it may be important to be consistent in their handling,
because if we had a pair of sdiv-srem, and only converted one of them,
-divrempairs will no longer see them as a pair,
and thus not "merge" them.
Roman Lebedev [Mon, 21 Sep 2020 20:57:13 +0000 (23:57 +0300)]
[NFC][CVP] Add tests for srem with potentially different sigdness domains
Arthur Eubanks [Tue, 22 Sep 2020 18:33:38 +0000 (11:33 -0700)]
[test][NewPM] Pin do-nothing-intrinsic.ll to legacy PM
It tests CallGraph infra around the legacy PM which isn't relevant in NPM.
Arthur Eubanks [Tue, 22 Sep 2020 18:30:05 +0000 (11:30 -0700)]
[LoopInfo][NewPM] Fix tests in Analysis/LoopInfo under NPM
Jonas Devlieghere [Tue, 22 Sep 2020 18:28:00 +0000 (11:28 -0700)]
[lldb] Skip TestMiniDumpUUID with reproducers
The modules not getting orphaned is wreaking havoc when the UUIDs match
between tests.
Jonas Devlieghere [Tue, 22 Sep 2020 18:06:09 +0000 (11:06 -0700)]
[lldb] Skip test_common_completion_process_pid_and_name with reproducers
This test launches a subprocess which will have a different PID during
capture and replay.
Hubert Tong [Tue, 22 Sep 2020 18:04:13 +0000 (14:04 -0400)]
[InstCombine] For pow(x, +/-0.5), stop falling into pow(x, 1.5), etc. case
The current code for handling pow(x, y) where y is an integer plus 0.5
is not explicitly guarded against attempting to transform the case where
abs(y) is exactly 0.5.
The latter case is meant to be handled by `replacePowWithSqrt`. Indeed,
if the pow(x, integer+0.5) case proceeds past a certain point, it will
hit an assertion by attempting to form pow(x, 0) using `getPow`.
This patch adds an explicit check to prevent attempting the
pow(x, integer+0.5) transformation on pow(x, +/-0.5) as suggested during
the review of D87877. This has the effect of retaining the shrinking of
`pow` to `powf` when the `sqrt` libcall cannot be formed.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D88066
Hubert Tong [Tue, 22 Sep 2020 18:01:58 +0000 (14:01 -0400)]
[NFC] Replace tabs with spaces in PPCInstrPrefix.td
Hubert Tong [Tue, 22 Sep 2020 17:59:00 +0000 (13:59 -0400)]
[test][MC] Rehabilitate llvm/test/MC/COFF/bigobj.py
The subject test was not actually running. This patch adds the
relevant suffix to the list of lit case filename extensions for the
enclosing directory.
Minor adjustments are also made to deal with bit rot.
Reviewed By: daltenty
Differential Revision: https://reviews.llvm.org/D87122
Haojian Wu [Tue, 22 Sep 2020 18:20:42 +0000 (20:20 +0200)]
[clang] Fix a typo-correction crash
We leave a dangling TypoExpr when typo-correction is performed
successfully in `checkArgsForPlaceholders`, which leads a crash in the
later TypoCorrection.
This code was added in https://github.com/llvm/llvm-project/commit/
1586782767938df3a20f7abc4d8335c48b100bc4,
and it didn't seem to have enough test coverage.
The fix is to remove this part, and no failuer tests.
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D87815
LLVM GN Syncbot [Tue, 22 Sep 2020 18:07:36 +0000 (18:07 +0000)]
[gn build] Port
8a64689e264
LLVM GN Syncbot [Tue, 22 Sep 2020 18:07:35 +0000 (18:07 +0000)]
[gn build] Port
848d66fafd2
Jan Korous [Thu, 6 Aug 2020 19:07:47 +0000 (11:07 -0800)]
[Analyzer][WebKit] UncountedLocalVarsChecker
Differential Review: https://reviews.llvm.org/D83259
Paul C. Anagnostopoulos [Mon, 21 Sep 2020 17:56:06 +0000 (13:56 -0400)]
Version 0.5 of the new "TableGen Backend Developer's Guide."
Files modified to take comments into account.
MLIR documentation updated for new TableGen documentation files.
Mircea Trofin [Mon, 21 Sep 2020 23:27:09 +0000 (16:27 -0700)]
[NFC][regalloc] Simplify/conform to style guide indvars in Greedy
Differential Revision: https://reviews.llvm.org/D88055
Sam McCall [Tue, 1 Sep 2020 18:22:59 +0000 (20:22 +0200)]
[ASTMatchers] Avoid recursion in ancestor matching to save stack space.
A recent change increased the stack size of memoizedMatchesAncestorOfRecursively
leading to stack overflows on real code involving large fold expressions.
It's not totally unreasonable to choke on very deep ASTs, but as common
infrastructure it's be nice if ASTMatchFinder is more robust.
(It already uses data recursion for the regular "downward" traversal.)
Differential Revision: https://reviews.llvm.org/D86964
Eduardo Caldas [Thu, 17 Sep 2020 15:31:30 +0000 (15:31 +0000)]
[SyntaxTree] Test the List API
Differential Revision: https://reviews.llvm.org/D87839
Jacques Pienaar [Tue, 22 Sep 2020 17:04:21 +0000 (10:04 -0700)]
[mlir][ods] Make OpBuilder and OperationState optional
The OpBuilder is required to start with OpBuilder and OperationState, so remove
the need for the user to specify it. To make it simpler to update callers,
retain the legacy behavior for now and skip injecting OpBuilder/OperationState
when params start with OpBuilder.
Related to bug 47442.
Differential Revision: https://reviews.llvm.org/D88050
Kazuaki Ishizaki [Tue, 22 Sep 2020 17:01:54 +0000 (02:01 +0900)]
[mlir] NFC: fix trivial typos under include directory
Reviewed By: mravishankar, jpienaar
Differential Revision: https://reviews.llvm.org/D88040
Amy Kwan [Tue, 22 Sep 2020 14:41:16 +0000 (09:41 -0500)]
[PowerPC] Implement Vector String Isolate Builtins in Clang/LLVM
This patch implements the vector string isolate (predicate and non-predicate
versions) builtins. The predicate builtins are custom selected within PPCISelDAGToDAG.
Differential Revision: https://reviews.llvm.org/D87671
Amy Kwan [Tue, 22 Sep 2020 14:40:45 +0000 (09:40 -0500)]
[PowerPC] Implement the 128-bit Vector Divide Extended Builtins in Clang/LLVM
This patch implements the 128-bit vector divide extended builtins in Clang/LLVM.
These builtins map to the vdivesq and vdiveuq instructions respectively.
Differential Revision: https://reviews.llvm.org/D87729
Simon Pilgrim [Tue, 22 Sep 2020 16:17:20 +0000 (17:17 +0100)]
[DAG] Remove DAGTypeLegalizer::GenWidenVectorTruncStores (PR42046)
Just scalarize trunc stores - GenWidenVectorTruncStores does the same thing but is flawed (PR42046) and unused.
Differential Revision: https://reviews.llvm.org/D87708
Alexandre Ganea [Tue, 22 Sep 2020 16:05:11 +0000 (12:05 -0400)]
Silence 'warning: unused variable' when compiling with Clang 10.0
Matt Morehouse [Tue, 22 Sep 2020 16:08:49 +0000 (09:08 -0700)]
[sanitizer_common] Add debug print to sysmsg.c
Greg McGary [Tue, 22 Sep 2020 02:02:12 +0000 (19:02 -0700)]
[lld-macho] Make lld::getInteger() tolerate leading "0x"/"0X" when base is 16
ld64 is cool with leading `0x` for hex command-line args, and we should be also.
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D88065
Hamilton Tobon Mosquera [Tue, 22 Sep 2020 15:45:43 +0000 (10:45 -0500)]
[OpenMPOpt] Refactored "issue" and "wait" declarations for data map runtime call.
Refactored __tgt_target_data_begin_mapper_<issue|wait> to receive the handle as an input/output argument.
This given the compiler warning of returning the handle as copy.
Differential Revision: https://reviews.llvm.org/D88029
Saleem Abdulrasool [Thu, 10 Sep 2020 21:15:37 +0000 (21:15 +0000)]
Sema: introduce `__attribute__((__swift_name__))`
This introduces the new `swift_name` attribute that allows annotating
APIs with an alternate spelling for Swift. This is used as part of the
importing mechanism to allow interfaces to be imported with a new name
into Swift. It takes a parameter which is the Swift function name.
This parameter is validated to check if it matches the possible
transformed signature in Swift.
This is based on the work of the original changes in
https://github.com/llvm/llvm-project-staging/commit/
8afaf3aad2af43cfedca7a24cd817848c4e95c0c
Differential Revision: https://reviews.llvm.org/D87534
Reviewed By: Aaron Ballman, Dmitri Gribenko
Arthur Eubanks [Tue, 22 Sep 2020 15:27:46 +0000 (08:27 -0700)]
[DI][ASan][NewPM] Fix some DebugInfo ASan tests under NPM
Alexandre Ganea [Tue, 22 Sep 2020 15:24:36 +0000 (11:24 -0400)]
[ThinLTO] Re-order modules for optimal multi-threaded processing
Re-use an optimizition from the old LTO API (used by ld64).
This sorts modules in ascending order, based on bitcode size, so that larger modules are processed first. This allows for smaller modules to be process last, and better fill free threads 'slots', and thusly allow for better multi-thread load balancing.
In our case (on dual Intel Xeon Gold 6140, Windows 10 version 2004, two-stage build), this saves 15 sec when linking `clang.exe` with LLD & `-flto=thin`, `/opt:lldltojobs=all`, no ThinLTO cache, -DLLVM_INTEGRATED_CRT_ALLOC=d:\git\rpmalloc.
Before patch: 102 sec
After patch: 85 sec
Inspired by the work done by David Callahan in D60495.
Differential Revision: https://reviews.llvm.org/D87966
Kostya Kortchinsky [Mon, 21 Sep 2020 21:09:37 +0000 (14:09 -0700)]
[scudo][standalone] Remove the pthread key from the shared TSD
https://reviews.llvm.org/D87420 removed the uses of the pthread key,
but the key itself was left in the shared TSD registry. It is created
on registry initialization, and destroyed on registry teardown.
There is really no use for it now, so we can just remove it.
Differential Revision: https://reviews.llvm.org/D88046
Arthur Eubanks [Tue, 22 Sep 2020 15:20:11 +0000 (08:20 -0700)]
[GVNSink][NewPM] Add GVNSinkPass to PassRegistry.def