Simon Pilgrim [Tue, 29 Sep 2020 09:56:00 +0000 (10:56 +0100)]
[InstCombine] Add trunc(lshr(sext(x),c)) non-uniform vector tests
Florian Hahn [Tue, 29 Sep 2020 09:38:44 +0000 (10:38 +0100)]
[LoopDeletion] Forget loop before setting values to undef
After D71539, we need to forget the loop before setting the incoming
values of phi nodes in exit blocks, because we are looking through those
phi nodes now and the SCEV expression could depend on the loop phi. If
we update the phi nodes before forgetting the loop, we miss those users
during invalidation.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D88167
Max Kazantsev [Tue, 29 Sep 2020 08:32:53 +0000 (15:32 +0700)]
[SCEV][NFC] Introduce isBasicBlockEntryGuardedByCond
Currently, we have `isLoopEntryGuardedByCond` method in SCEV, which
checks that some fact is true if we enter the loop. In fact, this is just a
particular case of more general concept `isBasicBlockEntryGuardedByCond`
applied to given loop's header. In fact, the logic if this code is largely
independent on the given loop and only cares code above it.
This patch makes this generalization. Now we can query it for any block,
and `isBasicBlockEntryGuardedByCond` is just a particular case.
Differential Revision: https://reviews.llvm.org/D87828
Reviewed By: fhahn
Tres Popp [Tue, 29 Sep 2020 08:24:54 +0000 (10:24 +0200)]
Revert "OpaquePtr: Add type to sret attribute"
This reverts commit
55c4ff91bd820d72014f63dcf7f3d5a0d3397986.
Issues were introduced as discussed in https://reviews.llvm.org/D88241
where this change made previous bugs in the linker and BitCodeWriter
visible.
Serguei Katkov [Thu, 24 Sep 2020 17:45:15 +0000 (00:45 +0700)]
[IsKnownNonZero] Handle the case with non-constant phi nodes
Handle the case when all inputs of phi are proven to be non zero.
Constants are checked in beginning of this method before check for depth of recursion,
so it is a partial case of non-constant phi.
Recursion depth is already handled by the function.
Reviewers: aqjune, nikic, efriedma
Reviewed By: nikic
Subscribers: dantrushin, hiraditya, jdoerfert, llvm-commits
Differential Revision: https://reviews.llvm.org/D88276
Florian Hahn [Tue, 29 Sep 2020 08:18:19 +0000 (09:18 +0100)]
Revert "Recommit "[SCCP] Do not replace deref'able ptr with un-deref'able one.""
Looks like there is still another remaining issue:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-msan/builds/22273/steps/build%20libcxx%2Fmsan/logs/stdio
This reverts commit
86a20d9e34f5a9989da72097f23f3b0a44157e73.
Florian Hahn [Mon, 28 Sep 2020 15:08:30 +0000 (16:08 +0100)]
Recommit "[SCCP] Do not replace deref'able ptr with un-deref'able one."
This version includes an small fix allowing function pointers to be
unconditionally replaced for now.
This reverts commit
4c5e4aa89b11ec3253258b8df5125833773d1b1e.
Sam Parker [Tue, 29 Sep 2020 07:41:53 +0000 (08:41 +0100)]
[NFC][ARM] Comments and lambdas
Add some comments in LowOverheadLoops and make some lambda variables
explicit arguments instead of capturing.
Ellis Hoag [Tue, 29 Sep 2020 06:25:24 +0000 (02:25 -0400)]
This reduces code duplication between CGObjCMac.cpp and Mangle.cpp
for generating the mangled name of an Objective-C method.
This has no intended functionality change.
https://reviews.llvm.org/D88329
Dmitry Antipov [Tue, 29 Sep 2020 03:32:51 +0000 (06:32 +0300)]
[Driver] Filter out <libdir>/gcc and <libdir>/gcc-cross if they do not exists
Differential Revision: https://reviews.llvm.org/D87901
Craig Topper [Tue, 29 Sep 2020 05:52:31 +0000 (22:52 -0700)]
[X86] Add computeKnownBits support for PEXT.
The number of zeros in the mask provides a lower bound on the number
of leading zeros in the result.
Craig Topper [Tue, 29 Sep 2020 05:46:27 +0000 (22:46 -0700)]
[X86] Add known bits test for PEXT. NFC
Johannes Doerfert [Tue, 29 Sep 2020 05:36:45 +0000 (00:36 -0500)]
Revert "[OpenMP][FIX] Verify compatible types for declare variant calls"
This reverts commit
c942095790decf525a445f3bd68fb9bcc9aa43c6.
One of the tests broke, revert to investigate.
Arthur Eubanks [Fri, 25 Sep 2020 22:21:54 +0000 (15:21 -0700)]
[Docs][NewPM] Add note about required passes
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D88342
Max Kazantsev [Tue, 29 Sep 2020 04:37:17 +0000 (11:37 +0700)]
[NFC] Use assert instead of checking the guaranteed condition
From preconditions it is known that either A dominates B or
B dominates A. If A does not dominate B, we do not really need
to check it. Assert should be enough. Should save some compile
time.
Max Kazantsev [Tue, 29 Sep 2020 04:34:15 +0000 (11:34 +0700)]
[IndVars] Remove exiting conditions that are trivially true/false
When removing exiting loop conditions, we only consider checks for
which we know the exact exit count. We could also eliminate checks for
which the condition is always true/false.
Differential Revision: https://reviews.llvm.org/D87344
Reviewed By: lebedev.ri, reames
Johannes Doerfert [Sun, 27 Sep 2020 20:52:52 +0000 (15:52 -0500)]
[OpenMP][FIX] Verify compatible types for declare variant calls
Especially for templates we need to check at some point if the base
function matches the specialization we might call instead. Before this
lead to the replacement of `std::sqrt(int(2))` calls with one that
converts the argument to a `std::complex<int>`, clearly not the desired
behavior.
Reported as PR47655
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D88384
Kiran Kumar T P [Tue, 29 Sep 2020 04:11:46 +0000 (09:41 +0530)]
[MLIR][OpenMP] Removed the ambiguity in flush op assembly syntax
Summary:
========
Bugzilla Ticket No: Bug 46884 [https://bugs.llvm.org/show_bug.cgi?id=46884]
Flush op assembly syntax was ambiguous:
Consider the below test case:
flush operation is not having any arguments.
But the next statement token i.e "%2" is read as the argument for flush operation and then translator issues an error.
***************************************************************
$ cat -n flush.mlir
1 llvm.func @_QQmain(%arg0: !llvm.i32) {
2 %0 = llvm.mlir.constant(1 : i64) : !llvm.i64
3 %1 = llvm.alloca %0 x !llvm.i32 {in_type = i32, name = "a"} : (!llvm.i64) -> !llvm.ptr<i32>
4 omp.flush
5 %2 = llvm.load %1 : !llvm.ptr<i32>
6 llvm.return
7 }
$ mlir-translate -mlir-to-llvmir flush.mlir
flush.mlir:5:6: error: expected ':'
%2 = llvm.load %1 : !llvm.ptr<i32>
^
***************************************************************
Solution:
=========
Introduced begin ( `(` ) and end token ( `)` ) to determince the begin and end of variadic arguments.
The patch includes code changes and testcase modifications.
Reviewed By: Valentin Clement, Mehdi AMINI
Differential Revision: https://reviews.llvm.org/D88376
Yonghong Song [Tue, 29 Sep 2020 03:15:05 +0000 (20:15 -0700)]
BPF: explicitly specify bpfel triple for certain tests
Commit
54d9f743c8b0 ("BPF: move AbstractMemberAccess and
PreserveDIType passes to EP_EarlyAsPossible") changed most
of CORE tests with opt run followed by llc and opt requires
the target triple specified in the IR.
There are few tests where little endian and big endian will
report different result and for little endian versions of
tests, "target triple = "bpf"" will produce wrong results
if the test executed in a big endian machine, e.g.
PowerPC big endian machine, since target "bpf" represents
host endian and will resolve to "bpfeb".
The builtbot reported such failures when build-and-run
on a PowerPC big endian machine.
To fix the issue, using "target triple = "bpfel"" instead.
Yaxun (Sam) Liu [Sun, 27 Sep 2020 03:29:57 +0000 (23:29 -0400)]
[HIP] Return non-zero value for invalid target ID
This is part of https://reviews.llvm.org/D60620
Yaxun (Sam) Liu [Tue, 29 Sep 2020 02:39:21 +0000 (22:39 -0400)]
Recommit "[HIP] Change default --gpu-max-threads-per-block value to 1024"
Recommit
04abbb3a78186aa92809866b43217c32cba90b71
Amara Emerson [Mon, 28 Sep 2020 16:46:26 +0000 (09:46 -0700)]
[AArch64][GlobalISel] Scalarize <2 x s64> G_MUL since we don't have native support for it.
Differential Revision: https://reviews.llvm.org/D88437
Yaxun (Sam) Liu [Mon, 28 Sep 2020 16:07:06 +0000 (12:07 -0400)]
Skip -fPIE for AMDGPU and HIP toolchain
AMDGPU toolchain does not support -fPIE, therefore skip it if specified by driver.
Differential Revision: https://reviews.llvm.org/D88425
Valentin Clement [Tue, 29 Sep 2020 01:22:07 +0000 (21:22 -0400)]
[mlir][openacc] Add acc.data operation verifier
Add a basic verifier for the data operation following the restriction from the standard.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D88334
Nathan Ridge [Mon, 7 Sep 2020 06:28:46 +0000 (02:28 -0400)]
[clangd] When finding refs for a renaming alias, do not return refs to underlying decls
Fixes https://github.com/clangd/clangd/issues/515
Differential Revision: https://reviews.llvm.org/D87225
Mehdi Amini [Mon, 28 Sep 2020 22:16:12 +0000 (22:16 +0000)]
Remove dependency from LLVM Dialect on the OpenMP dialect
The OmpDialect is in practice optional during translation to LLVM IR: the code is tolerant
to have a "nullptr" when not present / needed.
The dependency still exists on the export to LLVMIR.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D88351
LLVM GN Syncbot [Tue, 29 Sep 2020 00:24:06 +0000 (00:24 +0000)]
[gn build] Port
54d9f743c8b
Richard Smith [Tue, 29 Sep 2020 00:21:42 +0000 (17:21 -0700)]
Ensure that we don't compute linkage for an anonymous class too early if
it has a member whose name is the same as a builtin.
Fixes a regression from the introduction of BuiltinAttr.
Jan Korous [Tue, 29 Sep 2020 00:19:31 +0000 (17:19 -0700)]
[clang] Update warning-wall.c test
Follow-up to
1e86d637eb4f:
[clang] Selectively ena/disa-ble format-insufficient-args warning
Ruiling Song [Tue, 15 Sep 2020 00:06:57 +0000 (08:06 +0800)]
[RegisterCoalescer] Pass Undefs to extendToIndices()
When extending the subranges, the reaching-def may be an undefs. When
extending such kind of subrange, it will try to search for the reaching
def first. If the reaching def is an undef and we did not provide 'Undefs',
The findReachingDefs() will fail with message:
"Use of $noreg does not have a corresponding definition on every path:
LLVM ERROR: Use not jointly dominated by defs."
So we computeSubRangeUndefs() and pass the result to extendToIndices().
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D87744
Zahira Ammarguellat [Mon, 28 Sep 2020 23:54:40 +0000 (16:54 -0700)]
BuildVectorType with a dependent (array) type is crashing the compiler - Fix for PR-47542
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D88150
Yonghong Song [Thu, 3 Sep 2020 05:56:41 +0000 (22:56 -0700)]
BPF: move AbstractMemberAccess and PreserveDIType passes to EP_EarlyAsPossible
Move abstractMemberAccess and PreserveDIType passes as early as
possible, right after clang code generation.
Currently, compiler may transform the above code
p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0);
p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2);
a = llvm.bpf.builtin.preserve_field_info(p2, EXIST);
if (a) {
p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0);
p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2);
bpf_probe_read(buf, buf_size, p2);
}
to
p1 = llvm.bpf.builtin.preserve.struct.access(base, 0, 0);
p2 = llvm.bpf.builtin.preserve.struct.access(p1, 1, 2);
a = llvm.bpf.builtin.preserve_field_info(p2, EXIST);
if (a) {
bpf_probe_read(buf, buf_size, p2);
}
and eventually assembly code looks like
reloc_exist = 1;
reloc_member_offset = 10; //calculate member offset from base
p2 = base + reloc_member_offset;
if (reloc_exist) {
bpf_probe_read(bpf, buf_size, p2);
}
if during libbpf relocation resolution, reloc_exist is actually
resolved to 0 (not exist), reloc_member_offset relocation cannot
be resolved and will be patched with illegal instruction.
This will cause verifier failure.
This patch attempts to address this issue by do chaining
analysis and replace chains with special globals right
after clang code gen. This will remove the cse possibility
described in the above. The IR typically looks like
%6 = load @llvm.sk_buff:0:50$0:0:0:2:0
%7 = bitcast %struct.sk_buff* %2 to i8*
%8 = getelementptr i8, i8* %7, %6
for a particular address computation relocation.
But this transformation has another consequence, code sinking
may happen like below:
PHI = <possibly different @preserve_*_access_globals>
%7 = bitcast %struct.sk_buff* %2 to i8*
%8 = getelementptr i8, i8* %7, %6
For such cases, we will not able to generate relocations since
multiple relocations are merged into one.
This patch introduced a passthrough builtin
to prevent such optimization. Looks like inline assembly has more
impact for optimizaiton, e.g., inlining. Using passthrough has
less impact on optimizations.
A new IR pass is introduced at the beginning of target-dependent
IR optimization, which does:
- report fatal error if any reloc global in PHI nodes
- remove all bpf passthrough builtin functions
Changes for existing CORE tests:
- for clang tests, add "-Xclang -disable-llvm-passes" flags to
avoid builtin->reloc_global transformation so the test is still
able to check correctness for clang generated IR.
- for llvm CodeGen/BPF tests, add "opt -O2 <ir_file> | llvm-dis" command
before "llc" command since "opt" is needed to call newly-placed
builtin->reloc_global transformation. Add target triple in the IR
file since "opt" requires it.
- Since target triple is added in IR file, if a test may produce
different results for different endianness, two tests will be
created, one for bpfeb and another for bpfel, e.g., some tests
for relocation of lshift/rshift of bitfields.
- field-reloc-bitfield-1.ll has different relocations compared to
old codes. This is because for the structure in the test,
new code returns struct layout alignment 4 while old code
is 8. Align 8 is more precise and permits double load. With align 4,
the new mechanism uses 4-byte load, so generating different
relocations.
- test intrinsic-transforms.ll is removed. This is used to test
cse on intrinsics so we do not lose metadata. Now metadata is attached
to global and not instruction, it won't get lost with cse.
Differential Revision: https://reviews.llvm.org/D87153
David Tenty [Thu, 3 Sep 2020 22:34:57 +0000 (18:34 -0400)]
[clang][driver][AIX] Set compiler-rt as default rtlib
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D88182
ogiroux [Mon, 28 Sep 2020 23:34:41 +0000 (16:34 -0700)]
Attempt to clear some msan errors in the libcxx atomic tests.
Diego Caballero [Mon, 28 Sep 2020 23:15:13 +0000 (16:15 -0700)]
[mlir][Affine][VectorOps] Fix super vectorizer utility (D85869)
Adding missing code that should have been part of "D85869: Utility to
vectorize loop nest using strategy."
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D88346
Kostya Kortchinsky [Mon, 28 Sep 2020 20:07:33 +0000 (13:07 -0700)]
[scudo][standalone] Remove unused atomic_compare_exchange_weak
`atomic_compare_exchange_weak` is unused in Scudo, and its associated
test is actually wrong since the weak variant is allowed to fail
spuriously (thanks Roland).
This lead to flakes such as:
```
[ RUN ] ScudoAtomicTest.AtomicCompareExchangeTest
../../zircon/third_party/scudo/src/tests/atomic_test.cpp:98: Failure: Expected atomic_compare_exchange_weak(reinterpret_cast<T *>(&V), &OldVal, NewVal, memory_order_relaxed) is true.
Expected: true
Which is: 01
Actual : atomic_compare_exchange_weak(reinterpret_cast<T *>(&V), &OldVal, NewVal, memory_order_relaxed)
Which is: 00
../../zircon/third_party/scudo/src/tests/atomic_test.cpp:100: Failure: Expected atomic_compare_exchange_weak( reinterpret_cast<T *>(&V), &OldVal, NewVal, memory_order_relaxed) is false.
Expected: false
Which is: 00
Actual : atomic_compare_exchange_weak( reinterpret_cast<T *>(&V), &OldVal, NewVal, memory_order_relaxed)
Which is: 01
../../zircon/third_party/scudo/src/tests/atomic_test.cpp:101: Failure: Expected OldVal == NewVal.
Expected: NewVal
Which is: 24
Actual : OldVal
Which is: 42
[ FAILED ] ScudoAtomicTest.AtomicCompareExchangeTest (0 ms)
[----------] 2 tests from ScudoAtomicTest (1 ms total)
```
So I am removing this, if someone ever needs the weak variant, feel
free to add it back with a test that is not as terrible. This test was
initially ported from sanitizer_common, but their weak version calls
the strong version, so it works for them.
Differential Revision: https://reviews.llvm.org/D88443
Jan Korous [Wed, 16 Sep 2020 16:52:51 +0000 (09:52 -0700)]
[clang] Selectively ena/disa-ble format-insufficient-args warning
Differential Revision: https://reviews.llvm.org/D87176
Mehdi Amini [Mon, 28 Sep 2020 20:46:22 +0000 (20:46 +0000)]
Guard `find_library(tensorflow_c_api ...)` by checking for TENSORFLOW_C_LIB_PATH to be set by the user
Also have CMake fails if the user provides a TENSORFLOW_C_LIB_PATH but
we can't find TensorFlow at this path.
At the moment the CMake script tries to figure if TensorFlow is
available on the system and enables support for it. This is in general
not desirable to customize build features this way and instead it is
preferable to let the user opt-in explicitly into the features they want
to enable. This is in line with other optional external dependencies
like Z3.
There are a few reasons to this but amongst others:
- reproducibility: making features "magically" enabled based on whether
we find a package on the system or not makes it harder to handle bug
reports from users.
- user control: they can't have TensorFlow on the system and build LLVM
without TensorFlow right now. They also would suddenly distribute LLVM
with a different set of features unknowingly just because their build
machine environment would change subtly.
Right now this is motivated by a user reporting build failures on their system:
.../mesa-git/llvm-git/src/llvm-project/llvm/lib/Analysis/TFUtils.cpp:23:10: fatal error: tensorflow/c/c_api.h: No such file or directory
23 | #include "tensorflow/c/c_api.h"
| ^~~~~~
It looks like we detected TensorFlow at configure time but couldn't set all the paths correctly.
Differential Revision: https://reviews.llvm.org/D88371
Philip Reames [Mon, 28 Sep 2020 22:08:25 +0000 (15:08 -0700)]
[CVP] Allow two transforms in one invocation
For a call site which had both constant deopt operands and nonnull arguments, we were missing the opportunity to recognize the later by bailing early.
This is somewhat of a speculative fix. Months ago, I'd had a private report of performance and compile time regressions from the deopt operand folding. I never received a test case. However, the only possibility I see was that after that change CVP missed the nonnull fold, and we end up with a pass ordering/missed simplification issue. So, since it's a real issue, fix it and hope.
Fangrui Song [Mon, 28 Sep 2020 22:05:09 +0000 (15:05 -0700)]
[EHStreamer] Simplify sharedTypeIDs with std::mismatch
(Note that EMStreamer.cpp is largely under tested. The only test checking the prefix sharing is CodeGen/WebAssembly/eh-lsda.ll)
Sean Silva [Thu, 24 Sep 2020 20:03:30 +0000 (13:03 -0700)]
[mlir][shape] Make conversion passes more consistent.
- use select-ops to make the lowering simpler
- change style of FileCheck variables names to be consistent
- change some variable names in the code to be more explicit
Differential Revision: https://reviews.llvm.org/D88258
Petr Hosek [Mon, 28 Sep 2020 21:18:55 +0000 (14:18 -0700)]
[libcxx] Don't pass -s to libtool
This flag is the default in libtool on Darwin, and it's not supported
by llvm-libtool-darwin causing a build failure.
Differential Revision: https://reviews.llvm.org/D88449
Louis Dionne [Mon, 28 Sep 2020 21:29:52 +0000 (17:29 -0400)]
[libc++] Fix constexpr dynamic allocation on GCC 10
We're technically not allowed by the Standard to call ::operator new in
constexpr functions like __libcpp_allocate. Clang doesn't seem to complain
about it, but GCC does.
Craig Topper [Mon, 28 Sep 2020 21:20:20 +0000 (14:20 -0700)]
[X86] Add support for calling SimplifyDemandedBits on the input of PDEP with a constant mask.
We can do several optimizations for PDEP using computeKnownBits and SimplifyDemandedBits
-If the MSBs of the output aren't demanded, those MSBs of the mask input aren't demanded either. We need to keep the most significant demanded bit of the mask and any mask bits before it.
-The number of possible ones in the mask determines how many bits of the lsbs of the other operand are demanded. Any bits of the mask we don't demand by the previous rule should not be counted.
-The result will have zeros in any position that the mask is zero.
-Since non-mask input bits can only be output in the original position or a higher bit position, the result will have at least as many trailing zeroes as the non-mask input.
Differential Revision: https://reviews.llvm.org/D87883
Craig Topper [Mon, 28 Sep 2020 21:14:14 +0000 (14:14 -0700)]
[X86] Add tests for D87883. NFC
Amara Emerson [Sat, 26 Sep 2020 17:02:39 +0000 (10:02 -0700)]
[GlobalISel] Add support for lowering of vector G_SELECT and use for AArch64.
The lowering is a port of the SDAG expansion.
Differential Revision: https://reviews.llvm.org/D88364
David Tenty [Thu, 20 Aug 2020 22:24:11 +0000 (18:24 -0400)]
[CMake][AIX] Limit tools in external project build
This is a follow on to D85329 which disabled some llvm tools in the
runtimes build due to XCOFF64 limitations. This change disables them
in other external project builds as well, when no list of tools is
specified in the arguments.
Reviewed By: hubert.reinterpretcast, stevewan
Differential Revision: https://reviews.llvm.org/D88310
Nico Weber [Mon, 28 Sep 2020 20:57:48 +0000 (16:57 -0400)]
[gn build] Re-run CompletionModelCodegen when input json files change
Aaron Ballman [Mon, 28 Sep 2020 20:49:15 +0000 (16:49 -0400)]
Fix a think-o with the numerical suffixes in the docs for init_priority.
Jonas Devlieghere [Mon, 28 Sep 2020 20:50:22 +0000 (13:50 -0700)]
[lldb] Add print_function import
Amara Emerson [Mon, 28 Sep 2020 20:42:56 +0000 (13:42 -0700)]
Revert "Revert "[AArch64][GlobalISel] Add selection support for <8 x s16> G_INSERT_VECTOR_ELT with GPR scalar.""
This isn't a real with the codegen, it's a previously known bug in clang which
causes non-deterministic failures due to garbage bits in undef registers being
used in saturating instructions.
I'm disabling the result checking for the test until this issue is resolved.
This reverts commit
6c8168324b5329c94fe7e8f9a1619802091b9bec.
Aart Bik [Mon, 28 Sep 2020 19:56:10 +0000 (12:56 -0700)]
[mlir] [VectorOps] Relaxed restrictions on vector.reduction types even more
Recently, restrictions on vector reductions were made more relaxed by
accepting any width signless integer and floating-point. This CL relaxes
the restriction even more by including unsigned and signed integers.
Reviewed By: bkramer
Differential Revision: https://reviews.llvm.org/D88442
Craig Topper [Mon, 28 Sep 2020 19:32:34 +0000 (12:32 -0700)]
[X86] Use inlineasm flag output for the _bittest* intrinsics.
Instead of expliciting emitting a setc in the inline asm instructions,
we can use flag output. This allows the backend to use the flag
directly if it is needed by a branch. Previously we needed a test
instruction to convert the register back to a flag.
If the flag can't be used directly, the backend will emit a setcc.
Differential Revision: https://reviews.llvm.org/D87888
Simon Pilgrim [Mon, 28 Sep 2020 20:31:55 +0000 (21:31 +0100)]
[InstCombine] Regenerate cast tests. NFC.
Eric Astor [Mon, 28 Sep 2020 20:11:44 +0000 (16:11 -0400)]
[COFF] Aliases resolve directly to defined external targets
Avoid introducing unnecessary indirection for weak-external references.
We only need to introduce ".weak.<SYMBOL>.default" when referencing a
symbol that is defined, but not external.
Reviewed By: mstorsjo
Differential Revision: https://reviews.llvm.org/D88305
Louis Dionne [Mon, 28 Sep 2020 19:47:49 +0000 (15:47 -0400)]
[libc++] Replace uses of __libcpp_allocate by std::allocator<>
Both are equivalent, however std::allocator can appear in constant
expressions and is higher level.
Louis Dionne [Mon, 28 Sep 2020 18:45:48 +0000 (14:45 -0400)]
[libc++] Add UNSUPPORTED markup to atomic test in single-threaded mode
Louis Dionne [Mon, 28 Sep 2020 18:28:45 +0000 (14:28 -0400)]
[libc++] Fix heap UaF issue in coroutine test
This wasn't being flagged by older versions of ASAN, but it is now.
Benjamin Kramer [Mon, 28 Sep 2020 20:06:34 +0000 (22:06 +0200)]
[wasm] Move WasmTraits.h to BinaryFormat
There's no dependency on Object in there and this avoids a cyclic
dependency between libMC and libObject.
Sanjay Patel [Mon, 28 Sep 2020 19:54:11 +0000 (15:54 -0400)]
[CostModel] remove hack for intrinsic cost based on cost type
This hack seems to only have been necessary because of the
constructor bug noted in
33125cffd.
Once again, it's hard to prove NFC, but that's the hope...
Jason Molenda [Mon, 28 Sep 2020 19:42:16 +0000 (12:42 -0700)]
Once we've found a firmware binary and loaded it, don't search more
Add the flag in ProcessMachCore::DoLoadCore that stops additional
searches for the binaries when we have an LC_NOTE identifying the
firmware/standalone binary as the correct one & we have loaded it
successfully.
Jonas Devlieghere [Mon, 28 Sep 2020 19:48:22 +0000 (12:48 -0700)]
[lldb] Enable markdown support for documentation
This enables support for writing LLDB documentation in markdown in
addition to reStructured text. We already had documentation written in
markdown (StructuredDataPlugins and DarwinLog) which will now also be
available on the website.
Baptiste Saleil [Mon, 28 Sep 2020 19:12:14 +0000 (14:12 -0500)]
[PowerPC] Legalize v256i1 and v512i1 and implement load and store of these types
This patch legalizes the v256i1 and v512i1 types that will be used for MMA.
It implements loads and stores of these types.
v256i1 is a pair of VSX registers, so for this type, we load/store the two
underlying registers. v512i1 is used for MMA accumulators. So in addition to
loading and storing the 4 associated VSX registers, we generate instructions to
prime (copy the VSX registers to the accumulator) after loading and unprime
(copy the accumulator back to the VSX registers) before storing.
This patch also adds the UACC register class that is necessary to implement the
loads and stores. This class represents accumulator in their unprimed form and
allow the distinction between primed and unprimed accumulators to avoid invalid
copies of the VSX registers associated with primed accumulators.
Differential Revision: https://reviews.llvm.org/D84968
Sanjay Patel [Mon, 28 Sep 2020 19:23:36 +0000 (15:23 -0400)]
[CostModel] fill in arguments as part of intrinsic attribute constructor
This appears to be an error of code duplication - instead of
one constructor variant calling another, we have N similar
but not identical versions.
I think this is 'NFC' based on the current callers, but it's
hard to tell or guess the intent in all cases.
Paweł Bylica [Mon, 28 Sep 2020 18:47:43 +0000 (20:47 +0200)]
[python][tests] Fix string comparison with "is"
Peter Collingbourne [Fri, 25 Sep 2020 20:36:30 +0000 (13:36 -0700)]
scudo: Re-order Allocator fields for improved performance. NFCI.
Move smaller and frequently-accessed fields near the beginning
of the data structure in order to improve locality and reduce
the number of instructions required to form an access to those
fields. With this change I measured a ~5% performance improvement on
BM_malloc_sql_trace_default on aarch64 Android devices (Pixel 4 and
DragonBoard 845c).
Differential Revision: https://reviews.llvm.org/D88350
Aart Bik [Mon, 28 Sep 2020 17:41:53 +0000 (10:41 -0700)]
[mlir] [VectorOps] changes to printing support for integers
(1) simplify integer printing logic by always using 64-bit print
(2) add index support (since vector<16xindex> is planned to be added)
(3) adjust naming convention print_x -> printX
Reviewed By: bkramer
Differential Revision: https://reviews.llvm.org/D88436
Jon Roelofs [Mon, 28 Sep 2020 18:30:07 +0000 (11:30 -0700)]
[AArch64] reuse another map iterator. NFC
Amara Emerson [Mon, 28 Sep 2020 17:59:08 +0000 (10:59 -0700)]
Revert "[AArch64][GlobalISel] Add selection support for <8 x s16> G_INSERT_VECTOR_ELT with GPR scalar."
This reverts commit
b5e87c9ef2243ecd65e0ef87a1bf303c0c26db04 as it seems to have
broken a bot.
Utkarsh Saxena [Mon, 28 Sep 2020 17:19:51 +0000 (19:19 +0200)]
[clangd] Rename evaluate() to evaluateHeuristics()
Since we have 2 scoring functions (heuristics and decision forest),
renaming the existing evaluate() function to be more descriptive of the
Heuristics being evaluated in it.
Differential Revision: https://reviews.llvm.org/D88431
Dominic Chen [Sat, 26 Sep 2020 22:04:31 +0000 (18:04 -0400)]
[AddressSanitizer] Copy type metadata to prevent miscompilation
When ASan and e.g. Dead Virtual Function Elimination are enabled, the
latter will rely on type metadata to determine if certain virtual calls can be
removed. However, ASan currently does not copy type metadata, which can cause
virtual function calls to be incorrectly removed.
Differential Revision: https://reviews.llvm.org/D88368
Simon Pilgrim [Mon, 28 Sep 2020 17:50:06 +0000 (18:50 +0100)]
[InstCombine] Add trunc(shr(trunc(x),c)) non-uniform vector tests
Heejin Ahn [Mon, 28 Sep 2020 02:04:54 +0000 (19:04 -0700)]
[WebAssembly] Use wasm::Signature for in ObjectWriter (NFC)
There are two `WasmSignature` structs, one in
include/llvm/BinaryFormat/Wasm.h and the other in
lib/MC/WasmObjectWriter.cpp. I don't know why they got separated in this
way in the first place, but it seems we can unify them to use the one in
Wasm.h for all cases.
Reviewed By: dschuff, sbc100
Differential Revision: https://reviews.llvm.org/D88428
Jessica Paquette [Wed, 23 Sep 2020 18:28:10 +0000 (11:28 -0700)]
[AArch64][GlobalISel] Infer whether G_PHI is going to be a FPR in regbankselect
Some instructions (G_LOAD, G_SELECT, G_UNMERGE_VALUES) check if their uses
will define/use FPRs (using `onlyUsesFP` and `onlyDefinesFP`).
The register bank of a use isn't necessarily known when an instruction asks for
this.
Teach `hasFPConstraints` to look at the instructions feeding into a G_PHI when
its destination bank is unknown. If any of them are FPR, assume the entire
G_PHI will also be assigned a FPR.
Since a phi can have many inputs, and those inputs can in turn be phis,
restrict the search depth to a very low number.
Also improve the docs for `hasFPConstraints` and friends a little.
This is a 0.3% code size improvement on CTMark/Bullet at -O3, and a 0.2% code
size improvement at CTMark/pairlocalalign at -O3.
Differential Revision: https://reviews.llvm.org/D88177
Sanjay Patel [Mon, 28 Sep 2020 16:28:23 +0000 (12:28 -0400)]
[CostModel] move early exit for free intrinsics
This should be NFC unless some target was expecting that
some form of cttz/ctlz/memcpy is free in terms of size/latency
but not free in throughput cost.
Sanjay Patel [Mon, 28 Sep 2020 14:11:08 +0000 (10:11 -0400)]
[CostModel] split handling of intrinsics from other calls
This should be close to NFC (no-functional-change), but I
can't completely rule out that some call on some target
travels down a different path. There's an especially large
amount of code spaghetti in this part of the cost model.
The goal is to clean up the intrinsic cost handling so
we can canonicalize to the new min/max intrinsics without
causing regressions.
Jessica Paquette [Fri, 11 Sep 2020 00:15:28 +0000 (17:15 -0700)]
[AArch64][GlobalISel] Support shifted register form in emitTST
Support emitting ANDSXrs and ANDSWrs in `emitTST`. Update opt-fold-compare.mir
to show that it works.
Differential Revision: https://reviews.llvm.org/D87530
Jessica Paquette [Fri, 18 Sep 2020 17:46:48 +0000 (10:46 -0700)]
[GlobalISel] Combine (xor (and x, y), y) -> (and (not x), y)
When we see this:
```
%and = G_AND %x, %y
%xor = G_XOR %and, %y
```
Produce this:
```
%not = G_XOR %x, -1
%new_and = G_AND %not, %y
```
as long as we are guaranteed to eliminate the original G_AND.
Also matches all commuted forms. E.g.
```
%and = G_AND %y, %x
%xor = G_XOR %y, %and
```
will be matched as well.
Differential Revision: https://reviews.llvm.org/D88104
Simon Pilgrim [Mon, 28 Sep 2020 16:40:48 +0000 (17:40 +0100)]
[InstCombine] Add basic trunc(shr(trunc(x),c)) tests
Helps improve the minor regressions noticed on D88316
Utkarsh Saxena [Tue, 22 Sep 2020 05:56:08 +0000 (07:56 +0200)]
[clangd] Use Decision Forest to score code completions.
By default clangd will score a code completion item using heuristics model.
Scoring can be done by Decision Forest model by passing `--ranking_model=decision_forest` to
clangd.
Features omitted from the model:
- `NameMatch` is excluded because the final score must be multiplicative in `NameMatch` to allow rescoring by the editor.
- `NeedsFixIts` is excluded because the generating dataset that needs 'fixits' is non-trivial.
There are multiple ways (heuristics) to combine the above two features with the prediction of the DF:
- `NeedsFixIts` is used as is with a penalty of `0.5`.
Various alternatives of combining NameMatch `N` and Decision forest Prediction `P`
- N * scale(P, 0, 1): Linearly scale the output of model to range [0, 1]
- N * a^P:
- More natural: Prediction of each Decision Tree can be considered as a multiplicative boost (like NameMatch)
- Ordering is independent of the absolute value of P. Order of two items is proportional to `a^{difference in model prediction score}`. Higher `a` gives higher weightage to model output as compared to NameMatch score.
Baseline MRR = 0.619
MRR for various combinations:
N * P = 0.6346, advantage%=2.5768
N * 1.1^P = 0.6600, advantage%=6.6853
N * **1.2**^P = 0.6669, advantage%=**7.8005**
N * **1.3**^P = 0.6668, advantage%=**7.7795**
N * **1.4**^P = 0.6659, advantage%=**7.6270**
N * 1.5^P = 0.6646, advantage%=7.4200
N * 1.6^P = 0.6636, advantage%=7.2671
N * 1.7^P = 0.6629, advantage%=7.1450
N * 2^P = 0.6612, advantage%=6.8673
N * 2.5^P = 0.6598, advantage%=6.6491
N * 3^P = 0.6590, advantage%=6.5242
N * scaled[0, 1] = 0.6465, advantage%=4.5054
Differential Revision: https://reviews.llvm.org/D88281
Stella Laurenzo [Mon, 28 Sep 2020 14:28:04 +0000 (07:28 -0700)]
Add FunctionType to MLIR C and Python bindings.
Differential Revision: https://reviews.llvm.org/D88416
Jon Roelofs [Mon, 28 Sep 2020 16:44:57 +0000 (09:44 -0700)]
[AArch64] Reuse map iterator instead of double lookup. NFC
Mikhail Maltsev [Mon, 28 Sep 2020 16:46:03 +0000 (17:46 +0100)]
[unittests] Preserve LD_LIBRARY_PATH in crash recovery test
We need to preserve the LD_LIBRARY_PATH environment variable when
spawning a child process (certain setups rely on non-standard paths
for e.g. libstdc++). In order to achieve this, set
LLVM_CRC_UNIXCRCRETURNCODE in the parent process instead of creating
the child's environment from scratch.
Reviewed By: aganea
Differential Revision: https://reviews.llvm.org/D88308
Vedant Kumar [Fri, 25 Sep 2020 20:09:47 +0000 (13:09 -0700)]
[ubsan] nullability-arg: Fix crash on C++ member pointers
Extend -fsanitize=nullability-arg to handle call sites which accept C++
member pointers.
rdar://
62476022
Differential Revision: https://reviews.llvm.org/D88336
Utkarsh Saxena [Tue, 22 Sep 2020 05:56:08 +0000 (07:56 +0200)]
[clangd] Add a trained DecisionForest for code completion.
Replaces the dummy CodeCompletion model with a trained DecisionForest
model.
The features.json needs to be manually curated specifying the features
to be used. This is a one-time cost and does not change if the model
changes until we decide to add/remove features.
Differential Revision: https://reviews.llvm.org/D88071
Jonas Devlieghere [Mon, 28 Sep 2020 16:04:32 +0000 (09:04 -0700)]
Revert "Add the ability to write target stop-hooks using the ScriptInterpreter."
This temporarily reverts commit
b65966cff65bfb66de59621347ffd97238d3f645
while Jim figures out why the test is failing on the bots.
Michael Liao [Mon, 28 Sep 2020 14:51:17 +0000 (10:51 -0400)]
[clang][codegen] Annotate `correctly-rounded-divide-sqrt-fp-math` fn-attr for OpenCL only.
- `-cl-fp32-correctly-rounded-divide-sqrt` is an OpenCL-specific option
and `correctly-rounded-divide-sqrt-fp-math` should be added for OpenCL
at most.
Differential revision: https://reviews.llvm.org/D88303
Jay Foad [Mon, 28 Sep 2020 15:19:23 +0000 (16:19 +0100)]
[AMDGPU] Reformat AMDGPUTargetLowering::isSDNodeAlwaysUniform. NFC.
Sam Parker [Fri, 25 Sep 2020 08:36:40 +0000 (09:36 +0100)]
[ARM][LowOverheadLoops] Cleanup and re-arrange
Rename and reorganise how we decide where to put the LoopStart
instruction.
Tres Popp [Mon, 28 Sep 2020 15:03:35 +0000 (17:03 +0200)]
[llvm] Fix unused variable in non-debug configurations
Meera Nakrani [Mon, 28 Sep 2020 14:50:19 +0000 (14:50 +0000)]
[ARM] Added more patterns to generate SSAT/USAT with shift
Added patterns to generate an SSAT or USAT with shift for
SSAT/USAT instructions that are matched from IR patterns.
Differential Revision: https://reviews.llvm.org/D88145
Cameron McInally [Mon, 28 Sep 2020 13:57:00 +0000 (08:57 -0500)]
[SVE] Lower fixed length VECREDUCE_[UMAX|UMIN] to Scalable
Essentially the same as the signed variants from D88259. Also includes a clean up of the lowering function.
Differential Revision: https://reviews.llvm.org/D88317
Juneyoung Lee [Sat, 26 Sep 2020 14:56:30 +0000 (23:56 +0900)]
[ValueTracking] Fix analyses to update CxtI to be phi's incoming edges' terminators
It was mentioned that D88276 that when a phi node is visited, terminators at their incoming edges should be used for CtxI.
This is a patch that makes two functions (ComputeNumSignBitsImpl, isGuaranteedNotToBeUndefOrPoison) to do so.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D88360
Paul C. Anagnostopoulos [Fri, 25 Sep 2020 17:23:07 +0000 (13:23 -0400)]
[TableGen] Improved messages in PseudoLoweringEmitter.
Simon Pilgrim [Mon, 28 Sep 2020 14:12:23 +0000 (15:12 +0100)]
[InstCombine] matchRotate - force splat of uniform constant rotation amounts (PR46895)
Fixes minor bug in D88402 where we were using the original shift constant (with undefs) instead of one with the splat values (re)splatted to all elements.
Sam Parker [Mon, 28 Sep 2020 13:44:51 +0000 (14:44 +0100)]
[NFC][ARM] Factor out some logic for LoLoops.
Create a DCE function that accepts an instruction.
Jay Foad [Mon, 28 Sep 2020 13:34:23 +0000 (14:34 +0100)]
[AMDGPU] Reformat SITargetLowering::isSDNodeSourceOfDivergence. NFC.
Georgii Rymar [Mon, 28 Sep 2020 11:43:19 +0000 (14:43 +0300)]
[llvm-readobj/elf] - Fix the PREL31 relocation computation used for dumping arm32 unwind info (-u).
This is a part of https://bugs.llvm.org/show_bug.cgi?id=47581.
We have the following computation:
```
(1) uint64_t Location = Address & 0x7fffffff;
(2) if (Location & 0x04000000)
(3) Location |= (uint64_t) ~0x7fffffff;
(4) return Location + Place;
```
At line 2 there is a mistype. The constant should be `0x40000000`,
not `0x04000000`, because the intention here is to sign extend the `Location`,
which is the 31 bit signed value.
Differential revision: https://reviews.llvm.org/D88407
Alexander Kornienko [Mon, 28 Sep 2020 12:58:27 +0000 (14:58 +0200)]
[clang-tidy] IncludeInserter: allow <> in header name
This adds a pair of overloads for create(MainFile)?IncludeInsertion methods that
use the presence of the <> in the file name to control whether the #include
directive will use angle brackets or quotes. Motivating examples:
https://reviews.llvm.org/D82089#inline-789412 and
https://github.com/llvm/llvm-project/blob/master/clang-tools-extra/clang-tidy/modernize/MakeSmartPtrCheck.cpp#L433
The overloads with the IsAngled parameter can be removed after the users are
updated.
Update usages of createIncludeInsertion.
Update (almost all) usages of createMainFileIncludeInsertion.
Reviewed By: hokein
Differential Revision: https://reviews.llvm.org/D85666
Haojian Wu [Mon, 28 Sep 2020 13:08:28 +0000 (15:08 +0200)]
[clang] Don't emit "no member" diagnostic if the lookup fails on an invalid record decl.
The "no member" diagnostic is likely bogus.
Reviewed By: sammccall, #libc
Differential Revision: https://reviews.llvm.org/D86765