Vitaly Buka [Wed, 28 Apr 2021 01:33:58 +0000 (18:33 -0700)]
[scudo] Enable arm32 arch
Dávid Bolvanský [Tue, 27 Apr 2021 21:19:44 +0000 (23:19 +0200)]
[DSE] Eliminate zero memset after calloc
Solves PR11896
As noted, this can be improved futher (calloc -> malloc) in some cases. But for know, this is the first step.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D101391
Hongtao Yu [Wed, 28 Apr 2021 00:16:29 +0000 (17:16 -0700)]
[CSSPGO] Fix an AV caused by a block that has only pseudo pseudo instructions.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D101415
Greg McGary [Tue, 27 Apr 2021 19:22:44 +0000 (12:22 -0700)]
[lld-macho][NFC] define more strings in section_names:: and segment_names::
As preparation for a subsequent diff that implements builtin section renaming, define more `constexpr` strings in namespaces `lld::macho::segment_names` and `lld::macho::section_names`, and use them to replace string literals.
Differential Revision: https://reviews.llvm.org/D101393
Ahmed Taei [Mon, 26 Apr 2021 19:35:12 +0000 (12:35 -0700)]
Handle the case of tile and pad a subset of the dimensions
This is useful in cases such as tile-distribute-and-pad where not all
dims are tiled
Differential Revision: https://reviews.llvm.org/D101319
Han Zhu [Tue, 9 Feb 2021 01:24:25 +0000 (17:24 -0800)]
[loop-idiom] Hoist loop memcpys to loop preheader
For a simple loop like:
```
struct S {
int x;
int y;
char b;
};
unsigned foo(S* __restrict__ a, S* b, int n) {
for (int i = 0; i < n; i++)
a[i] = b[i];
return sizeof(a[0]);
}
```
We could eliminate the loop and convert it to a large memcpy of 12*n bytes. Currently this is not handled. Output of `opt -loop-idiom -S < memcpy_before.ll`
```
%struct.S = type { i32, i32, i8 }
define dso_local i32 @_Z3fooP1SS0_i(%struct.S* noalias nocapture %a, %struct.S* nocapture readonly %b, i32 %n) local_unnamed_addr {
entry:
%cmp7 = icmp sgt i32 %n, 0
br i1 %cmp7, label %for.body.preheader, label %for.cond.cleanup
for.body.preheader: ; preds = %entry
br label %for.body
for.cond.cleanup.loopexit: ; preds = %for.body
br label %for.cond.cleanup
for.cond.cleanup: ; preds = %for.cond.cleanup.loopexit, %entry
ret i32 12
for.body: ; preds = %for.body, %for.body.preheader
%i.08 = phi i32 [ %inc, %for.body ], [ 0, %for.body.preheader ]
%idxprom = zext i32 %i.08 to i64
%arrayidx = getelementptr inbounds %struct.S, %struct.S* %b, i64 %idxprom
%arrayidx2 = getelementptr inbounds %struct.S, %struct.S* %a, i64 %idxprom
%0 = bitcast %struct.S* %arrayidx2 to i8*
%1 = bitcast %struct.S* %arrayidx to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* nonnull align 4 dereferenceable(12) %0, i8* nonnull align 4 dereferenceable(12) %1, i64 12, i1 false)
%inc = add nuw nsw i32 %i.08, 1
%cmp = icmp slt i32 %inc, %n
br i1 %cmp, label %for.body, label %for.cond.cleanup.loopexit
}
; Function Attrs: argmemonly nofree nosync nounwind willreturn
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nocapture writeonly, i8* noalias nocapture readonly, i64, i1 immarg) #0
attributes #0 = { argmemonly nofree nosync nounwind willreturn }
```
The loop idiom pass currently only handles load and store instructions. Since struct S is too big to fit in a register, the loop body contains a memcpy intrinsic.
With this change, re-run `opt -loop-idiom -S < memcpy_before.ll`. The loop memcpy is promoted to loop preheader. For this trivial case, the loop is dead and will be removed by another pass.
```
%struct.S = type { i32, i32, i8 }
define dso_local i32 @_Z3fooP1SS0_i(%struct.S* noalias nocapture %a, %struct.S* nocapture readonly %b, i32 %n) local_unnamed_addr {
entry:
%a1 = bitcast %struct.S* %a to i8*
%b2 = bitcast %struct.S* %b to i8*
%cmp7 = icmp sgt i32 %n, 0
br i1 %cmp7, label %for.body.preheader, label %for.cond.cleanup
for.body.preheader: ; preds = %entry
%0 = zext i32 %n to i64
%1 = mul nuw nsw i64 %0, 12
call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %a1, i8* align 4 %b2, i64 %1, i1 false)
br label %for.body
for.cond.cleanup.loopexit: ; preds = %for.body
br label %for.cond.cleanup
for.cond.cleanup: ; preds = %for.cond.cleanup.loopexit, %entry
ret i32 12
for.body: ; preds = %for.body, %for.body.preheader
%i.08 = phi i32 [ %inc, %for.body ], [ 0, %for.body.preheader ]
%idxprom = zext i32 %i.08 to i64
%arrayidx = getelementptr inbounds %struct.S, %struct.S* %b, i64 %idxprom
%arrayidx2 = getelementptr inbounds %struct.S, %struct.S* %a, i64 %idxprom
%2 = bitcast %struct.S* %arrayidx2 to i8*
%3 = bitcast %struct.S* %arrayidx to i8*
%inc = add nuw nsw i32 %i.08, 1
%cmp = icmp slt i32 %inc, %n
br i1 %cmp, label %for.body, label %for.cond.cleanup.loopexit
}
; Function Attrs: argmemonly nofree nosync nounwind willreturn
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* noalias nocapture writeonly, i8* noalias nocapture readonly, i64, i1 immarg) #0
attributes #0 = { argmemonly nofree nosync nounwind willreturn }
```
Reviewed By: zino
Differential Revision: https://reviews.llvm.org/D97667
David Tenty [Tue, 27 Apr 2021 23:36:45 +0000 (19:36 -0400)]
[AIX] Add %pluginext and update tests to use proper pluginext
As a follow on to D96282, since bug point passes is built as a module the proper file extension to use is LLVM_PLUGIN_EXT, rather than SHLIBEXT. Using SHLIBEXT causes the tests to load a non-existent file on AIX. We also adjust the PluginsTest unittest to use LLVM_PLUGIN_EXT for similar reasons.
This change should hopefully make little difference to other platforms, since generally `SHLIBEXT=LTDL_SHLIB_EXT=CMAKE_SHARED_LIBRARY_SUFFIX` and `LLVM_PLUGIN_EXT=CMAKE_SHARED_LIBRARY_SUFFIX` on every platform except AIX.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D101412
Rob Suderman [Sat, 24 Apr 2021 02:24:49 +0000 (19:24 -0700)]
[tosa][mlir] Fix FullyConnected to correctly order dimensions
MatMul and FullyConnected have transposed dimensions for the weights.
Also, removed uneeded tensor reshape for bias.
Differential Revision: https://reviews.llvm.org/D101220
Rob Suderman [Sat, 24 Apr 2021 05:00:06 +0000 (22:00 -0700)]
[mlir][tosa] Add tosa.negate lowerings for quantized cases
Quantized negation can be performed using higher bits operations.
Minimal bits are picked to perform the operation.
Differential Revision: https://reviews.llvm.org/D101225
Jim Radford [Sat, 3 Apr 2021 06:51:17 +0000 (23:51 -0700)]
[CMake][llvm] avoid conflict w/ (and use when available) new builtin check_linker_flag
Match the API for the new check_linker_flag and use it directly when
available, leaving the old code as a fallback.
Differential Revision: https://reviews.llvm.org/D100901
Alexander Shaposhnikov [Tue, 27 Apr 2021 23:18:12 +0000 (16:18 -0700)]
Revert "[llvm-objcopy][MachO] Add support for LC_THREAD/LC_UNIXTHREAD"
This reverts commit
4dfddf715b94857998601aa79c25e4f327d44dfa
since it breaks some build bots (e.g. clang-ppc64be-linux)
Heejin Ahn [Tue, 27 Apr 2021 21:59:12 +0000 (14:59 -0700)]
[WebAssembly] Error when wasm EH is used with Emscripten EH/SjLj
- Error out when both Emscripten EH and wasm EH are used together, i.e.,
both `-enable-emscripten-cxx-exceptions` and `-exception-model=wasm`
are given together. This will not happen if you use Emscripten, but
this can happen when you call `llc` manually with wrong set of
arguments.
- Currently we don't yet support using wasm EH with Emscripten SjLj.
Unlike `-enable-emscripten-cxx-exceptions` which is turned on only
when you use `emcc -s DISABLE_EXCEPTION_CATCHING=0`,
`-enable-emscripten-sjlj` is turned on by Emscripten by default. So we
error out only when it is turned on and `setjmp` or `longjmp` is
actually used.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D101403
Petr Hosek [Tue, 27 Apr 2021 21:26:55 +0000 (14:26 -0700)]
[Driver] Add -print-multiarch
This is useful in runtimes build for example which currently try to
guess the correct triple where to place libraries in the multiarch
layout. Using this flag, the build system can get the correct triple
directly by querying Clang.
Differential Revision: https://reviews.llvm.org/D101400
Alexander Shaposhnikov [Tue, 27 Apr 2021 22:54:28 +0000 (15:54 -0700)]
[llvm-objcopy][MachO] Add support for LC_THREAD/LC_UNIXTHREAD
Add support for LC_THREAD/LC_UNIXTHREAD
(these load commands can be copied over without any modifications).
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D101384
Joseph Huber [Tue, 27 Apr 2021 22:02:05 +0000 (18:02 -0400)]
[OpenMP] Remove legacy pass manager run lines
Summary:
Two tests in OpenMPOpt currently fail using the legacy pass manager. Remove
these run lines to prevent tests from failing.
Jez Ng [Tue, 27 Apr 2021 22:02:18 +0000 (18:02 -0400)]
[lld-macho] Don't put an antivirus test file in reproduce.s
It appears that some antivirii do not recognize that "this is a
test": https://reviews.llvm.org/D101218#2720676
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D101402
Jez Ng [Mon, 26 Apr 2021 05:23:32 +0000 (01:23 -0400)]
[lld-macho] std::sort -> llvm::sort
Craig Topper [Tue, 27 Apr 2021 21:38:40 +0000 (14:38 -0700)]
[SelectionDAG] Use a VTSDNode to store the saturation width for FP_TO_SINT_SAT/FP_TO_UINT_SAT
Previously we used an i32 constant to store the saturation width, but i32 isn't
legal on RISCV64. This wasn't a big deal to fix, but it is extra work for the
type legalizer.
This patch uses a VTSDNode to store the type similar to SEXT_INREG. This makes
it opaque to the type legalizer.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D101262
Craig Topper [Tue, 27 Apr 2021 19:48:44 +0000 (12:48 -0700)]
[RISCV] Select 5 bit immediate for VSETIVLI during isel rather than peepholing in the custom inserter.
This adds a special operand type that is allowed to be either
an immediate or register. By giving it a unique operand type the
machine verifier will ignore it.
This perturbs a lot of tests but mostly it is just slightly different
instruction orders. Something bad did happen to some min/max reduction
tests. We're spilling vector registers when we weren't before.
Reviewed By: khchen
Differential Revision: https://reviews.llvm.org/D101246
Reid Kleckner [Tue, 27 Apr 2021 21:32:04 +0000 (00:32 +0300)]
[NFC][SimplifyCFG] Precommit SimplifyCFG tests from D29428
Roman Lebedev [Tue, 27 Apr 2021 21:24:44 +0000 (00:24 +0300)]
[NFC][SimplifyCFG] Autogenerate check lines in few more tests
River Riddle [Tue, 27 Apr 2021 21:27:08 +0000 (14:27 -0700)]
[mlir] Fix bug in ForwardDataFlowAnalysis solver
Explicitly check for uninitialized to prevent crashes in edge cases where the derived analysis creates a lattice element for a value that hasn't been visited yet.
Arthur Eubanks [Tue, 27 Apr 2021 19:35:25 +0000 (12:35 -0700)]
[ConstFold] Use const-folded operands in more places
Previously we were const folding operands but not passing them.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D101394
Michael Kruse [Tue, 27 Apr 2021 21:27:32 +0000 (16:27 -0500)]
[OpenMP][CMake] Pass --cuda-path to regression tests.
The OpenMP runtime can be compiled using a CUDA installed at non-default
location with the -DCUDA_TOOLKIT_ROOT_DIR setting. However, check-openmp
will fail afterwards because Clang needs to know where to find the CUDA
headers.
Fix by passing -cuda-path to Clang using the value of
CUDA_TOOLKIT_ROOT_DIR which has been determined by CMake. Also set
LD_LIBRARY_PATH such that it can find the cuda runtime when executing.
This will ensure that the regression test do not depend on the current
environment, but use the environment it was configured for.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D101266
Dávid Bolvanský [Tue, 27 Apr 2021 19:25:32 +0000 (21:25 +0200)]
[DSE] Added testcases for 11896, NFC
Han Zhu [Tue, 30 Mar 2021 19:56:19 +0000 (12:56 -0700)]
[loop-idiom][NFC] Extract processLoopStoreOfLoopLoad into a helper function
Differential Revision: https://reviews.llvm.org/D100979
Nikita Popov [Fri, 23 Apr 2021 20:02:27 +0000 (22:02 +0200)]
[SCEV] Handle uge/ugt predicates in applyLoopGuards()
These can be handled the same way as ule/ult, just using umax
instead of umin. This is useful in cases where the umax prevents
the upper bound from overflowing.
Differential Revision: https://reviews.llvm.org/D101196
Alexey Bataev [Tue, 27 Apr 2021 20:38:30 +0000 (13:38 -0700)]
[SLP]Add a test for possibly vectorized tiny tree, NFC.
Dmitry Vyukov [Tue, 27 Apr 2021 18:19:28 +0000 (20:19 +0200)]
tsan: fix build with COMPILER_RT_TSAN_DEBUG_OUTPUT
COMPILER_RT_TSAN_DEBUG_OUTPUT enables TSAN_COLLECT_STATS,
which changes layout of runtime structs (some structs contain
stats when the option is enabled).
It's not OK to build runtime with the define, but tests without it.
The error is detected by build_consistency_stats/nostats.
Fix this by defining TSAN_COLLECT_STATS for tests to match the runtime.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D101386
Dmitry Vyukov [Fri, 23 Apr 2021 11:10:39 +0000 (13:10 +0200)]
tsan: refactor fork handling
Commit
efd254b6362 ("tsan: fix deadlock in pthread_atfork callbacks")
fixed another deadlock related to atfork handling.
But builders with DCHECKs enabled reported failures of
pthread_atfork_deadlock2.c and pthread_atfork_deadlock3.c tests
related to the fact that we hold runtime locks on interceptor exit:
https://lab.llvm.org/buildbot/#/builders/70/builds/6727
This issue is somewhat inherent to the current approach,
we indeed execute user code (atfork callbacks) with runtime lock held.
Refactor fork handling to not run user code (atfork callbacks)
with runtime locks held. This change does this by installing
own atfork callbacks during runtime initialization.
Atfork callbacks run in LIFO order, so the expectation is that
our callbacks run last, right before the actual fork.
This way we lock runtime mutexes around fork, but not around
user callbacks.
Extend tests to also install after fork callbacks just to cover
more scenarios. Some tests also started reporting real races
that we previously suppressed.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D101385
Samuel Thibault [Tue, 27 Apr 2021 20:36:12 +0000 (13:36 -0700)]
Hurd: Clean up Debian multiarch /usr/include/<triplet>
This is a follow-up of
35dd6470de84 for the Hurd case, to avoid the
duplication of the i386-gnu path, already provided by
Hurd::getMultiarchTriple.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D101324
Nikita Popov [Tue, 27 Apr 2021 20:33:17 +0000 (22:33 +0200)]
[SCEV] Improve loop guard tests (NFC)
Invert the branch order to make the predicate more obvious.
Add tests with two predicates, to show that rewrites are
combined.
Fangrui Song [Tue, 27 Apr 2021 20:31:37 +0000 (13:31 -0700)]
Gnu: Replace with a GCCInstallation.isValid() check with assert
Adrian Prantl [Tue, 27 Apr 2021 20:23:48 +0000 (13:23 -0700)]
Update testcase for D101333.
Samuel Thibault [Tue, 27 Apr 2021 20:19:17 +0000 (13:19 -0700)]
hurd: Clean up test
- Unsupported Windows to drop backslashes code
- Upgrade to current gcc 10 version
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D101347
Arthur Eubanks [Tue, 27 Apr 2021 20:06:32 +0000 (13:06 -0700)]
[test] Fix some func-attrs tests under the legacy PM
The new PM doesn't visit declarations in CGSCC passes. These tests
aren't testing that detail, so just run them against the new PM.
Samuel Thibault [Tue, 27 Apr 2021 20:04:41 +0000 (13:04 -0700)]
hurd: Detect libstdc++ include paths on Debian Hurd i386
This is a follow-up of
e92d2b80c6c9 ("[Driver] Detect libstdc++ include
paths for native gcc (-m32 and -m64) on Debian i386") for the Debian Hurd
case, which has the same multiarch name reduction from i686 to i386.
i386-linux-gnu is actually Linux-only, so this moves the code of that commit
to Linux.cpp, and adds the same to Hurd.cpp
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D101331
Peter Steinfeld [Mon, 26 Apr 2021 22:20:22 +0000 (15:20 -0700)]
[flang] Handle structure constructors with forward references to PDTs
We were not correctly handling structure constructors that had forward
references to parameterized derived types. I harvested the code that checks
for forward references that was used during analysis of function call
expressions and called it from there and also called it during the
analysis of structure constructors.
I also added a test that will produce an internal error without this change.
Differential Revision: https://reviews.llvm.org/D101330
Samuel Thibault [Tue, 27 Apr 2021 19:41:18 +0000 (12:41 -0700)]
hurd: Fix i386 research path
f26341840253 ("[Driver] Gnu.cpp: remove obsoleted i386 triple detection
from end-of-life distribution versions") dropped the i686-gnu gcc path, but
GNU/Hurd's gcc is actually using it, and not i386.
This fixes the gcc path and update the tests to reflect it.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D101317
Evgenii Stepanov [Tue, 27 Apr 2021 19:31:34 +0000 (12:31 -0700)]
Revert "tsan: fix deadlock in pthread_atfork callbacks"
Tests fail on debug builders. See the forward fix in
https://reviews.llvm.org/D101385.
This reverts commit
efd254b63621de9ce750eddf9e8135154099d261.
Sanjay Patel [Tue, 27 Apr 2021 19:14:29 +0000 (15:14 -0400)]
[InstCombine] fold clamp to 2 values from min/max intrinsics
The "select" versions of these folds is also missing and can
cause infinite loops as shown in:
https://llvm.org/PR48900
...but it seems easier to match these as max/min as a first fix.
https://alive2.llvm.org/ce/z/wv-_dT
Sanjay Patel [Tue, 27 Apr 2021 18:43:09 +0000 (14:43 -0400)]
[InstCombine] add tests for clamp patterns using min/max intrinsics; NFC
Andy Kaylor [Tue, 27 Apr 2021 19:15:26 +0000 (12:15 -0700)]
[Dependence Analysis] Fix ExactSIV producing wrong analysis
Patch by Artem Radzikhovskyy!
Symptom: ExactSIV test produced incorrect analysis of dependencies see LIT tests
Bug: At the end of the algorithm when determining dependence direction original author forgot to divide intermediate results by gcd and round result toward zero
Although this bug can be fixed with significantly fewer changes I opted to write the code in such a way that reflects the original algorithm that Banerjee proposed, for easier reference in the future. This surprisingly results in shorter code, and fewer quotient and max/min calculations.
Changes Summary:
- fixed findGCD to return valid x and y so that they match the function description where: ax - by = gcd(a,b)
- Fixed ExactSIV test, to produce proper results
- Documented the extension of Banerjee's algorithm that the original code author introduced. Banerjee's original algorithm only tested whether Dst depends on Src, the extension also allows us to test whether Src depends on Dst, in one pass.
- ExactRDIV test worked fine. Since it uses findGCD(), it needed to be updated.Since ExactRDIV test has very few changes from the core algorithm of ExactSIV I modified the test to have consistent format as ExactSIV.
- Updated the LIT tests to be testing for correct values.
Differential Revision: https://reviews.llvm.org/D100331
Jay Foad [Tue, 27 Apr 2021 16:03:22 +0000 (17:03 +0100)]
[AMDGPU] GCNHazardRecognizer: ignore all meta instructions
This is hopefully NFC, but should be more robust in ignoring all
instructions that should be ignored, instead of just some of them.
Differential Revision: https://reviews.llvm.org/D101372
Evgenii Stepanov [Tue, 27 Apr 2021 19:06:21 +0000 (12:06 -0700)]
Fix -Wunused-but-set-variable warning in msan_test.cpp
Roman Lebedev [Tue, 27 Apr 2021 18:28:53 +0000 (21:28 +0300)]
[NFC][SimplifyCFG] Autogenerate check lines in many test files
These are potentially being affected by an upcoming patch.
David Green [Tue, 27 Apr 2021 18:33:24 +0000 (19:33 +0100)]
[ARM] Recognize VIDUP from BUILDVECTORs of additions
This adds a pattern to recognize VIDUP from BUILD_VECTOR of incrementing
adds. This can come up from either geps or adds, and came up recently in
D100550. We are just looking for a BUILD_VECTOR where each lane is an
add of the first lane with N*i, where i is the lane and N is one of 1,
2, 4, or 8, supported by the VIDUP instruction.
Differential Revision: https://reviews.llvm.org/D101263
David Green [Tue, 27 Apr 2021 08:37:31 +0000 (09:37 +0100)]
[ARM] Additional VIDUP tests. NFC
Alexey Bataev [Wed, 21 Apr 2021 16:59:58 +0000 (09:59 -0700)]
[COST][X86]Improve cost model for reverse shuffle v32i16/v64i8 in AVX512F.
Improved cost model for reverse shuffle on AVX512F for types
v32i16/v64i8.
Differential Revision: https://reviews.llvm.org/D100974
Roman Lebedev [Tue, 27 Apr 2021 18:09:20 +0000 (21:09 +0300)]
[NFC][Verifier] Fixup token PHINode test cases
It would still pass in non-assert build,
but with asserts it would now crash.
I haven't checked, but hopefully `not`'s `--crash` argument
should be enough to support both paths.
Jessica Clarke [Tue, 27 Apr 2021 18:03:57 +0000 (19:03 +0100)]
[ELF][MIPS] Emit dynamic relocations for PIC non-preemptible static TLS
This is the same problem as
127176e59eb9, but for static TLS rather than
dynamic TLS. Although we know the symbol will be the one in our own TLS
segment, and thus the offset of it within that, we don't know where in
the static TLS block our data will be allocated and thus we must emit a
dynamic relocation for this case.
Reviewed By: MaskRay, atanasyan
Differential Revision: https://reviews.llvm.org/D101381
Jessica Clarke [Tue, 27 Apr 2021 17:57:47 +0000 (18:57 +0100)]
[ELF][MIPS] Don't emit dynamic relocations for PIE non-preemptible TLS
Whilst not wrong (unless using static PIE where the relocations are
likely not implemented by the runtime), this is inefficient, as the TLS
module indices and offsets are independent of the executable's load
address.
Reviewed By: MaskRay, atanasyan
Differential Revision: https://reviews.llvm.org/D101382
Ahmed Bougacha [Tue, 20 Apr 2021 15:59:51 +0000 (08:59 -0700)]
[docs] Replace Apple representative to security group.
Differential Revision: https://reviews.llvm.org/D100864
Roman Lebedev [Tue, 27 Apr 2021 17:52:44 +0000 (20:52 +0300)]
[NFC][IR] PHINode: ... and assert in another ctor too
Roman Lebedev [Tue, 27 Apr 2021 17:40:15 +0000 (20:40 +0300)]
[NFC][IR] PHINode: assert we aren't trying to create token-typed PHI
Verifier will complain, but by then it may be too late,
because we might have never reached it because
we already crashed with some bogus bug.
It is best to catch this the moment it happens.
Anirudh Prasad [Tue, 27 Apr 2021 17:37:16 +0000 (13:37 -0400)]
[SystemZ][z/OS] Remove register prefixes when printing out the register.
- This patch is the first part in enforcing prefix-less registers for the HLASM dialect in z/OS
- This patch removes the "%[r|f|v]" prefix while printing registers
- To achieve this, the `AssemblerDialect` field of MAI was used
- There is also a bit of refactoring done to ensure code repetition is reduced.
- Currently the LLVM assembler for SystemZ/z/OS accepts both prefixed registers and prefix-less registers. A subsequent follow-up patch will restrict the SystemZAsmParser to only accept prefix-less registers.
Crediting @kianm as an author as well.
Reviewed By: uweigand, abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D101308
Craig Topper [Tue, 27 Apr 2021 16:21:28 +0000 (09:21 -0700)]
[TableGen] Add predicate checks to isel patterns for default HwMode.
As discussed in D100691 and based on D100889.
I removed the ModeChecks cache which provides little value. Reduced
from three loops to two. Used ArrayRef to pass the Predicate to
AppendPattern to avoid needing to construct a vector for single
mode. Used SmallVector to avoid heap allocation constructing
DefaultCheck for the in tree targets the use it.
Reviewed By: kparzysz
Differential Revision: https://reviews.llvm.org/D101240
Vitaly Buka [Tue, 27 Apr 2021 17:45:53 +0000 (10:45 -0700)]
[NFC][lsan] Another attempt to fix arm bot
Adrian Prantl [Tue, 27 Apr 2021 17:38:51 +0000 (10:38 -0700)]
Also display the underlying error message when displaying a fixit
When the user running LLDB with default settings sees the fixit
notification it means that the auto-applied fixit didn't work. This
patch shows the underlying error message instead of just the fixit to
make it easier to understand what the error in the expression was.
Differential Revision: https://reviews.llvm.org/D101333
Michał Górny [Sat, 24 Apr 2021 20:36:20 +0000 (22:36 +0200)]
[lldb] [gdb-remote] Report QPassSignals and qXfer via extensions API
Remove hardcoded platform list for QPassSignals, qXfer:auxv:read
and qXfer:libraries-svr4:read and instead query the process plugin
via the GetSupportedExtensions() API.
Differential Revision: https://reviews.llvm.org/D101241
Petr Hosek [Tue, 27 Apr 2021 09:15:37 +0000 (02:15 -0700)]
[Driver] Fix tests failing in per-target multiarch layout
These failures were revealed by
b4537c3f51bc6c011ddd9c10b80043ac4ce16a01.
Differential Revision: https://reviews.llvm.org/D101348
Nick Desaulniers [Tue, 27 Apr 2021 16:58:42 +0000 (09:58 -0700)]
[CodeGenOptions] make StackProtectorGuardOffset signed
GCC supports negative values for -mstack-protector-guard-offset=, this
should be a signed value. Pre-req to D100919.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D101325
LLVM GN Syncbot [Tue, 27 Apr 2021 16:56:33 +0000 (16:56 +0000)]
[gn build] Port
241c2da4064c
Victor Huang [Thu, 22 Apr 2021 16:00:28 +0000 (11:00 -0500)]
[AIX][Power10] Restrict prefixed instructions from crossing the 64byte boundary
This patch adds the support to restrict prefixed instruction from
crossing the 64 byte boundary:
- Add the infrastructure to register a custom XCOFF streamer
- Add a custom XCOFF streamer for PowerPC to allow us to
intercept instructions as they are being emitted and align all 8 byte
instructions to a 64 byte boundary if required by adding a 4 byte nop.
Reviewed By: stefanp
Differential Revision: https://reviews.llvm.org/D101107
Nico Weber [Tue, 27 Apr 2021 16:31:09 +0000 (12:31 -0400)]
[llvm, clang] Remove stdlib includes from .h files without `std::`
Found files not containing `std::` with:
INCL="algorithm|array|list|map|memory|queue|set|string|utility|vector|unordered_map|unordered_set"
git ls-files llvm/include/llvm | grep '\.h$' | xargs grep -L std:: | \
xargs grep -El "#include <($INCL)>$" > to_process.txt
git ls-files clang/include/clang | grep '\.h$' | xargs grep -L std:: | \
xargs grep -El "#include <($INCL)>$" >> to_process.txt
Then removed these headers from those files with
INCL_ESCAPED="$(echo $INCL|sed 's/|/\\|/g')"
cat to_process.txt | xargs sed -i "/^#include <\($INCL_ESCAPED\)>$/d"
cat to_process.txt | xargs sed -i '/^$/N;/^\n$/D'
No behavior change.
Differential Revision: https://reviews.llvm.org/D101378
Christian Kühnel [Wed, 21 Apr 2021 09:33:44 +0000 (11:33 +0200)]
[doc] added documentation for pre-merge testing
fixes https://github.com/google/llvm-premerge-checks/issues/275
Differential Revision: https://reviews.llvm.org/D100936
David Sherwood [Tue, 27 Apr 2021 14:46:03 +0000 (15:46 +0100)]
Revert "[LoopVectorize] Simplify scalar cost calculation in getInstructionCost"
This reverts commit
4afeda9157cffd2daa83f8075d73f1e11ea34c81.
Asher Mancinelli [Mon, 19 Apr 2021 14:33:25 +0000 (07:33 -0700)]
[flang] Add format test to GTest suite
Reviewed by: awarzynski
Differential Revision: https://reviews.llvm.org/D100765
Simon Pilgrim [Tue, 27 Apr 2021 14:39:06 +0000 (15:39 +0100)]
Revert rG9b7a0a50355d5 - Revert "[X86] Add support for reusing ZF etc. from locked XADD instructions (PR20841)"
Still causing some sanitizer buildbot failures.
David Sherwood [Wed, 10 Mar 2021 08:34:19 +0000 (08:34 +0000)]
[LoopVectorize] Simplify scalar cost calculation in getInstructionCost
This patch simplifies the calculation of certain costs in
getInstructionCost when isScalarAfterVectorization() returns a true value.
There are a few places where we multiply a cost by a number N, i.e.
unsigned N = isScalarAfterVectorization(I, VF) ? VF.getKnownMinValue() : 1;
return N * TTI.getArithmeticInstrCost(...
After some investigation it seems that there are only these cases that occur
in practice:
1. VF is a scalar, in which case N = 1.
2. VF is a vector. We can only get here if: a) the instruction is a
GEP/bitcast/PHI with scalar uses, or b) this is an update to an induction
variable that remains scalar.
I have changed the code so that N is assumed to always be 1. For GEPs
the cost is always 0, since this is calculated later on as part of the
load/store cost. PHI nodes are costed separately and were never previously
multiplied by VF. For all other cases I have added an assert that none of
the users needs scalarising, which didn't fire in any unit tests.
Only one test required fixing and I believe the original cost for the scalar
add instruction to have been wrong, since only one copy remains after
vectorisation.
I have also added a new test for the case when a pointer PHI feeds directly
into a store that will be scalarised as we were previously never testing it.
Differential Revision: https://reviews.llvm.org/D99718
David Goldman [Fri, 19 Mar 2021 20:23:15 +0000 (16:23 -0400)]
[clangd] Improve handling of Objective-C protocols in types
Improve support for Objective-C protocols for types/type locs
Differential Revision: https://reviews.llvm.org/D98984
Martin Storsjö [Tue, 6 Apr 2021 08:37:52 +0000 (11:37 +0300)]
[libcxx] [test] Convert a couple of LIBCXX-WINDOWS-FIXME into XFAIL: windows-dll for known bugs
These are caused due to inconsistencies regarding always inline in
combination with dllimport. A bug report reference is added next to
each XFAIL line.
Differential Revision: https://reviews.llvm.org/D100789
Martin Storsjö [Tue, 6 Apr 2021 07:44:52 +0000 (10:44 +0300)]
[libcxx] [test] Add a separate 'windows-dll' feature to check for
This allows distinguishing failures in tests that only fail when libcxx
is linked as a DLL, allowing narrowing down XFAILs (avoiding XPASS errors
if not built as a DLL).
If both enable_shared and enable_static are set, the tests link and use
the shared version of the lib.
Differential Revision: https://reviews.llvm.org/D100221
David Goldman [Mon, 26 Apr 2021 21:58:35 +0000 (17:58 -0400)]
[clangd] run clang-format on FindTargetTests.cpp's FindExplicitReferencesTest
Addressing comments in https://reviews.llvm.org/D98984
Differential Revision: https://reviews.llvm.org/D101328
Yaxun (Sam) Liu [Tue, 27 Apr 2021 13:35:20 +0000 (09:35 -0400)]
[HIP] Fix help text for -fgpu-allow-device-init
Add 'experimental' to help text.
Simon Pilgrim [Tue, 27 Apr 2021 14:00:57 +0000 (15:00 +0100)]
[X86] Add support for reusing ZF etc. from locked XADD instructions (PR20841)
XADD has the same EFLAGS behaviour as ADD
Reapplies rG2149aa73f640 (after it was reverted at rG535df472b042) - AFAICT rG029e41ec9800 should ensure we correctly tag the LXADD* ops as load/stores - I haven't been able to repro the sanitizer buildbot fails locally so this is a speculative commit.
Joachim Protze [Tue, 27 Apr 2021 13:50:53 +0000 (15:50 +0200)]
[OpenMP][libomptarget] Separate lit tests for different offloading targets (2/2)
This patch fuses the RUN lines for most libomptarget tests. The previous patch
D101315 created separate test targets for each supported offloading triple.
This patch updates the RUN lines in libomptarget tests to use a generic run
line independent of the offloading target selected for the lit instance.
In cases, where no RUN line was defined for a specific offloading target,
the corresponding target is declared as XFAIL. If it turns out that a test
actually supports the target, the XFAIL line can be removed.
Differential Revision: https://reviews.llvm.org/D101326
Gabor Marton [Tue, 27 Apr 2021 12:57:12 +0000 (14:57 +0200)]
[analyzer][StdLibraryFunctionsChecker] Track dependent arguments
When we report an argument constraint violation, we should track those
other arguments that participate in the evaluation of the violation. By
default, we depend only on the argument that is constrained, however,
there are some special cases like the buffer size constraint that might
be encoded in another argument(s).
Differential Revision: https://reviews.llvm.org/D101358
Jay Foad [Tue, 27 Apr 2021 13:03:42 +0000 (14:03 +0100)]
[AMDGPU] Minor refactoring in AMDGPUUnifyDivergentExitNodes. NFC.
Make unifyReturnBlockSet a member function so we don't have to pass TTI
around as an argument.
Simon Pilgrim [Tue, 27 Apr 2021 13:11:05 +0000 (14:11 +0100)]
[X86] Ensure multiclass ATOMIC_RMW_BINOP is tagged as MayLoad and MayStore
These are RMW ops and should be tagged as both loads and stores.
Frederik Gossen [Tue, 27 Apr 2021 13:02:47 +0000 (15:02 +0200)]
[MLIR] Debug log IR after pattern applications
Like `print-ir-after-all` and `-before-all`, this allows to inspect IR for
debug purposes. While the former allow to inspect only between passes, this
change allows to follow the rewrites that happen within passes.
Differential Revision: https://reviews.llvm.org/D100940
Alexey Bataev [Fri, 23 Apr 2021 19:33:41 +0000 (12:33 -0700)]
[SLP]Improved isGatherShuffledEntry, NFC.
Reworked isGatherShuffledEntry function, simplified and moved
common code to the lambda (it shall go away when non-power-2 patch will
be landed).
Frederik Gossen [Tue, 27 Apr 2021 12:50:52 +0000 (14:50 +0200)]
[MLIR][Shape] Remove empty extent tensor operands
Empty extent tensor operands were only removed when they were defined as a
constant. Additionally, we can remove them if they are known to be empty by
their type `tensor<0xindex>`.
Differential Revision: https://reviews.llvm.org/D101351
Florian Hahn [Mon, 26 Apr 2021 12:18:34 +0000 (13:18 +0100)]
[LV,LAA] Add test cases with pointer phis in loops.
Pre-commits tests for D101286.
Frederik Gossen [Tue, 27 Apr 2021 12:47:39 +0000 (14:47 +0200)]
[MLIR][Shape] Replace single operand broadcasts with appropriate cast
Differential Revision: https://reviews.llvm.org/D101350
Petar Avramovic [Tue, 27 Apr 2021 12:44:59 +0000 (14:44 +0200)]
AMDGPU/GlobalISel: Fix negative offset folding for buffer_load
Buffer_load does unsigned offset calculations. Don't fold
operands of 32-bit add that are likely to cause unsigned add
overflow (common case is when one of the operands is negative).
Differential Revision: https://reviews.llvm.org/D91336
Petar Avramovic [Tue, 27 Apr 2021 12:40:26 +0000 (14:40 +0200)]
AMDGPU/GlobalISel: Add test for buffer_load with negative offset
Pre-commit test for D91336.
Florian Hahn [Tue, 27 Apr 2021 12:28:27 +0000 (13:28 +0100)]
[LV] Hoist code to get vector loop latch (NFC).
Address suggestion from D99294.
Sanjay Patel [Tue, 27 Apr 2021 12:02:44 +0000 (08:02 -0400)]
[IndVars] avoid crash in LFTR when assuming an add recurrence
The test is a crasher reduced from:
https://llvm.org/PR49993
linearFunctionTestReplace() assumes that we have an add recurrence,
so check for that as a condition of matching a loop counter.
Differential Revision: https://reviews.llvm.org/D101291
Nicholas Guy [Thu, 4 Mar 2021 14:36:13 +0000 (14:36 +0000)]
[AArch64] Enable runtime unrolling for in-order sched models
Differential Revision: https://reviews.llvm.org/D97947
Anastasia Stulova [Tue, 27 Apr 2021 10:03:26 +0000 (11:03 +0100)]
[C++4OpenCL] Add diagnostics for OpenCL types in templates.
Refactored diagnostics for OpenCL types to allow their
reuse for templates.
Patch by olestrohm (Ole Strohm)!
Differential Revision: https://reviews.llvm.org/D100860
Florian Hahn [Tue, 27 Apr 2021 11:21:27 +0000 (12:21 +0100)]
[VPlan] Use recursive traversal iterator in VPSlotTracker.
This patch simplifies VPSlotTracker by using the recursive traversal
iterator to traverse all blocks in a VPlan in reverse post-order when
numbering VPValues in a plan.
This depends on a fix to RPOT (D100169). It also extends the traversal
unit tests to check RPOT.
Reviewed By: a.elovikov
Differential Revision: https://reviews.llvm.org/D100176
Zarko Todorovski [Mon, 26 Apr 2021 14:50:16 +0000 (10:50 -0400)]
[AIX] Allow safe for 32bit P9 VSX extract and insert pattern matches
In https://reviews.llvm.org/D92789 PPC64 checks were added that disallowed most
VSX pattern matching. We enable some safe ones for 32bit in this patch.
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D97503
Dmitry Vyukov [Fri, 23 Apr 2021 11:10:39 +0000 (13:10 +0200)]
tsan: fix deadlock in pthread_atfork callbacks
We take report/thread_registry locks around fork.
This means we cannot report any bugs in atfork handlers.
We resolved this by enabling per-thread ignores around fork.
This resolved some of the cases, but not all.
The added test triggers a race report from a signal handler
called from atfork callback, we reset per-thread ignores
around signal handlers, so we tried to report it and deadlocked.
But there are more cases: a signal handler can be called
synchronously if it's sent to itself. Or any other report
types would cause deadlocks as well: mutex misuse,
signal handler spoiling errno, etc.
Disable all reports for the duration of fork with
thr->suppress_reports and don't re-enable them around
signal handlers.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D101154
Pushpinder Singh [Tue, 27 Apr 2021 10:47:05 +0000 (10:47 +0000)]
Reapply "[AMDGPU][OpenMP] Add amdgpu-arch tool to list AMD GPUs installed"
This reverts commit
93604305bb72201641f31cc50a6e7b2fe65d3af3.
Kadir Cetinkaya [Thu, 22 Apr 2021 14:08:48 +0000 (16:08 +0200)]
[clangd] Dont index deeply nested symbols
This is fix for some timeouts and OOM problems faced while indexing an
auto-generated file with thousands of nested lambdas.
Differential Revision: https://reviews.llvm.org/D101066
Alexander Belyaev [Tue, 27 Apr 2021 10:27:04 +0000 (12:27 +0200)]
[mlir] Add a pass to tile Linalg ops using `linalg.tiled_loop`.
Differential Revision: https://reviews.llvm.org/D101084
Joachim Protze [Tue, 27 Apr 2021 10:23:18 +0000 (12:23 +0200)]
[OpenMP][libomptarget] Separate lit tests for different offloading targets (1/2)
This patch creates a separate test directory for each offloading target to be
tested. This allows to test multiple architectures in one configuration, while
still see all failing tests separately. The lit test names include the target
triple, so that it will be easier to spot the failing target.
This patch also allows to mark expected failing tests based on the
target-triple, as the currently used triple is added to the lit "features":
```
// XFAIL: nvptx64-nvidia-cuda
```
Differential Revision: https://reviews.llvm.org/D101315
Petar Avramovic [Tue, 27 Apr 2021 10:21:57 +0000 (12:21 +0200)]
AMDGPU/GlobalISel: Remove redundant G_FCANONICALIZE
Add basic version of isCanonicalized for global-isel. Copied from sdag.
Add post legalizer combine that deletes G_FCANONICALIZE when its input
is already Canonicalized.
Differential Revision: https://reviews.llvm.org/D96605
Marek Kurdej [Tue, 27 Apr 2021 10:22:56 +0000 (12:22 +0200)]
[libc++] Fix set-but-not-used warning. NFC.