Jan Svoboda [Fri, 6 Aug 2021 12:46:41 +0000 (14:46 +0200)]
[clang] Remove misleading assertion in FullSourceLoc
D31709 added an assertion was added to `FullSourceLoc::hasManager()` that ensured a valid `SourceLocation` is always paired with a `SourceManager`, and missing `SourceManager` is always paired with an invalid `SourceLocation`.
This appears to be incorrect, since clients never cared about constructing `FullSourceLoc` to uphold that invariant, or always checking `isValid()` before calling `hasManager()`.
The assertion started failing when serializing diagnostics pointing into an explicit module. Explicit modules don't have valid `SourceLocation` for the `import` statement, since they are "imported" from the command-line argument `-fmodule-name=x.pcm`.
This patch removes the assertion, since `FullSourceLoc` was never intended to uphold any kind of invariants between the validity of `SourceLocation` and presence of `SourceManager`.
Reviewed By: arphaman
Differential Revision: https://reviews.llvm.org/D106862
Andrzej Warzynski [Thu, 5 Aug 2021 09:17:37 +0000 (09:17 +0000)]
[flang][docs] Document the `flang` wrapper script
Differential Revision: https://reviews.llvm.org/D107543
Rainer Orth [Fri, 6 Aug 2021 12:04:11 +0000 (14:04 +0200)]
[profile] Only use NT_GNU_BUILD_ID if supported
The Solaris buildbots have been broken for some time by the unconditional
use of `NT_GNU_BUILD_ID`, e.g. Solaris/sparcv9
<https://lab.llvm.org/staging/#/builders/50/builds/4910> and Solaris/amd64
<https://lab.llvm.org/staging/#/builders/101/builds/3751>. Being a GNU
extension, it is not defined in `<sys/elf.h>`. However, providing a
fallback definition doesn't help because the code also relies on
`__ehdr_start`, another unportable GNU extension that most likely never
will be implemented in Solaris `ld`. Besides, there's reallly no point in
supporting build ids since they aren't used on Solaris at all.
This patch fixes this by making the relevant code conditional on the
definition of `NT_GNU_BUILD_ID`.
Tested on `amd64-pc-solaris2.11` and `sparcv9-sun-solaris2.11`.
Differential Revision: https://reviews.llvm.org/D107556
Mircea Trofin [Thu, 5 Aug 2021 19:29:27 +0000 (12:29 -0700)]
[NFC][MLGO] Make logging more robust
1) add some self-diagnosis (when asserts are enabled) to check that all
features have the same nr of entries
2) avoid storing pointers to mutable fields because the proto API
contract doesn't actually guarantee those stay fixed even if no further
mutation of the object occurs.
Differential Revision: https://reviews.llvm.org/D107594
Luna Kirkby [Fri, 6 Aug 2021 11:08:17 +0000 (07:08 -0400)]
Split 'qualifier on reference type has no effect' out into a new flag
This introduces a new flag ignored-reference-qualifiers for the
existing "'A' qualifier on reference type B has no effect" diagnostic,
as a child of ignored-qualifiers.
Rationale:
This particular diagnostic is enabled by default, but other parts of
ignored-qualifiers are not. Anecdotally, a user may encounter this
diagnostic in the wild, and, seeing it to be valuable, might try to
raise it to error with -Werror=ignored-qualifiers, whereupon the other
diagnostics the flag covers will also be raised, to the user's surprise
and confusion. By splitting this diagnostic out into a separate flag,
and marking it as a child of ignored-qualifiers, we allow the user more
granular control of the diagnostics they care about, while maintaining
backwards compatibility with existing build scripts.
Reshabh Sharma [Fri, 6 Aug 2021 09:56:12 +0000 (15:26 +0530)]
[AMDGPU] Handle functions in llvm's global ctors and dtors list
This patch introduces a new code object metadata field, ".kind"
which is used to add support for init and fini kernels.
HSAStreamer will use function attributes, "device-init" and
"device-fini" to distinguish between init and fini kernels from
the regular kernels and will emit metadata with ".kind" set to
"init" and "fini" respectively.
To reduce the number of init and fini kernels, the ctors and
dtors present in the llvm's global.ctors and global.dtors lists
are called from a single init and fini kernel respectively.
Reviewed by: yaxunl
Differential Revision: https://reviews.llvm.org/D105682
Simon Pilgrim [Fri, 6 Aug 2021 10:21:19 +0000 (11:21 +0100)]
[ARM] Fold insert_subvector to concat_vectors
D107068 fixed the same problem on aarch64 but the arm variant wasn't exposed in existing test coverage.
I've copied the arm64-neon-copy tests (and stripped the intrinsic test from it) for testing on arm neon builds as well.
Simon Pilgrim [Fri, 6 Aug 2021 09:46:22 +0000 (10:46 +0100)]
[X86][AVX] Extract SUBV_BROADCAST constant bits from just the lower subvector range (PR51281)
As reported on PR51281, an internal fuzz test encountered an issue when extracting constant bits from a SUBV_BROADCAST node from a constant pool source larger than the broadcasted subvector width.
The getTargetConstantBitsFromNode was assuming that the Constant would the same size as the subvector, resulting in the incorrect packing of the per-element bits data.
This patch attempts to solve this by using the SUBV_BROADCAST node to determine the subvector width, and then ensuring we extract only the lowest bits from Constant of that subvector bitsize.
Differential Revision: https://reviews.llvm.org/D107158
Alexander Belyaev [Fri, 6 Aug 2021 09:28:11 +0000 (11:28 +0200)]
[linalg] Expose `rewriteAsPaddedOp` function.
Differential Revision: https://reviews.llvm.org/D107629
Justas Janickas [Thu, 5 Aug 2021 11:42:53 +0000 (12:42 +0100)]
[C++4OpenCL] Introduces __remove_address_space utility
This change provides a way to conveniently declare types that have
address space qualifiers removed.
Since OpenCL adds address spaces implicitly even when they are not
specified in source, it is useful to allow deriving address space
unqualified types.
Fixes llvm.org/PR45326
Differential Revision: https://reviews.llvm.org/D106785
Stefan Gränitz [Fri, 6 Aug 2021 09:15:54 +0000 (11:15 +0200)]
[Orc][examples] Temporarily disable tests for the C API due to failures on sanitizer bots
These tests were added while the OrcV2Example tests had been disabled:
https://reviews.llvm.org/rGe5d8cfb2f134fcf0235ec1a35eec875a9cd36b21
Failures on sanitizer bots:
https://green.lab.llvm.org/green/job/clang-stage2-cmake-RgSan/7992/testReport/
Cullen Rhodes [Fri, 6 Aug 2021 08:16:07 +0000 (08:16 +0000)]
[AArch64] NFC: drop unnecessary llvm:: namespace prefix on MCInst
Sven van Haastregt [Fri, 6 Aug 2021 09:21:26 +0000 (10:21 +0100)]
[OpenCL][Docs] Adding builtins requires adding to both now
As we are trying to reach parity between opencl-c.h and
-fdeclare-opencl-builtins, ensure the documentation mentions that new
builtins should be added to both.
Reviewed by: Anastasia Stulova
David Sherwood [Fri, 30 Jul 2021 07:41:31 +0000 (08:41 +0100)]
[LoopVectorize] Improve vectorisation of some intrinsics by treating them as uniform
This patch adds more instructions to the Uniforms list, for example certain
intrinsics that are uniform by definition or whose operands are loop invariant.
This list includes:
1. The intrinsics 'experimental.noalias.scope.decl' and 'sideeffect', which
are always uniform by definition.
2. If intrinsics 'lifetime.start', 'lifetime.end' and 'assume' have
loop invariant input operands then these are also uniform too.
Also, in VPRecipeBuilder::handleReplication we check if an instruction is
uniform based purely on whether or not the instruction lives in the Uniforms
list. However, there are certain cases where calls to some intrinsics can
be effectively treated as uniform too. Therefore, we now also treat the
following cases as uniform for scalable vectors:
1. If the 'assume' intrinsic's operand is not loop invariant, then we
are free to treat this as uniform anyway since it's only a performance
hint. We will get the benefit for the first lane.
2. When the input pointers for 'lifetime.start' and 'lifetime.end' are loop
variant then for scalable vectors we assume these still ultimately come
from the broadcast of an alloca. We do not support scalable vectorisation
of loops containing alloca instructions, hence the alloca itself would
be invariant. If the pointer does not come from an alloca then the
intrinsic itself has no effect.
I have updated the assume test for fixed width, since we now treat it
as uniform:
Transforms/LoopVectorize/assume.ll
I've also added new scalable vectorisation tests for other intriniscs:
Transforms/LoopVectorize/scalable-assume.ll
Transforms/LoopVectorize/scalable-lifetime.ll
Transforms/LoopVectorize/scalable-noalias-scope-decl.ll
Differential Revision: https://reviews.llvm.org/D107284
Vladislav Vinogradov [Tue, 3 Aug 2021 14:23:31 +0000 (17:23 +0300)]
[mlir] Allow to override type/attr aliases from various hooks
Use new return type for `OpAsmDialectInterface::getAlias`:
* `AliasResult::NoAlias` if an alias was not provided.
* `AliasResult::OverridableAlias` if an alias was provided, but it might be overriden by other hook.
* `AliasResult::FinalAlias` if an alias was provided and it should be used (no other hooks will be checked).
In that case `AsmPrinter` will use either the first alias with `FinalAlias` result or
the last alias with `OverridableAlias` result (it depends on dialect array order).
Used `OverridableAlias` result for `BuiltinOpAsmDialectInterface`.
Use case: provide more informative alias for built-in attributes like `AffineMapAttr`
instead of generic "map<N>".
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D107437
Chuanqi Xu [Fri, 6 Aug 2021 07:10:02 +0000 (15:10 +0800)]
[FuncSpec] Return changed if function is changed by tryToReplaceWithConstant
The may get changed before specialization by RunSCCPSolver. In other
words, the pass may change the function without specialization happens.
Add test and comment to reveal this.
And it may return No Changed if the function get changed by
RunSCCPSolver before the specialization. It looks like a potential bug.
Test Plan: check-all
Reviewed By: https://reviews.llvm.org/D107622
Differential Revision: https://reviews.llvm.org/D107622
Esme-Yi [Fri, 6 Aug 2021 08:54:02 +0000 (08:54 +0000)]
[llvm-readobj][XCOFF] Warn about invalid offset
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D107398
David Sherwood [Fri, 6 Aug 2021 07:33:44 +0000 (08:33 +0100)]
Revert "[LoopVectorize] Add support for replication of more intrinsics with scalable vectors"
This reverts commit
95800da914938129083df2fa0165c1901909c273.
Jay Foad [Tue, 3 Aug 2021 16:11:08 +0000 (17:11 +0100)]
[AMDGPU][GlobalISel] Better legalization of 32-bit ctlz/cttz
Differential Revision: https://reviews.llvm.org/D107474
Jay Foad [Wed, 4 Aug 2021 10:55:29 +0000 (11:55 +0100)]
[AMDGPU][GlobalISel] Improve regbankselect for 64-bit VGPR ctlz_zero_undef/cttz_zero_undef
We can improve on the generic splitting by using ffbh/ffbl, which have a
defined result when the input is zero.
Differential Revision: https://reviews.llvm.org/D107442
Jay Foad [Wed, 4 Aug 2021 08:14:25 +0000 (09:14 +0100)]
[AMDGPU][GlobalISel] Add G_AMDGPU_FFBL_B32
This is the counterpart to G_AMDGPU_FFBH_U32 which already exists. These
instructions have a defined result of -1 when the input is zero.
Differential Revision: https://reviews.llvm.org/D107441
Jay Foad [Wed, 4 Aug 2021 13:37:45 +0000 (14:37 +0100)]
[GlobalISel] Improve legalization of narrow CTTZ
Differential Revision: https://reviews.llvm.org/D107457
Chuanqi Xu [Fri, 6 Aug 2021 08:38:20 +0000 (16:38 +0800)]
[NFC] [FuncSpec] Remove unused variables in isArgumentInteresting
Chuanqi Xu [Fri, 6 Aug 2021 07:38:06 +0000 (15:38 +0800)]
[FuncSpec] Move invariant computation for spec cost out of loop (NFC-ish)
Noticed that the computation for function specialization cost of a
function wouldn't change during the traversal of the arguments for the
function. We could hoist the computation out of the traversal. I
observed about ~1% improvement on compile time for spec2017. But I guess
it may not be precise. This should be NFC and fine.
Reviewed By: Sjoerd Meijer
Differential Revision: https://reviews.llvm.org/D107621
Serge Pavlov [Thu, 5 Aug 2021 11:12:18 +0000 (18:12 +0700)]
Introduce intrinsic llvm.isnan
This is recommit of the patch
16ff91ebccda1128c43ff3cee104e2c603569fb2,
reverted in
0c28a7c990c5218d6aec47c5052a51cba686ec5e because it had
an error in call of getFastMathFlags (base type should be FPMathOperator
but not Instruction). The original commit message is duplicated below:
Clang has builtin function '__builtin_isnan', which implements C
library function 'isnan'. This function now is implemented entirely in
clang codegen, which expands the function into set of IR operations.
There are three mechanisms by which the expansion can be made.
* The most common mechanism is using an unordered comparison made by
instruction 'fcmp uno'. This simple solution is target-independent
and works well in most cases. It however is not suitable if floating
point exceptions are tracked. Corresponding IEEE 754 operation and C
function must never raise FP exception, even if the argument is a
signaling NaN. Compare instructions usually does not have such
property, they raise 'invalid' exception in such case. So this
mechanism is unsuitable when exception behavior is strict. In
particular it could result in unexpected trapping if argument is SNaN.
* Another solution was implemented in https://reviews.llvm.org/D95948.
It is used in the cases when raising FP exceptions by 'isnan' is not
allowed. This solution implements 'isnan' using integer operations.
It solves the problem of exceptions, but offers one solution for all
targets, however some can do the check in more efficient way.
* Solution implemented by https://reviews.llvm.org/D96568 introduced a
hook 'clang::TargetCodeGenInfo::testFPKind', which injects target
specific code into IR. Now only SystemZ implements this hook and it
generates a call to target specific intrinsic function.
Although these mechanisms allow to implement 'isnan' with enough
efficiency, expanding 'isnan' in clang has drawbacks:
* The operation 'isnan' is hidden behind generic integer operations or
target-specific intrinsics. It complicates analysis and can prevent
some optimizations.
* IR can be created by tools other than clang, in this case treatment
of 'isnan' has to be duplicated in that tool.
Another issue with the current implementation of 'isnan' comes from the
use of options '-ffast-math' or '-fno-honor-nans'. If such option is
specified, 'fcmp uno' may be optimized to 'false'. It is valid
optimization in general, but it results in 'isnan' always returning
'false'. For example, in some libc++ implementations the following code
returns 'false':
std::isnan(std::numeric_limits<float>::quiet_NaN())
The options '-ffast-math' and '-fno-honor-nans' imply that FP operation
operands are never NaNs. This assumption however should not be applied
to the functions that check FP number properties, including 'isnan'. If
such function returns expected result instead of actually making
checks, it becomes useless in many cases. The option '-ffast-math' is
often used for performance critical code, as it can speed up execution
by the expense of manual treatment of corner cases. If 'isnan' returns
assumed result, a user cannot use it in the manual treatment of NaNs
and has to invent replacements, like making the check using integer
operations. There is a discussion in https://reviews.llvm.org/D18513#387418,
which also expresses the opinion, that limitations imposed by
'-ffast-math' should be applied only to 'math' functions but not to
'tests'.
To overcome these drawbacks, this change introduces a new IR intrinsic
function 'llvm.isnan', which realizes the check as specified by IEEE-754
and C standards in target-agnostic way. During IR transformations it
does not undergo undesirable optimizations. It reaches instruction
selection, where is lowered in target-dependent way. The lowering can
vary depending on options like '-ffast-math' or '-ffp-model' so the
resulting code satisfies requested semantics.
Differential Revision: https://reviews.llvm.org/D104854
Florian Hahn [Fri, 6 Aug 2021 07:12:54 +0000 (08:12 +0100)]
[LV] Move reduction PHI node fixup to VPlan::execute (NFC).
All information to fix-up the reduction phi nodes in the vectorized loop
is available in VPlan now. This patch moves the code to do so, to make
this clearer. Fixing up the loop exit value still relies on other
information and remains outside of VPlan for now.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D100113
Stella Laurenzo [Fri, 6 Aug 2021 04:10:03 +0000 (04:10 +0000)]
[mlir][python] Make a number of imports relative.
Avoiding absolute imports allows the code to be relocatable (which is used for out of tree integrations).
Differential Revision: https://reviews.llvm.org/D107617
Amara Emerson [Fri, 6 Aug 2021 07:06:13 +0000 (00:06 -0700)]
[GlobalISel] Make GLoadStore::getMemSize[InBits]() const.
Christian Kühnel [Fri, 6 Aug 2021 07:00:38 +0000 (07:00 +0000)]
[doc] added links to discord and discourse
Some folks are not aware that we have a Discourse server in addition to the mailing lists and a Discord server in addition to IRC. So I think we should add that.
These were announced on the mailing list a while ago: https://lists.llvm.org/pipermail/llvm-dev/2019-November/136880.html
Differential Revision: https://reviews.llvm.org/D100943
Chuanqi Xu [Fri, 6 Aug 2021 06:41:46 +0000 (14:41 +0800)]
[NFC] [FuncSpec] Update the Todo list for recursive functions
Now the recursive functions may get specialized many times when
`func-specialization-max-iters` increases. See discussion in
https://reviews.llvm.org/D106426 for details.
luxufan [Fri, 6 Aug 2021 06:18:29 +0000 (14:18 +0800)]
[JITLink][RISCV] Add relocation fixup test
This patch add R_RISCV_HI20, R_RISCV_LO12 and R_RISCV_CALL relocation test
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D107327
Adrian Kuegel [Thu, 5 Aug 2021 09:30:09 +0000 (11:30 +0200)]
[mlir][MemRef] Fix canonicalization of BufferCast(TensorLoad).
CastOp::areCastCompatible does not check whether casts are definitely compatible.
When going from dynamic to static offset or stride, the canonicalization cannot
know whether it is really cast compatible. In that case, it can only canonicalize
to an alloc plus copy.
Differential Revision: https://reviews.llvm.org/D107545
Amara Emerson [Fri, 6 Aug 2021 06:21:08 +0000 (23:21 -0700)]
Delete copy-ctor of MachineFrameInfo.
I just hit a nasty bug when writing a unit test after calling MF->getFrameInfo()
without declaring the variable as a reference.
Deleting the copy-constructor also showed a place in the ARM backend which was
doing the same thing, albeit it didn't impact correctness there from the looks of it.
Kai Luo [Fri, 6 Aug 2021 06:00:57 +0000 (06:00 +0000)]
[PowerPC] Fix shift amount of xxsldwi when performing vector int_to_double
POC
```
// main.c
#include <stdio.h>
#include <altivec.h>
extern vector double foo(vector int s);
int main() {
vector int s = {0, 1, 0, 4};
vector double vd;
vd = foo(s);
printf("%lf %lf\n", vd[0], vd[1]);
return 0;
}
// poc.c
vector double foo(vector int s) {
int x1 = s[1];
int x3 = s[3];
double d1 = x1;
double d3 = x3;
vector double x = { d1, d3 };
return x;
}
```
Compiled with `poc.c main.c -mcpu=pwr8 -O3` on BE machine.
Current clang gives
```
4.000000 1.000000
```
while xlc gives
```
1.000000 4.000000
```
Xlc's output should be correct.
Reviewed By: shchenz, #powerpc
Differential Revision: https://reviews.llvm.org/D107428
Martin Storsjö [Fri, 6 Aug 2021 05:51:21 +0000 (08:51 +0300)]
[fuzzer] Fix building on case sensitive mingw platforms
Include windows.h with an all lowercase filename; Windows SDK headers
aren't self consistent so they can't be used in an entirely
case sensitive setting, and mingw headers use all lowercase names
for such headers.
This fixes building after
881faf41909b47376595e8d7bb9c9a109182d20b.
Arthur Eubanks [Fri, 6 Aug 2021 04:18:53 +0000 (21:18 -0700)]
[GCov] Emit memset instead of stores in __llvm_gcov_reset
For a very large module, __llvm_gcov_reset can become very large.
__llvm_gcov_reset previously emitted stores to a bunch of globals in one
huge basic block. MemCpyOpt would turn many of these stores into
memsets, and updating MemorySSA would be extremely slow.
Verified that this makes the compile time of certain files go down
drastically (20min -> 5min).
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D107538
Ryan Prichard [Fri, 6 Aug 2021 04:55:23 +0000 (21:55 -0700)]
Replace "CHECK-NOT: #{{.*}}" with same-line positive checks. NFC.
The intent of the negative #{{.*}} checks is to verify that the line
declaring/defining a function has no attribute, but they could restrict
later function declarations instead.
The 2008-09-02-FunctionNotes.ll check had allowed @fn3 to have an
attribute, because there is only a single "define void @fn3()" in the
output.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D107614
Serge Bazanski [Fri, 6 Aug 2021 04:08:09 +0000 (21:08 -0700)]
[Lanai] fix lowering wide returns
This implements LanaiTargetLowering::CanLowerReturn, thereby ensuring
all return values conform to the RetCC and get sret-demoted as
necessary.
A regression test is also added that exercises this functionality.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D107086
Arthur O'Dwyer [Fri, 6 Aug 2021 03:29:53 +0000 (23:29 -0400)]
[libc++] s/_VSTD::_IsSame/_IsSame/. NFCI.
Jinsong Ji [Fri, 6 Aug 2021 02:58:50 +0000 (02:58 +0000)]
[PowerPC] Fix copy/paste error in scalar_to_vector patterns
https://reviews.llvm.org/D100478 refactoring added a copy/paste error
for v8i16 patterns.
Reviewed By: #powerpc, shchenz
Differential Revision: https://reviews.llvm.org/D107609
Jacques Pienaar [Fri, 6 Aug 2021 02:51:48 +0000 (19:51 -0700)]
[mlir] std.call reference function return types in failure
Makes it easier to see type mismatch from failure locally.
Differential Revision: https://reviews.llvm.org/D107288
Vitaly Buka [Fri, 6 Aug 2021 02:50:42 +0000 (19:50 -0700)]
[NFC][sanitizer] clang-format sem related block
Kai Luo [Fri, 6 Aug 2021 02:37:18 +0000 (02:37 +0000)]
[PowerPC] Pre-commit test for D107428. NFC.
Shilei Tian [Fri, 6 Aug 2021 02:31:42 +0000 (22:31 -0400)]
[NFC] Clean up and clang-format openmp/libomptarget/plugins/cuda/src/rtl.cpp
Jason Molenda [Fri, 6 Aug 2021 02:27:55 +0000 (19:27 -0700)]
Revert "[LLDB][GUI] Refactor form drawing using subsurfaces"
Temporarily revert this patch to unbreak the bots/builds
until we can understand what was intended; is_pad() call
isn't defined.
This reverts commit
2b89f40a411cb9717232df61371b24d73ae84cb8.
Matt Jacobson [Fri, 6 Aug 2021 02:12:00 +0000 (10:12 +0800)]
[AVR][clang] Pass '-fno-use-init-array' to cc1 as default
On AVR, '.ctors' is used, not '.init_array'. Make this the default
unless specifically overridden by driver argument.
This matches gcc, and it matches the behavior in (e.g.) the NetBSD
driver (for certain OS variants).
Reviewed by: MaskRay
Differential Revision: https://reviews.llvm.org/D107610
Matthias Springer [Fri, 6 Aug 2021 01:28:12 +0000 (10:28 +0900)]
[mlir] Cleanup: Fix warnings in MLIR
Tested with gcc-10. Other compilers may generate additional warnings. This does not fix all warnings. There are a few extra ones in LLVMCore and MLIR.
* `OpEmitter::getAttrNameIndex`: -Wunused-function (function is private and not used anywhere)
* `PrintOpPass` copy constructor: -Wextra ("Base class should be explicitly initialized in the copy constructor")
* `LegalizeForLLVMExport.cpp`: -Woverflow (overflow is expected, silence warning by making the cast explicit)
Differential Revision: https://reviews.llvm.org/D107525
Jessica Paquette [Thu, 5 Aug 2021 21:26:36 +0000 (14:26 -0700)]
[AArch64][GlobalISel] Overhaul G_INSERT legalization
Similar cleanup to G_EXTRACT (
51bd4e874fa51412e7399fe7f863169b4f4829bc).
Also swap the order of clamp/widen to avoid unnecessary complex merges.
Add a bunch of missing testcases to legalize-inserts while we're at it.
Differential Revision: https://reviews.llvm.org/D107601
Jessica Paquette [Thu, 5 Aug 2021 21:48:18 +0000 (14:48 -0700)]
[AArch64][GlobalISel] Widen G_IMPLICIT_DEF and G_FREEZE before clamping
Similar to other cleanup commits which widen instructions before clamping
during legalization. Purpose of this is to avoid weird type breakdowns.
In terms of G_IMPLICIT_DEF, this simplifies legalization for other instructions.
The legalizer has to emit G_IMPLICIT_DEF to legalize certain instructions, so
this can help with emitting merges elsewhere.
Differential Revision: https://reviews.llvm.org/D107604
Sean Fertile [Fri, 14 May 2021 17:55:13 +0000 (13:55 -0400)]
[PowerPC][AIX] Create multiple constant sections.
Fixes issue where late materialized constants can be more strictly
aligned then their containing csect.
Differential Revision: https://reviews.llvm.org/D103103
Jon Roelofs [Fri, 6 Aug 2021 00:46:33 +0000 (17:46 -0700)]
Revert "[GlobalISel][KnownBits] Implement G_CTPOP"
This reverts commit
ce6eb4f15a159e652bdccf92a9d3da8a972d1596.
It's broken on the windows bots: https://reviews.llvm.org/D107606#2930121
Amara Emerson [Thu, 29 Jul 2021 00:30:06 +0000 (17:30 -0700)]
[GlobalISel] Allow the ArtifactValueFinder to return the best available register on failure.
In some cases, like with inserts, we may have a matching size register already,
but still decide to try to look further. This change adds a CurrentBest
register to the value finder state, and any time a method fails to make progress,
returns that register (which may just be an empty Register).
To facilitate this, add a new entry point to the findValueFromDef() function
which initializes this state.
Also fix the build vector finder to return the current build_vector if all
sources are being requested.
Differential Revision: https://reviews.llvm.org/D107017
Jon Roelofs [Thu, 5 Aug 2021 21:57:44 +0000 (14:57 -0700)]
[GlobalISel][KnownBits] Implement G_CTPOP
Implementation copied almost verbatim from ValueTracking.
Differential revision: https://reviews.llvm.org/D107606
wlei [Thu, 5 Aug 2021 03:20:58 +0000 (20:20 -0700)]
[llvm-profgen] Fix bug of loop scope mismatch
One performance issue happened in profile generation and it turned out the line 525 loop is the bottleneck.
Moving the code outside of loop scope can fix this issue. The run time is improved from 30+mins to ~30s.
Reviewed By: hoy, wenlei
Differential Revision: https://reviews.llvm.org/D107529
Omar Emara [Thu, 5 Aug 2021 23:50:50 +0000 (16:50 -0700)]
[LLDB][GUI] Refactor form drawing using subsurfaces
This patch adds a new method SubSurface to the Surface class. The method
returns another surface that is a subset of this surface. This is
important to further abstract away drawing from the ncurses objects. For
instance, fields could previously be drawn on subpads only but can now
be drawn on any surface. This is needed to create the file search
dialogs and similar functionalities.
There is an opportunity to refactor window drawing in general using
surfaces, but we shall consider this separately later.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D107182
Ryan Prichard [Thu, 5 Aug 2021 23:35:02 +0000 (16:35 -0700)]
Mark getc_unlocked as unavailable by default
Before D45736, getc_unlocked was available by default, but turned off
for non-Cygwin/non-MinGW Windows. D45736 then added 9 more unlocked
functions, which were unavailable by default, but it also:
* left getc_unlocked enabled by default,
* removed the disabling line for Windows, and
* added code to enable getc_unlocked for GNU, Android, and OSX.
For consistency, make getc_unlocked unavailable by default. Maybe this
was the intent of D45736 anyway.
Reviewed By: MaskRay, efriedma
Differential Revision: https://reviews.llvm.org/D107527
Jessica Paquette [Thu, 5 Aug 2021 18:25:41 +0000 (11:25 -0700)]
[AArch64][GlobalISel] Widen extloads before clamping during legalization
Allows us to avoid awkward type breakdowns on types like s88, like the other
commits.
Differential Revision: https://reviews.llvm.org/D107587
Stanislav Mekhanoshin [Thu, 5 Aug 2021 21:25:18 +0000 (14:25 -0700)]
[AMDGPU] Improve v2i32/v2f32 insertelt patterns
Using REG_SEQUENCE produces better code than INSERT_SUBREG,
we can omit one move instruction in many cases.
Fixes: SWDEV-298028
Differential Revision: https://reviews.llvm.org/D107602
Jinsong Ji [Thu, 5 Aug 2021 22:46:07 +0000 (22:46 +0000)]
[PowerPC] Remove accidently left checks
Jinsong Ji [Thu, 5 Aug 2021 22:36:13 +0000 (22:36 +0000)]
[PowerPC] Add scalar vector test
Heejin Ahn [Wed, 4 Aug 2021 23:27:51 +0000 (16:27 -0700)]
[WebAssembly] Don't do SjLj transformation when there's only setjmp
When there is a `setjmp` call in a function, we transform every callsite
of `setjmp` to record its information by calling `saveSetjmp` function,
and we also transform every callsite of a function that can longjmp to
to check if a longjmp occurred and if so jump to the corresponding
post-setjmp BB. Currently we are doing this for every function that
contains a call to `setjmp`, but if there is no other function call
within that function that can longjmp, this transformation of `setjmp`
callsite and all the preparation of `setjmpTable` in the entry of the
function are not necessary.
This checks if a setjmp-calling function has any other calls that can
longjmp, and if not, skips the function for the SjLj transformation.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D107530
David Green [Thu, 5 Aug 2021 22:23:24 +0000 (23:23 +0100)]
[AArch64] Expand the SVE min/max reduction costs to NEON
This takes the existing SVE costing for the various min/max reduction
intrinsics and expands it to NEON, where I believe it applies equally
well.
In the process it changes the lowering to use min/max cost, as opposed
to summing up the cost of ICmp+Select.
Differential Revision: https://reviews.llvm.org/D106239
Steven Wan [Thu, 5 Aug 2021 22:18:48 +0000 (18:18 -0400)]
[AIX] "aligned" attribute should not decrease type alignment returned by __alignof__
`__alignof__(x)` always returns `ABIAlign` if the "x" is marked `__attribute__((aligned()))`. However, the "aligned" attribute should only increase the alignment of a struct, or struct member, unless it's used together with the "packed" attribute, or used as a part of a typedef, in which case, the "aligned" attribute can both increase and decrease alignment.
Reviewed By: sfertile
Differential Revision: https://reviews.llvm.org/D107598
Jessica Paquette [Thu, 5 Aug 2021 22:04:33 +0000 (15:04 -0700)]
[AArch64][GlobalISel] Widen G_BSWAP before clamping
This allows us to avoid odd type breakdowns + allows us to legalize types like
s88 in the first place.
Add some testcases for known legal types + testcases for s4 and s88.
Differential Revision: https://reviews.llvm.org/D107607
Stanislav Mekhanoshin [Thu, 5 Aug 2021 18:59:53 +0000 (11:59 -0700)]
[Thumb2] generate checks in ldr-str-imm12.ll. NFC.
That seems this test does not check what was stated in the
comment anymore. Just switch to generated checks.
Differential Revision: https://reviews.llvm.org/D107590
Stanislav Mekhanoshin [Thu, 5 Aug 2021 21:28:32 +0000 (14:28 -0700)]
[AMDGPU] add v2i32 and v2f32 insert_vector_elt tests. NFC.
Jessica Paquette [Wed, 4 Aug 2021 20:29:27 +0000 (13:29 -0700)]
[AArch64][GlobalISel] Overhaul G_EXTRACT legalization
This simplifies our existing G_EXTRACT rules and adds some test coverage. Mostly
changing this because it should make it easier to improve legalization for
instructions which use G_EXTRACT as part of the legalization process.
This also adds support for legalizing some weird types. Similar to other recent
legalizer changes, this changes the order of widening/clamping.
There was some dead code in our existing rules (e.g. the p0 case would never get
hit), so this knocks those out and makes the types we want to handle explicit.
This also removes some checks which, nowadays, are handled by the
MachineVerifier.
Differential Revision: https://reviews.llvm.org/D107505
Vitaly Buka [Wed, 4 Aug 2021 08:00:46 +0000 (01:00 -0700)]
[msan] Don't track origns in signal handlers
Origin::CreateHeapOrigin is not async-signal-safe and can deadlock.
Differential Revision: https://reviews.llvm.org/D107431
Nico Weber [Wed, 4 Aug 2021 11:47:36 +0000 (13:47 +0200)]
[lldb] Stop referencing "host_lib" in cmake files
It hasn't had an effect since https://reviews.llvm.org/rG7b968969db.
No behavior change.
Differential Revision: https://reviews.llvm.org/D107446
Nathan Lanza [Thu, 5 Aug 2021 03:08:08 +0000 (23:08 -0400)]
Clean up instcombine stpcpy test
Deduplicate some code and add an additional test to verify that the
sprintf->stpcpy optimization still works on android21 (which properly
supports it).
This follows up
58481663692b55.
Differential Revision: https://reviews.llvm.org/D107526
Nico Weber [Wed, 4 Aug 2021 11:25:26 +0000 (13:25 +0200)]
[lldb] Remove a few unused .exports files
They used to be referenced from the .xcodeproj files, but those are long gone.
No behavior change.
Differential Revision: https://reviews.llvm.org/D107444
Nico Weber [Thu, 5 Aug 2021 20:09:02 +0000 (22:09 +0200)]
[gn build] manually port
4d293f215dfb (LLVMDiff lib)
Michael Kruse [Thu, 5 Aug 2021 19:51:29 +0000 (14:51 -0500)]
[Polly][test] Add tests for IslMaxOperationsGuard.
Add unittests for IslMaxOperationsGuard and the behaviour of the isl-noexception.h wrapper under exceeded max_operations.
Reviewed By: patacca
Differential Revision: https://reviews.llvm.org/D107401
Michael Kruse [Thu, 5 Aug 2021 19:47:14 +0000 (14:47 -0500)]
[Polly][test] Test difference between isl::stat:ok() and isl::stat::error().
The foreach callback wrappers tests check the return values of isl::stat:ok() and isl::stat::error() separately. However, due to the the container they are iterating over containing just one element, they are actually not testing the difference between them.
This patch changes to set to be iterated over to contain 2 element to make returning sl::stat:ok (continue iterating the next element) and isl::stat::error (break after current element) have different effects other than the return value of the foreach itself.
Reviewed By: patacca
Differential Revision: https://reviews.llvm.org/D107395
Matt Morehouse [Thu, 5 Aug 2021 19:26:47 +0000 (12:26 -0700)]
[libFuzzer] Add missing include on Darwin.
Fangrui Song [Thu, 5 Aug 2021 19:17:50 +0000 (12:17 -0700)]
[clang] Implement -falign-loops=N (N is a power of 2) for non-LTO
GCC supports multiple forms of -falign-loops=.
-falign-loops= is currently ignored in Clang.
This patch implements the simplest but the most useful form where N is a
power of 2.
The underlying implementation uses a `llvm::TargetOptions` option for now.
Bitcode generation ignores this option.
Differential Revision: https://reviews.llvm.org/D106701
Bill Wendling [Tue, 3 Aug 2021 19:49:39 +0000 (12:49 -0700)]
[llvm-diff] Create libLLVMDiff library
Some tools may want to use the LLVM "diff" code. Move the code into a
library for easy use.
No functionality change intende.
Differential Revision: https://reviews.llvm.org/D107392
Jon Roelofs [Thu, 5 Aug 2021 18:52:26 +0000 (11:52 -0700)]
[AArch64][GlobalISel] Legalize ctpop s128
This is re-landing the same patch again, but without the changes to
LegalizerHelper that regressed the Mips test:
test/CodeGen/Mips/GlobalISel/llvm-ir/ctpop.ll
Differential revision: https://reviews.llvm.org/D106494
Matt Morehouse [Thu, 5 Aug 2021 17:38:46 +0000 (10:38 -0700)]
Enable extra coverage counters on Windows
- Enable extra coverage counters on Windows.
- Update extra_counters.test to run on Windows also.
- Update TableLookupTest.cpp to include the required pragma/declspec for the extra coverage counters.
Patch By: MichaelSquires
Reviewed By: morehouse
Differential Revision: https://reviews.llvm.org/D106676
Arthur O'Dwyer [Thu, 5 Aug 2021 17:19:05 +0000 (13:19 -0400)]
[libc++] IWYU to fix complaints when compiling with Modules. NFCI.
Differential Revision: https://reviews.llvm.org/D107583
Chris Jackson [Thu, 5 Aug 2021 14:46:09 +0000 (15:46 +0100)]
{DebugInfo][LSR] Don't cache dbg.value that are already undef
The SCEV-based salvaging method caches dbg.value information pre-LSR so
that salvaging may be attempted post-LSR. If the dbg.value are already
undef pre-LSR then a salvage attempt would be fruitless, so avoid
caching them.
Reviewed By: StephenTozer
Differential Revision: https://reviews.llvm.org/D107448
Matt Morehouse [Thu, 5 Aug 2021 18:07:53 +0000 (11:07 -0700)]
Revert "[llvm-diff] Create libLLVMDiff library"
This reverts commit
9854f2f30f84123ca78aa3603102e7cef4ec33c8 since it
broke all the builds.
Dimitry Andric [Wed, 4 Aug 2021 18:33:48 +0000 (20:33 +0200)]
sanitizer_common: disable thread safety annotations for googletest
Recently in
0da172b1766e thread safety warnings-as-errors were enabled.
However, googletest is currently not compatible with thread safety
annotations. On FreeBSD, which has the pthread functions marked with
such annotations, this results in errors when building the compiler-rt
tests:
In file included from compiler-rt/lib/interception/tests/interception_test_main.cpp:15:
In file included from llvm/utils/unittest/googletest/include/gtest/gtest.h:62:
In file included from llvm/utils/unittest/googletest/include/gtest/internal/gtest-internal.h:40:
llvm/utils/unittest/googletest/include/gtest/internal/gtest-port.h:1636:3: error: mutex 'mutex_' is still held at the end of function [-Werror,-Wthread-safety-analysis]
}
^
llvm/utils/unittest/googletest/include/gtest/internal/gtest-port.h:1633:32: note: mutex acquired here
GTEST_CHECK_POSIX_SUCCESS_(pthread_mutex_lock(&mutex_));
^
llvm/utils/unittest/googletest/include/gtest/internal/gtest-port.h:1645:32: error: releasing mutex 'mutex_' that was not held [-Werror,-Wthread-safety-analysis]
GTEST_CHECK_POSIX_SUCCESS_(pthread_mutex_unlock(&mutex_));
^
2 errors generated.
At some point googletest will hopefully be made compatible with thread
safety annotations, but for now add corresponding `-Wno-thread-*` flags
to `COMPILER_RT_GTEST_CFLAGS` to silence these warnings-as-errors.
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D107491
Geoffrey Martin-Noble [Thu, 5 Aug 2021 17:59:40 +0000 (10:59 -0700)]
[Bazel] Update for
9854f2f30f (Diff library)
Updates the Bazel build for
https://github.com/llvm/llvm-project/commit/
9854f2f30f by extracting a
library from llvm-diff. Note that this does not include the new
llvm-livepatch binary, for which the CMake file was added accidentally
and reverted in https://github.com/llvm/llvm-project/commit/
fec8f1a008.
Differential Revision: https://reviews.llvm.org/D107586
Hedin Garca [Wed, 4 Aug 2021 13:49:41 +0000 (13:49 +0000)]
[libc] Add diff and perf targets for more math functions
Comparing the run time of math functions from LLVM libc
with the MSVCRT libc:
|function |perf-LLVM libc |perf-MSVCRT
|ceilf |2.36 mins (
141491389600 ns)|47.10 sec (
47100940100 ns)
|exp2f |6.37 mins (
358441794700 ns)|12.39 mins (
719404388300 ns)
|expf |6.35 mins (
381204661800 ns)|6.17 mins (
346150163200 ns)
|fabsf |1.18 mins (
78425546600 ns) |53.75 sec (
53745301900 ns)
|floorf |3.15 mins (
164770963800 ns)|45.94 sec (
45935988400 ns)
|logbf |4.38 mins (
262508058800 ns)|55.47 sec (
55466377700 ns)
|nearbyintf |3.20 mins (
167972868000 ns)|9.13 mins (
523822963600 ns)
|rintf |3.20 mins (
168001498700 ns)|22.35 mins (
1341266448800 ns)
|roundf |2.35 mins (
141151500600 ns)|1.42 mins (
85326429800 ns)
|truncf |2.31 mins (
114846424000 ns)|59.41 sec (
59414309100 ns)
Evaluating the number of differing results in Windows:
|function |diff
|ceilf |8388606 differing results
|exp2f |
213303887 differing results
|expf |193922 differing results
|fabsf |8388606 differing results
|floorf |8388606 differing results
|logbf |0 differing results
|nearbyintf |0 differing results
|rintf |0 differing results
|roundf |0 differing results
|truncf |0 differing results
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D107462
Bill Wendling [Thu, 5 Aug 2021 17:50:38 +0000 (10:50 -0700)]
Remove unintended commit.
Jon Chesterfield [Thu, 5 Aug 2021 17:46:57 +0000 (18:46 +0100)]
[clang] Replace asm with __asm__ in cuda header
Asm is a gnu extension for C, so at present -fopenmp -std=c99
and similar fail to compile on nvptx, bug 51344
Changing to `__asm__` or `__asm` works for openmp, all three appear to work
for cuda. Suggesting `__asm__` here as `__asm` is used by MSVC with different
syntax, so this should make for better error diagnostics if the header is
passed to a compiler other than clang.
Reviewed By: tra, emankov
Differential Revision: https://reviews.llvm.org/D107492
Roman Lebedev [Thu, 5 Aug 2021 17:35:40 +0000 (20:35 +0300)]
[NFC][X86] combineX86ShuffleChain(): hoist Mask variable higher up
Having `NewMask` outside of an if and rebinding `BaseMask` `ArrayRef`
to it is confusing. Instead, just move the `Mask` vector higher up,
and change the code that earlier had no access to it but now does
to use `Mask` instead of `BaseMask`.
This has no other intentional changes.
This is a recommit of
35c0848b570214ed2b2d96cca4dd62bb7ae725cd,
that was reverted to simplify reversion of an earlier change.
Roman Lebedev [Thu, 5 Aug 2021 17:30:22 +0000 (20:30 +0300)]
[NFC][Codegen][X86] Add testcase that hanged after D107009
From Benjamin Kramer @ https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-
20210802/945642.html
Bill Wendling [Tue, 3 Aug 2021 19:49:39 +0000 (12:49 -0700)]
[llvm-diff] Create libLLVMDiff library
Some tools may want to use the LLVM "diff" code. Move the code into a
library for easy use.
No functionality change intende.
Differential Revision: https://reviews.llvm.org/D107392
Fangrui Song [Thu, 5 Aug 2021 17:32:14 +0000 (10:32 -0700)]
[ELF] Support copy relocation on non-default version symbols
Copy relocation on a non-default version symbol is unsupported and can crash at
runtime. Fortunately there is a one-line fix which works for most cases:
ensure `getSymbolsAt` unconditionally returns `ss`.
If two non-default version symbols are defined at the same place and both
are copy relocated, our implementation will copy relocated them into different
addresses. The pointer inequality is very unlikely an issue. In GNU ld, copy
relocating version aliases seems to create more pointer inequality problems than
us.
(
In glibc, sys_errlist@GLIBC_2.2.5 sys_errlist@GLIBC_2.3 sys_errlist@GLIBC_2.4
are defined at the same place, but it is unlikely they are all copy relocated in
one executable. Even if so, the variables are read-only and pointer inequality
should not be a problem.
)
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D107535
Jonas Devlieghere [Thu, 5 Aug 2021 16:35:37 +0000 (09:35 -0700)]
[lldb] Refactor IRExecutionUnit::FindInSymbols (NFC)
This patch refactors IRExecutionUnit::FindInSymbols. It eliminates a few
potential pitfalls and tries to be more explicit about the state carried
between symbol resolution attempts.
Differential revision: https://reviews.llvm.org/D107206
Jonas Devlieghere [Thu, 5 Aug 2021 16:27:19 +0000 (09:27 -0700)]
[lldb] Use a struct to pass function search options to Module::FindFunction
Rather than passing two booleans around, which is especially error prone
with them being next to each other, use a struct with named fields
instead.
Differential revision: https://reviews.llvm.org/D107295
Dan Liew [Thu, 5 Aug 2021 02:24:56 +0000 (19:24 -0700)]
Fix COMPILER_RT_DEBUG build for targets that don't support thread local storage.
022439931f5be77efaf80b44d587666b0c9b13b5 added code that is only enabled
when COMPILER_RT_DEBUG is enabled. This code doesn't build on targets
that don't support thread local storage because the code added uses the
THREADLOCAL macro. Consequently the COMPILER_RT_DEBUG build broke for
some Apple targets (e.g. 32-bit iOS simulators).
```
/Volumes/user_data/dev/llvm/llvm.org/main/src/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_mutex.cpp:216:8: error: thread-local storage is not supported for the current target
static THREADLOCAL InternalDeadlockDetector deadlock_detector;
^
/Volumes/user_data/dev/llvm/llvm.org/main/src/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h:227:24: note: expanded from macro 'THREADLOCAL'
# define THREADLOCAL __thread
^
1 error generated.
```
To fix this, this patch introduces a `SANITIZER_SUPPORTS_THREADLOCAL`
macro that is `1` iff thread local storage is supported by the current
target. That condition is then added to `SANITIZER_CHECK_DEADLOCKS` to
ensure the code is only enabled when thread local storage is available.
The implementation of `SANITIZER_SUPPORTS_THREADLOCAL` currently assumes
Clang. See `llvm-project/clang/include/clang/Basic/Features.def` for the
definition of the `tls` feature.
rdar://
81543007
Differential Revision: https://reviews.llvm.org/D107524
Ramesh Peri [Thu, 5 Aug 2021 17:04:28 +0000 (10:04 -0700)]
[llvm-ar] Fix for handling thin archive with SYM64 and a test case for it
WHen thin archives are created which have symbol table of type SYM64 then all the tools will not work since they cannot read the files properly.
One can reproduce the problem as follows:
1. Take a hello world program and create an archive out of it. The SYM64_THRESHOLD=0 will force the generation of SYM64 symbol table.
clang -c hello.cpp
SYM64_THRESHOLD=0 llvm-ar crsT mylib.a hello.o
2. Now try to use any of the tools on this mylib.a and it will fail.
llvm-nm -M mylib.a
THis fix will eliminate these failures. A regression test is created in llvm/test/Object/archive-symtab.test
Reviewed By: MaskRay, Ramesh
Differential Revision: https://reviews.llvm.org/D107322
Jon Roelofs [Thu, 5 Aug 2021 16:35:02 +0000 (09:35 -0700)]
Benjamin Kramer [Thu, 5 Aug 2021 16:53:00 +0000 (18:53 +0200)]
Revert "[X86] combineX86ShuffleChain(): canonicalize mask elts picking from splats"
This reverts commits
f819e4c7d0f6efef3cc1042cc45582320bf6c0a2 and
35c0848b570214ed2b2d96cca4dd62bb7ae725cd. It triggers an infinite loop during
compilation.
$ cat t.ll
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
define void @MaxPoolGradGrad_1.65() local_unnamed_addr #0 {
entry:
%wide.vec78 = load <64 x i32>, <64 x i32>* null, align 16
%strided.vec83 = shufflevector <64 x i32> %wide.vec78, <64 x i32> poison, <8 x i32> <i32 4, i32 12, i32 20, i32 28, i32 36, i32 44, i32 52, i32 60>
%0 = lshr <8 x i32> %strided.vec83, <i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16, i32 16>
%1 = add <8 x i32> zeroinitializer, %0
%2 = shufflevector <8 x i32> %1, <8 x i32> undef, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
%3 = shufflevector <16 x i32> %2, <16 x i32> undef, <32 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
%interleaved.vec = shufflevector <32 x i32> undef, <32 x i32> %3, <64 x i32> <i32 0, i32 8, i32 16, i32 24, i32 32, i32 40, i32 48, i32 56, i32 1, i32 9, i32 17, i32 25, i32 33, i32 41, i32 49, i32 57, i32 2, i32 10, i32 18, i32 26, i32 34, i32 42, i32 50, i32 58, i32 3, i32 11, i32 19, i32 27, i32 35, i32 43, i32 51, i32 59, i32 4, i32 12, i32 20, i32 28, i32 36, i32 44, i32 52, i32 60, i32 5, i32 13, i32 21, i32 29, i32 37, i32 45, i32 53, i32 61, i32 6, i32 14, i32 22, i32 30, i32 38, i32 46, i32 54, i32 62, i32 7, i32 15, i32 23, i32 31, i32 39, i32 47, i32 55, i32 63>
store <64 x i32> %interleaved.vec, <64 x i32>* undef, align 16
unreachable
}
$ llc < t.ll -mcpu=skylake
<hang>
Jessica Paquette [Wed, 4 Aug 2021 23:33:40 +0000 (16:33 -0700)]
[AArch64][GlobalISel] Mark v16s8 <- v8s8, v8s8 G_CONCAT_VECTOR as legal
G_CONCAT_VECTORS shows up from time to time when legalizing other instructions.
We actually import patterns for the v16s8 <- v8s8, v8s8 case so marking it
as legal gives us selection for free.
Differential Revision: https://reviews.llvm.org/D107512
Daniele Vettorel [Thu, 5 Aug 2021 16:32:36 +0000 (12:32 -0400)]
Add llvm-stress binary to Bazel build configuration.
The `llvm-stress` binary is currently missing from the Bazel `BUILD` file for llvm. This patch adds it.
Reviewed By: GMNGeoffrey
Differential Revision: https://reviews.llvm.org/D107571
Florian Hahn [Thu, 5 Aug 2021 09:33:29 +0000 (10:33 +0100)]
[SLP] Add additional memory version tests.