Nikolas Klauser [Sat, 4 Feb 2023 23:21:11 +0000 (00:21 +0100)]
[libc++] Move constexpr <cstring> functions into their own headers and remove unused <cstring> includes
Reviewed By: ldionne, Mordante, #libc, #libc_abi
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D143329
Joseph Huber [Mon, 20 Feb 2023 20:30:17 +0000 (14:30 -0600)]
[OpenMP] Update the bug report link for `libomp` assertion failures
Currently we still print the old https://bugs.llvm.org/ bugzilla link.
We should update this to the issues pane for the LLVM github.
Reviewed By: tlwilmar
Differential Revision: https://reviews.llvm.org/D144426
Nikita Popov [Tue, 21 Feb 2023 15:38:42 +0000 (16:38 +0100)]
[InstCombine] Add additional alloca comparison tests (NFC)
Kiran Chandramohan [Tue, 21 Feb 2023 15:16:16 +0000 (15:16 +0000)]
[Flang][OpenMP] Use the ultimate symbol for allocatable check
Reviewed By: peixin
Differential Revision: https://reviews.llvm.org/D144383
Mariya Podchishchaeva [Tue, 21 Feb 2023 15:07:47 +0000 (10:07 -0500)]
[NFC][clang][Modules] Refine test checks by adding `:`
The test can fail if wokring directory where the test was launched
has a `error` substring in its path.
Reviewed By: ChuanqiXu
Differential Revision: https://reviews.llvm.org/D143734
David Green [Tue, 21 Feb 2023 14:42:53 +0000 (14:42 +0000)]
[AArch64] Add a test for loading into a zerovector. NFC
Nicolas Vasilache [Tue, 21 Feb 2023 14:09:21 +0000 (06:09 -0800)]
[mlir][Linalg][Transform] NFC - Replace assert by more graceful error
Kavitha Natarajan [Tue, 24 Jan 2023 09:06:22 +0000 (14:36 +0530)]
[flang][OpenMP] Add parser support for order clause
Added parser support for OpenMP 5.0 & 5.1 feature
ORDER([order-modifier :]concurrent) clause for all
applicable and supported OpenMP directives.
Reviewed By: kiranchandramohan, abidmalikwaterloo
Differential Revision: https://reviews.llvm.org/D142524
Jessica Del [Mon, 20 Feb 2023 15:51:15 +0000 (16:51 +0100)]
[AMDGPU] MIR-Tests for Multiplication using KBA
These tests show inefficient behavior that will be optimized by a
later change.
By using Known Bits Analysis, we can avoid unnecessary multiplications
or additions with 0.
DianQK [Tue, 21 Feb 2023 12:25:59 +0000 (20:25 +0800)]
[SimplifyCFG] Check if the return instruction causes undefined behavior
This should fix https://github.com/rust-lang/rust/issues/107681.
Return undefined to a noundef return value is undefined.
Example:
```
define noundef i32 @test_ret_noundef(i1 %cond) {
entry:
br i1 %cond, label %bb1, label %bb2
bb1:
br label %bb2
bb2:
%r = phi i32 [ undef, %entry ], [ 1, %bb1 ]
ret i32 %r
}
```
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D144319
Nikita Popov [Tue, 21 Feb 2023 10:01:18 +0000 (11:01 +0100)]
[IPSCCP] Remove noundef when zapping return values
When replacing return values with undef, we should also drop the
noundef attribute (and other UB implying attributes).
Differential Revision: https://reviews.llvm.org/D144461
Ammarguellat [Fri, 17 Feb 2023 21:52:19 +0000 (16:52 -0500)]
[NFC] Added a lit test for the arithmetic fence builtin.
Joseph Huber [Mon, 20 Feb 2023 23:03:28 +0000 (17:03 -0600)]
[libomptarget] Remove unused image from global data movement function
This interface function does not actually need the device image type.
It's unused in the function, so it should be able to be safely removed.
The motivation for this is to facilitate downsteam porting of the
amd-stg-open RPC module into the nextgen plugin so we can delete the old
plugin entirely. For that to work we need to be able to call this
function at kernel-launch time, which doesn't have the image. Also it's
cleaner.
Reviewed By: jplehr
Differential Revision: https://reviews.llvm.org/D144436
Markus Böck [Tue, 21 Feb 2023 10:05:27 +0000 (11:05 +0100)]
[mlir][OpenACCToLLVM] Add pass option to emit opaque pointers
Part of https://discourse.llvm.org/t/rfc-switching-the-llvm-dialect-and-dialect-lowerings-to-opaque-pointers/68179
This patch simply adds the pass option use-opaque-pointers to instruct the pass to use opaque-pointers instead of typed pointers during conversion.
The pass itself does not actually make use of any pointers or anything, so did not require any changes except simply setting the option in the type converter. The tests have also been converted to using opaque-pointers
Differential Revision: https://reviews.llvm.org/D144462
Florian Hahn [Tue, 21 Feb 2023 13:01:10 +0000 (13:01 +0000)]
[GlobalOpt] Add test with large number of stores with non-null loads.
Kiran Chandramohan [Tue, 21 Feb 2023 12:34:03 +0000 (12:34 +0000)]
[Flang][OpenMP] Add convert to match the argument and result of update Op
Fixes #60873
Reviewed By: peixin
Differential Revision: https://reviews.llvm.org/D144432
Kiran Chandramohan [Tue, 21 Feb 2023 12:26:38 +0000 (12:26 +0000)]
[Flang][WWW] Add link to doxygen page
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D144433
Valentin Clement [Tue, 21 Feb 2023 12:23:44 +0000 (13:23 +0100)]
[flang] Add TODO when trying to do a polymorphic temp in getTempExtAddr
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D144459
Nikolas Klauser [Mon, 20 Feb 2023 15:47:34 +0000 (16:47 +0100)]
[runtimes] Remove add_target_flags* functions and use add_flags* instead
Reviewed By: phosek, #libunwind, #libc, #libc_abi
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D144398
Nikolas Klauser [Mon, 20 Feb 2023 15:26:19 +0000 (16:26 +0100)]
[runtimes] Move common functions from Handle{Libcxx,Libcxxabi,Libunwind}Flags.cmake to runtimes/cmake/Modules/HandleFlags.cmake
Reviewed By: phosek, #libunwind, #libc, #libc_abi
Spies: arichardson, libcxx-commits
Differential Revision: https://reviews.llvm.org/D144395
Manolis Tsamis [Tue, 21 Feb 2023 11:21:35 +0000 (12:21 +0100)]
[RISCV] Add vendor-defined XTheadMemPair (two-GPR Memory Operations) extension
The vendor-defined XTHeadMemPair (no comparable standard extension exists
at the time of writing) extension adds two-GPR load/store pair instructions.
It is supported by the C9xx cores (e.g., found in the wild in the
Allwinner D1) by Alibaba T-Head.
The current (as of this commit) public documentation for this
extension is available at:
https://github.com/T-head-Semi/thead-extension-spec/releases/download/2.2.2/xthead-2023-01-30-2.2.2.pdf
Support for these instructions has already landed in GNU Binutils:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=
6e17ae625570ff8f3c12c8765b8d45d4db8694bd
Depends on D143847
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D144002
Ting Wang [Tue, 21 Feb 2023 11:14:47 +0000 (06:14 -0500)]
[PowerPC][NFC] refactor eligible check for tail call optimization
The check logic for TCO is scattered in two functions:
IsEligibleForTailCallOptimization_64SVR4() IsEligibleForTailCallOptimization(),
and serves instruction selection phase only at this moment.
This patch aims to refactor existing logic to export an API for TCO
eligible query before instruction selection phase.
Reviewed By: shchenz, nemanjai
Differential Revision: https://reviews.llvm.org/D141673
Florian Hahn [Tue, 21 Feb 2023 11:04:09 +0000 (11:04 +0000)]
[GlobalOpt] Add tests for missed CleanupPointerRootUsers opportunity.
Owen Pan [Mon, 20 Feb 2023 00:05:55 +0000 (16:05 -0800)]
[clang-format][NFC] Clean up nullptr comparison style
For example, use 'Next' instead of 'Next != nullptr',
and '!Next' instead of 'Next == nullptr'.
Differential Revision: https://reviews.llvm.org/D144355
Taymon A. Beal [Tue, 21 Feb 2023 10:47:36 +0000 (02:47 -0800)]
[clalng-format] Fix handling of TypeScript tuples with optional last member
These were previously incorrectly treated as syntax errors.
Differential Revision: https://reviews.llvm.org/D144317
Florian Hahn [Tue, 21 Feb 2023 10:24:50 +0000 (10:24 +0000)]
[GlobalOpt] Add test with many stores of the initializer only.
LLVM GN Syncbot [Tue, 21 Feb 2023 10:00:16 +0000 (10:00 +0000)]
[gn build] Port
8e68c1204580
Nikita Popov [Tue, 21 Feb 2023 09:53:23 +0000 (10:53 +0100)]
[IPSCCP] Add tests for noundef attribute on zapped returns (NFC)
We replace the return value with undef without dropping the
noundef attribute.
Luke Lau [Mon, 20 Feb 2023 13:26:17 +0000 (13:26 +0000)]
[RISCV] Use a smaller VL when interleaving fixed vectors
Interleaves generated with vwaddu.vv and vwmaccu.vx were using VLs that
were twice the number of elements actually needed in the vector.
This also pulls the interleaving logic out into its own function so it
can be reused by later patches, and adapts it for scalable vectors.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D144386
Luke Lau [Mon, 20 Feb 2023 13:32:50 +0000 (13:32 +0000)]
[RISCV][NFC] Make a note of the operands for RISCVISD::VNSRL_VL
Split out from https://reviews.llvm.org/D144092
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D144387
pvanhout [Fri, 10 Feb 2023 08:32:51 +0000 (09:32 +0100)]
[AMDGPU] Remove function with incompatible features
Adds a new pass that removes functions
if they use features that are not supported on the current GPU.
This change is aimed at preventing crashes when building code at O0 that
uses idioms such as `if (ISA_VERSION >= N) intrinsic_a(); else intrinsic_b();`
where ISA_VERSION is not constexpr, and intrinsic_a is not selectable
on older targets.
This is a pattern that's used all over the ROCm device libs. The main
motive behind this change is to allow code using ROCm device libs
to be built at O0.
Note: the feature checking logic is done ad-hoc in the pass. There is no other
pass that needs (or will need in the foreseeable future) to do similar
feature-checking logic so I did not see a need to generalize the feature
checking logic yet. It can (and should probably) be generalized later and
moved to a TargetInfo-like class or helper file.
Reviewed By: arsenm, Joe_Nash
Differential Revision: https://reviews.llvm.org/D139000
Valentin Clement [Tue, 21 Feb 2023 09:14:00 +0000 (10:14 +0100)]
[flang] Use runtime Assign when rhs is polymorphic
Use the runtime when there lhs or rhs is polymorphic. The runtime
allows to deal better with polymorphic entities and aliasing.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D144418
Valentin Clement [Tue, 21 Feb 2023 09:12:07 +0000 (10:12 +0100)]
[flang] Accept polymorphic scalar in elemental intrinsic lowering
When lowering an elemental intrinsic like MERGE, a scalar
polymorphic entity was not recognized as a scalar. Update the check
so polyrmorphic entity can be used.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D144417
Ingo Müller [Fri, 17 Feb 2023 11:51:13 +0000 (11:51 +0000)]
[mlir][llvm] Add isVarArg flag to lookupOrCreateFn.
The function is a helper for looking up a function by name or creating
it if it doesn't exist. The arguments allow to specify the signature of
the function, if it needs to be created, but do not expose the varArg
parameter of LLVMFunctionType. This patch exposes adds an optional
parameter with a default value of false such that existing usage
continue to work as before.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D144256
Matthias Springer [Tue, 21 Feb 2023 07:57:16 +0000 (08:57 +0100)]
[mlir][transform] Add transform.get_defining_op op
This op is the inverse of `transform.get_result`.
Differential Revision: https://reviews.llvm.org/D144409
Nikita Popov [Tue, 21 Feb 2023 07:53:22 +0000 (08:53 +0100)]
[InstSimplify][CaptureTracking] Reduce scope of special case
As described in PR54002, the "icmp of load from global" special
case in CaptureTracking is incorrect, as is the InstSimplify code
that makes use of it.
This moves the special case from CaptureTracking into InstSimplify
to limit its scope to the buggy transform only, and not affect
other code using CaptureTracking as well.
Kazu Hirata [Tue, 21 Feb 2023 08:01:43 +0000 (00:01 -0800)]
[X86] Precommit a test
This patch precommits a test for:
https://github.com/llvm/llvm-project/issues/60374
Jessica Del [Mon, 20 Feb 2023 15:51:15 +0000 (16:51 +0100)]
[AMDGPU] MIR-Tests for Multiplication using KBA
These tests show inefficient behavior that will be optimized by a
later change.
By using Known Bits Analysis, we can avoid unnecessary multiplications
or additions with 0.
Alexis Murzeau [Tue, 21 Feb 2023 07:37:00 +0000 (07:37 +0000)]
[clang-tidy] update docs for new hungarian identifier-naming types (unsigned char and void)
Since
37e6a4f9496c8e35efc654d7619a79d6dbb72f99, `void` and
`unsigned char` were added to primitive types for hungarian notation.
This commit adds them to the documentation.
Reviewed By: carlosgalvezp
Differential Revision: https://reviews.llvm.org/D144422
Serge Pavlov [Tue, 21 Feb 2023 06:44:24 +0000 (13:44 +0700)]
[ADT] Alternative way to declare enum type as bitmask
If an enumeration represents a set of bit flags, using the macro
LLVM_MARK_AS_BITMASK_ENUM can make operations with such enumeration more
convenient. It however brings problems if the enumeration is non-scoped.
As the macro adds an item LLVM_BITMASK_LARGEST_ENUMERATOR to the
enumeration type, only one such type may be declared as bitmask. This
problem could be solved by convertion of the enumeration to scoped, but
it requires static_casts in new places and the convenience can be
eliminated.
This change introduces a new macro LLVM_DECLARE_ENUM_AS_BITMASK, which
allows non-invasive convertion of an enumeration into bitmask. It
provides specialization to trait classes, which previously were built
based on presence of LLVM_BITMASK_LARGEST_ENUMERATOR in the enumeration.
The macro must be specified in global or llvm namespace because the
trait classes are declared in llvm namespace.
Differential Revision: https://reviews.llvm.org/D144202
Konstantina Mitropoulou [Sat, 18 Feb 2023 00:42:35 +0000 (16:42 -0800)]
[AMDGPU] Add tests for future commit
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D144312
Uday Bondhugula [Tue, 21 Feb 2023 04:45:57 +0000 (10:15 +0530)]
[MLIR][Affine] Fix affine.parallel op domain add
Fix obvious bug in `addAffineParallelOpDomain` that would lead to
incorrect domain constraints for any affine.parallel op with
dimensionality greater than one.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D144349
Uday Bondhugula [Tue, 21 Feb 2023 04:40:15 +0000 (10:10 +0530)]
[MLIR] Add replaceUsesWithIf on Operation
Add replaceUsesWithIf on Operation along the lines of
Value::replaceUsesWithIf. This had been missing on Operation and is
convenient to replace multi-result operations' results conditionally.
Reviewed By: lattner
Differential Revision: https://reviews.llvm.org/D144348
eopXD [Tue, 21 Feb 2023 04:15:56 +0000 (20:15 -0800)]
[TypeSize][NFC] Fix type-o
Signed-off-by: eop Chen <eop.chen@sifive.com>
esmeyi [Tue, 21 Feb 2023 03:25:49 +0000 (22:25 -0500)]
[AIX] Lower some memory intrinsics to millicode functions on AIX
Summary: Currently we lower MEMCPY/MEMMOVE/MEMSET/BZERO to the corresponding libc functions. And the libc functions call the millicode functions on AIX. We can lower these intrinsics directly to save one call layer.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D143997
Wang, Xin10 [Tue, 21 Feb 2023 02:16:47 +0000 (10:16 +0800)]
[X86][NFC] Reorganize X86InstrInfo.td
For now X86InstrInfo.td has many definitions for some instrs
and patterns which I think should not exist here, extract them
and move to other files.
It will be more clear to me that X86InstrInfo just define some
X86 specific properties and would not include detailed instrs
definition.
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D144244
Kazu Hirata [Tue, 21 Feb 2023 01:00:03 +0000 (17:00 -0800)]
[X86] Precommit a test
This is for:
https://github.com/llvm/llvm-project/issues/60854
Kazu Hirata [Tue, 21 Feb 2023 00:38:21 +0000 (16:38 -0800)]
[X86] Improve (select carry C1+1 C1)
Without this patch:
return X < 4 ? 3 : 2;
return X < 9 ? 7 : 6;
are compiled as:
31 c0 xor %eax,%eax
83 ff 04 cmp $0x4,%edi
0f 93 c0 setae %al
83 f0 03 xor $0x3,%eax
31 c0 xor %eax,%eax
83 ff 09 cmp $0x9,%edi
0f 92 c0 setb %al
83 c8 06 or $0x6,%eax
respectively. With this patch, we generate:
31 c0 xor %eax,%eax
83 ff 04 cmp $0x4,%edi
83 d0 02 adc $0x2,%eax
31 c0 xor %eax,%eax
83 ff 04 cmp $0x4,%edi
83 d0 02 adc $0x2,%eax
respectively, saving 3 bytes while reducing the tree height.
This patch recognizes the equivalence of OR and ADD
(if bits do not overlap) and delegates to combineAddOrSubToADCOrSBB
for further processing. The same applies to the equivalence of XOR
and SUB.
Differential Revision: https://reviews.llvm.org/D143838
Luo, Yuanke [Mon, 20 Feb 2023 12:00:37 +0000 (20:00 +0800)]
[X86] Add test case that clobber base pointer register.
Simon Pilgrim [Mon, 20 Feb 2023 23:31:49 +0000 (23:31 +0000)]
[SLP][X86] minimum-sizes.ll - add AVX512 test coverage
As noticed on D144128, we need better AVX512 coverage for GEP vectorization
Alexandre Ganea [Mon, 20 Feb 2023 22:16:21 +0000 (17:16 -0500)]
[Support] Silence warning with Clang ToT.
This fixes the following warning on Windows with latest Clang:
```
[160/3057] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/Signals.cpp.obj
In file included from C:/git/llvm-project/llvm/lib/Support/Signals.cpp:260:
C:/git/llvm-project/llvm/lib/Support/Windows/Signals.inc(834,15): warning: comparison of integers of different signs: 'int' and 'unsigned int' [-Wsign-compare]
if (RetCode == (0xE0000000 | EX_IOERR))
~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~
1 warning generated.```
Simon Pilgrim [Mon, 20 Feb 2023 23:21:27 +0000 (23:21 +0000)]
[SLP][X86] load-merge.ll - add AVX512 test coverage
As noticed on D144128, we need better AVX512 coverage for GEP vectorization
Brad Smith [Mon, 20 Feb 2023 22:57:15 +0000 (17:57 -0500)]
[PowerPC] Correctly use ELFv2 ABI on all OS's that use the ELFv2 ABI
Add a member function isPPC64ELFv2ABI() to determine what ABI is used on the
64-bit PowerPC big endian operating environment.
Reviewed By: nemanjai, dim, pkubaj
Differential Revision: https://reviews.llvm.org/D144321
Peter Cooper [Mon, 13 Feb 2023 23:38:01 +0000 (15:38 -0800)]
Use modern @got syntax in tsan assembly, instead of old style non_lazy_pointers. NFC
Reviewed By: kubamracek, yln, wrotki, dvyukov
Differential Revision: https://reviews.llvm.org/D143959
Luke Lau [Fri, 10 Feb 2023 01:17:14 +0000 (01:17 +0000)]
[RISCV][NFC] Add test for different LMULs in vectorizer
This is a test for an upcoming patch that proposes to change the default LMUL used by the loop vectorizer from 1 to 2
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D143722
Joseph Huber [Mon, 20 Feb 2023 21:42:28 +0000 (15:42 -0600)]
[libc] Fix GPU include directories not being set properly
Summary:
For some reason, this variable was set after where it was used. Causing
weird behaviour with including the standard headers. Fix it.
Sanjay Patel [Mon, 20 Feb 2023 19:39:51 +0000 (14:39 -0500)]
[InstCombine] relax constraint on udiv fold
The pair of div folds was just added with:
4966d8ebe1bbe5bd6a4d28
But as noted in the post-commit review, we don't actually need
the no-remainder requirement for an unsigned division (still
need the no-unsigned-wrap though):
https://alive2.llvm.org/ce/z/qHjK3Q
Felipe de Azevedo Piovezan [Fri, 10 Feb 2023 20:21:46 +0000 (15:21 -0500)]
[debug-info][codegen] Prevent creation of self-referential SP node
The function `CGDebugInfo::EmitFunctionDecl` is supposed to create a
declaration -- never a _definition_ -- of a subprogram. This is made
evident by the fact that the SPFlags never have the "Declaration" bit
set by that function.
However, when `EmitFunctionDecl` calls `DIBuilder::createFunction`, it
still tries to fill the "Declaration" argument by passing it the result
of `getFunctionDeclaration(D)`. This will query an internal cache of
previously created declarations and, for most code paths, we return
nullptr; all is good.
However, as reported in [0], there are pathological cases in which we
attempt to recreate a declaration, so the cache query succeeds,
resulting in a subprogram declaration whose declaration field points to
another declaration. Through a series of RAUWs, the declaration field
ends up pointing to the SP itself. Self-referential MDNodes can't be
`unique`, which causes the verifier to fail (declarations must be
`unique`).
We can argue that the caller should check the cache first, but this is
not a correctness issue (declarations are `unique` anyway). The bug is
that `CGDebugInfo::EmitFunctionDecl` should always pass `nullptr` to the
declaration argument of `DIBuilder::createFunction`, expressing the fact
that declarations don't point to other declarations. AFAICT this is not
something for which any reasonable meaning exists.
This seems a lot like a copy-paste mistake that has survived for ~10
years, since other places in this file have the exact same call almost
token-by-token.
I've tested this by compiling LLVMSupport with and without the patch, O2
and O0, and comparing the dwarfdump of the lib. The dumps are identical
modulo the attributes decl_file/producer/comp_dir.
[0]: https://github.com/llvm/llvm-project/issues/59241
Differential Revision: https://reviews.llvm.org/D143921
Alexis Murzeau [Mon, 20 Feb 2023 19:18:36 +0000 (19:18 +0000)]
[clang-tidy] add primitive types for hungarian identifier-naming (unsigned char and void)
Add `unsigned char` and `void` types to recognized PrimitiveTypes.
Fixes: #60670
Depends on D144037
Reviewed By: carlosgalvezp
Differential Revision: https://reviews.llvm.org/D144041
Alexis Murzeau [Mon, 20 Feb 2023 19:03:10 +0000 (19:03 +0000)]
[clang-tidy] allow tests to use --config-file instead of --config
The previous way to test hungarian notation doesn't check CHECK-FIXES.
This will allow readability-identifier-naming tests of Hungarian
notation to keep the use of an external .clang-tidy file (not embedded
within the .cpp test file) and properly check CHECK-FIXES.
Also, it turns out the hungarian notation tests use the wrong
.clang-tidy file, so fix that too to make these tests ok.
This is a part of a fix for issue https://github.com/llvm/llvm-project/issues/60670.
Reviewed By: carlosgalvezp
Differential Revision: https://reviews.llvm.org/D144037
Sanjay Patel [Mon, 20 Feb 2023 19:04:03 +0000 (14:04 -0500)]
[InstCombine] auto-generate test CHECK lines; NFC
The check line was not enabled until
bfb1559fbe2fb656f3,
and then it was excessive, so the test started failing.
Sanjay Patel [Mon, 20 Feb 2023 18:42:04 +0000 (13:42 -0500)]
[InstCombine] distribute div over add with matching mul-by-constant
((X * C2) + C1) / C2 --> X + C1/C2
https://alive2.llvm.org/ce/z/P66io8
https://alive2.llvm.org/ce/z/vghegw
This could be made more general -- the multiplier could be a
multiple of the divisor -- but this is the pattern from
issue #60754.
Sanjay Patel [Mon, 20 Feb 2023 18:09:17 +0000 (13:09 -0500)]
[InstCombine] add tests for div with muladd operand; NFC
issue #60754
Sanjay Patel [Mon, 20 Feb 2023 16:20:30 +0000 (11:20 -0500)]
[InstCombine] add tests for add with sub-from-constant operand; NFC
Tiwari Abhinav Ashok Kumar [Mon, 20 Feb 2023 18:29:29 +0000 (23:59 +0530)]
[NFC] Fix missing colon in CHECK directives
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D144412
Kazu Hirata [Mon, 20 Feb 2023 18:38:18 +0000 (10:38 -0800)]
[lldb] Use llvm::rotr (NFC)
Ricardo Jesus [Mon, 20 Feb 2023 15:43:59 +0000 (15:43 +0000)]
[AArch64] Add tests for saba (NFC)
Tests in sve-saba.ll currently exhibit inefficient codegen.
Differential Revision: https://reviews.llvm.org/D144399
zhongyunde [Mon, 20 Feb 2023 15:49:13 +0000 (23:49 +0800)]
[InstCombine] canonicalize urem as cmp+select
Fix https://github.com/llvm/llvm-project/issues/60546
Reviewed By: nikic, efriedma, RKSimon, spatel
Differential Revision: https://reviews.llvm.org/D143883
Nikita Popov [Mon, 20 Feb 2023 08:46:54 +0000 (09:46 +0100)]
[InstCombine] Remove early constant fold
InstCombine currently performs a constant folding attempt as part
of the main InstCombine loop, before visiting the instruction.
However, each visit method will also attempt to simplify the
instruction, which will in turn constant fold it. (Additionally,
we also constant fold instructions before the main InstCombine loop
and use a constant folding IR builder, so this is doubly redundant.)
There is one place where InstCombine visit methods currently don't
call into simplification, and that's casts. To be conservative,
I've added an explicit constant folding call there (though it has
no impact on tests).
This makes for a mild compile-time improvement and in particular
mitigates the compile-time regression from enabling load
simplification in
be88b5814d9efce131dbc0c8e288907e2e6c89be.
Differential Revision: https://reviews.llvm.org/D144369
zhongyunde [Mon, 20 Feb 2023 12:50:55 +0000 (20:50 +0800)]
[test] precommit some tests for D143883 NFC
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D144372
Matthias Springer [Mon, 20 Feb 2023 15:32:43 +0000 (16:32 +0100)]
[mlir][bufferization] Fix crash in EmptyTensorElimination
Differential Revision: https://reviews.llvm.org/D144389
Nikita Popov [Mon, 20 Feb 2023 15:35:58 +0000 (16:35 +0100)]
[InstCombine] Use CaptureTracking in foldAllocaCmp()
foldAllocaCmp() checks whether the alloca is not captured (ignoring
the icmp). Replace the manual implementation of escape analysis
with CaptureTracking.
The primary practical difference is that CaptureTracking handles
nocapture arguments, while foldAllocaCmp() was using a hardcoded
list.
This is basically just the CaptureTracking refactoring from D120371
without the other changes.
Joseph Huber [Fri, 17 Feb 2023 18:06:02 +0000 (12:06 -0600)]
[libc] Fix dependencies for generating the GPU binary file
This patch adjusts the way dependencies are handled in the packaed
version of the GPU libc runtime. Previously the files were not getting
updated properly in the install when they changed.
Depends on D144214
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D144280
Joseph Huber [Thu, 16 Feb 2023 20:46:39 +0000 (14:46 -0600)]
[libc] Support add_object_library for the GPU build
This patch unifies the handling of generating the GPU build targets
between the `add_entrypoint_library` and the `add_object_library`
functions. The `_build_gpu_objects` function will create two targets.
One contains a single object file with several GPU binaries embedded in
it, a so-called fatbinary. The other is a direct compile of the
supported target to be used internally only. This patch pulls out some
of the properties logic so that we can handle both more easily. This
patch also required adding an ovverride `NO_GPU_BUILD` for cases when
we only want to build the source file as normal.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D144214
Ricardo Jesus [Mon, 20 Feb 2023 14:34:29 +0000 (14:34 +0000)]
[AArch64] Add SLP test for abs (NFC)
Differential Revision: https://reviews.llvm.org/D144376
Joseph Huber [Fri, 10 Feb 2023 20:37:20 +0000 (14:37 -0600)]
[Libomptarget] Implement the host memory allocator with fine grained memory
This patch should enable the "Host" allocation using fine-grained
memory. As far as I understand, this is HSA managed memory that is
availible to the host, but can be accessed by the device as well.
The original patch that introduced these extensions just stipulated that
it's "non-migratable" memory, which is most likely true because it's
managed by the host but accessible by the device. This should work
sufficiently well for what we expect the "host" allocation to do.
Depends on D143771
Reviewed By: kevinsala
Differential Revision: https://reviews.llvm.org/D143775
Joseph Huber [Fri, 10 Feb 2023 19:13:21 +0000 (13:13 -0600)]
[Libmoptarget] Enable the shared allocator for AMDGPU
Currently, the AMDGPU plugin did not support the `TARGET_ALLOC_SHARED`
allocation kind. We used the fine-grained memory allocator for the
"host" alloc when this is most likely not what is intended. Fine-grained
memory can be accessed by all agents, so it should be considered shared.
This patch removes the use of fine-grained memory for the host
allocator. A later patch will add support for this via the
`hsa_amd_memory_lock` method.
Reviewed By: kevinsala
Differential Revision: https://reviews.llvm.org/D143771
Zain Jaffal [Mon, 20 Feb 2023 13:56:00 +0000 (13:56 +0000)]
[ConstraintElimination] Add tests to check for `or` instruction decomposition if a constant operand is < 2^known_zero_bits of the first operand.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D142545
Dmitry Vyukov [Mon, 20 Feb 2023 10:58:15 +0000 (11:58 +0100)]
asan: fix crash in strdup on malloc failure
There are some programs that try to handle all malloc failures.
Let's allow testing of such programs.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D144374
NAKAMURA Takumi [Sun, 19 Feb 2023 22:12:57 +0000 (07:12 +0900)]
Rework "llvm-tblgen: Anonymize some functions.", llvmorg-17-init-2668-gc45e90cf152d
Anonymous namespace should be applied only to class definitions.
Alexey Bataev [Tue, 10 Jan 2023 12:28:21 +0000 (04:28 -0800)]
[SLP]Add shuffling of extractelements to avoid extra costs/data movement.
If the scalar must be extracted and then used in the gather node,
instead we can emit shuffle instruction to avoid those extra
extractelements and vector-to-scalar and back data movement.
Part of D110978
Differential Revision: https://reviews.llvm.org/D141940
David Green [Mon, 20 Feb 2023 14:13:53 +0000 (14:13 +0000)]
[AArch64] More consistently use buildvector for zero and all-ones constants
The AArch64 backend will use legal BUILDVECTORs for zero vectors or all-ones
vectors, so during selection tablegen patterns get rely on immAllZerosV and
immAllOnesV pattern frags in patterns like vnot. It was not always consistent
though, which this patch attempt to fix by recognizing where constant splat +
insert vector element is used. The main outcome of this will be that full
vector movi v0.2d, #
0000000000000000 will be used as opposed to movi d0, #0, as
per https://reviews.llvm.org/D53579. This helps simplify what tablegen will
see, to make pattern matching simpler.
Differential Revision: https://reviews.llvm.org/D144018
Florian Hahn [Mon, 20 Feb 2023 14:11:18 +0000 (14:11 +0000)]
[VPlan] Use usesScalars in shouldPack.
Suggested by @Ayal as follow-up improvement in D143864.
I was unable to find a case where this actually changes generated code,
but it is a unifying code to use common infrastructure.
Kerry McLaughlin [Mon, 20 Feb 2023 11:00:47 +0000 (11:00 +0000)]
[SME2][AArch64] Add multi-multi multiply-add long long intrinsics
Adds intrinsics for the following SME2 instructions (2 & 4 vectors):
- smlall
- smlsll
- umlall
- umlsll
- usmlall
NOTE: These intrinsics are still in development and are subject to future changes.
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D143277
Aaron Ballman [Mon, 20 Feb 2023 13:35:42 +0000 (08:35 -0500)]
Fix LLVM sphinx build
This fixes the issue found by:
https://lab.llvm.org/buildbot/#/builders/30/builds/32127
Caroline Concatto [Mon, 20 Feb 2023 12:21:38 +0000 (12:21 +0000)]
[IR] Add new intrinsics interleave and deinterleave vectors
This patch adds 2 new intrinsics:
; Interleave two vectors into a wider vector
<vscale x 4 x i64> @llvm.vector.interleave2.nxv2i64(<vscale x 2 x i64> %even, <vscale x 2 x i64> %odd)
; Deinterleave the odd and even lanes from a wider vector
{<vscale x 2 x i64>, <vscale x 2 x i64>} @llvm.vector.deinterleave2.nxv2i64(<vscale x 4 x i64> %vec)
The main motivator for adding these intrinsics is to support vectorization of
complex types using scalable vectors.
The intrinsics are kept simple by only supporting a stride of 2, which makes
them easy to lower and type-legalize. A stride of 2 is sufficient to handle
complex types which only have a real/imaginary component.
The format of the intrinsics matches how `shufflevector` is used in
LoopVectorize. For example:
using cf = std::complex<float>;
void foo(cf * dst, int N) {
for (int i=0; i<N; ++i)
dst[i] += cf(1.f, 2.f);
}
For this loop, LoopVectorize:
(1) Loads a wide vector (e.g. <8 x float>)
(2) Extracts odd lanes using shufflevector (leading to <4 x float>)
(3) Extracts even lanes using shufflevector (leading to <4 x float>)
(4) Performs the addition
(5) Interleaves the two <4 x float> vectors into a single <8 x float> using
shufflevector
(6) Stores the wide vector.
In this example, we can 1-1 replace shufflevector in (2) and (3) with the
deinterleave intrinsic, and replace the shufflevector in (5) with the
interleave intrinsic.
The SelectionDAG nodes might be extended to support higher strides (3, 4, etc)
as well in the future.
Similar to what was done for vector.splice and vector.reverse, the intrinsic
is lowered to a shufflevector when the type is fixed width, so to benefit from
existing code that was written to recognize/optimize shufflevector patterns.
Note that this approach does not prevent us from adding new intrinsics for other
strides, or adding a more generic shuffle intrinsic in the future. It just solves
the immediate problem of being able to vectorize loops with complex math.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D141924
Max Kazantsev [Mon, 20 Feb 2023 11:38:07 +0000 (18:38 +0700)]
Revert "[AssumptionCache] caches @llvm.experimental.guard's"
This reverts commit
f9599bbc7a3f831e1793a549d8a7a19265f3e504.
For some reason it caused us a huge compile time regression in downstream
workloads. Not sure whether the source of it is in upstream code ir not.
Temporarily reverting until investigated.
Differential Revision: https://reviews.llvm.org/D142330
Mel Chen [Mon, 13 Feb 2023 13:28:42 +0000 (05:28 -0800)]
[LV] Harden the test of the minmax with index pattern. (NFC)
- Add test config: -force-vector-width=4 -force-vector-interleave=1
- New test case: The test case both returns the minimum value and the index.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D143905
Florian Hahn [Mon, 20 Feb 2023 10:53:45 +0000 (10:53 +0000)]
[VPlan] Move shouldPack outside of DEBUG ifdef.
This fixes a build failure with assertions disabled.
Simon Tatham [Thu, 16 Feb 2023 15:34:33 +0000 (15:34 +0000)]
[LowerTypeTests] Support generating Armv6-M jump tables. (reland)
[Originally committed as
f6ddf7781471b71243fa3c3ae7c93073f95c7dff;
reverted in
bbef38352fbade9e014ec97d5991da5dee306da7 due to test
breakage; now relanded with the Arm tests conditioned on
`arm-registered-target`]
The LowerTypeTests pass emits a jump table in the form of an
`inlineasm` IR node containing a string representation of some
assembly. It tests the target triple to see what architecture it
should be generating assembly for. But that's not good enough for
`Triple::thumb`, because the 32-bit PC-relative `b.w` branch
instruction isn't available in all supported architecture versions. In
particular, Armv6-M doesn't support that instruction (although the
similar Armv8-M Baseline does).
Most of this patch is concerned with working out whether the
compilation target is Armv6-M or not, which I'm doing by going through
all the functions in the module, retrieving a TargetTransformInfo for
each one, and querying it via a new method I've added to check its
SubtargetInfo. If any function's TTI indicates that it's targeting an
architecture supporting B.W, then we assume we're also allowed to use
B.W in the jump table.
The Armv6-M compatible jump table format requires a temporary
register, and therefore also has to use the stack in order to restore
that register.
Another consequence of this change is that jump tables on Arm/Thumb
are no longer always the same size. In particular, on an architecture
that supports Arm and Thumb-1 but not Thumb-2, the Arm and Thumb
tables are different sizes from //each other//. As a consequence,
``getJumpTableEntrySize`` can no longer base its answer on the target
triple's architecture: it has to take into account the decision that
``selectJumpTableArmEncoding`` made, which meant I had to move that
function to an earlier point in the code and store its answer in the
``LowerTypeTestsModule`` class.
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D143576
Florian Hahn [Mon, 20 Feb 2023 10:28:24 +0000 (10:28 +0000)]
[VPlan] Replace AlsoPack field with shouldPack() method (NFC).
There is no need to update the AlsoPack field when creating
VPReplicateRecipes. It can be easily computed based on the VP def-use
chains when it is needed.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D143864
Matt Devereau [Mon, 20 Feb 2023 09:47:53 +0000 (09:47 +0000)]
[InstSimplify] Correct icmp_lshr test to use ult instead of slt
Nicolas Vasilache [Mon, 20 Feb 2023 09:40:18 +0000 (01:40 -0800)]
[mlir][linalg][TransformOps] Connect hoistRedundantVectorTransfers
Connect the hoistRedundantVectorTransfers functionality to the transform
dialect.
Authored-by: Quentin Colombet <quentin.colombet@gmail.com>
Differential Revision: https://reviews.llvm.org/D144260
Nikita Popov [Wed, 15 Feb 2023 11:14:55 +0000 (12:14 +0100)]
[InstCombine] Call simplifyLoadInst()
InstCombine is supposed to be a superset of InstSimplify, but
failed to invoke load simplification.
Unfortunately, this causes a minor compile-time regression, which
will be mitigated in a future commit.
Max Kazantsev [Mon, 20 Feb 2023 09:42:14 +0000 (16:42 +0700)]
[Test] Move test for D143726 to LICM
Seems that it's a more appropriate place to do this transform.
Nikita Popov [Mon, 20 Feb 2023 09:38:22 +0000 (10:38 +0100)]
[InstCombine] Add additional load folding tests (NFC)
These show that we currently fail to call load simplification from
InstCombine.
Matt Devereau [Tue, 31 Jan 2023 13:30:09 +0000 (13:30 +0000)]
[InstSimplify] Simplify icmp between Shl instructions of the same value
define i1 @compare_vscales() {
%vscale = call i64 @llvm.vscale.i64()
%vscalex2 = shl nuw nsw i64 %vscale, 1
%vscalex4 = shl nuw nsw i64 %vscale, 2
%cmp = icmp ult i64 %vscalex2, %vscalex4
ret i1 %cmp
}
This IR is currently emitted by LLVM. This icmp is redundant as this snippet
can be simplified to true or false as both operands originate from the same
@llvm.vscale.i64() call.
Differential Revision: https://reviews.llvm.org/D142542
Max Kazantsev [Mon, 20 Feb 2023 08:48:05 +0000 (15:48 +0700)]
[SCEV] Canonicalize ext(min/max(x, y)) to min/max(ext(x), ext(y))
I stumbled over this while trying to improve our exit count work. These expressions
are equivalent for complementary signed/unsigned ext and min/max (including umin_seq),
but they are not canonicalized and SCEV cannot recognize them as the same.
The benefit of this canonicalization is that SCEV can prove some new equivalences which
it coudln't prove because of different forms. There is 1 test where trip count seems pessimized,
I could not directly figure out why, but it just seems an unrelated issue that we can fix.
Other changes seem neutral or positive to me.
Differential Revision: https://reviews.llvm.org/D141481
Reviewed By: nikic
Kazu Hirata [Mon, 20 Feb 2023 08:58:29 +0000 (00:58 -0800)]
Migrate away from the soft-deprecated functions in APInt.h (NFC)
Note that those functions on the left hand side are soft-deprecated in
favor of those on the right hand side:
getMinSignedBits -> getSignificantBits
getNullValue -> getZero
isNullValue -> isZero
isOneValue -> isOne
Sameer Sahasrabuddhe [Mon, 20 Feb 2023 08:55:37 +0000 (14:25 +0530)]
[llvm][Uniformity] A phi with an undef argument is not always divergent
The uniformity analysis treated an undef argument to phi to be distinct from any
other argument, equivalent to calling PHINode::hasConstantValue() instead of
PHINode::hasConstantOrUndefValue(). Such a phi was reported as divergent. This
is different from the older divergence analysis which treats such a phi as
uniform. Fixed uniformity analysis to match the older behaviour.
The original behaviour was added to DivergenceAnalysis in D19013. But it is not
clear if relying on the undef value is safe. The defined values are not constant
per se; they just happen to be uniform and the non-constant uniform value may
not dominate the PHI.
Reviewed By: ruiling
Differential Revision: https://reviews.llvm.org/D144254