JP Lehr [Tue, 28 Feb 2023 12:15:22 +0000 (07:15 -0500)]
[OpenMP][AMDGPU] More detail in AMDGPU kernel launch info
Makes the info that is printed for kernel launches configurable for
different plugins. Adds all machinery to print the detailed launch
info that the current AMD plugin provides and includes e.g. register
spill counts.
The files msgpack.cpp, msgpack.def, and msgpack.h are copied from the old plugin
and are untouched. The contents of UtilitiesHSA.cpp and .h are copied together from
various files from the old plugin. The code was originally written by
Jon Chesterfield. I updated the function and type names visible to the outside, i.e.
in headers, to respect the LLVM conventions.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D144521
sgokhale [Tue, 28 Feb 2023 12:37:36 +0000 (18:07 +0530)]
[LV] Modify test case for commit 4f9a544
Was observing test failure. Relanding the test
Benjamin Kramer [Tue, 28 Feb 2023 12:22:10 +0000 (13:22 +0100)]
Revert "[mlir][linalg] Vectorize tensor.extract using contiguous loads"
This reverts commit
89b144ece330b363713bec369d2d89dc85f715f5. See
https://reviews.llvm.org/D141998 for a test case where this goes wrong.
Sacha Ballantyne [Tue, 28 Feb 2023 11:56:23 +0000 (11:56 +0000)]
[flang] Change COUNT intrinsic to support different kind logicals
Previously COUNT would cast the mask input to logical<4> before passing it
to the runtime function, this has been changed to allow different types of logical.
Reviewed By: tblah
Differential Revision: https://reviews.llvm.org/D144867
Ivan Kosarev [Tue, 28 Feb 2023 11:50:50 +0000 (11:50 +0000)]
[AMDGPU][NFC] Eliminate the u32imm operand definition.
It is only used to infer the types of offset parameters in isel patterns,
which we can specify directly.
Reviewed By: piotr
Differential Revision: https://reviews.llvm.org/D144890
dbakunevich [Mon, 20 Feb 2023 12:18:03 +0000 (19:18 +0700)]
[SCEV][NFC] Introduce utility function to get power of 2
The new function has been added to SCEV that allows to raise the number 2
to the desired power.
Authored-by: Dmitry Bakunevich
Differential Revision: https://reviews.llvm.org/D144381
sgokhale [Tue, 28 Feb 2023 12:02:39 +0000 (17:32 +0530)]
[LV] Reland "Update logic for calculating register usage due to invariants"
Previously, while calculating register usage due to invariants, it was assumed that invariant would always be part of widening
instructions. This resulted in calculating vector register types for vectors which cant be legalized(check the newly added test for more details).
An invariant might not always need a vector register. For e.g., invariant might just be used for iteration check.
This patch checks if the invariant is part of any widening instruction and considers register usage accordingly. Fixes issue 60493
Differential Revision: https://reviews.llvm.org/D143422
Nicolas Vasilache [Fri, 24 Feb 2023 11:59:08 +0000 (03:59 -0800)]
[mlir][Linalg] Refactor transform.structured.pad to separate out hoisting
Depends on: D144717
Differential Revision: https://reviews.llvm.org/D144856
Nicolas Vasilache [Fri, 24 Feb 2023 11:05:41 +0000 (03:05 -0800)]
[mlir][Linalg] NFC - Apply cleanups to transforms
Depends on: D144656
Differential Revision: https://reviews.llvm.org/D144717
Deniz Evrenci [Tue, 28 Feb 2023 10:36:46 +0000 (11:36 +0100)]
[Clang] Do not emit exception diagnostics from coroutines and coroutine lambdas
All exceptions thrown in coroutine bodies are caught and
unhandled_exception member of the coroutine promise type is called.
In accordance with the existing rules of diagnostics related to
exceptions thrown in functions marked noexcept, even if the promise
type's constructor, get_return_object, or unhandled_exception
throws, diagnostics should not be emitted.
Fixes #48797.
Differential Revision: https://reviews.llvm.org/D144352
Florian Hahn [Tue, 28 Feb 2023 10:40:36 +0000 (11:40 +0100)]
[GlobalOpt] Split CleanupPointerRootUsers test with constant exprs.
Split tests for D144468. Adding a test with an icmp constant expression
stopped CleanupPointerRootUsers from being called. Move it to a
separate test.
Nikita Popov [Tue, 28 Feb 2023 09:17:03 +0000 (10:17 +0100)]
[CHR] Do not fetch BFI without profile summary (NFCI)
Do not compute BFI if PGO is not used. This addresses the
compile-time regression from https://reviews.llvm.org/D144769.
sgokhale [Tue, 28 Feb 2023 10:16:59 +0000 (15:46 +0530)]
Revert "[LV] Update logic for calculating register usage due to invariants"
Observing test failure for llvm/test/Transforms/LoopVectorize/AArch64/reg-usage.ll
This reverts commit
d1628266946fdddb44bdad2b3ccf3cd5fc769f42.
luxufan [Tue, 28 Feb 2023 09:59:02 +0000 (17:59 +0800)]
[Local][InstCombine] Handle MD_noundef in combineMetadataCSE
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D144942
Max Kazantsev [Tue, 28 Feb 2023 09:16:01 +0000 (16:16 +0700)]
[LICM][NFC] Don't preserve DT and loop analyzes separately
This is already implied by getLoopPassPreservedAnalyses.
Differential Revision: https://reviews.llvm.org/D144860
Reviewed By: nikic, skatkov
Dmitry Makogon [Tue, 28 Feb 2023 09:21:58 +0000 (16:21 +0700)]
[GuardWidening] Rename 'isAvailableAt' -> 'canBeHoistedTo' (NFC)
This better describes what this method does.
Dmitry Makogon [Tue, 28 Feb 2023 08:14:37 +0000 (15:14 +0700)]
[GuardWidening] Make sure widened condition operands are available at insertion point
This fixes a possible issue when we could hoist an instruction up to a widenable
condition intrinsic call without making sure its operands are available at
hoisiting point.
Recently insertion point finding algorithm changed a bit, so this availability
check became necessary.
Verifier would crash after we handled the following special case:
L >u C0 && L >u C1 -> L >u max(C0, C1),
Previously we would insert the new condition right before the widenable condition
branch where all L operands were available.
Now we may choose the widenable condition intrinsic call as insertion point and it may
happen so that the L operands are computed after the call, so we have to make sure that
L operands are available at the point we want to insert it.
Differential Revision: https://reviews.llvm.org/D144944
Haojian Wu [Tue, 28 Feb 2023 09:15:28 +0000 (10:15 +0100)]
[mlir] Fix the missing mlir test BUILD.bazel for
3948f0a0b5e5fecddf315b8de321c6a44ee7ff5c
Haojian Wu [Tue, 28 Feb 2023 08:32:30 +0000 (09:32 +0100)]
Matthias Springer [Mon, 27 Feb 2023 09:37:02 +0000 (10:37 +0100)]
[mlir][Affine][NFC] Improve FlatAffineValueConstraint dump
Improve indentation for better readability.
Before:
```
Domain: 0, Range: 2, Symbols: 2, Locals: 1
5 constraints
(None Value Value Value Local const)
1 1 0 -1 0 0 = 0
0 1 -1 0 0 0 >= 0
0 0 1 -1 2 2 >= 0
0 0 -1 1 -2 -1 >= 0
0 -1 1 0 2 0 >= 0
```
After:
```
Domain: 0, Range: 2, Symbols: 2, Locals: 1
5 constraints
(None Value Value Value Local const)
1 1 0 -1 0 0 = 0
0 1 -1 0 0 0 >= 0
0 0 1 -1 2 2 >= 0
0 0 -1 1 -2 -1 >= 0
0 -1 1 0 2 0 >= 0
```
Differential Revision: https://reviews.llvm.org/D144854
Dmitry Makogon [Tue, 28 Feb 2023 06:30:15 +0000 (13:30 +0700)]
[Test] Add test exposing hoisting bug in GuardWidening (NFC)
It hoits instruction without making sure its operands are
available at hoisiting point.
Tobias Gysi [Tue, 28 Feb 2023 07:44:57 +0000 (08:44 +0100)]
[mlir][llvm] Rename LLVMOpsInterfaces.td to LLVMInterfaces.td (NFC).
The revision renames LLVMOpsInterfaces.td since the the tablegen file
contains op and type interfaces.
Reviewed By: ftynse, Dinistro
Differential Revision: https://reviews.llvm.org/D144875
Jun Zhang [Sat, 25 Feb 2023 06:23:01 +0000 (14:23 +0800)]
[InstCombine] Fold signbit test of a pow2 or zero
(X & X) < 0 --> X == MinSignedC
(X & X) > -1 --> X != MinSignedC
Alive2: https://alive2.llvm.org/ce/z/_J5q3S
Closes: https://github.com/llvm/llvm-project/issues/60957
Signed-off-by: Jun Zhang <jun@junz.org>
Differential Revision: https://reviews.llvm.org/D144777
Craig Topper [Tue, 28 Feb 2023 06:33:41 +0000 (22:33 -0800)]
[TableGen] Replace a StringMap keyed by Record name with a DenseMap.
We can use the Record* to uniquely identify the Record without using
its name.
LiaoChunyu [Tue, 28 Feb 2023 06:07:25 +0000 (14:07 +0800)]
[RISCV] Enable preferZeroCompareBranch to optimize branch on zero in codegenprepare
Similar to ARM and SystemZ.
Related Patchs: D101778(preferZeroCompareBranch)
https://reviews.llvm.org/rG9a9421a461166482465e786a46f8cced63cd2e9f ( == 0 to u< 1)
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D142071
Chuanqi Xu [Tue, 28 Feb 2023 06:37:08 +0000 (14:37 +0800)]
[NFC] Add a test about template pack for C++20 Modules
I found the issue in a donwstream project. But the case looks fine in
the upstream. Add the test to make sure that it wouldn't happen in the
upstream.
Sameer Sahasrabuddhe [Tue, 28 Feb 2023 05:59:19 +0000 (11:29 +0530)]
[llvm][Uniformity] provide overloads for Instruction* and Value*
Uniformity analysis is mainly concerned with the uniformity of values. But it is
sometimes useful to ask if an instruction is uniform, for example, if the
instruction is a terminator. On LLVM IR, every Instruction is a Value, so the
queries like isUniform() need to be overloaded so that the most derived class
always wins.
Reviewed By: ruiling
Differential Revision: https://reviews.llvm.org/D144699
sgokhale [Mon, 27 Feb 2023 07:50:52 +0000 (13:20 +0530)]
[LV] Update logic for calculating register usage due to invariants
Previously, while calculating register usage due to invariants, it was assumed that invariant would always be part of widening
instructions. This resulted in calculating vector register types for vectors which cant be legalized(check the newly added test for more details).
An invariant might not always need a vector register. For e.g., invariant might just be used for iteration check.
This patch checks if the invariant is part of any widening instruction and considers register usage accordingly. Fixes issue 60493
Differential Revision: https://reviews.llvm.org/D143422
Uday Bondhugula [Sun, 26 Feb 2023 14:56:08 +0000 (20:26 +0530)]
[MLIR] Fix bug in addAffineParallelOpDomain upper bound constraint
Fix upper bound constraint addition in addAffineParallelOpDomain; it was
off by one in the case of constants.
Differential Revision: https://reviews.llvm.org/D144836
Chia-hung Duan [Tue, 28 Feb 2023 03:20:38 +0000 (03:20 +0000)]
[scudo] Set name for reserved regions in SizeClassAllocator64
This improves the readability of address space
Reviewed By: enh
Differential Revision: https://reviews.llvm.org/D144898
Ben Shi [Fri, 24 Feb 2023 03:11:00 +0000 (11:11 +0800)]
[AVR] Fix incorrect flags of livein registers when spilling them
In AVRFrameLowering::spillCalleeSavedRegisters(), when a 16-bit
livein register is spilled, two PUSH instructions are generated
for the higher and lower 8-bit registers. But these two 8-bit
registers are marked as killed in the two PUSH instructions, so
any future use of them will cause a crash.
This patch fixes the above issue by adding the two sub 8-bit
registers to the livein list.
Fixes https://github.com/llvm/llvm-project/issues/56423
Reviewed By: jacquesguan
Differential Revision: https://reviews.llvm.org/D144720
Arthur Eubanks [Tue, 28 Feb 2023 03:00:37 +0000 (19:00 -0800)]
[IPO] Remove various legacy passes
These are part of the optimization pipeline, of which the legacy pass manager version is deprecated and being removed.
Theodoros Kasampalis [Tue, 28 Feb 2023 02:27:59 +0000 (10:27 +0800)]
[X86] Fix for offsets of CFA directives
`emitPrologue` may insert stack pointer adjustment in tail call optimized functions where the callee argument stack size is bigger than the caller's. In such a case, the adjustment must be taken into account when generating CFA directives.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D143618
Craig Topper [Tue, 28 Feb 2023 02:12:18 +0000 (18:12 -0800)]
[CodeGen] Use LLVM_ATTRIBUTE_UNUSED instead of LLVM_DUMP_METHOD on a raw_ostream operator<<.
LLVM_DUMP_METHOD includes ATTRIBUTE_NOINLINE. operator<< isn't
what we normally consider a dump method so it should be ok to inline.
This fixes a warning from gcc that some other declaration for some
other class was inline but this one is noinline. Seems like a bogus
warning from gcc really.
LLVM GN Syncbot [Tue, 28 Feb 2023 01:47:13 +0000 (01:47 +0000)]
[gn build] Port
58ec6e09abe8
Craig Topper [Tue, 28 Feb 2023 01:39:50 +0000 (17:39 -0800)]
[AArch64] Use isSVESizelessBuiltinType instead of isSizelessBuiltinType in SVE specific code.
isSizelessBuiltinType includes RISC-V vector and WebAssembly reference
types. This code is not applicable to those types.
Johannes Doerfert [Mon, 27 Feb 2023 21:31:05 +0000 (13:31 -0800)]
[OpenMP][FIX] Properly align firstprivate variables
The old code didn't actually align the values, and it added padding even
when none was necessary. This approach will pad entries if necessary
and, similar to the struct case, use the host pointer as guidance.
NOTE: This does still not align them as the host has, but it's unclear
if the user really should use the alignment bits anyway. For now
this is a reasonable compromise, only if we have host alignment
information (explicitly not implicitly via the host pointer), we
could do it completely right without wasting lots of resources for
>99% of the cases.
Fixes: https://github.com/llvm/llvm-project/issues/61034
Jakub Kuderski [Tue, 28 Feb 2023 01:25:20 +0000 (20:25 -0500)]
[ADT] Fix definition of `adl_begin`/`adl_end` and `Iter`/`ValueOfRange`
- Make `IterOfRange` and `ValueOfRange` work with types that require
custom `begin`/`end` functions.
- Allow for `adl_begin`/`adl_end` to be used in constant-evaluated
contexts.
- Use SFINAE-friendly trailing return type deductions `adl_begin`/`adl_end` so that they are useable in template argument deduction.
- Add missing documentation comments.
This is required for future work in https://reviews.llvm.org/D144503.
Reviewed By: dblaikie, zero9178
Differential Revision: https://reviews.llvm.org/D144583
Chris Cotter [Tue, 28 Feb 2023 01:26:45 +0000 (01:26 +0000)]
[NFC] [clang] Forward forwarding reference
Update function bodies to forward forwarding references.
I spotted this while authoring a clang-tidy tool for CppCoreGuideline F.19
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D143877
David Blaikie [Tue, 28 Feb 2023 00:54:24 +0000 (00:54 +0000)]
DebugInfo: Disable ctor homing for types with only deleted (non copy/move) ctors
Such a type is never going to have a ctor home, and may be used for type
punning or other ways of creating objects.
May be a more generally acceptable solution in some cases compared to
attributing with [[clang::standalone_debug]].
Differential Revision: https://reviews.llvm.org/D144931
Noah Goldstein [Tue, 28 Feb 2023 00:22:22 +0000 (18:22 -0600)]
Add tests for replacing `{v}unpck{l|h}pd` -> `{v}shufps`; NFC
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D144442
Noah Goldstein [Thu, 16 Feb 2023 17:57:12 +0000 (11:57 -0600)]
Add new pass `X86FixupInstTuning` for fixing up machine-instruction selection.
There are a variety of cases where we want more control over the exact
instruction emitted. This commit creates a new pass to fixup
instructions after the DAG has been lowered. The pass is only meant to
replace instructions that are guranteed to be interchangable, not to
do analysis for special cases.
Handling these instruction changes in in X86ISelLowering of
X86ISelDAGToDAG isn't ideal, as its liable to either break existing
patterns that expected a certain instruction or generate infinite
loops.
As well, operating as the MachineInstruction level allows us to access
scheduling/code size information for making the decisions.
Currently only implements `{v}permilps` -> `{v}shufps/{v}shufd` but
more transforms can be added.
Differential Revision: https://reviews.llvm.org/D143787
Noah Goldstein [Sat, 25 Feb 2023 03:05:10 +0000 (21:05 -0600)]
Add tests for replacing `{v}permilps` -> `{v}shufps/{v}pshufd`; NFC
Differential Revision: https://reviews.llvm.org/D144779
Noah Goldstein [Thu, 16 Feb 2023 17:56:48 +0000 (11:56 -0600)]
Adding tuning flags for int <-> fp domain switching penalties; NFC
Atom
- No domain switching penalties
Nehalem+
- No penalty on moves
Haswell+
- No penalty on moves / shuffles
Skylake+
- No penality on moves / shuffles / blends
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D143859
Fangrui Song [Tue, 28 Feb 2023 00:19:13 +0000 (16:19 -0800)]
[ELF][PPC64] Actually implement --no-power10-stubs
When a caller that does not use TOC calls a function, a call stub is needed if
the function may use TOC. --no-power10-stubs avoids PC-relative instructions in
the code sequence.
The --no-power10-stubs=no implementation added in D94627 is wrong.
First, the first instruction incorrectly uses `mflr 0` (instead of `mflr 12`).
Second, for the PLT case, it uses addis+addi with getVA instead of addis+ld with
getGotPltVA.
Amir Ayupov [Mon, 27 Feb 2023 23:40:45 +0000 (15:40 -0800)]
[BOLT][NFC] Simplify BinaryFunction::setTrapOnEntry
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D144758
Arthur Eubanks [Mon, 27 Feb 2023 23:38:39 +0000 (15:38 -0800)]
[polly] Remove unnecessary -enable-new-pm flags
Amir Ayupov [Mon, 27 Feb 2023 23:26:14 +0000 (15:26 -0800)]
[BOLT][NFC] Log reversing splitting decision
Expose log for testing purposes.
Reviewed By: #bolt, maksfb
Differential Revision: https://reviews.llvm.org/D144674
Amir Ayupov [Mon, 27 Feb 2023 23:21:23 +0000 (15:21 -0800)]
[BOLT] Prevent unsetting unknown control flow for split jump table
In case of a function with unknown control flow but with a single jump
table and a single jump table site, we attempt to match the jump table
and a site and update block successors using jump table targets.
Restrict this behavior for split jump tables which have targets in a
fragment function.
Fixes https://github.com/llvm/llvm-project/issues/60795.
Reviewed By: #bolt, rafauler
Differential Revision: https://reviews.llvm.org/D144602
Amir Ayupov [Wed, 22 Feb 2023 06:09:17 +0000 (22:09 -0800)]
[BOLT][NFC] Const-ify analyzeJumpTable
Avoid modifying `BF`, instead set extra output parameter and modify BF in caller
scope.
Reviewed By: #bolt, rafauler
Differential Revision: https://reviews.llvm.org/D144598
Maksim Panchenko [Fri, 24 Feb 2023 23:45:21 +0000 (15:45 -0800)]
[BOLT] Change call count output for ICF
ICF optimization runs multiple passes and the order in which functions
are folded could be dependent on the order they are being processed.
This order is indeterministic as functions are intermediately stored in
std::unordered_map<>. Note that this order is mostly stable, but is not
guaranteed to be and can change e.g. after switching to a different C++
library implementation.
Because the processing (and folding) order is indeterministic, the
previous way of calculating merged function call count could produce
different results.
Change the way we calculate the ICF call count to make it independent of
the function folding/processing order.
Mostly NFC as the output binary should remain the same, the change
affects only the console output.
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D144807
wpmed92 [Mon, 27 Feb 2023 20:51:28 +0000 (12:51 -0800)]
[mlir][core] Fix ValueRange printing in AsmPrinter
The ValueRange printing behaviour of `OpAsmPrinter` and `AsmPrinter` is different, as reported [[ https://github.com/llvm/llvm-project/issues/59334 | here ]]
```
static void testPrint(AsmPrinter &p, Operation *op, ValueRange operands) {
p << '(' << operands << ')';
}
```
Although the base `AsmPrinter` is passed as the first parameter (and not `OpAsmPrinter`), the code compiles fine. However, instead of the SSA values, the types for the operands will be printed. This is a violation of the Liskov Substitution Principle.
The desired behaviour would be that the above code does not compile. The reason it compiles, is that for the above code, the `TypeRange` version will be selected for the `<<` operator, since `ValueRange` is implicitly converted to `TypeRange`:
```
template <typename AsmPrinterT>
inline std::enable_if_t<std::is_base_of<AsmPrinter, AsmPrinterT>::value,
AsmPrinterT &>
operator<<(AsmPrinterT &p, const TypeRange &types) {
llvm::interleaveComma(types, p);
return p;
}
```
Vladislav Dzhidzhoev [Tue, 7 Feb 2023 20:32:50 +0000 (21:32 +0100)]
[AArch64][GlobalISel] Legalize G_SHUFFLE_VECTOR with smaller dest size
Legalize G_SHUFFLE_VECTOR having destination vector length smaller than
source vector length by reshaping destination vector.
Differential Revision: https://reviews.llvm.org/D144670
Fangrui Song [Mon, 27 Feb 2023 22:33:18 +0000 (14:33 -0800)]
[ELF][PPC64] Merge PPC64R12SetupStub and PPC64PCRelPLTStub. NFC
PPC64PCRelPLTStub (from D83669) duplicates lot of code from
PPC64R12SetupStub. Just merge them.
Note: PPC64R12SetupStub does not correctly handle long branch to a
non-preemptible non-TOC code.
Sam Clegg [Fri, 24 Feb 2023 18:09:07 +0000 (10:09 -0800)]
[lld][WebAssembly] Fix handling of mixed strong and weak references
When adding a undefined symbols to the symbol table, if the existing
reference is weak replace the symbol flags with (potentially) non-weak
binding.
Fixes: https://github.com/llvm/llvm-project/issues/60829
Differential Revision: https://reviews.llvm.org/D144747
Arthur Eubanks [Mon, 27 Feb 2023 22:17:24 +0000 (14:17 -0800)]
[test] Remove unnecessary -enable-new-pm=0
Arthur Eubanks [Mon, 27 Feb 2023 19:50:30 +0000 (11:50 -0800)]
[LLVMContextImpl] Separate out opaque pointers
To make the map lookups simpler for opaque pointers and to simplify future typed pointer code removal. No significant compile time wins though.
While we're here, remove the address space 0 optimization for typed pointers.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D144910
Maksim Panchenko [Sun, 26 Feb 2023 02:23:53 +0000 (18:23 -0800)]
[BOLT] Fix intermittent crash with instrumentation
When createInstrumentedIndirectCall() was invoked for tail calls, we
attached annotation instruction twice to the new call instruction.
First in createDirectCall(), and then again while copying over the
metadata operands.
As a result, the annotations were not properly stripped for such calls
before the call to freeAnnotations() in LowerAnnotations pass. That lead
to use-after-free while restoring the offsets with setOffset() call.
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D144806
Matthew Voss [Mon, 27 Feb 2023 22:04:27 +0000 (14:04 -0800)]
[NFC][PGO] Prefix duplicate profile MemOp entry diagnostic with 'warning:'
Adding this prefix will indicate clearly that the compiler doesn't exit
when it hits this diagnostic. Searches for other non-fatal diagnostics
will also be able to find this diagnostic easily.
Arthur O'Dwyer [Thu, 19 Jan 2023 22:10:09 +0000 (17:10 -0500)]
[libc++] Fix "size_t" constants that should be "bool" or "int", and add tests
`is_placeholder`, despite having an "is_" name, actually returns an int:
1 for `_1`, 2 for `_2`, 3 for `_3`, and so on. But it should still be int,
not size_t.
Simon Pilgrim [Mon, 27 Feb 2023 21:49:18 +0000 (21:49 +0000)]
[X86] Split off x86-64-v* tuning flags. NFC
Noticed when reviewing D143786, we are currently inheriting the x86-64-v* tuning flags from specific CPUs when really we need these to be a mixture of common traits and tuning to avoid specific severe regressions.
Differential Revision: https://reviews.llvm.org/D144832
Michael Jones [Thu, 16 Feb 2023 19:23:44 +0000 (11:23 -0800)]
[libc] use vars in string to num fuzz targets
The string to integer and string to float standalone fuzz targets just
ran the functions and didn't do anything with the output. This was
intentional, since they are intended to be used with sanitizers to
detect buffer overflow bugs. Not using the variables was causing compile
warnings, so this patch adds trivial checks to use the variables.
Reviewed By: sivachandra, lntue
Differential Revision: https://reviews.llvm.org/D144208
Chia-hung Duan [Mon, 27 Feb 2023 21:12:10 +0000 (21:12 +0000)]
Revert "[scudo] Only prepare PageMap entry for partial region"
This reverts commit
0a0b6fa4fbdf3bdeb300ddd58859f66b714b8bdf.
Arthur Eubanks [Mon, 27 Feb 2023 19:29:23 +0000 (11:29 -0800)]
[Bitcode] Remove typed pointer abbreviation
Since typed pointers are deprecated.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D144901
Kazu Hirata [Mon, 27 Feb 2023 20:56:19 +0000 (12:56 -0800)]
[AArch64] Fix a warning
This patch fixes:
llvm/lib/Target/AArch64/AArch64MIPeepholeOpt.cpp:582:17: error:
unused variable 'INSvilaneMI' [-Werror,-Wunused-variable]
Michal Paszkowski [Mon, 27 Feb 2023 20:26:09 +0000 (21:26 +0100)]
[SPIR-V] Support TargetExtType for SPIR-V builtin types
This patch adds support for TargetExtType/target(...) representing
SPIR-V builtin types. After D135202, target(...) is the preferred way
for representing SPIR-V builtin types in LLVM IR and the only working
in the opaque pointer mode.
In order to maintain compatibility with LLVM IR generated by older
versions of Clang and LLVM/SPIR-V Translator, pointers-to-opaque-structs
denoting SPIR-V/OpenCL builtin types will be translated to equivalent
SPIR-V target extension types. This translation is only available in the
typed pointer mode (-opaque-pointers=0).
The relevant LIT tests with SPIR-V builtins were converted to use the
new target(...) notation.
Differential Revision: https://reviews.llvm.org/D144494
Vasileios Porpodas [Mon, 27 Feb 2023 18:44:14 +0000 (10:44 -0800)]
[SLP] Fixes crash in BoUpSLP::isGatherShuffledEntry()
Crash caused by:
708eb1b96d9a36f9c0182b7d53c492059778fa35
Differential Revision: https://reviews.llvm.org/D144895
Nilanjana Basu [Thu, 26 Jan 2023 01:35:31 +0000 (17:35 -0800)]
[AArch64] Avoid using intermediate integer registers for copying between source and destination floating point registers
In post-isel code, there are cases where there were redundant copies from a source FPR to an intermediate GPR in order to copy to a destination FPR. In this patch, we identify these patterns in post-isel peephole optimization and replace them with a direct FPR-to-FPR copy.
One example for this will be the insertion of the scalar result of 'uaddlv' neon intrinsic function into a destination vector. During instruction selection phase, 'uaddlv' result is copied to a GPR, & a vector insert instruction is matched separately to copy the previous result to a destination SIMD&FP register.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D142594
Daniel Thornburgh [Sun, 6 Feb 2022 13:20:54 +0000 (08:20 -0500)]
[Clang] [AVR] Fix USHRT_MAX for 16-bit int.
For AVR, the definition of USHRT_MAX overflows.
Reviewed By: aaron.ballman, #clang-language-wg
Differential Revision: https://reviews.llvm.org/D144218
Tamir Duberstein [Mon, 27 Feb 2023 20:02:51 +0000 (20:02 +0000)]
[clang-format-diff] Correctly parse start-of-file diffs
Handle the case where the diff is a pure removal of lines. Before this
change start_line would end up as 0 which is rejected by clang-format.
Submitting on behalf of @tamird.
Differential Revision: https://reviews.llvm.org/D144291
Rong Xu [Mon, 27 Feb 2023 17:51:28 +0000 (09:51 -0800)]
[Pass][CHR] Move ControlHeightReduction to module optimization pipeline
This is a modified version of commit
b374423304a8 by
Arthur (https://reviews.llvm.org/D143424).
Here we invoke to the pass independent of PGOOPT. We now check if the
profile is available through the program summary. This ensures CHR is
called in distributed ThinLTO BE compilation (where PGOOPT might not
be created).
Differential Revision: https://reviews.llvm.org/D144769
Florian Hahn [Mon, 27 Feb 2023 19:38:39 +0000 (20:38 +0100)]
[SCEV] Hoist common cleanup code to function. (NFC)
This allows for easier updating of common code in follow-on patches.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D144847
Amara Emerson [Mon, 27 Feb 2023 19:02:37 +0000 (11:02 -0800)]
[AArch64][GlobalISel] Reorder stack up-adjustment and register copies
This change reorders the stack up-adjustment and return value copying phases of
machine-ir generation on Aarch64. Doing so prevents a bug observed for fastcc
calls with >8 arguments, where the up-adjustment required from making that call
is placed in the wrong place relative to spill and reloading code.
See: https://github.com/llvm/llvm-project/issues/60972 for full issue
reproduction and context.
Patch contributed by Bruce Collie
Differential Revision: https://reviews.llvm.org/D144791
David Green [Mon, 27 Feb 2023 19:20:10 +0000 (19:20 +0000)]
[AArch64] Don't remove free sext_inreg(vector_extract(x)) if it leads to multiple extracts
If we have sext_inreg(vector_extract(x)) but the top bits are not used, DAG
will try to remove the sext_inreg, using vector_extract(x) directly. This can
lead to multiple uses of both sext_inreg(vector_extract(x)) and
vector_extract(x), leading to the generation of both umov and smov extracts.
This adds a target hook to prevent that under AArch64 where the sext_inreg can
be considered free if there are multiple uses of the sext and no uses of the
vector_extract. This helps fix a small regression from D144550.
Differential Revision: https://reviews.llvm.org/D144850
Frederik Gossen [Mon, 27 Feb 2023 18:52:15 +0000 (13:52 -0500)]
[MLIR] Add primitive builders for scf.if
Differential Revision: https://reviews.llvm.org/D144886
Chia-hung Duan [Fri, 24 Feb 2023 04:20:30 +0000 (04:20 +0000)]
[scudo] Only prepare PageMap entry for partial region
This reduces the size of PageMap and we are more likely to use the
static local buffer. Note that now this is only supported for single
region case, i.e. on SizeClassAllocator64. For SizeClassAllocator32,
it needs a different way to save the PageMap.
Differential Revision: https://reviews.llvm.org/D142659
Nikolas Klauser [Tue, 24 Jan 2023 08:32:31 +0000 (09:32 +0100)]
[libc++][NFC] Format __split_buffer and move constructors that are marked inline into the class body
Reviewed By: ldionne, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D142433
Nikolas Klauser [Sun, 26 Feb 2023 14:57:25 +0000 (15:57 +0100)]
[libc++] Simplify the modules_include.sh.cpp script a bit
Reviewed By: #libc, ldionne
Spies: vvereschaka, libcxx-commits
Differential Revision: https://reviews.llvm.org/D144825
Mark de Wever [Fri, 24 Feb 2023 20:35:41 +0000 (21:35 +0100)]
[libc++] Improves clang-format settings.
Add a new test based .clang-format file which inherits from the generic
one. This moves some test specific formatting rules to the test
directory.
The main benefit is that headers are sorted, which makes it more likely
to catch these errors before creating a review instead of spotting the
error in the CI clang-tidy step.
Reviewed By: ldionne, philnik, #libc
Differential Revision: https://reviews.llvm.org/D144755
Mark de Wever [Sat, 25 Feb 2023 14:24:57 +0000 (15:24 +0100)]
[libc++] Fixes operator& hijacking atomic types.
This uses std::addressof everywherein atomic. This is not strictly
needed for the integral and floating point specializations. They should
not be used by user defined types. But it's easier to fix everything.
Note these changes are made using a WIP clang-tidy plugin.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D144786
Arthur Eubanks [Fri, 24 Feb 2023 05:47:03 +0000 (21:47 -0800)]
[LLVMContextImpl] Separate out integer constant ones
Very small compile time improvement:
https://llvm-compile-time-tracker.com/compare.php?from=
6a7a8907e8334eaf551742148079c628f78e6ed7&to=
454d1181fbdb9121f0c7a3ecf526520db32ab420&stat=instructions:u
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D144746
Arthur Eubanks [Fri, 24 Feb 2023 05:16:27 +0000 (21:16 -0800)]
[LLVMContextImpl] Separate out integer constant zeroes
Very small compile time improvement:
https://llvm-compile-time-tracker.com/compare.php?from=
a628ca4925f7249b4fbd3e932c9627b12e2770dd&to=
6a7a8907e8334eaf551742148079c628f78e6ed7&stat=instructions:u
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D144745
Alexey Bataev [Mon, 27 Feb 2023 17:20:49 +0000 (09:20 -0800)]
[SLP]Fix PR61018: Assertion `Mask[I] == UndefMaskElem && "Multiple uses
of scalars."' failed.
Need to check for the reused indices when checking if 2 insertelement
instruction are from the same buildvector. If the inidices are reused,
better not to match buildvectors and consider them as differenet,
otherwise need to track the order of insertelement operations.
zhongyunde [Mon, 27 Feb 2023 18:00:43 +0000 (02:00 +0800)]
[AMDGPU] Update the CHECK autogenerated as it's expired
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D144771
Craig Topper [Mon, 27 Feb 2023 17:19:27 +0000 (09:19 -0800)]
[Sema] Use isSVESizelessBuiltinType instead of isSizelessBuiltinType to prevent crashing on RISC-V.
These 2 spots are protecting calls to SVE specific functions. If RISC-V
sizeless types end up in there we trigger assertions.
Use the more specific isSVESizelessBuiltinType() to avoid letting
RISC-V vectors through.
Reviewed By: asb, c-rhodes
Differential Revision: https://reviews.llvm.org/D144772
Kiran Chandramohan [Mon, 27 Feb 2023 16:42:42 +0000 (16:42 +0000)]
[Flang][OpenMP][OpenACC] Error for loop with no control
Issue error if a DO construct associated with a loop does not have
loop control. Currently, it is issued only for the loop immediately
following the loop construct. This patch extends it to cases like
collapse where there is more than one loop associated. It also fixes
a crash since the existing code always expects loop control.
This is covered in OpenMP 4.5 standard, Section 2.7.1.
"The do-loop cannot be a DO WHILE or a DO loop without loop control."
OpenACC 3.3 covers this indirectly in Section 2.9.1.
The trip count for all loops associated with the collapse clause must
be computable and invariant in all the loops".
Reviewed By: clementval
Differential Revision: https://reviews.llvm.org/D144290
Joseph Huber [Mon, 27 Feb 2023 14:46:32 +0000 (08:46 -0600)]
[OpenMP] Ignore implicit casts on assertion for `use_device_ptr`
There was an assertion triggering when invoking a captured member whose
initializer was in a blase class. This patch fixes it by allowing the
assertion on implicit casts to the base class rather than only the base
class itself.
Fixes https://github.com/llvm/llvm-project/issues/61027
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D144873
Kiran Chandramohan [Mon, 27 Feb 2023 16:31:16 +0000 (16:31 +0000)]
[Flang][OpenMP] NFC: Change a few message/comments to fit 80chars
Changes are all in the OpenMP semantic checks file.
Reviewed By: SBallantyne
Differential Revision: https://reviews.llvm.org/D144874
Nicolas Vasilache [Wed, 22 Feb 2023 13:24:25 +0000 (05:24 -0800)]
[mlir][Linalg] Reimplement hoisting on tensors as a subset-based transformation
This revision significantly rewrites hoisting on tensors.
Previously, `vector.transfer_read/write` and `tensor.extract/insert_slice` would
be clumped together when looking for candidate pairs.
This would significantly increase the complexity of the logic and would not apply
independently to `tensor.extract/insert_slice`.
The new implementation decouples the cases and starts to cast the problem
as a generic matching subset extract/insert, which will be future proof when
other such operation pairs are introduced.
Lastly, the implementation makes the distinction clear between `vector.transfer_read/write` for
which we allow bypasses of the disjoint subsets from `tensor.extract/insert_slice` for which we
do not yet allow it.
This can be extended in the future and unified once we have subset disjunction implemented more generally.
The algorithm can be rewritten to be less of a fixed point with interspersed canonicalizations.
As a consequence, the test explicitly adds a canonicalization to clean up the IR and verify we end up in the same state.
That extra canonicalization exhibited that one of the uses in one of the tests was dead, so we fix the appropriate test.
Differential Revision: https://reviews.llvm.org/D144656
Haojian Wu [Mon, 27 Feb 2023 16:10:33 +0000 (17:10 +0100)]
[mlir] Fix a -Wunused-variable warning, NFC
Nikita Popov [Mon, 27 Feb 2023 15:56:45 +0000 (16:56 +0100)]
[ConstExpr] Avoid creation of select constant expressions
These expressions will now only be created if explicitly requested
in IR/bitcode (and by LowerTypeTests, which has a tricky to remove
use).
This is in preparation for removing these expressions entirely,
but also fixes #60983 in the meantime.
Frederik Gossen [Mon, 27 Feb 2023 15:58:56 +0000 (10:58 -0500)]
[MLIR] Add pass to deduplicate functions
Deduplicate functions that are equivalent in all aspects but their symbol name.
The pass chooses one representative per equivalence class, erases the remainder, and updates function calls accordingly.
Differential Revision: https://reviews.llvm.org/D144738
Haojian Wu [Mon, 27 Feb 2023 15:50:39 +0000 (16:50 +0100)]
Frederik Gossen [Mon, 27 Feb 2023 15:49:22 +0000 (10:49 -0500)]
[MLIR] Expose region equivalence check through OperationEquivalence
Differential Revision: https://reviews.llvm.org/D144735
Nikita Popov [Mon, 27 Feb 2023 15:36:07 +0000 (16:36 +0100)]
[InlineCost] Avoid ConstantExpr::getSelect()
Instead use ConstantFoldSelectInstruction(), which will return
nullptr if it cannot be folded and a constant expression would
be produced instead.
In preparation for removing select constant expressions.
Kohei Yamaguchi [Mon, 27 Feb 2023 15:36:16 +0000 (16:36 +0100)]
[mlir][sparse] Add checking parent op of SortOp
Fix crash with segmentation fault caused by setting a parent operator
that is not func::FuncOp with sparse_tensor SortOp.
fixes https://github.com/llvm/llvm-project/issues/59988
Reviewed By: aartbik, wrengr
Differential Revision: https://reviews.llvm.org/D143874
Kohei Yamaguchi [Mon, 27 Feb 2023 15:34:59 +0000 (16:34 +0100)]
[mlir][NFC] Cleanup Passes documentation
- Fix a place of NVGPU dialect's pass
- Move a summary of `-finalize-memref-to-llvm` into description
- Fix broken links
- Replace back-quote dialect headers with single-quote headers for
improved readability.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D142868
Amir Mohammad Tavakkoli [Mon, 27 Feb 2023 15:28:54 +0000 (16:28 +0100)]
[mlir][LinAlg][Transform][GPU] Add GPU memory hierarchy to the transform.promote op
In this patch we are adding the support of copying a a `memref.subview` to the shared or private memory in GPU. The global to shared memory copy is adopted from codes implemented in IREE (https://github.com/iree-org/iree), but the private memory copy part has not been implemented in IREE. This patch enables transferring a subview from `global->shared`, `global->private`, and `shared->private`.
Our final aim is to provide a copy layout as an affine map to the `transform.promote` op to support transpose memory copy. This map is a permutation of the original affine index map. Although this has been implemented and user can copy data to arbitrary layout , this attempt is not included in this patch since we have still problem with `linalg.generic` operations to change their index map to the transformed index map. You can find more in following links ([[ https://github.com/tavakkoliamirmohammad/iree-llvm-fork/commit/
4fd5f93355951ad0fb338858393ff409bd9c62f8 | Initial attempt to support layout map in promote op in transform dialect ]]) ([[ https://github.com/tavakkoliamirmohammad/iree-llvm-fork/commit/
9062b5849f91d4defb84996392b71087dadf7a8c | Fix data transpose in shared memory ]])
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D144666
Alexey Bataev [Mon, 27 Feb 2023 14:33:08 +0000 (06:33 -0800)]
[SLP]Fix a crash when trying to find reduced ops for the reduced value.
Need to use original reduced value, not the one the compiler gets after
reduction, it may be replaced by the extractelement instruction already.
Nikita Popov [Mon, 27 Feb 2023 15:31:21 +0000 (16:31 +0100)]
[InstCombine] Avoid ConstantExpr::getSelect() use (NFCI)
Instead let IRBuilder take care of constant folding.
In preparation for removing select constantexprs.