Srishti Srivastava [Fri, 21 Jul 2023 20:28:57 +0000 (13:28 -0700)]
[MLIR][ANALYSIS] Add liveness analysis utility
This commit adds a utility to implement liveness analysis using the
sparse backward data-flow analysis framework. Theoretically, liveness
analysis assigns liveness to each (value, program point) pair in the
program and it is thus a dense analysis. However, since values are
immutable in MLIR, a sparse analysis, which will assign liveness to
each value in the program, suffices here.
Liveness analysis has many applications. It can be used to avoid the
computation of extraneous operations that have no effect on the memory
or the final output of a program. It can also be used to optimize
register allocation. Both of these applications help achieve one very
important goal: reducing runtime.
A value is considered "live" iff it:
(1) has memory effects OR
(2) is returned by a public function OR
(3) is used to compute a value of type (1) or (2).
It is also to be noted that a value could be of multiple types (1/2/3) at
the same time.
A value "has memory effects" iff it:
(1.a) is an operand of an op with memory effects OR
(1.b) is a non-forwarded branch operand and a block where its op could
take the control has an op with memory effects.
A value `A` is said to be "used to compute" value `B` iff `B` cannot be
computed in the absence of `A`. Thus, in this implementation, we say that
value `A` is used to compute value `B` iff:
(3.a) `B` is a result of an op with operand `A` OR
(3.b) `A` is used to compute some value `C` and `C` is used to compute
`B`.
---
It is important to note that there already exists an MLIR liveness
utility here: llvm-project/mlir/include/mlir/Analysis/Liveness.h. So,
what is the need for this new liveness analysis utility being added by
this commit? That need is explained as follows:-
The similarities between these two utilities is that both use the
fixpoint iteration method to converge to the final result of liveness.
And, both have the same theoretical understanding of liveness as well.
However, the main difference between (a) the existing utility and (b)
the added utility is the "scope of the analysis". (a) is restricted to
analysing each block independently while (b) analyses blocks together,
i.e., it looks at how the control flows from one block to the other,
how a caller calls a callee, etc. The restriction in the former implies
that some potentially non-live values could be marked live and thus the
full potential of liveness analysis will not be realised.
This can be understood using the example below:
```
1 func.func private @private_dead_return_value_removal_0() -> (i32, i32) {
2 %0 = arith.constant 0 : i32
3 %1 = arith.addi %0, %0 : i32
4 return %0, %1 : i32, i32
5 }
6 func.func @public_dead_return_value_removal_0() -> (i32) {
7 %0:2 = func.call @private_dead_return_value_removal_0() : () -> (i32, i32)
8 return %0#0 : i32
9 }
```
Here, if we just restrict our analysis to a per-block basis like (a), we
will say that the %1 on line 3 is live because it is computed and then
returned outside its block by the function. But, if we perform a
backward data-flow analysis like (b) does, we will say that %0#1 of line
7 is not live because it isn't returned by the public function and thus,
%1 of line 3 is also not live. So, while (a) will be unable to suggest
any IR optimizations, (b) can enable this IR to convert to:-
```
1 func.func private @private_dead_return_value_removal_0() -> i32 {
2 %0 = arith.constant 0 : i32
3 return %0 : i32
4 }
5 func.func @public_dead_return_value_removal_0() -> i32 {
6 %0 = call @private_dead_return_value_removal_0() : () -> i32
7 return %0 : i32
8 }
```
One operation was removed and one unnecessary return value of the
function was removed and the function signature was modified. This is an
optimization that (b) can enable but (a) cannot. Such optimizations can
help remove a lot of extraneous computations that are currently being
done.
Signed-off-by: Srishti Srivastava <srishtisrivastava.ai@gmail.com>
Reviewed By: matthiaskramm, jcai19
Differential Revision: https://reviews.llvm.org/D153779
Peter Klausler [Wed, 19 Jul 2023 19:06:31 +0000 (12:06 -0700)]
[flang] Catch case of character array constructor with indeterminable length
F'2023 7.8 para 5 requires that an implied DO loop with no iterations
in a character array constructor should have items whose lengths are
constant expressions independent of the value of the implied DO loop
index.
Differential Revision: https://reviews.llvm.org/D155968
Daniel Hoekwater [Tue, 27 Jun 2023 01:30:27 +0000 (01:30 +0000)]
[AArch64] Move branch relaxation after bbsection assignment
Because branch relaxation needs to factor in if branches target
a block in the same section or a different one, it needs to run
after the Basic Block Sections / Machine Function Splitting passes.
Because Jump table compression relies on block offsets remaining
fixed after the table is compressed, we must also move the JT
compression pass.
The only tests affected are ones enforcing just the ordering and
the a few that have basic block ids changed because RenumberBlocks
hasn't run yet.
Differential Revision: https://reviews.llvm.org/D153829
Alexey Bataev [Fri, 21 Jul 2023 20:13:01 +0000 (13:13 -0700)]
[SLP][NFC]Add a test with strided loads, NFC.
Peter Klausler [Wed, 19 Jul 2023 00:05:47 +0000 (17:05 -0700)]
[flang][runtime] Detect NEWUNIT= without FILE= or STATUS='SCRATCH'
It is an error to open a new unit with OPEN(NEWUNIT=) and have
neither a file name nor a scratch status. Catch it, and report a
new error code.
Differential Revision: https://reviews.llvm.org/D155967
Florian Hahn [Fri, 21 Jul 2023 20:05:50 +0000 (22:05 +0200)]
[LV] Replace use of getMaxSafeDepDist with isSafeForAnyVector (NFC)
Replace the use of getMaxSafeDepDistBytes with the more direct
isSafeForAnyVector. This removes the need to define getMaxSafeDepDistBytes.
Matt Arsenault [Wed, 3 May 2023 13:52:53 +0000 (09:52 -0400)]
ValueTracking: Implement computeKnownFPClass for frexp
Work around the lack of proper multiple return values by looking
at the extractvalue.
https://reviews.llvm.org/D150982
Matt Arsenault [Wed, 3 May 2023 11:37:16 +0000 (07:37 -0400)]
ValueTracking: Add baseline tests for frexp handling in computeKnownFPClass
Matt Arsenault [Sun, 2 Jul 2023 00:21:28 +0000 (20:21 -0400)]
AMDGPU: Add baseline test for fdiv combine
Peter Klausler [Tue, 18 Jul 2023 23:14:49 +0000 (16:14 -0700)]
[flang] Preserve errors from generic matching
When searching for a matching specific procedure for a set of actual
arguments in a type-bound generic interface for a defined operator,
don't discard any error messages that may have been produced for
the specific that was found. Tweak the code to preserve those
messages and add them to the context's messages, and add a test.
Differential Revision: https://reviews.llvm.org/D155966
Slava Zakharin [Fri, 21 Jul 2023 19:11:51 +0000 (12:11 -0700)]
[flang][hlfir] Added missing fir.convert for i1 result of hlfir.dot_product.
Some operations using the result of hlfir.dot_product can tolerate
that the type of the result changes from !fir.logical to i1 during
intrinsics lowering, but some won't. I added a separate LIT case with
fir.store to mimic one of the nag tests.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D155914
Slava Zakharin [Fri, 21 Jul 2023 19:11:42 +0000 (12:11 -0700)]
[flang][hlfir] Preserve polymorphism for the result of hlfir.transpose.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D155912
Slava Zakharin [Fri, 21 Jul 2023 19:11:25 +0000 (12:11 -0700)]
[NFC][flang] Distinguish MATMUL and MATMUL-TRANSPOSE printouts.
When MatmulTranpose reports incorrect shapes of the arguments
it cannot represent itself as MATMUL, because the reading
of the first argument's shape will be confusing.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D155911
Peter Klausler [Tue, 18 Jul 2023 20:31:23 +0000 (13:31 -0700)]
[flang] Stricter checking of DIM= arguments to LBOUND/UBOUND/SIZE
DIM= arguments with constant values can be checked for validity
even when other arguments to an intrinsic function can't be
folded. Handle errors with assumed-rank arguments as well.
Differential Revision: https://reviews.llvm.org/D155964
Peter Klausler [Tue, 18 Jul 2023 21:53:21 +0000 (14:53 -0700)]
[flang] Finalize &/or destroy ABSTRACT types
The runtime type information tables always flag ABSTRACT types as
needing neither destruction in general nor finalization in particular.
This is incorrect. Although an ABSTRACT type may not itself have
a FINAL procedure -- its argument cannot be polymorphic, but
ABSTRACT types in declarations must always be so -- it can still
have finalizable components &/or components requiring deallocation.
Differential Revision: https://reviews.llvm.org/D155965
Noah Goldstein [Fri, 21 Jul 2023 18:31:47 +0000 (13:31 -0500)]
[InstCombine] If there is a known-bit transform is_pow2 check to just check for any other bits
in `ctpop(X) eq/ne 1` or `ctpop(X) ugt/ule 1`, if there is any
known-bit in `X`, instead of going through `ctpop`, we can just test
if there are any other known bits in `X`. If there are, `X` is not a
power of 2. If there aren't, `X` is a power of 2.
https://alive2.llvm.org/ce/z/eLMJgU
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D152677
Noah Goldstein [Mon, 12 Jun 2023 03:38:46 +0000 (22:38 -0500)]
[InstCombine] Add tests for ispow2 comparisons with a known bit; NFC
Differential Revision: https://reviews.llvm.org/D152676
Noah Goldstein [Sun, 11 Jun 2023 21:13:30 +0000 (16:13 -0500)]
[InstCombine] Canonicalize `(X^(X-1)) u{ge,lt} X` as pow2 test
https://alive2.llvm.org/ce/z/T8osF6
Differential Revision: https://reviews.llvm.org/D152673
Noah Goldstein [Sun, 11 Jun 2023 21:41:46 +0000 (16:41 -0500)]
[InstCombine] Add tests for canonicalizing `(X^(X-1)) u{ge,lt} X` as pow2 test; NFC
Differential Revision: https://reviews.llvm.org/D152672
Peter Klausler [Tue, 18 Jul 2023 16:32:33 +0000 (09:32 -0700)]
[flang] Support implicit global external as procedure pointer target
A name that has been used to reference an undeclared global external
procedure should be accepted as the target of a procedure pointer
assignment statement.
Fixes llvm-test-suite/Fortran/gfortran/regression/proc_ptr_45.f90.
Differential Revision: https://reviews.llvm.org/D155963
Peter Klausler [Mon, 17 Jul 2023 23:35:34 +0000 (16:35 -0700)]
[flang] Compare component types In AreSameComponent()
The subroutine AreSameComponent() of the predicate AreSameDerivedType()
had a TODO about checking component types that needed completion in order
to properly detect that two specific procedures of a generic are
distinguishable in the llvm-test-suite/Fortran/gfortran/regression
test import7.f90.
Differential Revision: https://reviews.llvm.org/D155962
Rahman Lavaee [Mon, 17 Jul 2023 14:23:42 +0000 (07:23 -0700)]
[llvm-objdump] Use BBEntry::BBID to represent basic block numbers.
Reviewed By: aidengrossman, mtrofin, JestrTulip
Differential Revision: https://reviews.llvm.org/D155464
Joseph Huber [Fri, 21 Jul 2023 16:18:13 +0000 (11:18 -0500)]
[libc] Disable 'DecodeInOtherBases` test on GPU targets
This test is excessively slow on GPU targets, taking anywhere beween 5
and 60 seconds to complete each time it's run. See
https://lab.llvm.org/buildbot/#/builders/55/builds/52203/steps/12/logs/stdio
for an example on the NVPTX buildbot. Simply disable testing this on the
GPU for now.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D155979
Simon Pilgrim [Fri, 21 Jul 2023 18:10:06 +0000 (19:10 +0100)]
[X86] combineBitcastvxi1 - don't prematurely create PACKSS nodes.
Similar to Issue #63710 - by truncating the v8i16 result with a PACKSS node before type legalization, we fail to make use of various folds that rely on TRUNCATE nodes.
This required tweaks to LowerTruncateVecPackWithSignBits to recognise when the truncation source has been widened and to more closely match combineVectorSignBitsTruncation wrt truncating with PACKSS/PACKUS on AVX512 targets.
One of the last stages before we can finally get rid of combineVectorSignBitsTruncation.
Simon Pilgrim [Fri, 21 Jul 2023 17:16:11 +0000 (18:16 +0100)]
[X86] truncateVectorWithPACK - avoid concat_vectors(extract_subvector(pack()),extract_subvector(pack())) for sub-128 bit vectors
As we start using this after type legalization, we must avoid creating concat_vectors nodes so late.
Stanislav Mekhanoshin [Thu, 20 Jul 2023 20:12:37 +0000 (13:12 -0700)]
[AMDGPU] Remove std::optional from VOPD::ComponentProps. NFC.
This class has to be fast and efficient with a trivial copy
constructor.
Differential Revision: https://reviews.llvm.org/D155881
Fangrui Song [Fri, 21 Jul 2023 17:28:52 +0000 (10:28 -0700)]
[test] Unsupport CodeGenCXX/destructors for LLVM_ENABLE_REVERSE_ITERATION builds
_ZN5test312_GLOBAL__N_11CD2Ev and _ZN5test312_GLOBAL__N_11DD0Ev are
swapped in LLVM_ENABLE_REVERSE_ITERATION builds. Unsupport for now.
Uday Bondhugula [Thu, 20 Jul 2023 16:20:13 +0000 (21:50 +0530)]
NFC. Move remaining affine/memref test cases into respective dialect dirs
Move a bunch of lingering test cases from test/Transforms/ into
test/Dialect/Affine and MemRef.
Differential Revision: https://reviews.llvm.org/D155855
Simon Pilgrim [Fri, 21 Jul 2023 16:48:11 +0000 (17:48 +0100)]
[X86] Add isUpperSubvectorUndef helper to simplify recognition of vectors widened with undef upper subvectors. NFC.
Arthur Eubanks [Fri, 21 Jul 2023 16:47:08 +0000 (09:47 -0700)]
[Sanitizers][Darwin][Test] Mark symbolize_pc test on Darwin/TSan+UBSan as UNSUPPORTED
Followup to https://reviews.llvm.org/rG760c208f6ff9e97a9a11523c00874a1eec4f876b which XFAIL'd them, but they pass in some configurations.
Peter Klausler [Mon, 17 Jul 2023 16:42:47 +0000 (09:42 -0700)]
[flang] Accept an assumed-rank array as operand of ASSOCIATED()
The ASSOCIATED() intrinsic was mistakenly defined in the intrinsic
function table as requiring operands of known rank, which unintentionally
prevented assumed-rank dummy arguments from being tested.
Fixes llvm-test-suite/Fortran/gfortran/regression/pr88932.f90.
Differential Revision: https://reviews.llvm.org/D155498
Sylvestre Ledru [Fri, 21 Jul 2023 16:23:14 +0000 (18:23 +0200)]
clang/Debian: add Debian Trixie now that it is in unstable
Lorenzo Chelini [Fri, 21 Jul 2023 11:28:45 +0000 (13:28 +0200)]
[MLIR][Linalg] Preserve DPS when decomposing Softmax
Preserve destination passing style (DPS) when decomposing
`linalg.Softmax`; instead of creating a new empty, which may materialize
as a new buffer after bufferization, use the result directly.
Reviewed By: qcolombet
Differential Revision: https://reviews.llvm.org/D155942
Alex Bradbury [Fri, 21 Jul 2023 15:37:50 +0000 (16:37 +0100)]
[RISCV][NFC] Add RISCVSubtarget field to RISCVExpandPseudo and RISCVPreRAExpandPseudo
To my eye, it's cleaner to just get hold of STI in runOnMachineFunction
(as we do already for InstrInfo) and then accessing the field as needed
rather than to have repeated lookup code in the member functions or
helpers that need it.
Differential Revision: https://reviews.llvm.org/D155840
Joseph Huber [Fri, 21 Jul 2023 15:34:09 +0000 (10:34 -0500)]
[libc] Treat the locks array as a bitfield
Currently we keep an internal buffer of device memory that is used to
indicate ownership of a port. Since we only use this as a single bit we
can simply turn this into a bitfield. I did this manually rather than
having a separate type as we need very special handling of the masks
used to interact with the locks.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D155511
Fangrui Song [Fri, 21 Jul 2023 15:46:51 +0000 (08:46 -0700)]
[Support] Implement LLVM_ENABLE_REVERSE_ITERATION for StringMap
ProgrammersManual.html says
> StringMap iteration order, however, is not guaranteed to be deterministic, so any uses which require that should instead use a std::map.
This patch makes -DLLVM_REVERSE_ITERATION=on (currently
-DLLVM_ENABLE_REVERSE_ITERATION=on works as well) shuffle StringMap
iteration order (actually flipping the hash so that elements not in the
same bucket are reversed) to catch violations, similar to D35043 for
DenseMap. This should help change the hash function (e.g., D142862,
D155781).
With a lot of fixes, there are still some violations. This patch
implements the "reverse_iteration" lit feature to skip such tests.
Eventually we should remove this feature.
`ninja check-{llvm,clang,clang-tools}` are clean with
`#define LLVM_ENABLE_REVERSE_ITERATION 1`.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D155789
Mikhail Goncharov [Fri, 21 Jul 2023 15:38:01 +0000 (17:38 +0200)]
[bazel] update config.h.cmake
Fangrui Song [Fri, 21 Jul 2023 15:37:58 +0000 (08:37 -0700)]
[RISCV] Allow delayed decision for ADD/SUB relocations
For a label difference `A-B` in assembly, if A and B are separated by a
linker-relaxable instruction, we should emit a pair of ADD/SUB
relocations (e.g. R_RISCV_ADD32/R_RISCV_SUB32,
R_RISCV_ADD64/R_RISCV_SUB64).
However, the decision is made upfront at parsing time with inadequate
heuristics (`requiresFixup`). As a result, LLVM integrated assembler
incorrectly suppresses R_RISCV_ADD32/R_RISCV_SUB32 for the following
code:
```
// Simplified from a workaround https://android-review.googlesource.com/c/platform/art/+/2619609
// Both end and begin are not defined yet. We decide ADD/SUB relocations upfront and don't know they will be needed.
.4byte end-begin
begin:
call foo
end:
```
To fix the bug, make two primary changes:
* Delete `requiresFixups` and the overridden emitValueImpl (from D103539).
This deletion requires accurate evaluateAsAbolute (D153097).
* In MCAssembler::evaluateFixup, call handleAddSubRelocations to emit
ADD/SUB relocations.
However, there is a remaining issue in
MCExpr.cpp:AttemptToFoldSymbolOffsetDifference. With MCAsmLayout, we may
incorrectly fold A-B even when A and B are separated by a
linker-relaxable instruction. This deficiency is acknowledged (see
D153097), but was previously bypassed by eagerly emitting ADD/SUB using
`requiresFixups`. To address this, we partially reintroduce `canFold` (from
D61584, removed by D103539).
Some expressions (e.g. .size and .fill) need to take the `MCAsmLayout`
code path in AttemptToFoldSymbolOffsetDifference, avoiding relocations
(weird, but matching GNU assembler and needed to match user
expectation). Switch to evaluateKnownAbsolute to leverage the `InSet`
condition.
As a bonus, this change allows for the removal of some relocations for
the FDE `address_range` field in the .eh_frame section.
riscv64-64b-pcrel.s contains the main test.
Add a linker relaxable instruction to dwarf-riscv-relocs.ll to test what
it intends to test.
Merge fixups-relax-diff.ll into fixups-diff.ll.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D155357
Phoebe Wang [Fri, 21 Jul 2023 15:29:11 +0000 (23:29 +0800)]
Revert "[X86][BF16] Do not scalarize masked load for BF16 when we have BWI"
This reverts commit
ca1c05208ed35ba72869c65ad773b2cca4bbd360.
It caused Buildbot fail: https://lab.llvm.org/buildbot#builders/220/builds/24870
Andrzej Warzynski [Fri, 21 Jul 2023 09:47:36 +0000 (10:47 +0100)]
[flang][nfc] Clarify the usage of llvmArgs and mlirArgs
Differential Revision: https://reviews.llvm.org/D155931
Phoebe Wang [Fri, 21 Jul 2023 15:18:38 +0000 (23:18 +0800)]
[X86][BF16] Do not scalarize masked load for BF16 when we have BWI
Fixes #63017
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D155952
Guray Ozen [Fri, 21 Jul 2023 14:59:15 +0000 (16:59 +0200)]
[mlir][nvgpu] Set useDefaultAttributePrinterParser
Differential Revision: https://reviews.llvm.org/D155959
Jakub Kuderski [Fri, 21 Jul 2023 14:59:16 +0000 (10:59 -0400)]
Revert "[mlir][spirv] Add D155747 to `.git-blame-ignore-revs`"
This reverts commit
b8a20658fee019fe9126a29f930ddd5dedec51ff.
This does not preserve the line history of cut-and-pasted code like I
expected.
Jay Foad [Fri, 21 Jul 2023 14:43:11 +0000 (15:43 +0100)]
[ARM] Extend regression test for D154281
Add a test case with a larger call frame which does not satisfy
ARMFrameLowering::hasReservedCallFrame.
Nikita Popov [Fri, 21 Jul 2023 14:44:38 +0000 (16:44 +0200)]
[FunctionAttrs] Add tests for PR63936 (NFC)
Maciej Gabka [Thu, 20 Jul 2023 08:31:48 +0000 (08:31 +0000)]
Add missing SLEEF mappings to scalable vector functions for log2 and log2f
In the original commit adding SLEEF mappings, https://reviews.llvm.org/D146839
mappings for log2/log2f were missing.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D155801
Nikita Popov [Fri, 21 Jul 2023 13:24:08 +0000 (15:24 +0200)]
[ValueTracking] Check non-zero operator before dominating condition (NFC)
Prefer checking for non-zero operator before non-zero via
dominating conditions. This is to make sure we don't have
compile-time regressions when special cases that are currently
part of isKnownNonZero() get moved into isKnownNonZeroFromOperator().
Simon Pilgrim [Fri, 21 Jul 2023 13:44:03 +0000 (14:44 +0100)]
[DAG] SimplifyDemandedBits - call ComputeKnownBits for constant non-uniform ISD::SRL shift amounts
We only attempted to determine KnownBits for uniform constant shift amounts, but ComputeKnownBits is able to handle some non-uniform cases as well that we can use as a fallback.
Matthias Springer [Fri, 21 Jul 2023 13:35:53 +0000 (15:35 +0200)]
[mlir][linalg] MapCopyToThreadsOp: Support tensor.pad
Also return the generated loop op.
Differential Revision: https://reviews.llvm.org/D155950
Nikita Popov [Fri, 21 Jul 2023 13:23:02 +0000 (15:23 +0200)]
[ValueTracking] Extract isKnownNonZeroFromOperator() (NFC)
Split off the primary part of the isKnownNonZero() implementation,
in the same way it is done for computeKnownBits(). This makes it
easier to reorder different parts of isKnownNonZero().
Maciej Gabka [Fri, 21 Jul 2023 13:50:10 +0000 (13:50 +0000)]
Revert "[TLI][AArch64] Add missing SLEEF mappings to scalable vector functions for log2 and log2f"
This reverts commit
791c89600aaa288d7066aea95a1e06cd6d61b2e3.
Maciej Gabka [Thu, 20 Jul 2023 08:31:48 +0000 (08:31 +0000)]
[TLI][AArch64] Add missing SLEEF mappings to scalable vector functions for log2 and log2f
In the original commit adding SLEEF mappings, https://reviews.llvm.org/D146839
mappings for log2/log2f were missing.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D155623
Matthias Springer [Fri, 21 Jul 2023 13:29:16 +0000 (15:29 +0200)]
[mlir][linalg] BufferizeToAllocationOp: Add option to materialize buffers for operands
Add an option that does not bufferize the targeted op itself, but just materializes a buffer for the destination operands. This is useful for partial bufferization of complex ops such as `scf.forall`, which need special handling (and an analysis if the region).
Differential Revision: https://reviews.llvm.org/D155946
Matthias Springer [Fri, 21 Jul 2023 13:12:52 +0000 (15:12 +0200)]
[mlir][transform] Add `apply_cse` option to `transform.apply_patterns` op
Applying the canonicalizer and CSE in an interleaved fashion is useful after bufferization (and maybe other transforms) to fold away self copies.
Differential Revision: https://reviews.llvm.org/D155933
Daniel Krupp [Wed, 19 Jul 2023 12:01:53 +0000 (14:01 +0200)]
[clang][analyzer]Fix non-effective taint sanitation
There was a bug in alpha.security.taint.TaintPropagation checker
in Clang Static Analyzer.
Taint filtering could only sanitize const arguments.
After this patch, taint filtering is effective also
on non-const parameters.
Differential Revision: https://reviews.llvm.org/D155848
Corentin Jabot [Fri, 7 Jul 2023 08:58:13 +0000 (10:58 +0200)]
[Clang] Diagnose jumps into statement expressions
Such jumps are not allowed by GCC and allowing them
can lead to situations where we jumps into unevaluated
statements.
Fixes #63682
Reviewed By: aaron.ballman, #clang-language-wg
Differential Revision: https://reviews.llvm.org/D154696
Jie Fu [Fri, 21 Jul 2023 12:45:59 +0000 (20:45 +0800)]
[mlir][nvgpu] Ignore -Wunused-function in NVGPUDialect.cpp (NFC)
In file included from /Users/jiefu/llvm-project/mlir/lib/Dialect/NVGPU/IR/NVGPUDialect.cpp:363:
/Users/jiefu/llvm-project/build-Release/tools/mlir/include/mlir/Dialect/NVGPU/IR/NVGPUAttrDefs.cpp.inc:22:36: error: unused function 'generatedAttributeParser' [-Werror,-Wunused-function]
static ::mlir::OptionalParseResult generatedAttributeParser(::mlir::AsmParser &parser, ::llvm::StringRef *mnemonic, ::mlir::Type type, ::mlir::Attribute &value) {
^
/Users/jiefu/llvm-project/build-Release/tools/mlir/include/mlir/Dialect/NVGPU/IR/NVGPUAttrDefs.cpp.inc:46:30: error: unused function 'generatedAttributePrinter' [-Werror,-Wunused-function]
static ::mlir::LogicalResult generatedAttributePrinter(::mlir::Attribute def, ::mlir::AsmPrinter &printer) {
^
2 errors generated.
David Berard [Fri, 21 Jul 2023 12:24:47 +0000 (05:24 -0700)]
[llvm][SLP] Exit early if inputs to comparator are equal
**TL;DR:** This PR modifies a comparator. The comparator is used in a subsequent call to llvm::stable_sort. Sorting comparators should follow strict weak ordering - in particular, (x < x) should return false. This PR adds a fix to avoid an infinite loop when the inputs to the comparator are equal.
**Details**:
Sometimes when two equivalent tensors passed into the comparator, we encounter infinite looping (at https://github.com/llvm/llvm-project/blob/
aae2eaae2cefd3132059925c4592276defdb1faa/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp#L4049)
Although it seems like this comparator will never be called with two equivalent pointers, some sanitizers, e.g. https://chromium.googlesource.com/chromiumos/third_party/gcc/+/refs/heads/stabilize-zako-5712.88.B/libstdc++-v3/include/bits/stl_algo.h#360, will add checks for (x < x). When this sanitizer is used with the current implementation, it triggers a comparator check for (x < x) which runs into the infinite loop
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D155874
Kadir Cetinkaya [Wed, 14 Sep 2022 08:07:15 +0000 (10:07 +0200)]
[clangd] Prefer definitions for gototype and implementation
Differential Revision: https://reviews.llvm.org/D133843
Simon Pilgrim [Fri, 21 Jul 2023 11:26:45 +0000 (12:26 +0100)]
[X86] matchBinaryShuffle - match PACKUS for v2i64 -> v4i32 shuffle truncation patterns.
Handle PACKUSWD on +SSE41 targets, or fallback to PACKUSBW on any +SSE2 target
Simon Pilgrim [Fri, 21 Jul 2023 10:56:35 +0000 (11:56 +0100)]
[X86] Add packus.ll test coverage
Similar to the existing packss.ll tests
Simon Pilgrim [Thu, 20 Jul 2023 13:31:02 +0000 (14:31 +0100)]
[X86] packss.ll - add SSE4.2 test coverage
Michael Halkenhaeuser [Thu, 20 Jul 2023 19:41:35 +0000 (15:41 -0400)]
[OpenMP][OMPT] Add 'Initialized' flag
We observed some overhead and unnecessary debug output.
This can be alleviated by (re-)introduction of a boolean that indicates, if the
OMPT initialization has been performed.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D155186
Haojian Wu [Fri, 21 Jul 2023 12:12:44 +0000 (14:12 +0200)]
[clangd] Make the order of missing-include edits deterministic
Fixes https://github.com/llvm/llvm-project/issues/63995
Aaron Ballman [Fri, 21 Jul 2023 12:08:09 +0000 (08:08 -0400)]
Mark this test as unsupported on Windows systems
There is a strange issue happening with command line processing though. The
command line argument
--export-dynamic-symbol 'f*'
does not have the single quotes stripped on some Windows targets (but not
all). This causes the glob matching to fail, which means the test fails on
some Windows bots and passes on others.
This is expected to be a temporary measure to get bots back to green. I've not
found a commit that has caused a behavioral change that could be reverted
instead, so this could be an issue with lit or test machine configuration.
Benjamin Kramer [Fri, 21 Jul 2023 11:51:32 +0000 (13:51 +0200)]
[bazel] Tweak dependency spaghetti after
70c2e0618a0f3c09ed7149d88b4987b932eb6705
Benjamin Kramer [Fri, 21 Jul 2023 11:51:10 +0000 (13:51 +0200)]
Matthias Springer [Fri, 21 Jul 2023 11:33:54 +0000 (13:33 +0200)]
[mlir] Fix build after D155680
Alexander Belyaev [Fri, 21 Jul 2023 11:16:31 +0000 (13:16 +0200)]
[mlir] Update bazel build after rG70c2e0618a0f3c09ed7149d88b4987b932eb6705
Corentin Jabot [Fri, 21 Jul 2023 10:32:23 +0000 (12:32 +0200)]
[Clang] Fix access to an unitinialized variable
This fixes the spurious test failure introduced in
f9caa12328b2
Shivam Gupta [Fri, 21 Jul 2023 10:29:56 +0000 (15:59 +0530)]
Revert "[LIT] Added an option to llvm-lit to emit the necessary test coverage data, divided per test case"
This reverts commit
d8e26bccb3016d255298b7db78fe8bf05dd880b2.
Test case are meant to run only when LLVM_INDIVIDUAL_TEST_COVERAGE is set.
Michael Halkenhaeuser [Wed, 8 Jun 2022 23:33:01 +0000 (16:33 -0700)]
[OpenMP] [OMPT] [6/8] Added callback support for target data operations, target submit, and target regions.
This patch adds support for invoking target callbacks but does not yet
invoke them. A new structure OmptInterface has been added that tracks
thread local states including correlation ids. This structure defines
methods that will be called from the device independent target library
with information related to a target entry point for which a callback
is invoked. These methods in turn use the callback functions maintained
by OmptDeviceCallbacksTy to invoke the tool supplied callbacks.
Depends on D124652
Patch from John Mellor-Crummey <johnmc@rice.edu>
With contributions from:
Dhruva Chakrabarti <Dhruva.Chakrabarti@amd.com>
Differential Revision: https://reviews.llvm.org/D127365
Matthias Springer [Fri, 21 Jul 2023 10:10:36 +0000 (12:10 +0200)]
[mlir][bufferization] Remove cleanup pipeline from bufferization pass
To keep the pass simple, users should apply cleanup passes manually when necessary. In particular, `-cse -canonicalize` are often desireable to fold away self-copies that are created by the bufferization.
This addresses a comment in D120191.
Differential Revision: https://reviews.llvm.org/D155923
Pranav Taneja [Fri, 21 Jul 2023 10:03:37 +0000 (15:33 +0530)]
[AMDGPU] [NFC] Fixed a typo in SIShrinkInstructions.cpp
Reviewed By: pravinjagtap
Differential Revision: https://reviews.llvm.org/D155785
Jay Foad [Tue, 18 Jul 2023 12:38:31 +0000 (13:38 +0100)]
[AMDGPU][RFC] Update isLegalAddressingMode for GFX9 SMEM signed offsets
Differential Revision: https://reviews.llvm.org/D155587
Jay Foad [Thu, 20 Jul 2023 15:51:48 +0000 (16:51 +0100)]
[AMDGPU] Add tests for SMEM addressing modes in CodeGenPrepare
Differential Revision: https://reviews.llvm.org/D155854
Shivam Gupta [Fri, 21 Jul 2023 09:26:02 +0000 (14:56 +0530)]
[LIT] Added an option to llvm-lit to emit the necessary test coverage data, divided per test case
This patch is the first part of https://llvm.org/OpenProjects.html#llvm_patch_coverage.
We have first define a new variable LLVM_TEST_COVERAGE which when set, pass --emit-coverage option to
llvm-lit which will help in setting a unique value to LLVM_PROFILE_FILE for each RUN. So for example
coverage data for test case llvm/test/Analysis/AliasSet/memtransfer.ll will be emitted as
build/test/Analysis/AliasSet/memtransfer.profraw
Reviewed By: hnrklssn
Differential Revision: https://reviews.llvm.org/D154280
Ingo Müller [Thu, 20 Jul 2023 09:58:41 +0000 (09:58 +0000)]
[mlir][transform][structured][python] Allow str arg in match_op_names.
Allow the `names` argument in `MatchOp.match_op_names` to be of type
`str` in addition to `Sequence[str]`. In this case, the argument is
treated as a list with one name, i.e., it is possible to write
`MatchOp.match_op_names(..., "test.dummy")` instead of
`MatchOp.match_op_names(..., ["test.dummy"])`.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D155807
Ingo Müller [Thu, 20 Jul 2023 08:57:22 +0000 (08:57 +0000)]
[mlir][linalg][transform] Extend diagnostics of FuseIntoContainingOp.
This patch extends the diagnostic output of `FuseIntoContainingOp` when
it fails to find the next producer by also provided the location of the
affected transform op.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D155803
Guray Ozen [Fri, 21 Jul 2023 09:12:56 +0000 (11:12 +0200)]
[mlir][nvgpu] Add `tma.create.descriptor` to create tensor map descriptor
The Op creates a tensor map descriptor object representing tiled memory region. The descriptor is used by Tensor Memory Access (TMA). The `tensor` is the source tensor to be tiled. The `boxDimensions` is the size of the tiled memory region in each dimension.
The pattern here lowers `tma.create.descriptor` to a runtime function call that eventually calls calls CUDA Driver's `cuTensorMapEncodeTiled`. For more information see below:
https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__TENSOR__MEMORY.html
Depends on D155453
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D155680
Luke Lau [Mon, 17 Jul 2023 11:11:21 +0000 (12:11 +0100)]
[RISCV] Add SDNode patterns for vrol.[vv,vx] and vror.[vv,vx,vi]
These correspond to ROTL/ROTR nodes
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D155439
Andrzej Warzynski [Fri, 21 Jul 2023 07:25:12 +0000 (07:25 +0000)]
[mlir][test] Add missing LIT config for `mlir-cpu-config` + emulator
Similarly to when using `lli`, make sure that when using
`mlir-cpu-runner` with an emulator, a full path to `mlir-cpu-runner` is
used. Otherwise `mlir-cpu-runner` won't be found and you will get the
following error:
```
Error while loading mlir-cpu-runner: No such file or directory
```
This patch should fix:
* https://lab.llvm.org/buildbot/#/builders/179
The breakage was originally introduced in
https://reviews.llvm.org/D155405.
Differential Revision: https://reviews.llvm.org/D155920
Alex Zinenko [Thu, 20 Jul 2023 12:24:44 +0000 (12:24 +0000)]
[mlir] allow region branch spec from parent op to itself
RegionBranchOpInterface did not allow the operation with regions to
specify itself as successors. Therefore, this implied that the control
is always transferred to a region before being transferred back to the
parent op. Since the region can only transfer the control back to the
parent op from a terminator, this transitively implied that the first
block of any region with a RegionBranchOpInterface is always executed
until the terminator can transfer the control flow back. This is
trivially false for any conditional-like operation that may or may not
execute the region, as well as for loop-like operations that may not
execute the body.
Remove the restriction from the interface description and update the
only transform that relied on it.
See
https://discourse.llvm.org/t/rfc-region-control-flow-interfaces-should-encode-region-not-executed-correctly/72103.
Depends On: https://reviews.llvm.org/D155757
Reviewed By: Mogball, springerm
Differential Revision: https://reviews.llvm.org/D155822
Alex Zinenko [Wed, 19 Jul 2023 21:58:01 +0000 (21:58 +0000)]
[mlir] allow dense dataflow to customize call and region operations
Initial implementations of dense dataflow analyses feature special cases
for operations that have region- or call-based control flow by
leveraging the corresponding interfaces. This is not necessarily
sufficient as these operations may influence the dataflow state by
themselves as well we through the control flow. For example,
`linalg.generic` and similar operations have region-based control flow
and their proper memory effects, so any memory-related analyses such as
last-writer require processing `linalg.generic` directly instead of, or
in addition to, the region-based flow.
Provide hooks to customize the processing of operations with region-
cand call-based contol flow in forward and backward dense dataflow
analysis. These hooks are trigerred when control flow is transferred
between the "main" operation, i.e. the call or the region owner, and
another region. Such an apporach allows the analyses to update the
lattice before and/or after the regions. In the `linalg.generic`
example, the reads from memory are interpreted as happening before the
body region and the writes to memory are interpreted as happening after
the body region. Using these hooks in generic analysis may require
introducing additional interfaces, but for now assume that the specific
analysis have spceial cases for the (rare) operaitons with call- and
region-based control flow that need additional processing.
Reviewed By: Mogball, phisiart
Differential Revision: https://reviews.llvm.org/D155757
Luke Lau [Thu, 20 Jul 2023 11:59:13 +0000 (12:59 +0100)]
[RISCV] Remove VPatBinaryExtVL_WV_WX multiclass. NFC
It's no longer needed now that the sext/zext patterns have been merged.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D155815
Luke Lau [Wed, 19 Jul 2023 13:13:50 +0000 (14:13 +0100)]
[RISCV] Add patterns for vnsr[a,l].wx where shift amount has different type than vector element
We're currently only matching scalar shift amounts where the type is the same
as the vector element type. But because only the bottom log2(2*SEW) bits are
used, only 7 bits will be used at most so we can use any scalar type >= i8.
This patch adds patterns for the case above, as well as for when the shift
amount type is the same as the widened element type and doesn't need extended.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D155698
Luke Lau [Wed, 19 Jul 2023 12:18:36 +0000 (13:18 +0100)]
[RISCV] Add tests for vnsr[l,a].wx patterns that could be matched
These patterns of ([l,a]shr v, ([s,z]ext splat)) only pick up the cases where
the scalar has the same type as the vector element. However since only the low
log2(SEW) bits of the scalar are read, we could use any scalar type that has
been extended.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D155697
Ilya Leoshkevich [Thu, 20 Jul 2023 08:11:24 +0000 (10:11 +0200)]
[SystemZ] Allow symbols in immediate asm operands
Currently mentioning any symbols in immediate asm operands is not
supported, for example:
error: invalid operand for instruction
lghi %r4,foo_end-foo
The immediate problem is that is*Imm() and print*Operand() functions do
not accept MCExprs, but simply relaxing these checks is not enough:
after symbol addresses are computed, range checks need to run against
resolved values.
Add a number of SystemZ::FixupKind members for each kind of immediate
value and process them in SystemZMCAsmBackend::applyFixup(). Only
perform the range checks, do not change anything.
Adjust the tests: move previously failing cases like the one shown
above out of insn-bad.s.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D154899
Alexander Belyaev [Fri, 21 Jul 2023 09:08:44 +0000 (11:08 +0200)]
Corentin Jabot [Mon, 3 Jul 2023 17:02:24 +0000 (19:02 +0200)]
[Clang] Fix constraint checking of non-generic lambdas.
A lambda call operator can be a templated entity -
and therefore have constraints while not being a function template
template<class T> void f() {
[]() requires false { }();
}
In that case, we would check the constraints of the call operator
which is non-viable. However, we would find a viable candidate:
the conversion operator to function pointer, and use it to
perform a surrogate call.
These constraints were not checked because:
* We never check the constraints of surrogate functions
* The lambda conversion operator has non constraints.
From the wording, it is not clear what the intent is but
it seems reasonable to expect the constraints of the lambda conversion
operator to be checked and it is consistent with GCC and MSVC.
This patch also improve the diagnostics for constraint failure
on surrogate calls.
Fixes #63181
Reviewed By: #clang-language-wg, aaron.ballman
Differential Revision: https://reviews.llvm.org/D154368
Nikita Popov [Fri, 21 Jul 2023 08:39:45 +0000 (10:39 +0200)]
[X86] Expand constant expressions in test (NFC)
Guray Ozen [Fri, 21 Jul 2023 08:36:29 +0000 (10:36 +0200)]
[mlir][nvgpu] Improve finding module Op to for `mbarrier.create`
Current transformation expects module op to be two level higher, however, it is not always the case. This work searches module op in a while loop.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D155825
Guray Ozen [Thu, 20 Jul 2023 11:56:11 +0000 (13:56 +0200)]
[mlir][nvgpu] Add nvgpu.tma.async.load and nvgpu.tma.descriptor
This work adds `nvgpu.tma.async.load` Op that requests tma load asyncronusly using mbarrier object.
It also creates nvgpu.tma.descriptor type. The type is supposed be created by `cuTensorMapEncodeTiled` cuda drivers api.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D155453
Alex Zinenko [Thu, 20 Jul 2023 13:53:51 +0000 (13:53 +0000)]
[mlir] remove RegionBranchOpInterface from linalg ops
Linalg structure ops do not implement control flow in the way expected
by RegionBranchOpInterface, and the interface implementation isn't
actually used anywhere. The presence of this interface without correct
implementation is confusing for, e.g., dataflow analyses.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D155841
Nikita Popov [Fri, 21 Jul 2023 08:12:05 +0000 (10:12 +0200)]
[LoopIdiom] Regenerate test checks (NFC)
Nikita Popov [Fri, 21 Jul 2023 08:11:35 +0000 (10:11 +0200)]
[InstCombine] Regenerate test checks (NFC)
Nikita Popov [Thu, 20 Jul 2023 12:31:18 +0000 (14:31 +0200)]
Reapply [IR] Mark and constant expressions as undesirable
Reapply after fixing an issue in canonicalizeLogicFirst() exposed
by this change (
218f97578b26f7a89f7f8ed0748c31ef0181f80a).
-----
In preparation for removing support for and expressions, mark them
as undesirable. As such, we will no longer implicitly create such
expressions, but they still exist.
Haojian Wu [Fri, 21 Jul 2023 08:09:44 +0000 (10:09 +0200)]
[bazel] add missing dep for llvm/unittests:frontend_tests
Nikita Popov [Fri, 21 Jul 2023 07:58:22 +0000 (09:58 +0200)]
[IR] Accept non-Instruction in BinaryOperator::CreateWithCopiedFlags() (NFC)
The underlying copyIRFlags() API accepts arbitrary values and can
work with flags on operators (i.e. instructions or constant
expressions). Remove the arbitrary limitation that the
CreateWithCopiedFlags() API imposes, so we can directly pass through
values matched by PatternMatch, which can be constant expressions.
The attached test case works fine now, but would crash with an
upcoming change to not produce and constant expressions.
David Green [Fri, 21 Jul 2023 07:48:53 +0000 (08:48 +0100)]
[AArch64] Basic vector bswap costs
This adds some basic vector bswap costs, providing the type is supported.
Differential Revision: https://reviews.llvm.org/D155806