Alexey Bataev [Tue, 10 Jan 2023 12:28:21 +0000 (04:28 -0800)]
[SLP]Add shuffling of extractelements to avoid extra costs/data movement.
If the scalar must be extracted and then used in the gather node,
instead we can emit shuffle instruction to avoid those extra
extractelements and vector-to-scalar and back data movement.
Part of D110978
Differential Revision: https://reviews.llvm.org/D141940
David Green [Mon, 20 Feb 2023 14:13:53 +0000 (14:13 +0000)]
[AArch64] More consistently use buildvector for zero and all-ones constants
The AArch64 backend will use legal BUILDVECTORs for zero vectors or all-ones
vectors, so during selection tablegen patterns get rely on immAllZerosV and
immAllOnesV pattern frags in patterns like vnot. It was not always consistent
though, which this patch attempt to fix by recognizing where constant splat +
insert vector element is used. The main outcome of this will be that full
vector movi v0.2d, #
0000000000000000 will be used as opposed to movi d0, #0, as
per https://reviews.llvm.org/D53579. This helps simplify what tablegen will
see, to make pattern matching simpler.
Differential Revision: https://reviews.llvm.org/D144018
Florian Hahn [Mon, 20 Feb 2023 14:11:18 +0000 (14:11 +0000)]
[VPlan] Use usesScalars in shouldPack.
Suggested by @Ayal as follow-up improvement in D143864.
I was unable to find a case where this actually changes generated code,
but it is a unifying code to use common infrastructure.
Kerry McLaughlin [Mon, 20 Feb 2023 11:00:47 +0000 (11:00 +0000)]
[SME2][AArch64] Add multi-multi multiply-add long long intrinsics
Adds intrinsics for the following SME2 instructions (2 & 4 vectors):
- smlall
- smlsll
- umlall
- umlsll
- usmlall
NOTE: These intrinsics are still in development and are subject to future changes.
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D143277
Aaron Ballman [Mon, 20 Feb 2023 13:35:42 +0000 (08:35 -0500)]
Fix LLVM sphinx build
This fixes the issue found by:
https://lab.llvm.org/buildbot/#/builders/30/builds/32127
Caroline Concatto [Mon, 20 Feb 2023 12:21:38 +0000 (12:21 +0000)]
[IR] Add new intrinsics interleave and deinterleave vectors
This patch adds 2 new intrinsics:
; Interleave two vectors into a wider vector
<vscale x 4 x i64> @llvm.vector.interleave2.nxv2i64(<vscale x 2 x i64> %even, <vscale x 2 x i64> %odd)
; Deinterleave the odd and even lanes from a wider vector
{<vscale x 2 x i64>, <vscale x 2 x i64>} @llvm.vector.deinterleave2.nxv2i64(<vscale x 4 x i64> %vec)
The main motivator for adding these intrinsics is to support vectorization of
complex types using scalable vectors.
The intrinsics are kept simple by only supporting a stride of 2, which makes
them easy to lower and type-legalize. A stride of 2 is sufficient to handle
complex types which only have a real/imaginary component.
The format of the intrinsics matches how `shufflevector` is used in
LoopVectorize. For example:
using cf = std::complex<float>;
void foo(cf * dst, int N) {
for (int i=0; i<N; ++i)
dst[i] += cf(1.f, 2.f);
}
For this loop, LoopVectorize:
(1) Loads a wide vector (e.g. <8 x float>)
(2) Extracts odd lanes using shufflevector (leading to <4 x float>)
(3) Extracts even lanes using shufflevector (leading to <4 x float>)
(4) Performs the addition
(5) Interleaves the two <4 x float> vectors into a single <8 x float> using
shufflevector
(6) Stores the wide vector.
In this example, we can 1-1 replace shufflevector in (2) and (3) with the
deinterleave intrinsic, and replace the shufflevector in (5) with the
interleave intrinsic.
The SelectionDAG nodes might be extended to support higher strides (3, 4, etc)
as well in the future.
Similar to what was done for vector.splice and vector.reverse, the intrinsic
is lowered to a shufflevector when the type is fixed width, so to benefit from
existing code that was written to recognize/optimize shufflevector patterns.
Note that this approach does not prevent us from adding new intrinsics for other
strides, or adding a more generic shuffle intrinsic in the future. It just solves
the immediate problem of being able to vectorize loops with complex math.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D141924
Max Kazantsev [Mon, 20 Feb 2023 11:38:07 +0000 (18:38 +0700)]
Revert "[AssumptionCache] caches @llvm.experimental.guard's"
This reverts commit
f9599bbc7a3f831e1793a549d8a7a19265f3e504.
For some reason it caused us a huge compile time regression in downstream
workloads. Not sure whether the source of it is in upstream code ir not.
Temporarily reverting until investigated.
Differential Revision: https://reviews.llvm.org/D142330
Mel Chen [Mon, 13 Feb 2023 13:28:42 +0000 (05:28 -0800)]
[LV] Harden the test of the minmax with index pattern. (NFC)
- Add test config: -force-vector-width=4 -force-vector-interleave=1
- New test case: The test case both returns the minimum value and the index.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D143905
Florian Hahn [Mon, 20 Feb 2023 10:53:45 +0000 (10:53 +0000)]
[VPlan] Move shouldPack outside of DEBUG ifdef.
This fixes a build failure with assertions disabled.
Simon Tatham [Thu, 16 Feb 2023 15:34:33 +0000 (15:34 +0000)]
[LowerTypeTests] Support generating Armv6-M jump tables. (reland)
[Originally committed as
f6ddf7781471b71243fa3c3ae7c93073f95c7dff;
reverted in
bbef38352fbade9e014ec97d5991da5dee306da7 due to test
breakage; now relanded with the Arm tests conditioned on
`arm-registered-target`]
The LowerTypeTests pass emits a jump table in the form of an
`inlineasm` IR node containing a string representation of some
assembly. It tests the target triple to see what architecture it
should be generating assembly for. But that's not good enough for
`Triple::thumb`, because the 32-bit PC-relative `b.w` branch
instruction isn't available in all supported architecture versions. In
particular, Armv6-M doesn't support that instruction (although the
similar Armv8-M Baseline does).
Most of this patch is concerned with working out whether the
compilation target is Armv6-M or not, which I'm doing by going through
all the functions in the module, retrieving a TargetTransformInfo for
each one, and querying it via a new method I've added to check its
SubtargetInfo. If any function's TTI indicates that it's targeting an
architecture supporting B.W, then we assume we're also allowed to use
B.W in the jump table.
The Armv6-M compatible jump table format requires a temporary
register, and therefore also has to use the stack in order to restore
that register.
Another consequence of this change is that jump tables on Arm/Thumb
are no longer always the same size. In particular, on an architecture
that supports Arm and Thumb-1 but not Thumb-2, the Arm and Thumb
tables are different sizes from //each other//. As a consequence,
``getJumpTableEntrySize`` can no longer base its answer on the target
triple's architecture: it has to take into account the decision that
``selectJumpTableArmEncoding`` made, which meant I had to move that
function to an earlier point in the code and store its answer in the
``LowerTypeTestsModule`` class.
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D143576
Florian Hahn [Mon, 20 Feb 2023 10:28:24 +0000 (10:28 +0000)]
[VPlan] Replace AlsoPack field with shouldPack() method (NFC).
There is no need to update the AlsoPack field when creating
VPReplicateRecipes. It can be easily computed based on the VP def-use
chains when it is needed.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D143864
Matt Devereau [Mon, 20 Feb 2023 09:47:53 +0000 (09:47 +0000)]
[InstSimplify] Correct icmp_lshr test to use ult instead of slt
Nicolas Vasilache [Mon, 20 Feb 2023 09:40:18 +0000 (01:40 -0800)]
[mlir][linalg][TransformOps] Connect hoistRedundantVectorTransfers
Connect the hoistRedundantVectorTransfers functionality to the transform
dialect.
Authored-by: Quentin Colombet <quentin.colombet@gmail.com>
Differential Revision: https://reviews.llvm.org/D144260
Nikita Popov [Wed, 15 Feb 2023 11:14:55 +0000 (12:14 +0100)]
[InstCombine] Call simplifyLoadInst()
InstCombine is supposed to be a superset of InstSimplify, but
failed to invoke load simplification.
Unfortunately, this causes a minor compile-time regression, which
will be mitigated in a future commit.
Max Kazantsev [Mon, 20 Feb 2023 09:42:14 +0000 (16:42 +0700)]
[Test] Move test for D143726 to LICM
Seems that it's a more appropriate place to do this transform.
Nikita Popov [Mon, 20 Feb 2023 09:38:22 +0000 (10:38 +0100)]
[InstCombine] Add additional load folding tests (NFC)
These show that we currently fail to call load simplification from
InstCombine.
Matt Devereau [Tue, 31 Jan 2023 13:30:09 +0000 (13:30 +0000)]
[InstSimplify] Simplify icmp between Shl instructions of the same value
define i1 @compare_vscales() {
%vscale = call i64 @llvm.vscale.i64()
%vscalex2 = shl nuw nsw i64 %vscale, 1
%vscalex4 = shl nuw nsw i64 %vscale, 2
%cmp = icmp ult i64 %vscalex2, %vscalex4
ret i1 %cmp
}
This IR is currently emitted by LLVM. This icmp is redundant as this snippet
can be simplified to true or false as both operands originate from the same
@llvm.vscale.i64() call.
Differential Revision: https://reviews.llvm.org/D142542
Max Kazantsev [Mon, 20 Feb 2023 08:48:05 +0000 (15:48 +0700)]
[SCEV] Canonicalize ext(min/max(x, y)) to min/max(ext(x), ext(y))
I stumbled over this while trying to improve our exit count work. These expressions
are equivalent for complementary signed/unsigned ext and min/max (including umin_seq),
but they are not canonicalized and SCEV cannot recognize them as the same.
The benefit of this canonicalization is that SCEV can prove some new equivalences which
it coudln't prove because of different forms. There is 1 test where trip count seems pessimized,
I could not directly figure out why, but it just seems an unrelated issue that we can fix.
Other changes seem neutral or positive to me.
Differential Revision: https://reviews.llvm.org/D141481
Reviewed By: nikic
Kazu Hirata [Mon, 20 Feb 2023 08:58:29 +0000 (00:58 -0800)]
Migrate away from the soft-deprecated functions in APInt.h (NFC)
Note that those functions on the left hand side are soft-deprecated in
favor of those on the right hand side:
getMinSignedBits -> getSignificantBits
getNullValue -> getZero
isNullValue -> isZero
isOneValue -> isOne
Sameer Sahasrabuddhe [Mon, 20 Feb 2023 08:55:37 +0000 (14:25 +0530)]
[llvm][Uniformity] A phi with an undef argument is not always divergent
The uniformity analysis treated an undef argument to phi to be distinct from any
other argument, equivalent to calling PHINode::hasConstantValue() instead of
PHINode::hasConstantOrUndefValue(). Such a phi was reported as divergent. This
is different from the older divergence analysis which treats such a phi as
uniform. Fixed uniformity analysis to match the older behaviour.
The original behaviour was added to DivergenceAnalysis in D19013. But it is not
clear if relying on the undef value is safe. The defined values are not constant
per se; they just happen to be uniform and the non-constant uniform value may
not dominate the PHI.
Reviewed By: ruiling
Differential Revision: https://reviews.llvm.org/D144254
Valentin Clement [Mon, 20 Feb 2023 08:43:57 +0000 (09:43 +0100)]
[flang] Carry over the derived type from target in pointer remapping
When calling PointerAssociateRemapping the dynamic type information
from the target needs to be carried over to the pointer if any.
Reviewed By: klausler
Differential Revision: https://reviews.llvm.org/D143717
Kohei Asano [Mon, 20 Feb 2023 08:38:03 +0000 (09:38 +0100)]
[InstSimplify] Fold LoadInst for uniform constant global variables
Fold LoadInst for uniformly initialized constants, even if there
are non-constant GEP indices.
Goal proof: https://alive2.llvm.org/ce/z/oZtVby
Motivated by https://github.com/rust-lang/rust/issues/107208
Differential Revision: https://reviews.llvm.org/D144184
David Spickett [Fri, 3 Feb 2023 11:45:05 +0000 (11:45 +0000)]
[libc][AArch64] Fix fullbuild when using G++/GCC
The libc uses some functions that GCC does not currently
implement, that come from Arm's ACLE header usually.
These are:
```
__arm_wsr64
__arm_rsr64
__arm_wsr
__arm_rsr
```
This issue was reported to us (https://github.com/llvm/llvm-project/issues/60473)
and I've then reported that back to GCC (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108642).
Even if these functions are added, clang has some non standard extensions
to them that gcc may not take. So we're looking at a fix in gcc 13 at best,
and that may not be enough for what we're doing with them.
So I've added ifdefs to use alternatives with gcc.
For handling the stack pointer, inline assembly is unfortunately the only option.
I have verified that the single mov is essentially what __arm_rsr64 generates.
For fpsr and fpcr the gcc devs suggested using
https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/AArch64-Built-in-Functions.html#AArch64-Built-in-Functions.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D143261
Kohei Asano [Mon, 20 Feb 2023 08:23:39 +0000 (09:23 +0100)]
[InstSimplify] Add additional load folding tests (NFC)
For D144184.
Tobias Gysi [Mon, 20 Feb 2023 07:46:33 +0000 (08:46 +0100)]
[mlir][llvm] Add atomic support to the StoreOp.
This revision adds atomic support to the StoreOp. It chooses
to print the atomic keywords together with the syncscope and
ordering arguments. The revision also implements verifiers to
ensure the constraints that apply to atomic store operations
are checked.
Depends on D144112
Reviewed By: Dinistro
Differential Revision: https://reviews.llvm.org/D144200
Kazu Hirata [Mon, 20 Feb 2023 07:56:52 +0000 (23:56 -0800)]
Use APInt::getSignificantBits instead of APInt::getMinSignedBits (NFC)
Note that getMinSignedBits has been soft-deprecated in favor of
getSignificantBits.
Chuanqi Xu [Mon, 20 Feb 2023 07:37:11 +0000 (15:37 +0800)]
[NFC] Remove the unused parameter in Sema::PushGlobalModuleFragment
The `IsImplicit` parameter should be removed since it is not used now.
Kazu Hirata [Mon, 20 Feb 2023 07:35:39 +0000 (23:35 -0800)]
[llvm] Use APInt::isAllOnes instead of isAllOnesValue (NFC)
Note that isAllOnesValue has been soft-deprecated in favor of
isAllOnes.
Chuanqi Xu [Mon, 20 Feb 2023 07:07:07 +0000 (15:07 +0800)]
[NFC] Remove unused Sema::DirectModuleImports
Sema::DirectModuleImports is not used now. Remove it for clearness.
Kazu Hirata [Mon, 20 Feb 2023 07:06:36 +0000 (23:06 -0800)]
Use APInt::isOne instead of APInt::isOneValue (NFC)
Note that isOneValue has been soft-deprecated in favor of isOne.
Kazu Hirata [Mon, 20 Feb 2023 06:54:23 +0000 (22:54 -0800)]
Use APInt::getAllOnes instead of APInt::getAllOnesValue (NFC)
Note that getAllOnesValue has been soft-deprecated in favor of
getAllOnes.
Kazu Hirata [Mon, 20 Feb 2023 06:42:01 +0000 (22:42 -0800)]
[llvm] Use APInt::getZero instead of APInt::getNullValue (NFC)
Note that APInt::getNullValue has been soft-deprecated in favor of
APInt::getZero.
Serguei Katkov [Mon, 20 Feb 2023 06:03:18 +0000 (13:03 +0700)]
[SimpleLoopUnswitch] Fix an assert in injectPendingInvariantConditions
Since canonicalizeForInvariantConditionInjection is introduced the
in loop successor may be the second successor.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D144361
Kazu Hirata [Mon, 20 Feb 2023 06:23:57 +0000 (22:23 -0800)]
Use APInt::isZero instead of APInt::isNulLValue (NFC)
Note that APInt::isNullValue has been soft-deprecated in favor of
APInt::isZero.
Chuanqi Xu [Mon, 20 Feb 2023 05:58:28 +0000 (13:58 +0800)]
[Modules] Handle the visibility of GMF during the template instantiation
Close https://github.com/llvm/llvm-project/issues/60775
Previously, we will mark all the declarations in the GMF as not visible
to other module units. But this is too strict and the users may meet
problems during the template instantiation like the above exampel shows.
The patch addresseds the problem.
Max Kazantsev [Mon, 20 Feb 2023 05:39:33 +0000 (12:39 +0700)]
[SCEV] Support umin/smin in SCEVLoopGuardRewriter
Adds support for these SCEVs to cover more cases.
Differential Revision: https://reviews.llvm.org/D143259
Reviewed By: dmakogon, fhahn
Kazu Hirata [Mon, 20 Feb 2023 06:04:47 +0000 (22:04 -0800)]
Use APInt::count{l,r}_{zero,one} (NFC)
Craig Topper [Mon, 20 Feb 2023 05:43:15 +0000 (21:43 -0800)]
[RISCV] Add more tests for D144166. NFC
Adding load and store tests. Addressing post commit feedback.
Fangrui Song [Mon, 20 Feb 2023 05:39:47 +0000 (21:39 -0800)]
[LoopIdiomRecognize] Remove legacy pass
Following recent changes to remove non-core legacy passes.
Alex Brachet [Mon, 20 Feb 2023 03:57:23 +0000 (03:57 +0000)]
[Fuchsia] Use cleaner method of adding driver binary
Alex Brachet [Mon, 20 Feb 2023 03:33:29 +0000 (03:33 +0000)]
[Fuchsia] Fix driver build on Windows
Don't include llvm-driver when building for Windows
sstwcw [Mon, 20 Feb 2023 03:03:33 +0000 (03:03 +0000)]
[clang-format] Put ports on separate lines in Verilog module headers
New:
```
module mh1
(input var int in1,
input var in2, in3,
output tagged_st out);
endmodule
```
Old:
```
module mh1
(input var int in1, input var in2, in3, output tagged_st out);
endmodule
```
`getNextNonComment` was modified to return a non-const pointer because
we needed to use it that way in `verilogGroupDecl`.
The comment on line 2626 was a typo. We corrected it while modifying
the function.
Reviewed By: MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D143825
Chuanqi Xu [Mon, 20 Feb 2023 02:26:41 +0000 (10:26 +0800)]
Recommit [Coroutines] Stop supportting std::experimental::coroutine_traits
As we discussed before, we should stop supporting
std::experimental::coroutine_traits in clang17. Now the clang16 is
branched so we can clean them now.
All the removed tests have been duplicated before.
Kai Luo [Mon, 20 Feb 2023 02:05:37 +0000 (10:05 +0800)]
[GISelEmitter][NFC] Correct path of GISel's td file in the comment.
`include/llvm/CodeGen/TargetGlobalISel.td` no longer exists.
Matt Arsenault [Mon, 23 Jan 2023 15:22:33 +0000 (11:22 -0400)]
AMDGPU: Restrict foldFreeOpFromSelect combine based on legal source mods
Provides a small code size savings for some f32 cases.
Alex Brachet [Mon, 20 Feb 2023 01:57:40 +0000 (01:57 +0000)]
Reland "[Fuchsia] Enable llvm-driver build".
The MacOS problem has been fixed. Additionally, don't enable the
driver build on Windows. We can look into enabling it later if
symlinks work better than I think on Windows.
Differential Revision: https://reviews.llvm.org/D144287
Matt Arsenault [Thu, 15 Dec 2022 00:23:55 +0000 (19:23 -0500)]
AMDGPU: Teach fneg combines that select has source modifiers
We do match source modifiers for f32 typed selects already, but the
combiner code was never informed of this.
A long time ago the documentation lied and stated that source
modifiers don't work for v_cndmask_b32 when they in fact do. We had a
bunch fo code operating under the assumption that they don't support
source modifiers, so we tried to move fnegs around to work around
this.
Gets a few small improvements here and there. The main hazard to watch
out for is infinite loops in the combiner since we try to move fnegs
up and down the DAG. For now, don't fold fneg directly into select.
The generic combiner does this for a restricted set of cases
when getNegatedExpression obviously shows an improvement for both
operands. It turns out to be trickier to avoid infinite looping the
combiner in conjunction with pulling out source modifiers, so
leave this for a later commit.
Amara Emerson [Sun, 19 Feb 2023 23:36:36 +0000 (15:36 -0800)]
[GlobalISel] Fix a store-merging bug due to use of >= instead of >.
This fixes a corner case where we would skip doing an alias check because of a
>= vs > bug, due to the presence of a non-aliasing instruction, in this case
the load %safeld.
Fixes issue #59376
Alex Brachet [Sun, 19 Feb 2023 23:42:11 +0000 (23:42 +0000)]
[CMake] Fix driver build on MacOS
Sanjay Patel [Sun, 19 Feb 2023 15:33:30 +0000 (10:33 -0500)]
[InstCombine] canonicalize "extract lowest set bit" away from cttz intrinsic
1 << (cttz X) --> -X & X
https://alive2.llvm.org/ce/z/qv3E9e
This creates an extra use of the input value, so that's generally
not preferred, but there are advantages to this direction:
1. 'negate' and 'and' allow for better analysis than 'cttz'.
2. This is more likely to induce follow-on transforms (in the
example from issue #60801, we'll get the decrement pattern).
3. The more basic ALU ops are more likely to result in better
codegen across a variety of targets.
This won't solve the motivating bugs (see issue #60799) because
we do not recognize the redundant icmp+sel, and the x86 backend
may not have the pattern-matching to produce the optimal BMI
instructions.
Differential Revision: https://reviews.llvm.org/D144329
Erik Desjardins [Sun, 19 Feb 2023 18:47:09 +0000 (13:47 -0500)]
Recommit "[Support] change StringMap hash function from djbHash to xxHash"
This reverts commit
37eb9d13f891f7656f811516e765b929b169afe0.
Test failures have been fixed:
- ubsan failure fixed by
72eac42f21c0f45a27f3eaaff9364cbb5189b9e4
- warn-unsafe-buffer-usage-fixits-local-var-span.cpp fixed by
03cc52dfd1dbb4a59b479da55e87838fb93d2067 (wasn't related)
- test-output-format.ll failure was spurious, build failed at
https://lab.llvm.org/buildbot/#/builders/54/builds/3545 (
b4431b2d945b6fc19b1a55ac6ce969a8e06e1e93)
but passed at
https://lab.llvm.org/buildbot/#/builders/54/builds/3546 (
5ae99be0377248c74346096dc475af254a3fc799)
which is before my revert
https://github.com/llvm/llvm-project/compare/
b4431b2d945b6fc19b1a55ac6ce969a8e06e1e93...
5ae99be0377248c74346096dc475af254a3fc799
Original commit message:
Depends on https://reviews.llvm.org/D142861.
Alternative to https://reviews.llvm.org/D137601.
xxHash is much faster than djbHash. This makes a simple Rust test case with a large constant string 10% faster to compile.
Previous attempts at changing this hash function (e.g. https://reviews.llvm.org/D97396) had to be reverted due to breaking tests that depended on iteration order.
No additional tests fail with this patch compared to `main` when running `check-all` with `-DLLVM_ENABLE_PROJECTS="all"` (on a Linux host), so I hope I found everything that needs to be changed.
Differential Revision: https://reviews.llvm.org/D142862
Florian Hahn [Sun, 19 Feb 2023 21:42:04 +0000 (21:42 +0000)]
[SLP] Fix infinite loop in isUndefVector.
This fixes an infinite loop if isa<T>(II->getOperand(1)) is true.
Update Base at the top of the loop, before the continue.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D144292
Alex Bradbury [Sun, 19 Feb 2023 20:40:58 +0000 (20:40 +0000)]
[RISCV][MC] Mark Zawrs extension as non-experimental
Support for the unratified 1.0-rc3 specification was introduced in
D133443. The specification has since been ratified (in November 2022
according to the recently ratified extensions list
<https://wiki.riscv.org/display/HOME/Recently+Ratified+Extensions>.
A review of the diff
<https://github.com/riscv/riscv-zawrs/compare/V1.0-rc3...main> of the
1.0-rc3 spec vs the current/ratified document shows no changes to the
instruction encoding or naming. At one point, a note was added
<https://github.com/riscv/riscv-zawrs/commit/
e84f42406a7c88eb92452515b2035144a7023a51>
indicating Zawrs depends on the Zalrsc extension (not officially
specified, but I believe to be just the LR/SC instructions from the A
extension). The final text ended up as "The instructions in the Zawrs
extension are only useful in conjunction with the LR instructions, which
are provided by the A extension, and which we also expect to be provided
by a narrower Zalrsc extension in the future." I think it's consistent
with this phrasing to not require the A extension for Zawrs, which
matches what was implemented.
No intrinsics are implemented for Zawrs currently, meaning we don't need
to additionally review whether those intrinsics can be considered
finalised and ready for exposure to end users.
Differential Revision: https://reviews.llvm.org/D143507
Craig Topper [Sun, 19 Feb 2023 20:27:24 +0000 (12:27 -0800)]
[RISCV] Add fgtq.s and fgeq.s assembler aliases for Zfa.
We can swap operands and use fltq.s and fleq.s. Similar for D and H.
Craig Topper [Sun, 19 Feb 2023 19:37:17 +0000 (11:37 -0800)]
[RISCV] Remove Commutable property from Zfa fltq/fleq instructions.
Kazu Hirata [Sun, 19 Feb 2023 19:29:12 +0000 (11:29 -0800)]
Use APInt::popcount instead of APInt::countPopulation (NFC)
This is for consistency with the C++20-style bit manipulation
functions in <bit>.
Alex Bradbury [Sun, 19 Feb 2023 19:15:32 +0000 (19:15 +0000)]
[lld][test][RISCV] Don't use incorrectly normalised arch string in riscv-attributes-place.s
Per the psABI, the arch string should be normalised to (amongest other
things) always include the full version of each extension in form
zfoo1p0. riscv-attributes-place.s didn't conform to this, which is not a
problem for the current parsing logic, but this behaviour would change
with a patch I'm about to propose.
This makes riscv-sttributes-place.s feature a valid arch string, and
maintains test coverage for this particular form of invalid arch string
by adding it to riscv-attributes.s.
David Green [Sun, 19 Feb 2023 19:13:41 +0000 (19:13 +0000)]
[ARM] Add targets for Arm DebugInfo tests. NFC
This prevents the instructions being invalid for the subtarget.
Florian Hahn [Sun, 19 Feb 2023 18:01:15 +0000 (18:01 +0000)]
[VPlan] Make sure properlyDominates(A, A) returns false.
At the moment, properlyDominates(A, A) can return true via
LocalComesBefore. Add an early exit to ensure it returns false if
A == B.
Note: no test has been added because the existing test suite covers this
case already with libc++ with assertions enabled.
Fixes https://github.com/llvm/llvm-project/issues/60850.
Mehdi Amini [Sun, 19 Feb 2023 06:14:20 +0000 (22:14 -0800)]
Fix potential crash in Flang generateLLVMIR() when MLIR fails to translate to LLVM
This is pure code motion, to ensure that the check if we have a valid llvmModule
comes before trying to set option on this module.
Differential Revision: https://reviews.llvm.org/D144342
Mark de Wever [Sun, 19 Feb 2023 15:44:56 +0000 (16:44 +0100)]
[libc++][format] Disables test on GCC-12.
These tests fail in D144331, for the same reason other format tests fail
in GCC. This is a resource issue.
Carlos Galvez [Sun, 19 Feb 2023 13:58:31 +0000 (13:58 +0000)]
Fix clang-tools-extra docs build
Carlos Galvez [Wed, 25 Jan 2023 05:26:07 +0000 (05:26 +0000)]
[clang-tidy] Introduce HeaderFileExtensions and ImplementationFileExtensions options
Re-introduce the patch that was reverted previously.
In the first attempt, the checks would not be able to
read from the global option, since getLocalOrGlobal
only works with string types. Additional logic is needed
in order to support both use cases in the transition
period. All that logic will be removed when the local
options are fully removed.
We have a number of checks designed to analyze problems
in header files only, for example:
bugprone-suspicious-include
google-build-namespaces
llvm-header-guard
misc-definitions-in-header
...
All these checks duplicate the same logic and options
to determine whether a location is placed in the main
source file or in the header. More checks are coming
up with similar requirements.
Thus, to remove duplication, let's move this option
to the top-level configuration of clang-tidy (since
it's something all checks should share).
Add a deprecation notice for all checks that use the
local option, prompting to update to the global option.
Differential Revision: https://reviews.llvm.org/D142655
DianQK [Sun, 19 Feb 2023 13:08:29 +0000 (21:08 +0800)]
Revert "[SimplifyCFG] Check if the return instruction causes undefined behavior"
This reverts commit
b6eed9a82e0ce530d94a194c88615d6c272e1854.
DianQK [Sun, 19 Feb 2023 08:42:33 +0000 (16:42 +0800)]
[SimplifyCFG] Check if the return instruction causes undefined behavior
This should fix https://github.com/rust-lang/rust/issues/107681.
Return undefined to a noundef return value is undefined.
Example:
```
define noundef i32 @test_ret_noundef(i1 %cond) {
entry:
br i1 %cond, label %bb1, label %bb2
bb1:
br label %bb2
bb2:
%r = phi i32 [ undef, %entry ], [ 1, %bb1 ]
ret i32 %r
}
```
Differential Revision: https://reviews.llvm.org/D144319
Benjamin Kramer [Sun, 19 Feb 2023 09:54:10 +0000 (10:54 +0100)]
[lldb] Add missing wasm switch case
TypeSystemClang.cpp:4855:13: error: enumeration value 'WasmExternRef' not handled in switch [-Werror,-Wswitch]
Kristina Bessonova [Sun, 19 Feb 2023 08:09:23 +0000 (10:09 +0200)]
[BOLT] Attempt to fix bolt/test/runtime/AArch64/adrrelaxationpass.s after D144079
Differential Revision: https://reviews.llvm.org/D144344
Joshua Cao [Sun, 19 Feb 2023 06:10:36 +0000 (22:10 -0800)]
[SCEV] Add automated test checks for some tests
Vitaly Buka [Sun, 19 Feb 2023 07:39:34 +0000 (23:39 -0800)]
[sanitizers] Update global_symbols.txt
NAKAMURA Takumi [Fri, 17 Feb 2023 14:25:40 +0000 (23:25 +0900)]
llvm-tblgen: Anonymize some functions.
Craig Topper [Sun, 19 Feb 2023 01:24:10 +0000 (17:24 -0800)]
[RISCV] Add Zfa test cases for strict ONE and UEQ comparisons. NFC
These correspond to islessgreater and it inverse.
Fabian [Sat, 18 Feb 2023 20:31:37 +0000 (21:31 +0100)]
[mlir] Execute all requested translations in MlirTranslateMain
Currently, MlirTranslateMain only executes one of the requested translations, and does not error if multiple are specified. This commit enables translations to be chained in the specified order.
This makes round-trip tests easier, since existing import/export passes can be reused and no combined round-trip passes have to be registered (example: mlir-translate -serialize-spirv -deserialize-spirv).
Additionally, by leveraging TranslateRegistration with file-to-file TranslateFunctions, generic pre- and post-processing can be added before/after conversion to/from MLIR.
Reviewed By: lattner, Mogball
Differential Revision: https://reviews.llvm.org/D143719
Craig Topper [Sun, 19 Feb 2023 00:43:50 +0000 (16:43 -0800)]
[RISCV] Handle RISCVISD::SplitF64 and RISCVISD::BuildPairF64 during isel with Zfa.
Instead of special casing Zfa in the custom inserters, select the
correct instructions during isel.
BuildPairF64 we can do with pattern, but SplitF64 requires custom
selection due to the two destinations.
If we didn't need SplitF64 without Zfa, I would have an extract low
and extract high ISD opcode for Zfa to avoid that issue.
Juneyoung Lee [Sat, 18 Feb 2023 20:43:26 +0000 (20:43 +0000)]
[DivRemPairs] Strip division's poison generating flag
Given this transformation: X % Y -> X - (X / Y) * Y
This patch strips off the poison-generating flag of X / Y such as exact, because it may make the optimized form result poison whereas X % Y does not.
The issue was reported here: https://github.com/llvm/llvm-project/issues/60748
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D144333
Juneyoung Lee [Sat, 18 Feb 2023 20:03:15 +0000 (20:03 +0000)]
Add a test for D144333
Michael Kirk [Sat, 18 Feb 2023 20:50:44 +0000 (12:50 -0800)]
[clang-format] Handle tabs in file path for git-clang-format
Vitaly Buka [Sat, 18 Feb 2023 02:29:42 +0000 (18:29 -0800)]
[SCEV] Fix FoldID::addInteger(unsigned long I)
"unsigned long" can be 8 bytes, but the code assumes 4.
This this the real root cause D122215 was reverted.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D144316
Vitaly Buka [Sat, 18 Feb 2023 20:21:10 +0000 (12:21 -0800)]
Revert "[SimplifyCFG] Check if the return instruction causes undefined behavior"
Breaks bots
https://lab.llvm.org/buildbot/#/builders/236/builds/2349
https://lab.llvm.org/buildbot/#/builders/74/builds/17361
https://lab.llvm.org/buildbot/#/builders/168/builds/11972
This reverts commit
7be55b007698f6b6398cbbea69c327b5a971938a.
David Green [Sat, 18 Feb 2023 19:54:29 +0000 (19:54 +0000)]
[AArch64] Concat zip1 and zip2 is a wider zip1
Given concat(zip1(a, b), zip2(a, b)), we can convert that to a 128bit zip1(a, b)
if we widen a and b out first.
Fixes #54226
Differential Revision: https://reviews.llvm.org/D121088
Noah Goldstein [Sat, 18 Feb 2023 19:36:06 +0000 (13:36 -0600)]
[ValueTracking] Add cases for additional ops in `isKnownNonZero`
Add cases for the following ops:
- 0-X -- https://alive2.llvm.org/ce/z/6C75Li
- bitreverse(X) -- https://alive2.llvm.org/ce/z/SGG1q9
- bswap(X) -- https://alive2.llvm.org/ce/z/p7pzwh
- ctpop(X) -- https://alive2.llvm.org/ce/z/c5y3BC
- abs(X) -- https://alive2.llvm.org/ce/z/yxXGz_
https://alive2.llvm.org/ce/z/rSRg4K
- uadd_sat(X, Y) -- https://alive2.llvm.org/ce/z/Zw-y4W
https://alive2.llvm.org/ce/z/2NRqRz
https://alive2.llvm.org/ce/z/M1OpF8
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D142828
Noah Goldstein [Sat, 18 Feb 2023 19:36:24 +0000 (13:36 -0600)]
[ValueTracking] Add tests for additional `isKnownNonZero` cases; NFC
Add cases for the following ops:
- 0-X
- bitreverse(X)
- bswap(X)
- ctpop(X)
- abs(X)
- uadd_sat(X, Y)
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D142827
Noah Goldstein [Thu, 26 Jan 2023 17:36:05 +0000 (11:36 -0600)]
[ValueTracking] Add KnownBits patterns `xor(x, x - 1)` and `and(x, -x)` for knowing upper bits to be zero
These two BMI pattern will clear the upper bits of result past the
first set bit. So if we know a single bit in `x` is set, we know that
`results[bitwidth - 1, log2(x) + 1] = 0`.
Alive2:
blsmsk: https://alive2.llvm.org/ce/z/a397BS
blsi: https://alive2.llvm.org/ce/z/tsbQhC
Differential Revision: https://reviews.llvm.org/D142271
Noah Goldstein [Thu, 26 Jan 2023 17:36:16 +0000 (11:36 -0600)]
[ValueTracking] Add tests for known bits after common BMI pattern (blsmsk/blsi); NFC
Differential Revision: https://reviews.llvm.org/D142270
Jay Foad [Thu, 26 Jan 2023 17:34:50 +0000 (11:34 -0600)]
[KnownBits] Add blsi and blsmsk
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D142519
Cyndy Ishida [Sat, 18 Feb 2023 19:27:44 +0000 (11:27 -0800)]
[llvm-tapi-diff] add default case to switch for symbol flags
Cyndy Ishida [Sat, 18 Feb 2023 18:37:44 +0000 (10:37 -0800)]
[TextAPI] Capture new properties from TBD to InterfaceFile
* Deployment Versions for targets
* Run Search Paths
* Text vs Data Segment attributes to symbols
Reviewed By: pete
Differential Revision: https://reviews.llvm.org/D144158
NAKAMURA Takumi [Sat, 18 Feb 2023 17:52:39 +0000 (02:52 +0900)]
llvm-tblgen: Add "TableGenBackends.h" to each emitter.
"TableGenBackends.h" has declarations of emitters.
NAKAMURA Takumi [Sat, 18 Feb 2023 18:01:00 +0000 (03:01 +0900)]
llvm-tblgen: Add missing includes
NAKAMURA Takumi [Sat, 18 Feb 2023 14:15:23 +0000 (23:15 +0900)]
llvm-tblgen: Reformat
Amara Emerson [Sat, 18 Feb 2023 17:51:17 +0000 (09:51 -0800)]
[GlobalISel] Fix G_ZEXTLOAD being converted to G_SEXTLOAD incorrectly.
The extending loads combine tries to prefer sign-extends folding into loads vs
zexts, and in cases where a G_ZEXTLOAD is first used by a G_ZEXT, and then used
by a G_SEXT, it would select the G_SEXT even though the load is already
zero-extending.
Fixes issue #59630
Florian Hahn [Sat, 18 Feb 2023 18:00:18 +0000 (18:00 +0000)]
Revert "[SCCP] Remove legacy SCCP pass."
This reverts commit
5356fefc19df3fbf32d180b1b10e6226e8743541.
It looks like Polly still relies on the legacy SCCP pass. Bring it back
until the best way forward is determined.
Mark de Wever [Sat, 18 Feb 2023 17:30:56 +0000 (18:30 +0100)]
[NFC][libc++][format] Small improvements.
While working on the formatter for the thread::id several minor issues
where spotted. This fixes them.
Florian Hahn [Sat, 18 Feb 2023 17:54:29 +0000 (17:54 +0000)]
[SCCP] Remove legacy SCCP pass.
This is part of the optimization pipeline, of which the legacy pass manager version is deprecated.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D144201
Mark de Wever [Sat, 18 Feb 2023 17:30:56 +0000 (18:30 +0100)]
[NFC][libc++][doc] Fixes formatting.
Kelvin Li [Sat, 18 Feb 2023 05:09:13 +0000 (00:09 -0500)]
[Flang] Add PowerPC intrinsics
This patch adds a subset of PowerPC intrinsics - fmadd,
fmsub, fnmadd and fnmsub.
Differential Revision: https://reviews.llvm.org/D143951
Kristina Bessonova [Sat, 18 Feb 2023 16:31:21 +0000 (18:31 +0200)]
[AArch64InstPrinter][llvm-objdump] Print ADR PC-relative label as a target address hexadecimal form
This is similar to ADRP and matches GNU objdump:
GNU objdump:
```
0000000000200100 <_start>:
200100: adr x0, 201000 <_start+0xf00>
```
llvm-objdump (before patch):
```
0000000000200100 <_start>:
200100: adr x0, #3840
```
llvm-objdump (after patch):
```
0000000000200100 <_start>:
200100: adr x0, 0x201000 <_start+0xf00>
```
Reviewed By: simon_tatham, peter.smith
Differential Revision: https://reviews.llvm.org/D144079
DianQK [Sat, 18 Feb 2023 15:09:07 +0000 (23:09 +0800)]
[SimplifyCFG] Check if the return instruction causes undefined behavior
This should fix https://github.com/rust-lang/rust/issues/107681.
Return undefined to a noundef return value is undefined.
Example:
```
define noundef i32 @test_ret_noundef(i1 %cond) {
entry:
br i1 %cond, label %bb1, label %bb2
bb1:
br label %bb2
bb2:
%r = phi i32 [ undef, %entry ], [ 1, %bb1 ]
ret i32 %r
}
```
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D144319
Tue Ly [Sat, 18 Feb 2023 15:16:13 +0000 (10:16 -0500)]
[libc][bazel] Fix missing dependency in test/src/stdlib targets.
Sanjay Patel [Sat, 18 Feb 2023 12:19:04 +0000 (07:19 -0500)]
[InstCombine] add tests for 1<<cttz(x); NFC
issue #60799
issue #60801
Nikolas Klauser [Sat, 18 Feb 2023 00:27:24 +0000 (01:27 +0100)]
[libc++] Fix header includes in <__atomic/cxx_atomic_impl.h>
Reviewed By: #libc, philnik
Spies: Mordante, paulkirth, libcxx-commits
Differential Revision: https://reviews.llvm.org/D144307