Jingu Kang [Wed, 19 Jul 2023 11:07:18 +0000 (12:07 +0100)]
[MachineLICM] Handle Subloops
Following discussion on https://reviews.llvm.org/D154205, make MachineLICM pass
handle subloops with only visiting outmost loop's blocks once.
Differential Revision: https://reviews.llvm.org/D154205
Yingwei Zheng [Thu, 20 Jul 2023 15:21:36 +0000 (23:21 +0800)]
[ConstraintElim] Store the triple Pred + LHS + RHS in ReproducerEntry instead of CmpInst + Not
This patch represents a condition with `Pred + LHS + RHS` in ReproducerEntry instead of `CmpInst + Not`.
It avoids creating temporary ICmpInsts in D155412.
Reviewed By: nikic, fhahn
Differential Revision: https://reviews.llvm.org/D155782
Craig Topper [Thu, 20 Jul 2023 15:14:50 +0000 (08:14 -0700)]
[RISCV] Remove Opcode field from RVInst. Assign Inst{6-0} directly. NFC
Most places assign Opcode right after assigning every other bit in
Inst. I don't think treating Opcode separately adds much value. It
doesn't hide what bits belong to the opcode since every other bits is
listed.
This makes RVInst consistent with RVInst16 subclasss which already
assign Inst{1-0} directly.
Reviewed By: asb, wangpc
Differential Revision: https://reviews.llvm.org/D155797
Craig Topper [Thu, 20 Jul 2023 15:14:41 +0000 (08:14 -0700)]
[RISCV] Remove unused Opcode field from RVInst16. NFC
Unlike RVInst which also has an Opcode field, all of the subclasseso
of RVInst16 assign Inst{1-0} directly.
Reviewed By: asb, wangpc
Differential Revision: https://reviews.llvm.org/D155791
Jakub Kuderski [Thu, 20 Jul 2023 15:13:51 +0000 (11:13 -0400)]
[mlir][spirv] Extract Atomic/Cast/Group op implementation. NFC.
Continue to work outlined in D155747 and split the main SPIR-V ops
implementation file into a few smaller and quicker to compile files.
This organization matches the op definition organizaion in `.td` files.
In this patch, extract atomic, cast/conversion, and group op
implementation into separate files.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D155777
Craig Topper [Thu, 20 Jul 2023 15:03:59 +0000 (08:03 -0700)]
[RISCV] Order the RISCVInstrInfo*.td includes for standard extensions into logical groups. NFC
There are some ordering dependency between these files.
-F must be included before D, Zfh, and Zfa. I recently suggested Zfbfmin
should be after Zfh.
-V must be before Zvk.
-Zc must be after Zb since Zbb instructions can be compressed.
So I've grouped all the scalar FP together. The vector together.
And put the compressed instructions at the end.
Reviewed By: asb, wangpc
Differential Revision: https://reviews.llvm.org/D155780
Benjamin Kramer [Thu, 20 Jul 2023 15:05:28 +0000 (17:05 +0200)]
[bazel] Fix dependency path
Benjamin Kramer [Thu, 20 Jul 2023 15:02:50 +0000 (17:02 +0200)]
[bazel][clang] Add missing dependency for
8b5d3ba829c162fd4890fd65a4629ce0715825ee
Jingu Kang [Tue, 18 Jul 2023 12:52:20 +0000 (13:52 +0100)]
[AArch64] Reuse larger DUPLANE if available
As combining DUP, try to reuse larger DUPLANELANE.
Differential Revision: https://reviews.llvm.org/D155592
Timm Bäder [Thu, 20 Jul 2023 14:38:32 +0000 (16:38 +0200)]
[clang][Interp] Provide required c++14 warnings
This doesn't show up in standards after c++14.
Sergio Afonso [Thu, 20 Jul 2023 14:38:32 +0000 (15:38 +0100)]
[Flang][OpenMP] Fix unit test using AMDGPU triple without requiring it
Ingo Müller [Mon, 17 Jul 2023 10:26:33 +0000 (10:26 +0000)]
[mlir][transform][python] Add extended ApplyPatternsOp.
This patch adds a mixin for ApplyPatternsOp to _transform_ops_ext.py
with syntactic sugar for construction such ops. Curiously, the op did
not have any constructors yet, probably because its tablegen definition
said to skip the default builders. The new constructor is thus quite
straightforward. The commit also adds a refined `region` property which
returns the first block of the single region.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D155435
Ingo Müller [Thu, 20 Jul 2023 10:13:55 +0000 (10:13 +0000)]
[mlir][linalg][transform] Rename ApplyPatternsOp.{region => patterns}.
This gives the region a more meaningful name. The topic came up in a
discussion on https://reviews.llvm.org/D155435, where the name `region`
would have led to a situation where a convenience accessor called
`region` (after the ODS name) would have returned a Block.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D155810
Ingo Müller [Wed, 19 Jul 2023 15:31:23 +0000 (15:31 +0000)]
[mlir][transform][gpu][python] Add MapForallToBlocks mix-in.
This patch adds a mix-in class for MapForallToBlocks with overloaded
constructors. This makes it optional to provide the return type of the
op, which is defaulte to `AnyOpType`.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D155717
Timm Bäder [Thu, 20 Jul 2023 14:18:45 +0000 (16:18 +0200)]
[clang][Interp] Add missing static_assert messages
Kevin P. Neal [Thu, 20 Jul 2023 13:51:50 +0000 (09:51 -0400)]
[FPEnv][RISCV] Correct strictfp tests.
Correct RISC-V strictfp tests to follow the rules documented in the LangRef:
https://llvm.org/docs/LangRef.html#constrained-floating-point-intrinsics
Mostly these tests just needed the strictfp attribute on function definitions.
I've also removed the strictfp attribute from uses of the constrained
intrinsics because it comes by default since D154991, but I only did this
in tests I was changing anyway.
Test changes verified with D146845.
Sergio Afonso [Thu, 1 Jun 2023 16:38:33 +0000 (17:38 +0100)]
[MLIR][OpenMP][OMPIRBuilder] Use target triple to initialize `IsGPU` flag
This patch modifies the construction of the `OpenMPIRBuilder` in MLIR to
initialize the `IsGPU` flag using target triple information passed down from
the Flang frontend. If not present, it will default to `false`.
This replicates the behavior currently implemented in Clang, where the
`CodeGenModule::createOpenMPRuntime()` method creates a different
`CGOpenMPRuntime` instance depending on the target triple, which in turn has an
effect on the `IsGPU` flag of the `OpenMPIRBuilderConfig` object.
Differential Revision: https://reviews.llvm.org/D151903
Timm Bäder [Sun, 30 Apr 2023 14:20:20 +0000 (16:20 +0200)]
[clang][Interp] Fix compound assign operator evaluation order
We need to evaluated the RHS before the LHS.
Differential Revision: https://reviews.llvm.org/D149550
Florian Hahn [Thu, 20 Jul 2023 13:56:18 +0000 (14:56 +0100)]
[IVUsers] Check getExpr result in findAddRecForLoop.
This fixes a crash if the SCEV for the use isn't invertible and nullptr
is returned.
Fixes https://github.com/llvm/llvm-project/issues/63840
Aaron Ballman [Thu, 20 Jul 2023 13:53:02 +0000 (09:53 -0400)]
Update NATVIS visualizers for Clang
This fixes issues with TemplateTypeParmType and TemplateTypeParmDecl.
Timm Bäder [Thu, 4 May 2023 05:29:57 +0000 (07:29 +0200)]
[clang][Interp] Implement __builtin_strcmp
Make our Function class keep a list of parameter offsets so we can
simply get a parameter by index when evaluating builtin functions.
Differential Revision: https://reviews.llvm.org/D149816
Jake Egan [Thu, 20 Jul 2023 13:44:14 +0000 (09:44 -0400)]
Implement -frecord-command-line for XCOFF integrated assembler path
The patch D153600 implemented `-frecord-command-line` for the XCOFF direct assembly path. This patch adds support for the XCOFF integrated assembly path.
Reviewed By: scott.linder
Differential Revision: https://reviews.llvm.org/D154921
Joseph Huber [Thu, 20 Jul 2023 13:31:37 +0000 (08:31 -0500)]
[libc] Add an override option for specifying the loader implementation
There are some cases when testing we want to override the logic for not
building tests if the loader is not present. This allows users to
specify an external binary that fulfils the same duties which will force
the tests to be built even without meeting the dependencies.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D155837
Alex Bradbury [Thu, 20 Jul 2023 13:35:55 +0000 (14:35 +0100)]
[RISCV][NFC] Get rid of additional unneeded static_cast around RISCVSubtarget
Some similar cases to
60152f1983336e709.
Timm Bäder [Tue, 9 May 2023 17:38:37 +0000 (19:38 +0200)]
[clang][Interp] Add more shift error checking
Differential Revision: https://reviews.llvm.org/D150209
Timm Bäder [Tue, 11 Jul 2023 07:47:03 +0000 (09:47 +0200)]
[clang][Interp] Call dtor of Floating values
The APFloat might heap-allocate some memory, so we need to call its
destructor.
Differential Revision: https://reviews.llvm.org/D154928
Nikita Popov [Thu, 20 Jul 2023 12:31:18 +0000 (14:31 +0200)]
[IR] Mark add constant expressions as undesirable
In preparation for removing support for add expressions, mark them
as undesirable. As such, we will no longer implicitly create such
expressions, but they still exist.
Nikita Popov [Thu, 20 Jul 2023 13:21:19 +0000 (15:21 +0200)]
[InstCombine] Avoid ConstantExpr::get()
Use ConstantFoldBinaryOpOperands() instead of ConstantExpr::get().
This will continue working with binary operands that are not
supported as constant expressions.
Jon Chesterfield [Thu, 20 Jul 2023 13:23:07 +0000 (14:23 +0100)]
[libc][amdgpu] Accept deadstripped clock_freq global
If the clock_freq symbol isn't used, and is removed,
we don't need to abort the loader. Can instead just not set it.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D155832
Timm Bäder [Tue, 18 Jul 2023 12:01:14 +0000 (14:01 +0200)]
[clang][NFC] Simplify SourceLocExpr::EvaluateInContext
Use ASTContext::MakeIntValue and remove the std::tie+lambda weirdness.
Differential Revision: https://reviews.llvm.org/D155584
Joseph Huber [Thu, 20 Jul 2023 00:10:30 +0000 (19:10 -0500)]
[libc] Remove global constructor in `getopt` implementation
This file required a global constructor due to copying the file stream
and have a non-constexpr constructor for the wrapper type. Also, I
changes the `opterr` to be a pointer, because it seemed like it wasn't
being set correctly as an externally visibile variable if we just
captured it by value.
Reviewed By: abrachet
Differential Revision: https://reviews.llvm.org/D155766
Joseph Huber [Wed, 19 Jul 2023 23:03:02 +0000 (18:03 -0500)]
[libc] Remove global constructors on File type
The `File` interface currently has a destructor to delete the buffer if
it is owned by the file. This is problematic for the globally allocated
`stdout`, `stdin`, and `stderr` files. This causes the file interface to
have global constructors to initialize the destructors to use these.
However, these never use the destructors because they don't own the
buffer. This patch removes the destructor and calls in manually in the
close implementation. The platform close should never need to access the
buffer and it needs to be done before clearing the whole thing, so this
should work.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D155762
Alex Bradbury [Thu, 20 Jul 2023 13:05:17 +0000 (14:05 +0100)]
[RISCV][NFC] Use templated getSubtarget in RISCVExpandPseudo::runOnMachineFunction
This avoids a static_cast.
Alexander Belyaev [Thu, 20 Jul 2023 13:08:19 +0000 (15:08 +0200)]
[mlir] Add UBDialect to BUILD.bazel.
Markus Böck [Wed, 19 Jul 2023 14:35:54 +0000 (16:35 +0200)]
[mlir][LLVM] Convert `noalias` parameters into alias scopes during inlining
Currently, inlining a function with a `noalias` parameter leads to a large loss of optimization potential as the `noalias` parameter, an important hint for alias analysis, is lost completely.
This patch fixes this with the same approach as LLVM by annotating all users of the `noalias` parameter with appropriate alias and noalias scope lists.
The implementation done here is not as sophisticated as LLVMs, which has more infrastructure related to escaping and captured pointers, but should work in the majority of important cases.
Any deficiency can be addressed in future patches.
Related LLVM code: https://github.com/llvm/llvm-project/blob/
27ade4b554774187d2c0afcf64cd16fa6d5f619d/llvm/lib/Transforms/Utils/InlineFunction.cpp#L1090
Differential Revision: https://reviews.llvm.org/D155712
Danila Malyutin [Thu, 20 Jul 2023 07:10:26 +0000 (10:10 +0300)]
[Statepoint] Use correct RegisterClass for spilling
Copy propagation might have changed the register class of the register
Differential Revision: https://reviews.llvm.org/D155792
Timm Bäder [Wed, 19 Jul 2023 11:55:43 +0000 (13:55 +0200)]
[clang][Interp][NFC] Add InterpStack::dump()
Simon Pilgrim [Thu, 20 Jul 2023 12:48:56 +0000 (13:48 +0100)]
[X86] LowerTRUNCATE - use LowerTruncateVecPackWithSignBits for prefer-256 bit AVX512 cases during type legalization
If the AVX512 target will split the 512-bit vector truncation then try to use PACKSS/PACKUS first.
David Green [Thu, 20 Jul 2023 12:53:18 +0000 (13:53 +0100)]
[AArch64] Update bswap cost test. NFC
See D155806
Martin Braenne [Thu, 20 Jul 2023 07:12:50 +0000 (07:12 +0000)]
[clang][dataflow] Add an `operator<<` for `OptionalTypeIdentifier`.
When tests fail in UncheckedOptionalAccessModelTest.cpp, this prints the name of the optional type instead of a blob of hex.
Reviewed By: ymandel
Differential Revision: https://reviews.llvm.org/D155788
Nikita Popov [Thu, 20 Jul 2023 12:49:08 +0000 (14:49 +0200)]
[LVI] Check ConstantFoldCompareInstOperands() failure (NFCI)
I don't believe this can happen right now (because we're only
working on icmps and as such can't hit the current fcmp null
paths), but this will be possible in the future when icmp
constant expressions are removed.
Nikita Popov [Thu, 20 Jul 2023 12:47:50 +0000 (14:47 +0200)]
[ConstantFolding] Update failure behavior documentation (NFC)
These functions may return null or a constant expression on failure,
depending on whether such a constant expression is still supported.
wanglei [Mon, 17 Jul 2023 23:03:03 +0000 (07:03 +0800)]
[LoongArch] Fix instruction definitions that were incorrectly specified input/output operands
This has no impact on the current assembly functionality but will affect
the patches for the subsequent code generation.
Alex Bradbury [Thu, 20 Jul 2023 12:48:15 +0000 (13:48 +0100)]
[RISCV] Don't include X1 in the X0_PD register pair
Zdinx on RV32 defines the D instructions as taking even register pairs,
and specifies that if using X0 when as a destination then X1 won't be
written, and if using X0 as a source then the value is still all 0s
(i.e. X1 isn't read). Therefore, it's incorrect to model X0_PD as having
X1 as a subregister. This will also be the case for register pairs in
Zacas and the P extension (and this patch takes the same approach as
D95588 does).
This patch introduces a dummy register that is solely used as a subreg
alongside X0 in X0_PD. An earlier version of the patch had a minor
effect on register allocation in some tests, which is now avoided by:
1) Adding RISCV::DUMMY_REG_PAIR_WITH_X0 to RISCVRegisterInfo::getReservedRegs
2) Defining a new register class that includes DUMMY_REG_PAIR_WITH_X0
Differential Revision: https://reviews.llvm.org/D153974
Jon Chesterfield [Thu, 20 Jul 2023 12:43:17 +0000 (13:43 +0100)]
[libc][amdgpu] Tolerate different install directories for hsa.h
HSA headers might be under a hsa/ directory or might not.
This scheme matches the one used by the openmp amdgpu plugin.
Reviewed By: jhuber6, jplehr
Differential Revision: https://reviews.llvm.org/D155812
Martin Braenne [Thu, 20 Jul 2023 08:53:41 +0000 (08:53 +0000)]
[clang][dataflow] Print the source line if we saw unexpected diagnostics in tests.
This makes it easier to determine which line the unexpected happened on; previously, we would only get the line number.
Reviewed By: ymandel
Differential Revision: https://reviews.llvm.org/D155802
Weining Lu [Thu, 20 Jul 2023 11:35:37 +0000 (19:35 +0800)]
[LoongArch][NFC] Remove incorrect notes in clang tests
The assertions in these two tests were not auto-generated by update_cc_test_checks.py. Remove them.
Nikita Popov [Thu, 20 Jul 2023 12:17:07 +0000 (14:17 +0200)]
[InstCombine] Avoid ConstantExpr::getAnd() (NFCI)
In preparation for removing and constant expressions.
LLVM GN Syncbot [Thu, 20 Jul 2023 12:08:42 +0000 (12:08 +0000)]
[gn build] Port
a2160dd34d56
Timm Bäder [Thu, 20 Jul 2023 08:51:29 +0000 (10:51 +0200)]
[clang][Interp][NFC] Add a debugging assertion
We will probably have to remove this at some point, but until then, make
sure we're not running into much-harder-to-debug problems later on.
yrong [Thu, 20 Jul 2023 12:00:10 +0000 (20:00 +0800)]
[libc++][ranges] Implement P2474R2(`views::repeat`).
- Implement https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2474r2.html
- Implement LWG3875(https://cplusplus.github.io/LWG/issue3875).
Depends on D151629
Reviewed By: #libc, Mordante, philnik, var-const
Differential Revision: https://reviews.llvm.org/D141699
Guray Ozen [Thu, 20 Jul 2023 10:26:35 +0000 (12:26 +0200)]
[mlir][nvgpu] Add `mbarrier.arrive.expect_tx` and `mbarrier.try_wait.parity`
This work adds two Ops:
`mbarrier.arrive.expect_tx` performs expect_tx `mbarrier.barrier` returns `mbarrier.barrier.token`
`mbarrier.try_wait.parity` waits on `mbarrier.barrier` and `mbarrier.barrier.token`
`mbarrier.arrive.expect_tx` is one of the requirement to enable H100 TMA support.
Depends on D154074 D154076 D154059 D154060
Reviewed By: qcolombet
Differential Revision: https://reviews.llvm.org/D154094
Thorsten Schütt [Fri, 14 Jul 2023 08:48:25 +0000 (10:48 +0200)]
[GIsel][AArch64] extend legalization of G_INSERT_VECTOR_ELT
Fixes https://github.com/llvm/llvm-project/issues/63826
Reviewed By: aemerson
Differential Revision: https://reviews.llvm.org/D155274
Ilya Leoshkevich [Thu, 20 Jul 2023 11:28:28 +0000 (13:28 +0200)]
[TableGen][CodeEmitterGen] Avoid empty OpNum switches in getOperandBitOffset()
getOperandBitOffset() causes the following warning on MSVC:
E:\llvm\ninja\lib\Target\SystemZ\SystemZGenMCCodeEmitter.inc(15414): warning C4060: switch statement contains no 'case' or 'default' labels
Do not emit empty OpNum switches.
Reviewed By: RKSimon, uweigand
Differential Revision: https://reviews.llvm.org/D155805
Simon Pilgrim [Thu, 20 Jul 2023 11:11:48 +0000 (12:11 +0100)]
[DAG] ShrinkDemandedConstant - early-out for empty DemandedBits/Elts
Leave this to constant folding in SimplifyDemandedBits
Fixes #63975
Tobias Gysi [Thu, 20 Jul 2023 08:13:17 +0000 (08:13 +0000)]
[mlir][llvm] Add branch weight op interface
This revision adds a branch weight op interface for the call / branch
operations that support branch weights. It can be used in the LLVM IR
import and export to simplify the branch weight conversion. An
additional mapping between call operations and instructions ensures
the actual conversion can be done in the module translation itself,
rather than in the dialect translation interface. It also has the
benefit that downstream users can amend custom metadata to the call
operation during the export to LLVM IR.
Reviewed By: zero9178, definelicht
Differential Revision: https://reviews.llvm.org/D155702
Michael Halkenhaeuser [Tue, 18 Jul 2023 16:56:12 +0000 (12:56 -0400)]
[clang][OpenMP] Add interop support for multiple depend clauses
This patch removes the constraint of the `interop` directive where only a single
`depend` clause was allowed.
Differential Revision: https://reviews.llvm.org/D155692
Bryan Chan [Thu, 20 Jul 2023 10:04:29 +0000 (06:04 -0400)]
[Clang][AArch64][SME] Generate target features from +(no)sme.* options
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D142702
Bryan Chan [Thu, 20 Jul 2023 10:04:14 +0000 (06:04 -0400)]
[Clang][AArch64][SME] Add outer product intrinsics
This patch adds support for the following SME ACLE intrinsics (as defined
in https://arm-software.github.io/acle/main/acle.html):
- svmopa_za32[_bf16]_m // also for s8, u8, f16, f32
- svmops_za32[_bf16]_m // also for s8, u8, f16, f32
- svsumopa_za32[_s8]_m
- svsumops_za32[_s8]_m
- svusmopa_za32[_u8]_m
- svusmops_za32[_u8]_m
When the sme-f64f64 feature is enabled, the following intrinsics are supported:
- svmopa_za64_f64_m
- svmops_za64_f64_m
When the sme-i16i64 feature is enabled, the following intrinsics are supported:
- svmopa_za64[_s16]_m // also for u16
- svmops_za64[_s16]_m // also for u16
- svsumopa_za64[_s16]_m
- svsumops_za64[_s16]_m
- svusmopa_za64[_u16]_m
- svusmops_za64[_u16]_m
Co-authored-by: Sagar Kulkarni <sagar.kulkarni1@huawei.com>
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D134681
Bryan Chan [Thu, 20 Jul 2023 10:03:55 +0000 (06:03 -0400)]
[Clang][AArch64][SME] Add intrinsics for adding vector elements to ZA tile
This patch adds support for the following SME ACLE intrinsics (as defined
in https://arm-software.github.io/acle/main/acle.html):
- svaddha_za32[_u32]_m // also for s32
- svaddva_za32[_u32]_m // also for s32
- svaddha_za64[_u64]_m // also for s64
- svaddva_za64[_u64]_m // also for s64
The _za64 versions are available only when the sme-i16i64 feature is enabled.
Co-authored-by: Sagar Kulkarni <sagar.kulkarni1@huawei.com>
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D134680
Bryan Chan [Thu, 20 Jul 2023 09:58:45 +0000 (05:58 -0400)]
[Clang][AArch64][SME] Add intrinsics for reading streaming vector length
This patch adds support for the following SME ACLE intrinsics (as defined
in https://arm-software.github.io/acle/main/acle.html):
- svcntsb
- svcntsh
- svcntsw
- svcntsd
Co-authored-by: Sagar Kulkarni <sagar.kulkarni1@huawei.com>
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D134679
Bryan Chan [Thu, 20 Jul 2023 09:51:19 +0000 (05:51 -0400)]
[Clang][AArch64][SME] Add intrinsics for ZA array load/store (LDR/STR)
This patch adds support for the following SME ACLE intrinsics (as defined
in https://arm-software.github.io/acle/main/acle.html):
- svldr_vnum_za
- svstr_vnum_za
Co-authored-by: Sagar Kulkarni <sagar.kulkarni1@huawei.com>
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D134678
Bryan Chan [Thu, 20 Jul 2023 09:50:46 +0000 (05:50 -0400)]
[Clang][AArch64][SME] Add ZA zeroing intrinsics
This patch adds support for the following SME ACLE intrinsics (as defined
in https://arm-software.github.io/acle/main/acle.html):
- svzero_mask_za
- svzero_za
Co-authored-by: Sagar Kulkarni <sagar.kulkarni1@huawei.com>
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D134677
Bryan Chan [Thu, 20 Jul 2023 09:50:16 +0000 (05:50 -0400)]
[Clang][AArch64][SME] Add vector read/write (mova) intrinsics
This patch adds support for the following SME ACLE intrinsics (as defined
in https://arm-software.github.io/acle/main/acle.html):
- svread_hor_za8[_s8]_m // also for u8
- svread_hor_za16[_s16]_m // also for u16, f16, bf16
- svread_hor_za32[_s32]_m // also for u32, f32
- svread_hor_za64[_s64]_m // also for u64, f64
- svread_hor_za128[_s8]_m // also for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32, f64
- svread_ver_za8[_s8]_m // also for u8
- svread_ver_za16[_s16]_m // also for u16, f16, bf16
- svread_ver_za32[_s32]_m // also for u32, f32
- svread_ver_za64[_s64]_m // also for u64, f64
- svread_ver_za128[_s8]_m // also for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32, f64
- svwrite_hor_za8[_s8]_m // also for u8
- svwrite_hor_za16[_s16]_m // also for u16, f16, bf16
- svwrite_hor_za32[_s32]_m // also for u32, f32
- svwrite_hor_za64[_s64]_m // also for u64, f64
- svwrite_hor_za128[_s8]_m // also for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32, f64
- svwrite_ver_za8[_s8]_m // also for u8
- svwrite_ver_za16[_s16]_m // also for u16, f16, bf16
- svwrite_ver_za32[_s32]_m // also for u32, f32
- svwrite_ver_za64[_s64]_m // also for u64, f64
- svwrite_ver_za128[_s8]_m // also for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32, f64
Co-authored-by: Sagar Kulkarni <sagar.kulkarni1@huawei.com>
Reviewed By: sdesmalen, kmclaughlin
Differential Revision: https://reviews.llvm.org/D128648
Simon Pilgrim [Thu, 20 Jul 2023 09:35:44 +0000 (10:35 +0100)]
[DAG] hoistLogicOpWithSameOpcodeHands - ensure SIGN_EXTEND_INREG nodes have the same extension value type
Fix bug in the check for matching SIGN_EXTEND_INREG types
Simon Pilgrim [Thu, 20 Jul 2023 09:25:40 +0000 (10:25 +0100)]
[X86] Add test case showing incorrect and(sextinreg(v0,i2),sextinreg(v1,i5)) -> sextinreg(and(v0,v1),i2) fold
David Green [Thu, 20 Jul 2023 09:34:05 +0000 (10:34 +0100)]
[LV][AArch64] Fix reductions costs in strict-fadd-cost.ll. NFC
These tests were originally added in
0aff1798b5721d5f95d16f465b99d, where they
were measuring the cost of fadd and fmuladd reductions, which should be fairly
high cost. For some reason, due to the forced vector factors, the debug costs
of each instruction are printed twice by the vectorizer. Once as if the
instruction is a simple fadd/fmuladd, and later with the correct reduction
cost.
In
d827865e9f778f5b27edb2afe003c2a the costs were updated to match the first
print statements, where they would be better to match the second to test the
cost of the reduction.
This patch returns them to testing the original reduction costs.
Ivan Butygin [Fri, 30 Jun 2023 18:51:20 +0000 (20:51 +0200)]
[mlir] Add `ub` dialect and `poison` op.
Add new dialect boilerplate and `poison` op definition.
Discussion: https://discourse.llvm.org/t/rfc-poison-semantics-for-mlir/66245/24
Differential Revision: https://reviews.llvm.org/D154248
wangpc [Thu, 20 Jul 2023 09:16:05 +0000 (17:16 +0800)]
[NFC][RISCV] Rewrite TableGen files using named arguments
To simplify code and show the usage of named arguments.
Reviewed By: michaelmaitland, MaskRay
Differential Revision: https://reviews.llvm.org/D154067
LLVM GN Syncbot [Thu, 20 Jul 2023 08:55:09 +0000 (08:55 +0000)]
[gn build] Port
1c154bd75515
LLVM GN Syncbot [Thu, 20 Jul 2023 08:55:08 +0000 (08:55 +0000)]
[gn build] Port
049d6a3f428e
Sylvestre Ledru [Thu, 20 Jul 2023 08:54:54 +0000 (10:54 +0200)]
Revert "[CUDA][HIP] Use the same default language std as C++"
This reverts commit
2d1d07152bd26b001dedec3400b4b01d3bb11622.
Markus Böck [Thu, 20 Jul 2023 07:53:38 +0000 (09:53 +0200)]
[mlir][LLVM] Handle access groups during inlining
Handling access groups is luckily rather trivial: Any access groups from the call instruction are simply appended to any memory operations.
This is similar to one of the steps when handling alias scopes.
This patch nevertheless implements it as a separate function purely for readability purposes as it uses a different interface than alias scopes.
Differential Revision: https://reviews.llvm.org/D155795
Matthias Springer [Thu, 20 Jul 2023 08:20:36 +0000 (10:20 +0200)]
[mlir] Remove some code duplication between `Builders.cpp` and `FoldUtils.cpp`
Also update the documentation of `Operation::fold`, which did not take into account in-place foldings.
Differential Revision: https://reviews.llvm.org/D155691
Weining Lu [Thu, 20 Jul 2023 08:09:34 +0000 (16:09 +0800)]
[LoongArch][NFC] Revise preprocessor test init-loongarch.c
- Add `--match-full-lines` to FileCheck invocations.
- Remove useless `grep __loongarch_`s.
Matthias Springer [Thu, 20 Jul 2023 08:11:44 +0000 (10:11 +0200)]
[mlir][IR] Implement proper folder for `IsCommutative` trait
Commutative ops were previously folded with a special rule in `OperationFolder`. This change turns the folding into a proper `OpTrait` folder.
Differential Revision: https://reviews.llvm.org/D155687
Advenam Tacet [Tue, 18 Jul 2023 19:15:13 +0000 (21:15 +0200)]
[ASan][libc++] Annotating std::deque with all allocators
This patch is part of our efforts to support container annotations with (almost) every allocator.
Annotating std::deque with default allocator is implemented in D132092.
Support in ASan API exests since rG1c5ad6d2c01294a0decde43a88e9c27d7437d157.
The motivation for a research and those changes was a bug, found by Trail of Bits, in a real code where an out-of-bounds read could happen as two strings were compared via a `std::equals` function that took `iter1_begin`, `iter1_end`, `iter2_begin` iterators (with a custom comparison function).
When object `iter1` was longer than `iter2`, read out-of-bounds on `iter2` could happen. Container sanitization would detect it.
If you have any questions, please email:
- advenam.tacet@trailofbits.com
- disconnect3d@trailofbits.com
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D146815
Edoardo Sanguineti [Thu, 20 Jul 2023 08:14:07 +0000 (08:14 +0000)]
[libc++] Fix tests for the runtime assertions in <barrier>
As @ldionne pointed out to me in a newer revision, there is a //REQUIRE comment in both files edited by this patch that prevents the test to run on some platforms where it should actually run.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D155755
David Green [Thu, 20 Jul 2023 08:12:05 +0000 (09:12 +0100)]
[DAG][AArch64] Fix truncated vscale constant types
It appears that vscale values truncated to i1 causes mismatches in the constant
types when created in getNode. https://godbolt.org/z/TaaTo86ne.
Differential Revision: https://reviews.llvm.org/D155626
Ilya Leoshkevich [Thu, 20 Jul 2023 08:10:16 +0000 (10:10 +0200)]
[TableGen][CodeEmitterGen] Add support for querying operand bit offsets
In order to generate relocations or to apply fixups after the layout
has been computed, the targets need to know the offsets of the
respective operands. There are indirect ways to figure them out in some
cases, for example, on SystemZ, the first memory operand is always at
offset 2, and the second one is always at offset 4. But there are no
such tricks for the immediate operands on SystemZ, so one has to refer
to individual instruction encodings.
This information, however, is available to TableGen. Generate
the getOperandBitOffset() method to access it, and use it to simplify
getting memory operand offsets on SystemZ. This also paves the way for
implementing symbolic immediates on this platform.
For the multi-lit operands, getOperandBitOffset() returns the offset of
the first lit.
An alternative way to obtain offsets would be to pass them to the
encoder methods, but this would require reworking all targets. Also,
VarLenCodeEmitter already does this, but adopting it requires
reworking the respective targets without other significant benefits.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D155329
Rik Huijzer [Thu, 20 Jul 2023 08:09:34 +0000 (10:09 +0200)]
[MLIR][Tensor] Avoid crash on negative dimensions
In https://reviews.llvm.org/D151611, a check was added to the tensor verifier to
emit an error on negative tensor dimensions. This check allowed for dynamic
dimensions, hence negative dimensions were still able to get through the verifier.
This is a problem in situations such as #60558, where the dynamic dimension is
converted to a static (and possibly negative) dimension by another pass in the
compiler. This patch fixes that by doing another check during the
`StaticTensorGenerate` conversion, and return a failure if the dimension is
negative.
As a side-note, I have to admit that I do not know why returning a failure in
`StaticTensorGenerate` gives a nice "tensor dimensions must be non-negative"
error. I suspect that the verifier runs again when `return failure()` is called,
but I am not sure.
Fixes #60558.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D155728
wangpc [Thu, 20 Jul 2023 08:02:03 +0000 (16:02 +0800)]
[TableGen] Support named arguments
We provide a way to specify arguments in the form of `name=value`
so that we don't have to specify all optional arguments before the
one we'd like to change. Required arguments can alse be specified
in this way.
Note that the argument can only be specified once regardless of
the way (named or positional) to specify and positional arguments
should be put before named arguments.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D152998
Fangrui Song [Thu, 20 Jul 2023 07:42:38 +0000 (00:42 -0700)]
[llvm-readobj] Print <null> for relocation target with an empty name
For a relocation, we don't differentiate the two cases:
* the symbol index is 0
* the symbol index is non zero, the type is not STT_SECTION, and the name is empty. Clang generates such local symbols for RISC-V linker relaxation.
So we may print
```
Offset Info Type Symbol's Value Symbol's Name + Addend
000000000000001c 0000000100000039 R_RISCV_32_PCREL
0000000000000000 0
// llvm-readobj
0x1C R_RISCV_32_PCREL - 0x0
```
while GNU readelf prints "<null>", which is clearer. Let's match the GNU behavior.
Related to https://reviews.llvm.org/D81842
```
000000000000001c 0000000100000039 R_RISCV_32_PCREL
0000000000000000 <null> + 0
// llvm-readobj
0x1C R_RISCV_32_PCREL <null> 0x0
```
Reviewed By: jhenderson, kito-cheng
Differential Revision: https://reviews.llvm.org/D155353
Fangrui Song [Thu, 20 Jul 2023 07:39:01 +0000 (00:39 -0700)]
[llvm-readobj][test] Pre-commit an empty symbol name test for D155353
wangpc [Thu, 20 Jul 2023 07:16:21 +0000 (15:16 +0800)]
[TableGen][NFC] Remove unreachable code
The removed code assumed that we can define classes inside a multiclass,
so the name of outer multiclass is concatenated to the qualified name.
But for current TableGen grammar, we can't define classes in multiclass,
so it is unnecessary.
This commit is requested in D152998.
Corentin Jabot [Thu, 20 Jul 2023 07:17:56 +0000 (09:17 +0200)]
[clang-tools-extra] the message in a static_assert is not always a string literal
Fixes build failure introduce by 47ccfd7.
Nuno Lopes [Thu, 20 Jul 2023 07:14:43 +0000 (08:14 +0100)]
[InstCombineVectorOps] Use poison instead of undef as placeholder [NFC]
Undef was being used to populate unused vector lanes.
While at it, switch extractelement to use poison as the OOB value (per LangRef)
Craig Topper [Thu, 20 Jul 2023 07:03:44 +0000 (00:03 -0700)]
[RISCV] Sink more common code from RVInst/RVInst16 into RVInstCommon. NFC
Reviewed By: wangpc
Differential Revision: https://reviews.llvm.org/D155787
Fangrui Song [Thu, 20 Jul 2023 07:06:47 +0000 (00:06 -0700)]
[WebAssembly] Use MapVector to stabilize iteration order after D150803
StringMap iteration order is not guaranteed to be deterministic
(https://llvm.org/docs/ProgrammersManual.html#llvm-adt-stringmap-h).
Fangrui Song [Thu, 20 Jul 2023 07:02:06 +0000 (00:02 -0700)]
[WebAssembly] Use SetVector to stabilize iteration order after D120365
StringMap iteration order is not guaranteed to be deterministic
(https://llvm.org/docs/ProgrammersManual.html#llvm-adt-stringmap-h).
Corentin Jabot [Tue, 1 Nov 2022 12:37:12 +0000 (13:37 +0100)]
[Clang] Implement P2741R3 - user-generated static_assert messages
Reviewed By: #clang-language-wg, aaron.ballman
Differential Revision: https://reviews.llvm.org/D154290
Freddy Ye [Thu, 20 Jul 2023 06:30:46 +0000 (14:30 +0800)]
[X86] Add AVX-VNNI-INT16 instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
Reviewed By: pengfei, skan
Differential Revision: https://reviews.llvm.org/D155145
Fangrui Song [Thu, 20 Jul 2023 06:30:30 +0000 (23:30 -0700)]
.debug_gnu_pub{names,types}: Stabilize iteration order
StringMap iteration order is not guaranteed to be deterministic
(https://llvm.org/docs/ProgrammersManual.html#llvm-adt-stringmap-h).
Sort by DIE offset (which looks like a pre-order traversal order).
Danila Malyutin [Wed, 19 Jul 2023 16:32:59 +0000 (19:32 +0300)]
[X86] Recognize standalone `(1 << nbits) - 1` pattern as bzhi
This can be thought as a subcase of `x & ((1 << nbits) - 1)` where x == -1
Differential Revision: https://reviews.llvm.org/D155622
Danila Malyutin [Wed, 19 Jul 2023 18:00:08 +0000 (21:00 +0300)]
[X86][AArch64] Add additional extract_lowbits test
Check that vreg_width-1 mask is only removed for shifts
Differential Revision: https://reviews.llvm.org/D155734
Timm Bäder [Wed, 19 Jul 2023 13:58:40 +0000 (15:58 +0200)]
[clang][Interp][NFC] Add another assertion to InterpStack::peek()
Timm Bäder [Wed, 19 Jul 2023 11:53:18 +0000 (13:53 +0200)]
[clang][Interp][NFC] Fix a doc comment mixup
Timm Bäder [Wed, 19 Jul 2023 11:51:42 +0000 (13:51 +0200)]
[clang][Interp][NFC] Clear InterpStack::ItemTypes in clear()
Craig Topper [Thu, 20 Jul 2023 05:36:16 +0000 (22:36 -0700)]
[RISCV] Use the opcodestr and argstr arguments of Pseudo to simplify tablegen code. NFC
We can replace several "let AsmString =".
Freddy Ye [Thu, 20 Jul 2023 05:34:47 +0000 (13:34 +0800)]
[X86] Add SM4 instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
Reviewed By: pengfei, skan
Differential Revision: https://reviews.llvm.org/D155148