Sanjay Patel [Sun, 13 Nov 2022 17:38:12 +0000 (12:38 -0500)]
[SystemZ] improve test for showing store merge miscompile; NFC
See issue #58883 for details.
Philip Reames [Mon, 14 Nov 2022 16:29:55 +0000 (08:29 -0800)]
[RISCV] Implement assembler support for XVentanaCondOps
This change provides an implementation of the XVentanaCondOps vendor extension. This extension is defined in version 1.0.0 of the VTx-family custom instructions specification (https://github.com/ventanamicro/ventana-custom-extensions/releases/download/v1.0.0/ventana-custom-extensions-v1.0.0.pdf) by Ventana Micro Systems.
In addition to the technical contribution, this change is intended to be a test case for our vendor extension policy.
Once this lands, I plan to use this extension to prototype selection lowering to conditional moves. There's an RVI proposal in flight, and the expectation is that lowering to these and the new RVI instructions is likely to be substantially similar.
Differential Revision: https://reviews.llvm.org/D137350
bixia1 [Wed, 9 Nov 2022 17:07:06 +0000 (09:07 -0800)]
[mlir][sparse] Add rewriting rules for sparse_tensor.sort_coo.
Refactor the rewriting of sparse_tensor.sort to support the implementation of
sparse_tensor.sort_coo.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D137522
Sylvain Audi [Wed, 9 Nov 2022 15:01:55 +0000 (10:01 -0500)]
[PDB] Don't include input files in the 'cmd' entry of S_ENVBLOCK
MSVC records the command line arguments in S_ENVBLOCK, skipping the input file arguments.
This patch adds this filtering on lld-link side.
Differential Revision: https://reviews.llvm.org/D137723
Simon Pilgrim [Mon, 14 Nov 2022 16:13:16 +0000 (16:13 +0000)]
[MCA][X86] Ensure the avx512 vnni tests use the upper xmm/ymm registers
Ensure we're testing the avx512vl vnni instructions and not the avx vnni instructions
Simon Pilgrim [Mon, 14 Nov 2022 15:57:13 +0000 (15:57 +0000)]
[MCA][X86] Add test coverage for VBMI2 instructions
Chris Bieneman [Mon, 14 Nov 2022 16:28:36 +0000 (10:28 -0600)]
[NFC] Fixing spelling in code comment
bixia1 [Fri, 11 Nov 2022 22:24:26 +0000 (14:24 -0800)]
[mlir][sparse][NFC] Add comments to tests that are run for with and without runtime libraries.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D137869
Ivan Kosarev [Mon, 14 Nov 2022 16:10:23 +0000 (16:10 +0000)]
[AMDGPU][AsmParser] Forbid TFE modifiers for MBUF stores.
Reviewed By: dp
Differential Revision: https://reviews.llvm.org/D137832
Nicholas Guy [Mon, 14 Nov 2022 15:55:44 +0000 (15:55 +0000)]
[NFC] Removal of complex deinterleaving test case complex_mul_v8f64
This test is not particularly useful for testing complex deinterleaving,
especially due to f64 muls not being supported in mve. The test is
being removed as it's hitting an unrelated pre-existing condition
regarding register spilling.
Jay Foad [Mon, 14 Nov 2022 15:27:59 +0000 (15:27 +0000)]
[AMDGPU] More use of DivergentBinFrag and friends. NFC.
Nikita Popov [Tue, 18 Oct 2022 10:11:04 +0000 (12:11 +0200)]
[AA] Move MayBeCrossIteration into AAQI (NFC)
Move the MayBeCrossIteration flag from BasicAA into AAQI. This is
in preparation for exposing it to users of the AA API.
Ivan Kosarev [Mon, 14 Nov 2022 12:37:26 +0000 (12:37 +0000)]
[AMDGPU][MC] Support TFE modifiers in MUBUF loads and stores.
Reviewed By: dp, arsenm
Differential Revision: https://reviews.llvm.org/D137783
Mindong Chen [Mon, 14 Nov 2022 15:18:47 +0000 (23:18 +0800)]
[docs][OpaquePtr] Fix hyperlinks
Jay Foad [Mon, 14 Nov 2022 15:14:55 +0000 (15:14 +0000)]
[AMDGPU] Define and use UniformTernaryFrag. NFC.
Simon Pilgrim [Mon, 14 Nov 2022 10:58:20 +0000 (10:58 +0000)]
[X86] Remove unnecessary overrides for CBW/CWDE/CDQE/CMC instructions
All of these match the default WriteALU schedule
Caroline Concatto [Thu, 3 Nov 2022 12:18:20 +0000 (12:18 +0000)]
[AArch64] Add all SME2.1 instructions Assembly/Disassembly
This patch adds a new feature flag:
sme-f16f16 to represent FEAT_SME-F16F16
This patch add the following instructions:
SME2.1 stand alone instructions:
MOVAZ (array to vector, four registers): Move and zero four ZA single-vector groups to vector registers.
(array to vector, two registers): Move and zero two ZA single-vector groups to vector registers.
(tile to vector, four registers): Move and zero four ZA tile slices to vector registers.
(tile to vector, single): Move and zero ZA tile slice to vector register.
(tile to vector, two registers): Move and zero two ZA tile slices to vector registers.
LUTI2 (Strided four registers): Lookup table read with 2-bit indexes.
(Strided two registers): Lookup table read with 2-bit indexes.
LUTI4 (Strided four registers): Lookup table read with 4-bit indexes.
(Strided two registers): Lookup table read with 4-bit indexes.
ZERO (double-vector): Zero ZA double-vector groups.
(quad-vector): Zero ZA quad-vector groups.
(single-vector): Zero ZA single-vector groups.
SME2p1 and SME-F16F16:
All instructions are half precision elements:
FADD: Floating-point add multi-vector to ZA array vector accumulators.
FSUB: Floating-point subtract multi-vector from ZA array vector accumulators.
FMLA (multiple and indexed vector): Multi-vector floating-point fused multiply-add by indexed element.
(multiple and single vector): Multi-vector floating-point fused multiply-add by vector.
(multiple vectors): Multi-vector floating-point fused multiply-add.
FMLS (multiple and indexed vector): Multi-vector floating-point fused multiply-subtract by indexed element.
(multiple and single vector): Multi-vector floating-point fused multiply-subtract by vector.
(multiple vectors): Multi-vector floating-point fused multiply-subtract.
FCVT (widening): Multi-vector floating-point convert from half-precision to single-precision (in-order).
FCVTL: Multi-vector floating-point convert from half-precision to deinterleaved single-precision.
FMOPA (non-widening): Floating-point outer product and accumulate.
FMOPS (non-widening): Floating-point outer product and subtract.
SME2p1 and B16B16:
BFADD: BFloat16 floating-point add multi-vector to ZA array vector accumulators.
BFSUB: BFloat16 floating-point subtract multi-vector from ZA array vector accumulators.
BFCLAMP: Multi-vector BFloat16 floating-point clamp to minimum/maximum number.
BFMLA (multiple and indexed vector): Multi-vector BFloat16 floating-point fused multiply-add by indexed element.
(multiple and single vector): Multi-vector BFloat16 floating-point fused multiply-add by vector.
(multiple vectors): Multi-vector BFloat16 floating-point fused multiply-add.
BFMLS (multiple and indexed vector): Multi-vector BFloat16 floating-point fused multiply-subtract by indexed element.
(multiple and single vector): Multi-vector BFloat16 floating-point fused multiply-subtract by vector.
(multiple vectors): Multi-vector BFloat16 floating-point fused multiply-subtract.
BFMAX (multiple and single vector): Multi-vector BFloat16 floating-point maximum by vector.
(multiple vectors): Multi-vector BFloat16 floating-point maximum.
BFMAXNM (multiple and single vector): Multi-vector BFloat16 floating-point maximum number by vector.
(multiple vectors): Multi-vector BFloat16 floating-point maximum number.
BFMIN (multiple and single vector): Multi-vector BFloat16 floating-point minimum by vector.
(multiple vectors): Multi-vector BFloat16 floating-point minimum.
BFMINNM (multiple and single vector): Multi-vector BFloat16 floating-point minimum number by vector.
(multiple vectors): Multi-vector BFloat16 floating-point minimum number.
BFMOPA (non-widening): BFloat16 floating-point outer product and accumulate.
BFMOPS (non-widening): BFloat16 floating-point outer product and subtract.
The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09
Differential Revision: https://reviews.llvm.org/D137571
Nikita Popov [Mon, 14 Nov 2022 14:46:00 +0000 (15:46 +0100)]
[AST] Remove legacy AliasSetPrinter pass
A NewPM version of this pass exists, drop the legacy version of
this testing-only pass.
Sjoerd Meijer [Fri, 11 Nov 2022 12:56:42 +0000 (18:26 +0530)]
[AArch64] Add match patterns for the reassociated forms of FNMUL
Differential Revision: https://reviews.llvm.org/D137925
Nikita Popov [Mon, 14 Nov 2022 14:28:09 +0000 (15:28 +0100)]
[LoopVersioningLICM] Clarify scope of AST (NFC)
Make it clearer that the AST is only temporarily used during the
legality check, and does not have to survive into the transformation
phase.
Joseph Huber [Mon, 14 Nov 2022 14:11:33 +0000 (08:11 -0600)]
[OpenMP] Fix installation to old resource dir
Summary:
The changes in D125860 renamed the old resource directory to only use
the major version. This was not updated for the OpenMP project, causing
OpenMP resources to still be installed in the old `major.minor.rev`
folder. This lead to problems including the header files.
fixes #58966
Luca Di Sera [Mon, 14 Nov 2022 14:17:22 +0000 (15:17 +0100)]
Add clang_CXXMethod_isMoveAssignmentOperator to libclang
The new method is a wrapper of `CXXMethodDecl::isMoveAssignmentOperator` and
can be used to recognized move-assignment operators in libclang.
An export for the function, together with its documentation, was added to
"clang/include/clang-c/Index.h" with an implementation provided in
"clang/tools/libclang/CIndex.cpp". The implementation was based on
similar `clang_CXXMethod.*` implementations, following the same
structure but calling `CXXMethodDecl::isMoveAssignmentOperator` for its
main logic.
The new symbol was further added to "clang/tools/libclang/libclang.map"
to be exported, under the LLVM16 tag.
"clang/tools/c-index-test/c-index-test.c" was modified to print a
specific tag, "(move-assignment operator)", for cursors that are
recognized by `clang_CXXMethod_isMoveAssignmentOperator`.
A new regression test file,
"clang/test/Index/move-assignment-operator.cpp", was added to ensure
whether the correct constructs were recognized or not by the new function.
The "clang/test/Index/get-cursor.cpp" regression test file was updated
as it was affected by the new "(move-assignment operator)" tag.
A binding for the new function was added to libclang's python's
bindings, in "clang/bindings/python/clang/cindex.py", adding a new
method for `Cursor`, `is_move_assignment_operator_method`.
An accompanying test was added to
`clang/bindings/python/tests/cindex/test_cursor.py`, testing the new
function with the same methodology as the corresponding libclang test.
The current release note, `clang/docs/ReleaseNotes.rst`, was modified to
report the new addition under the "libclang" section.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D137246
Nikita Popov [Mon, 14 Nov 2022 14:16:38 +0000 (15:16 +0100)]
[LoopVersioningLICM] Remove unnecessary reset code (NFC)
The LoopVersioningLICM object is only ever used for a single loop,
but there was various unnecessary code for handling the case where
it is reused across loops. Drop that code, and pass the loop to the
constructor.
LLVM GN Syncbot [Mon, 14 Nov 2022 14:05:19 +0000 (14:05 +0000)]
[gn build] Port
d52e2839f3b1
Nicholas Guy [Mon, 14 Nov 2022 13:59:59 +0000 (13:59 +0000)]
[ARM][CodeGen] Add support for complex deinterleaving
Adds the Complex Deinterleaving Pass implementing support for complex numbers in a target-independent manner, deferring to the TargetLowering for the given target to create a target-specific intrinsic.
Differential Revision: https://reviews.llvm.org/D114174
revunov.denis@huawei.com [Mon, 14 Nov 2022 13:25:20 +0000 (13:25 +0000)]
[BOLT][NFC] Fix possible use-after-free
If NewName twine has reference to the old name, then after
Section.Name = NewName.str(); this reference is invalidated,
so we cannot use NewName.str() anymore.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D137616
Valentin Clement [Mon, 14 Nov 2022 13:27:44 +0000 (14:27 +0100)]
[flang][NFC] Fix typo in fir.box_typecode op description
Dmitry Preobrazhensky [Mon, 14 Nov 2022 13:20:20 +0000 (16:20 +0300)]
[AMDGPU][MC][GFX11] Improve diagnostic messages for invalid VOPD syntax
Differential Revision: https://reviews.llvm.org/D137842
Nicolas Vasilache [Wed, 9 Nov 2022 11:56:26 +0000 (03:56 -0800)]
[mlir][Transform] Add support for dynamically unpacking tile_sizes / num_threads in tile_to_foreach_thread
This commit adds automatic unpacking of Value's of type pdl::OperationType to the underlying single-result OpResult.
This allows mixing single-value, attribute and multi-value pdl::Operation tile sizes and num threads to TileToForeachThreadOp.
Differential Revision: https://reviews.llvm.org/D137896
Ying Yi [Mon, 10 Oct 2022 12:26:56 +0000 (13:26 +0100)]
[ThinLTO] a ThinLTO warning is added if cache_size_bytes or cache_size_files is too small for the current link job. The warning recommends the user to consider adjusting --thinlto-cache-policy.
A specific case for ThinLTO cache pruning is that the current build is huge, and the cache wasn't big enough to hold the intermediate object files of that build. So in doing that build, a file would be cached, and later in that same build it would be evicted. This was significantly decreasing the effectiveness of the cache. By giving this warning, the user could identify the required cache size/files and improve ThinLTO link speed.
Differential Revision: https://reviews.llvm.org/D135590
Jay Foad [Mon, 14 Nov 2022 11:36:24 +0000 (11:36 +0000)]
[AMDGPU] Simplify SelectPat and remove comment obsoleted by D133593
Thomas Symalla [Mon, 14 Nov 2022 11:55:05 +0000 (12:55 +0100)]
[InstCombine][NFC] Add extractelement tests
HanSheng Zhang [Mon, 14 Nov 2022 11:45:23 +0000 (12:45 +0100)]
[reg2mem] Skip non-sized Instructions (PR58890)
We can only convert sized values into alloca/load/store, skip
instructions returning other types.
Fixes https://github.com/llvm/llvm-project/issues/58890.
Differential Revision: https://reviews.llvm.org/D137700
Christian Sigg [Mon, 14 Nov 2022 11:21:59 +0000 (12:21 +0100)]
[mlir][bazel] NFC: change MLIR_GPU_TO_CUBIN_PASS_ENABLE from `defines` to `local_defines`.
Joshua Cao [Mon, 14 Nov 2022 03:24:15 +0000 (22:24 -0500)]
Do not write a comma when varargs is the only argument
Fixes https://github.com/llvm/llvm-project/issues/56544
AsmWriter always writes ", ..." when a tail call has a varargs argument. This patch only writes the ", " when there is an argument before the varargs argument.
I did not write a dedicated test this for this change, but I modified an existing test that will test for a regression.
Reviewed By: avogelsgesang
Differential Revision: https://reviews.llvm.org/D137893
Signed-off-by: Adrian Vogelsgesang <avogelsgesang@salesforce.com>
Jean Perier [Mon, 14 Nov 2022 10:19:21 +0000 (11:19 +0100)]
[flang] Add hlfir.declare codegen
hlfir.declare codegen generates a fir.declare, and may generate a
fir.embox/fir.rebox/fir.emboxchar if the base value does not convey
all the variable bounds and length parameter information.
Leave OPTIONAL as a TODO to keep this patch simple. It will require
making the embox/rebox optional to preserve the optionality aspects.
Differential Revision: https://reviews.llvm.org/D137789
LLVM GN Syncbot [Mon, 14 Nov 2022 10:12:18 +0000 (10:12 +0000)]
[gn build] Port
dd46a08008f7
Haojian Wu [Mon, 14 Nov 2022 10:10:55 +0000 (11:10 +0100)]
Update the wrong isSelfContainedHeader API usage in the test.
Nikita Popov [Mon, 14 Nov 2022 10:01:15 +0000 (11:01 +0100)]
[ConstraintElimination] Use SmallVectorImpl (NFC)
When passing a SmallVector by reference, don't specify its size.
Nikita Popov [Tue, 8 Nov 2022 13:46:09 +0000 (14:46 +0100)]
[TableGen] Use MemoryEffects to represent intrinsic memory effects (NFCI)
The TableGen implementation was using a homegrown implementation of
FunctionModRefInfo. This switches it to use MemoryEffects instead.
This makes the code simpler, and will allow exposing the full
representational power of MemoryEffects in the future. Among other
things, this will allow us to map IntrHasSideEffects to an
inaccessiblemem readwrite, rather than just ignoring it entirely
in most cases.
To avoid layering issues, this moves the ModRef.h header from IR
to Support, so that it can be included in the TableGen layer.
Differential Revision: https://reviews.llvm.org/D137641
Valentin Clement [Mon, 14 Nov 2022 09:50:56 +0000 (10:50 +0100)]
[flang] Add fir.box_typecode operation
`fir.box_typecode` operation allows to retrieve the type code
from a boxed value. This will be used in the `fir.select_type` conversion
to if-then-else ladder for type guard statement with intrinsic type spec
instead of using a runtime call.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D137829
Valentin Clement [Mon, 14 Nov 2022 09:46:53 +0000 (10:46 +0100)]
[flang] Initial lowering of SELECT TYPE construct to fir.select_type operation
This patch is the initial path to lower the SELECT TYPE construct to the
fir.select_type operation. More work is required in the AssocEntity
mapping but it will be done in a follow up patch to ease the review.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D137728
Sebastian Neubauer [Mon, 14 Nov 2022 09:46:46 +0000 (10:46 +0100)]
[Coroutines] Do not add allocas for retcon coroutines
Same as for async-style lowering, if there are no resume points in a
function, the coroutine frame pointer will be replaced by an undef,
making all accesses to the frame undefinde behavior.
Fix this by not adding allocas to the coroutine frame if there are no
resume points.
Differential Revision: https://reviews.llvm.org/D137866
Sebastian Neubauer [Fri, 11 Nov 2022 21:20:43 +0000 (22:20 +0100)]
[Coroutines] Presubmit retcon without suspend test
The test gets incorrectly optimized to unreachable.
Nikita Popov [Fri, 11 Nov 2022 16:27:31 +0000 (17:27 +0100)]
[ConstraintElimination] Add Decomposition struct (NFCI)
Replace the vector of DecompEntry with a struct that stores the
constant offset separately. I think this is cleaner than giving the
first element special handling.
This probably also fixes some potential ubsan errors by more
consistently using addWithOverflow/multiplyWithOverflow.
Nikita Popov [Fri, 11 Nov 2022 16:02:40 +0000 (17:02 +0100)]
[ConstraintElimination] Make decompose() infallible
decompose() currently returns a mix of {} and 0 + 1*V on failure.
This changes it to always return the 0 + 1*V form, thus making
decompose() infallible.
This makes the code marginally more powerful, e.g. we now fold
sub_decomp_i80 by treating the constant as a symbolic value.
Differential Revision: https://reviews.llvm.org/D137847
Jean Perier [Mon, 14 Nov 2022 09:38:22 +0000 (10:38 +0100)]
[flang][RFC] Do not rely on attributes to tag HLFIR variable uses
After more considerations and experience, switch to one of the
alternative plan for HLFIR variable that will avoid requiring naming
designators and having to maintain and update names in attributes after
inlining of code duplication.
The cost is the increase of fir.box usage, which in most cases should
be removed when lowering from HLFIR to FIR.
Differential Revision: https://reviews.llvm.org/D137634
Jean Perier [Mon, 14 Nov 2022 09:37:04 +0000 (10:37 +0100)]
[flang][NFC] rename hlfir::FortranEntity into EntityWithAttributes
This reflects the fact that Attributes will not always be visible when
looking at an HLFIR variable. The EntityWithAttributes class is used
to denote in the compiler code that the value at hand has visible
attributes. It is intended to be used in lowering so that the code
can query about operands attributes when generating code.
Differential Revision: https://reviews.llvm.org/D137792
Jean Perier [Mon, 14 Nov 2022 09:25:03 +0000 (10:25 +0100)]
[flang] Add hlfir.declare operation
This operation will be used to declare named variables in HLFIR.
See the added description in HLFIROpBase.td for more info about it.
The motivation behind this operation is described in https://reviews.llvm.org/D137634.
The FortranVariableInterface verifier is changed a bit. It used to
operate using the result type to verify the provided shape and length
parameters. This is a bit incorrect because what matters to verify the
information is the input address (This worked OK with fir.declare where
the input memref type is the same as the output result). Also, not all
operation defining variables will have an input memref with the same
meaning (hlfir.designate and hlfir.associate for instance).
Hence, this verifier is now optional and must be provided a memref to
operate.
Differential Revision: https://reviews.llvm.org/D137781
Haojian Wu [Mon, 7 Nov 2022 12:30:47 +0000 (13:30 +0100)]
Move the isSelfContainedHeader function from clangd to libtooling.
We plan to reuse it in the include-cleaner library, this patch moves
this functionality from clangd to libtooling, so that this piece of code can be
shared among all clang tools.
Differential Revision: https://reviews.llvm.org/D137697
Kazu Hirata [Mon, 14 Nov 2022 08:31:06 +0000 (00:31 -0800)]
[llvm] Use std::is_integral_v (NFC)
Muhammad Omair Javaid [Mon, 14 Nov 2022 08:27:11 +0000 (12:27 +0400)]
Revert "[libclang] Expose completion result kind in `CXCompletionResult`"
This reverts commit
97105e5bf70fae5d9902081e917fd178b57f1717.
It breaks clang-armv8-quick buildbot:
https://lab.llvm.org/buildbot/#/builders/245/builds/761
Dmitry Makogon [Fri, 11 Nov 2022 09:49:48 +0000 (16:49 +0700)]
[IRCE] Bail out if AddRec in icmp is for another loop (PR58912)
When IRCE runs on outer loop and sees a check of an AddRec of
inner loop, it crashes with an assert in SCEV that the AddRec
must be loop invariant.
This adds a bail out if the AddRec which is checked in icmp
is for another loop.
Fixes https://github.com/llvm/llvm-project/issues/58912.
Differential Revision: https://reviews.llvm.org/D137822
Craig Topper [Mon, 14 Nov 2022 07:43:55 +0000 (23:43 -0800)]
[RISCV] Add PACKH/PACKW/PACK to hasAllNBitUsers.
Craig Topper [Mon, 14 Nov 2022 06:49:51 +0000 (22:49 -0800)]
[RISCV] Add another PACKH pattern.
This handles the case where the upper bits are zeroed with an AND
after the OR.
Kazu Hirata [Mon, 14 Nov 2022 07:50:08 +0000 (23:50 -0800)]
[Support] Use std::is_scalar_v (NFC)
Johannes Reifferscheid [Mon, 14 Nov 2022 07:14:31 +0000 (08:14 +0100)]
Add missing include.
Aiden Grossman [Mon, 14 Nov 2022 06:59:43 +0000 (06:59 +0000)]
[Docs] Add Documentation on BOLT Build Configs
This patch adds documentation into the advanced builds documentation on
how to use the BOLT caches, including the combinations with the PGO
multistage builds and (Thin)LTO.
Reviewed By: sylvestre.ledru, Amir
Differential Revision: https://reviews.llvm.org/D137899
Aiden Grossman [Sat, 12 Nov 2022 22:55:08 +0000 (22:55 +0000)]
[Docs] Add Documentation on (Thin)LTO + PGO Build Configs
This patch adds documentation on the AdvancedBuilds page on how to do
PGO builds with (Thin)LTO with the currently undocumented (as far as I
can tell) PGO_INSTRUMENT_LTO option in the Clang PGO caches.
Reviewed By: sylvestre.ledru
Differential Revision: https://reviews.llvm.org/D137898
Med Ismail Bennani [Mon, 14 Nov 2022 06:19:14 +0000 (22:19 -0800)]
[lldb] Re-phase comments in `ScriptedThread.get_stackframes` method (NFC)
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Med Ismail Bennani [Mon, 14 Nov 2022 06:05:21 +0000 (22:05 -0800)]
[lldb] Remove unused `stack_memory_dump` member from ScriptedProcess class (NFC)
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Craig Topper [Mon, 14 Nov 2022 05:54:15 +0000 (21:54 -0800)]
[RISCV] Improve PACKH instruction selection
Handle AssertZExt in addition to AND.
wangpc [Mon, 14 Nov 2022 05:50:51 +0000 (13:50 +0800)]
[RISCV] Don't use zero-stride vector load if there's no optimized u-arch
For vector strided instructions, as the RVV spec says:
> When rs2=x0, then an implementation is allowed, but not required, to
> perform fewer memory operations than the number of active elements, and
> may perform different numbers of memory operations across different
> dynamic executions of the same static instruction.
So compiler shouldn't assume that fewer memory operations will be
performed when rs2=x0.
We add a target feature to specify whether u-arch supports optimized
zero-stride vector load. And we do vector splat optimization iff this
feature is supported.
This feature is enabled by default since most designs implement this
optimization.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D137699
Chuanqi Xu [Fri, 11 Nov 2022 08:39:12 +0000 (16:39 +0800)]
[C++20] [Modules] Emit Macro Definition in -module-file-info action
It is helpful to know whih macro definition is emitted in the module
file without openning it directly. And this is not easy to be tested
with the lit test. So this patch add the facility to emit macro
definitions in `-module-file-info` action. And this should be innnocent
for every other cases.
Mehdi Amini [Thu, 3 Nov 2022 21:44:12 +0000 (21:44 +0000)]
Apply clang-tidy fixes for readability-identifier-naming in MergerTest.cpp (NFC)
Mehdi Amini [Thu, 3 Nov 2022 21:39:14 +0000 (21:39 +0000)]
Apply clang-tidy fixes for llvm-qualified-auto in AttrOrTypeFormatGen.cpp (NFC)
Mehdi Amini [Thu, 3 Nov 2022 21:01:01 +0000 (21:01 +0000)]
Apply clang-tidy fixes for llvm-qualified-auto in MLIRContext.cpp (NFC)
Craig Topper [Mon, 14 Nov 2022 04:00:34 +0000 (20:00 -0800)]
[RISCV] Add PACKW and PACKH to isSignExtendingOpW in RISCVSExtWRemoval.
PACKW sign extends like other W instructions.
PACKH zeroes bits 63:16 which means bits 63:31 are all zero.
Craig Topper [Mon, 14 Nov 2022 02:34:20 +0000 (18:34 -0800)]
[RISCV] Improve selection of PACKW.
Use hasAllWUsers to check if the upper bits are ignored so we can
use PACKW even when no sign_extend_inreg is present before the OR.
luxufan [Fri, 11 Nov 2022 13:21:45 +0000 (21:21 +0800)]
[LoopFlatten] Forget all block and loop dispositions after flatten
Method forgetLoop only forgets expression of phi or its users. SCEV
expressions except the above mentioned may still has loop dispositions
that point to the destroyed loop, which might cause a crash.
Fixes: https://github.com/llvm/llvm-project/issues/58865
Reviewed By: nikic, fhahn
Differential Revision: https://reviews.llvm.org/D137651
gonglingqin [Mon, 14 Nov 2022 01:27:01 +0000 (09:27 +0800)]
[LoongArch] Expand atomicrmw fadd/fsub/fmin/fmax with CmpXChg
Differential Revision: https://reviews.llvm.org/D137311
Yeting Kuo [Sun, 13 Nov 2022 13:43:31 +0000 (21:43 +0800)]
[RISCV][NFC] Remove dead code.
All ISD::BSWAP nodes are not customized lowered in RISC-V now, so the patch
removed dead code for ISD::BSWAP in LowerOperation.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D137907
Craig Topper [Mon, 14 Nov 2022 01:23:54 +0000 (17:23 -0800)]
[RISCV] Fix incorrect early out from isSignExtendedW in RISCVSExtWRemoval.
We can only return false to abort. If the criteria is met we need
to use continue instead. Returning true stops us from visiting all
nodes and makes the caller think it is safe to remove sext.w.
Craig Topper [Mon, 14 Nov 2022 00:46:43 +0000 (16:46 -0800)]
[RISCV] Add test for incorrect sext.w removal. NFC
Kazu Hirata [Mon, 14 Nov 2022 00:22:33 +0000 (16:22 -0800)]
[PowerPC] Use ArrayRef (NFC)
This patch teaches getStoreOpcodesForSpillArray and
getLoadOpcodesForSpillArray to return ArrayRef. This way,
isLoadFromStackSlot and isStoreToStackSlot can use llvm::is_contained.
Craig Topper [Sun, 13 Nov 2022 21:35:04 +0000 (13:35 -0800)]
[RISCV] Improve selection of PACK/PACKW for AssertZExt input.
Kazu Hirata [Sun, 13 Nov 2022 22:54:29 +0000 (14:54 -0800)]
[IR] Use llvm::any_of (NFC)
Krzysztof Parzyszek [Sun, 13 Nov 2022 21:10:06 +0000 (15:10 -0600)]
[Hexagon] Use `Register` instead of `unsigned`, NFC
Use `Register` instead of `unsigned` in HexagonInstrInfo,
HexagonRegisterInfo, HexagonFrameLowering, and HexagonHardwareLoops.
Arthur Eubanks [Sun, 13 Nov 2022 22:21:12 +0000 (14:21 -0800)]
[lldb][test] Avoid UB in optimized_code test
Florian Hahn [Sun, 13 Nov 2022 22:05:37 +0000 (22:05 +0000)]
[VectorUtils] Skip interleave members with diff type and alloca sizes.
Currently, codegen doesn't support cases where the type size doesn't
match the alloc size. Skip them for now.
Fixes #58722.
Arthur Eubanks [Sun, 13 Nov 2022 22:02:24 +0000 (14:02 -0800)]
[clang][test] Avoid UB in overload.cl
Krzysztof Parzyszek [Sun, 13 Nov 2022 20:33:26 +0000 (14:33 -0600)]
[Hexagon] Pass Hexagon::PC to InitializeHexagonMCRegisterInfo
That will make MCRegisterInfo::getProgramCounter return the right thing.
Krzysztof Parzyszek [Sun, 13 Nov 2022 20:32:18 +0000 (14:32 -0600)]
[Hexagon] Remove unneeded HexagonRegisterInfo::getRARegister
That function exists in MCRegisterInfo, and is inherited into HRI.
Krzysztof Parzyszek [Sun, 13 Nov 2022 19:56:06 +0000 (13:56 -0600)]
[Hexagon] Reduce the spill alignment for double/quad vector classes
The spill alignment for HVX vectors is always the single vector size,
regardless of whether the class describes vector tuples or not.
Craig Topper [Sun, 13 Nov 2022 19:52:14 +0000 (11:52 -0800)]
[RISCV] Add BREV8 to hasAllWUsers in RISCVSExtWRemoval.
This instruction reverses the bits in each byte. Since we're only
interested in whether the upper 32 bits are used or not, we can
look through them to check their users.
Craig Topper [Sun, 13 Nov 2022 19:34:06 +0000 (11:34 -0800)]
[RISCV] Add PACK/PACKH/PACKW to hasAllWUsers in RISCVSExtWRemoval.
Renato Golin [Sat, 12 Nov 2022 21:48:54 +0000 (21:48 +0000)]
[MLIR] Move JitRunner Options to header, pass to mlirTransformer
This allows the MLIR transformer to see the command line options and
make desicions based on them. No change upstream, but my use-case is to
look at the entry point name and type to make sure I can use them.
Differential Revision: https://reviews.llvm.org/D137861
Krzysztof Parzyszek [Sat, 12 Nov 2022 23:27:59 +0000 (17:27 -0600)]
Move variable declarations out of #if guard, NFC
They are used in other sides of the #if/#else.
Florian Hahn [Sun, 13 Nov 2022 17:38:39 +0000 (17:38 +0000)]
[SimpleLoopUnswitch] Forget SCEVs for replaced phis.
Forget SCEVs based on exit phis in case SCEV looked through the phi.
After unswitching, it may not be possible to look through the phi due to
it having multiple incoming values, so it needs to be re-computed.
Fixes #58868
bzcheeseman [Fri, 11 Nov 2022 19:22:06 +0000 (11:22 -0800)]
[MLIR][Bytecode] Ensure `dataIt` is aligned coming out of `EncodingReader::alignTo`.
This addresses the TODO in the code previously and checks that the address of `dataIt` is properly aligned to the requested alignment.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D137855
Simon Pilgrim [Sun, 13 Nov 2022 17:12:18 +0000 (17:12 +0000)]
[X86] Regenerate combine-movmsk.ll
Adds a AVX check that we lost at some point
Sanjay Patel [Sun, 13 Nov 2022 16:47:21 +0000 (11:47 -0500)]
Revert "[InstCombine] allow more folds for multi-use selects (2nd try)"
This reverts commit
6eae6b3722d9204fa93b772e24afab93406cc143.
This version of the patch results in the same DFSAN bot failure as before,
so my guess about the SimplifyQuery context instruction was wrong.
I don't know what the real bug is.
Sanjay Patel [Sun, 13 Nov 2022 15:19:41 +0000 (10:19 -0500)]
[InstCombine] allow more folds for multi-use selects (2nd try)
The 1st try (
681a6a399022 ) was reverted because it caused
a DataFlowSanitizer bot failure.
This try modifies the existing calls to simplifyBinOp() to
not use a query that sets the context instruction because
that seems like a likely source of failure. Since we already
try those simplifies with multi-use patterns in some cases,
that means the bug is likely present even without this patch.
However, I have not been able to reduce a test to prove that
this was the bug, so if we see any bot failures with this patch,
then it should be reverted again.
The reduced simplify power does not affect any optimizations
in existing, motivating regression tests.
Original commit message:
The 'and' case showed up in a recent bug report and prevented
more follow-on transforms from happening.
We could handle more patterns (for example, the select arms
simplified, but not to constant values), but this seems
like a safe, conservative enhancement. The backend can
convert select-of-constants to math/logic in many cases
if it is profitable.
There is a lot of overlapping logic for these kinds of patterns
(see SimplifySelectsFeedingBinaryOp() and FoldOpIntoSelect()),
so there may be some opportunity to improve efficiency.
There are also optimization gaps/inconsistency because we do
not call this code for all bin-opcodes (see TODO for ashr test).
Simon Pilgrim [Sun, 13 Nov 2022 15:19:30 +0000 (15:19 +0000)]
[X86] Update WriteMPSAD class and remove VMPSADBWrri override
AMD 15h SoG + Agner both indicate there's no difference between MPSADBWrri + VMPSADBWrri - I can't find any data on the folded variant so I've kept the existing numbers
Removes the last X86 override for WriteMPSAD/WritePSADBW classes - removing a further 3 entries from every sched class table
Simon Pilgrim [Sun, 13 Nov 2022 14:54:16 +0000 (14:54 +0000)]
[X86] Remove unnecessary VPSADBW/VDBPSADBW zmm overrides
These match the existing WritePSADBWZ schedule classes
Simon Pilgrim [Sun, 13 Nov 2022 14:49:29 +0000 (14:49 +0000)]
[MCA][X86] Add test coverage for VDBPSADBW instructions
Nico Weber [Thu, 10 Nov 2022 13:59:44 +0000 (08:59 -0500)]
[gn build] Extract gen_arch_intrinsics() template to remove some duplication
No behavior change.
Differential Revision: https://reviews.llvm.org/D137784
Simon Pilgrim [Sun, 13 Nov 2022 14:10:03 +0000 (14:10 +0000)]
[X86] Fix scheduler tag for GFNI YMM instructions
These were hardcoded to XMM width
chenglin.bi [Sun, 13 Nov 2022 11:19:38 +0000 (19:19 +0800)]
[GlobalISel] Correct constant type in matchReassocConstantInnerLHS
When we match a pattern from m_GCst, the register type could be different from original op. So we can't replace the original op to vreg direct.
This code create a new constant with original op type then replace the original op.
Fix #58906
Reviewed By: arsenm, aemerson
Differential Revision: https://reviews.llvm.org/D137778
Simon Pilgrim [Sun, 13 Nov 2022 11:13:30 +0000 (11:13 +0000)]
[X86] Cleanup CVTPD2PS schedule values
The znver1/znver2 schedules for CVTPD2PS were incorrectly double pumping the xmm-load variant instead of the ymm variants (znver1 only)
Also, the xmm-load variant was incorrectly using FP03 instead of just FP3
Confirmed by the AMD SoG 17h tables, Agner + uops.info
Another step towards removing a lot of unnecessary overrides from all the x86 scheduler models - these should hopefully be convertible into regular WriteCvtPD2I classes soon.