Sanjay Patel [Fri, 13 Jan 2023 17:30:15 +0000 (12:30 -0500)]
[InstCombine] add tests for pow reassociation; NFC
Florian Mayer [Fri, 13 Jan 2023 18:24:45 +0000 (10:24 -0800)]
[NFC] [scudo] actually fix DCHECK now
Florian Mayer [Fri, 13 Jan 2023 18:18:25 +0000 (10:18 -0800)]
[NFC] [scudo] fix mistake in DCHECK
sorry, my test build (and all the pre-merge bots) did not exercise this.
Carlo Bertolli [Fri, 13 Jan 2023 18:18:49 +0000 (12:18 -0600)]
[OpenMP][libomptarget][AMDGPU] lock/unlock (pin/unpin) mechanism in libomptarget amdgpu plugin (API and implementation)
The current only way to obtain pinned memory with libomptarget is to use a custom allocator llvm_omp_target_alloc_host.
This reflects well the CUDA implementation of libomptarget, but it does not correctly expose the AMDGPU runtime API,
where any system allocated page can be locked/unlocked through a call to hsa_amd_memory_lock/unlock.
This patch enables users to allocate memory through malloc (mmap, sbreak) and then pin the related memory pages
with a libomptarget special call. It is a base support in the amdgpu libomptarget plugin to enable users to prelock
their host memory pages so that the runtime doesn't need to lock them itself for asynchronous memory transfers.
Reviewed By: jdoerfert, ye-luo
Differential Revision: https://reviews.llvm.org/D139208
Matt Arsenault [Tue, 10 Jan 2023 22:17:29 +0000 (17:17 -0500)]
AMDGPU: Fix format string indexes for existing llvm.printf.fmts
The index stored to the buffer is just an index into this named
metadata. It would more robust to produce a private constant table,
and use a constant expression to index into it.
Alex Brachet [Fri, 13 Jan 2023 18:17:46 +0000 (18:17 +0000)]
Revert "[clang-scan-deps] Migrate to OptTable"
This reverts commit
0c60ec699fc1ccca2e444ceb041cad9b1dca3a64.
Alex Brachet [Fri, 13 Jan 2023 18:11:29 +0000 (18:11 +0000)]
[clang-scan-deps] Migrate to OptTable
Differential Revision: https://reviews.llvm.org/D139949
Roman Lebedev [Fri, 13 Jan 2023 17:44:41 +0000 (20:44 +0300)]
[SimplifyCFG] Reapply: when eliminating `unreachable` landing pads, mark `call`s as `nounwind`
This time the change is in it's least intrusive form since only the return
type in prototype for `removeUnwindEdge()` is changed, since only a single
specific caller need that knowledge.
We really can't recover that knowledge, and `nounwind` knowledge,
(and not just a lack of the unwind edge, aka `call` instead of `invoke`),
is e.g. part of the reasoning in e.g. `mayHaveSideEffects()`.
Note that this is call-site-specific knowledge,
just because some callsite had an `unreachable`
unwind edge, does not mean that all will.
Christopher Bate [Sat, 24 Dec 2022 00:21:46 +0000 (17:21 -0700)]
[mlir][gpu] Migrate hard-coded address space integers to an enum attribute (gpu::AddressSpaceAttr)
This is a purely mechanical change that introduces an enum attribute in the GPU
dialect to represent the various memref memory spaces as opposed to the
hard-coded integer attributes that are currently used.
The following steps were taken to make the transition across the codebase:
1. Introduce a pass "gpu-lower-memory-space-attributes":
The pass updates all memref types that have a memory space attribute that is a
`gpu::AddressSpaceAttr`. These attributes are changed to `IntegerAttr`'s using a
mapping that is given by the caller. This pass is based on the
"map-memref-spirv-storage-class" pass and the common functions can probably
be refactored into a set of utilities under the MemRef dialect.
2. Update the verifiers of GPU/NVGPU dialect operations.
If a verifier currently checks the address space of an operand using
e.g.`getWorkspaceAddressSpace`, then it can continue to do so. However, the
checks are changed to only fail if the memory space is either missing or a wrong
value of type `gpu::AddressSpaceAttr`. Otherwise, it just assumes the address
space is correct because it was specifically lowered to something other than a
`gpu::AddressSpaceAttr`.
3. Update existing gpu-to-llvm conversion infrastructure.
In the existing gpu-to-X passes, we add a full conversion equivalent to
`gpu-lower-memory-space-attributes` just before doing the conversion to the
LLVMDialect. This is done because currently both the gpu-to-llvm passes
(rocdl,nvvm) run gpu-to-gpu rewrites within the pass, which introduce
`AddressSpaceAttr` memory space annotations. Therefore, I inserted the
memory space conversion between the gpu-to-gpu rewrites and the LLVM
conversion.
For more context see the below discourse discussion:
https://discourse.llvm.org/t/gpu-workgroup-shared-memory-address-space-is-hard-coded/
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D140644
Blue Gaston [Sat, 7 Jan 2023 00:34:52 +0000 (17:34 -0700)]
[Sanitizer] Clean up SANITIZER_CAN_USE_ALLOCATOR64 logic
Update: A change to this code recently broke our bots after this change enabled the 64 bit allocator for defined(aarch64): https://reviews.llvm.org/D137136
We added logic initially to get our test passing, but want to further clean up this code to enable MacOS to use allocator64 and increase readability and clarity of the logic.
rdar://
103647896
Differential Revision: https://reviews.llvm.org/D141171
Florian Mayer [Thu, 12 Jan 2023 21:39:04 +0000 (13:39 -0800)]
allocation_ring_buffer_size to 0 disables stack collection
Reviewed By: hctim, eugenis
Differential Revision: https://reviews.llvm.org/D141631
Florian Hahn [Fri, 13 Jan 2023 17:50:58 +0000 (17:50 +0000)]
[VPlan] Use to_vector when iterating over a temporary vector. (NFC)
Matt Arsenault [Tue, 10 Jan 2023 22:07:01 +0000 (17:07 -0500)]
AMDGPU: Some printf call edge case tests
Check printf printing printf, and printf passed to a function.
Matt Arsenault [Tue, 10 Jan 2023 22:02:47 +0000 (17:02 -0500)]
AMDGPU: Don't expand printf users if printf is defined
Benjamin Kramer [Fri, 13 Jan 2023 17:13:31 +0000 (18:13 +0100)]
[thread] Fix a FIXME with std::apply. NFCI
Med Ismail Bennani [Fri, 13 Jan 2023 16:51:03 +0000 (08:51 -0800)]
Revert "[lldb] Add Debugger & ScriptedMetadata reference to Platform::CreateInstance"
This reverts commit
2d53527e9c64c70c24e1abba74fa0a8c8b3392b1.
Mark de Wever [Mon, 9 Jan 2023 18:06:25 +0000 (19:06 +0100)]
[libc++][doc] Updates the release notes.
This is a preparation for the upcoming LLVM 17 release.
Reviewed By: ldionne, philnik, avogelsgesang, jloser, #libc
Differential Revision: https://reviews.llvm.org/D141304
bixia1 [Thu, 12 Jan 2023 00:32:42 +0000 (16:32 -0800)]
[mlir][sparse] Improve ConcatenateOp rewriting for annotated all dense result.
Previously, we rely on InsertOp to add values to the result, in the same way we
add values to a sparse tensor with compressed dimensions. We now direct store
values to the values buffer.
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D141517
Guillaume Chatelet [Fri, 13 Jan 2023 15:50:38 +0000 (15:50 +0000)]
[NFC] Remove Function::getParamAlignment
Differential Revision: https://reviews.llvm.org/D141696
David Green [Fri, 13 Jan 2023 16:09:47 +0000 (16:09 +0000)]
[AArch64] Add some tests for vscale being a power 2. NFC
Fahad Nayyar [Tue, 22 Nov 2022 16:25:59 +0000 (16:25 +0000)]
[Clang][Sema] Enabled implicit conversion warning for CompoundAssignment operator.
This change enables implicit conversion warnings (like Wshorten-64-to-32) for compound assignment operator with integral operands.
rdar://
10466193
Differential Revision: https://reviews.llvm.org/D139114
Fahad Nayyar [Wed, 11 Jan 2023 17:02:54 +0000 (17:02 +0000)]
[libunwind] Fixed an upcoming clang -Wsign-conversion warning
Fixing an upcoming clang warning (from https://reviews.llvm.org/D139114) in libunwind.
Differential Revision: https://reviews.llvm.org/D141515
Jakub Kuderski [Fri, 13 Jan 2023 15:55:04 +0000 (10:55 -0500)]
[mlir][spirv] Fix crash in spirv-lower-abi-attributes
... when the are no SPIR-V env attributes.
Fixes: https://github.com/llvm/llvm-project/issues/59983
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D141695
Matthias Springer [Fri, 13 Jan 2023 15:52:48 +0000 (16:52 +0100)]
[mlir][vector] Support multiple result types in vector.mask
The verifier already had support for multiple result types, but the op definition assumed a single, optional result.
Differential Revision: https://reviews.llvm.org/D141683
Kito Cheng [Fri, 30 Dec 2022 06:59:40 +0000 (14:59 +0800)]
[Driver][RISCV] Adjust the priority between -mcpu, -mtune and -march
RISC-V supports `-march`, `-mtune`, and `-mcpu`: `-march` provides the
architecture extension information, `-mtune` provide the pipeline model, and
`-mcpu` provides both.
What's the priority among those options for now(w/o this patch)?
Pipeline model:
- Take from `-mtune` if present.
- Take from `-mcpu` if present
- Use the default pipeline model: `generic-rv32` or `generic-rv64`
Architecture extension has quite complicated behavior now:
- Take union from `-march` and `-mcpu` if both are present.
- Take from `-march` if present.
- Take from `-mcpu` if present.
- Implied from `-mabi` if present.
- Use the default architecture depending on the target triple
We treat `-mcpu`/`-mtune` and `-mcpu`/`-march` differently, and it's
kind of counterintuitive: -march is explicitly specified but ignored.
This patch adjusts the priority between `-mcpu`/`-march`, letting it use
architecture extension information from `-march` if it's present.
So the priority of architecture extension information becomes:
- Take from `-march` if present.
- Take from `-mcpu` if present.
- Implied from `-mabi` if present.
- Use the default architecture depending on the target triple
And this also match what we implement in RISC-V GCC too.
Reviewed By: craig.topper, MaskRay
Differential Revision: https://reviews.llvm.org/D140693
Frank (Fang) Gao [Fri, 13 Jan 2023 15:47:44 +0000 (10:47 -0500)]
[mlir][vector] Add scalable vectors support to OuterProductOp
This will probably be the first in a series of patches that tries to
enable code generation for ARM SME (extension of SVE).
Since SME's core operation is the outer product instruction, I figured
that it would probably be a good idea to enable the outer product
operation to properly accept and generate scalable vectors.
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D138718
Jens Massberg [Fri, 13 Jan 2023 10:37:13 +0000 (11:37 +0100)]
Move definitions to prevent incomplete types.
C++20 is more strict when erroring out due to incomplete types.
Thus the code required some restructoring so that it complies in C++20.
Differential Revision: https://reviews.llvm.org/D141671
Jens Massberg [Thu, 12 Jan 2023 10:43:08 +0000 (11:43 +0100)]
Add default constructurs to `filter_iterator_impl` and `filter_iterator_impl`.
Bases of `reverse_iterator` must be default-constructible. This is enforced when using `libstdc++-12` plus C++20.
Differential Revision: https://reviews.llvm.org/D141587
Matthias Springer [Fri, 13 Jan 2023 15:31:01 +0000 (16:31 +0100)]
[mlir][bufferization][NFC] Make getEnclosingRepetitiveRegion public
These functions are generally useful and not specific to One-Shot Analysis. Move them to `BufferizableOpInterface.h` and make them public.
Differential Revision: https://reviews.llvm.org/D141685
Sanjay Patel [Fri, 13 Jan 2023 13:59:21 +0000 (08:59 -0500)]
[InstCombine] improve description of fold and add TODO; NFC
D58633
Florian Hahn [Fri, 13 Jan 2023 15:32:44 +0000 (15:32 +0000)]
[MachineCombiner] Lift same-bb restriction for reassociable ops.
This patch relaxes the restriction that both reassociate operands must
be in the same block as the root instruction.
The comment indicates that the reason for this restriction was that the
operands not in the same block won't have a depth in the trace.
I believe this is outdated; if the operand is in a different block, it
must dominate the current block (otherwise it would need to be phi),
which in turn means the operand's block must be included in the current
rance, and depths must be available.
There's a test case (no_reassociate_different_block) added in
70520e2f1c5fc4 which shows that we have accurate depths for operands
defined in other blocks.
This allows reassociation of code that computes the final reduction
value after vectorization, among other things.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D141302
Benjamin Kramer [Fri, 13 Jan 2023 15:19:06 +0000 (16:19 +0100)]
[HashBuilder] Simplify with C++17. NFCI
Haojian Wu [Fri, 13 Jan 2023 15:25:44 +0000 (16:25 +0100)]
[include-cleaner] Remove a stale FIXME.
This FIXME was addressed in
0e545816a9e582af29ea4b9441fea8ed376cf52a.
Keith Walker [Fri, 13 Jan 2023 13:54:28 +0000 (13:54 +0000)]
[AArch64] Update 2 RME MEC instruction encodings
The encodings of these 2 RME MEC instructions are
incorrect and need swapping:
- DC CIPAE
- DC CIGDPAE
The correct encoding is:
Operation op0 op1 CRn CRm op2
DC CIPAE, Xt 0b01 0b100 0b0111 0b1110 0b000
DC CIGDPAE, Xt 0b01 0b100 0b0111 0b1110 0b111
Differential Revision: https://reviews.llvm.org/D141689
David Green [Fri, 13 Jan 2023 15:19:15 +0000 (15:19 +0000)]
[AArch64] Add extra tests for sinking to umull/smull. NFC
Joseph Huber [Fri, 13 Jan 2023 15:16:10 +0000 (09:16 -0600)]
[lld][Mach-O] Fix build with Mach-O due to missing library
Summary:
The build was failing due to an undefined symbol. This is because this
function is defined in the `BitWriter` component but was not linked.
Jirui Wu [Fri, 13 Jan 2023 15:07:12 +0000 (15:07 +0000)]
[ARM] Accept two-register form of vnmul
The previous vnmul only accepts three registers. It should accept either
two or three registers as vmul does.
Differential Revision: https://reviews.llvm.org/D141405
Benjamin Kramer [Fri, 13 Jan 2023 15:14:47 +0000 (16:14 +0100)]
[analyzer] Fix a FIXME. NFCI
Lorenzo Chelini [Sun, 8 Jan 2023 13:04:18 +0000 (14:04 +0100)]
[MLIR] Fold outer dims permutation to pack when propagating
Instead of folding the transpose into the linalg.generic keep the
transposition in the packing operation, effectively making the
linalg.generic transparent to the propagation. Additionally, if the init
operand of the generic has users pack the init and pass it as the
operand to the generic.
Reviewed By: hanchung
Differential Revision: https://reviews.llvm.org/D141483
Nikita Popov [Fri, 13 Jan 2023 15:07:06 +0000 (16:07 +0100)]
[LVI] Check for non-speculatable instructions
When constraining an operand based on a condition at a (potentially
transitive) use site, make sure we don't skip over non-speculatable
instructions. While the result is only used under the condition,
the non-speculatable instruction may have a side-effect or UB.
Demonstrating this issue requires raising the limit on the walk,
so do that.
Guillaume Chatelet [Fri, 13 Jan 2023 15:05:24 +0000 (15:05 +0000)]
Deprecate DataLayout::getPrefTypeAlignment
Guillaume Chatelet [Fri, 13 Jan 2023 15:04:06 +0000 (15:04 +0000)]
[lldb][NFC] Remove dependency on DataLayout::getPrefTypeAlignment
Guillaume Chatelet [Fri, 13 Jan 2023 15:03:52 +0000 (15:03 +0000)]
[mlir][NFC] Remove dependency on DataLayout::getPrefTypeAlignment
Guillaume Chatelet [Fri, 13 Jan 2023 15:01:09 +0000 (15:01 +0000)]
[clang][NFC] Remove dependency on DataLayout::getPrefTypeAlignment
Nikita Popov [Fri, 13 Jan 2023 14:59:17 +0000 (15:59 +0100)]
[CVP] Add additional tests for use-site conditions (NFC)
For longer chains, we need to consider non-speculatable instructions.
Paul Robinson [Fri, 13 Jan 2023 14:52:14 +0000 (06:52 -0800)]
[compiler-rt] fix typo in #
95507c82
Erich Keane [Fri, 13 Jan 2023 14:50:35 +0000 (06:50 -0800)]
Revert "Workaround an assertion failure during module build"
This reverts commit
e77e14ecf17bba5f9e2ef43d8c3dbc9c86685287.
According to @rsmith on https://reviews.llvm.org/D131858, he believes
the reason for this workaround in the firstplace has been fixed.
Reverting the workaround to hopefully confirm that is the case.
Nikita Popov [Fri, 13 Jan 2023 14:44:27 +0000 (15:44 +0100)]
[CVP] Handle use-site conditions in urem folds
Nikita Popov [Fri, 13 Jan 2023 14:40:40 +0000 (15:40 +0100)]
[CVP] Add additional tests for use-site conditions (NFC)
Paul Robinson [Fri, 13 Jan 2023 14:16:33 +0000 (06:16 -0800)]
[compiler-rt] Fix XFAIL conditions after converting to 'target=...'
Fixes #60002.
Alexandros Lamprineas [Wed, 11 Jan 2023 11:50:32 +0000 (11:50 +0000)]
[IPSCCP] Enable specialization of functions.
Re-enable the optimization after having fixed the compilation error
found in SPEC/CINT2017rate/502.gcc_r when both LTO and PGO are in use
(see https://reviews.llvm.org/D141474).
Differential Revision: https://reviews.llvm.org/D140210
Dmitri Gribenko [Fri, 13 Jan 2023 14:02:39 +0000 (15:02 +0100)]
[lld][MachO] Store test outputs in %t
Nikita Popov [Fri, 13 Jan 2023 13:56:07 +0000 (14:56 +0100)]
[CVP] Avoid duplicate range fetch (NFC)
In preparation for switching this to use getConstantRangeAtUse().
Takuya Shimizu [Fri, 13 Jan 2023 13:20:14 +0000 (08:20 -0500)]
Fix -Wlogical-op-parentheses warning inconsistency for const and constexpr values
When using the && operator within a || operator, both Clang and GCC
produce a warning for potentially confusing operator precedence.
However, Clang avoids this warning for certain patterns, such as
a && b || 0 or a || b && 1, where the operator precedence of && and ||
does not change the result.
However, this behavior appears inconsistent when using the const or
constexpr qualifiers. For example:
bool t = true;
bool tt = true || false && t; // Warning: '&&' within '||'
const bool t = true;
bool tt = true || false && t; // No warning
const bool t = false;
bool tt = true || false && t; // Warning: '&&' within '||'
The second example does not produce a warning because
true || false && t matches the a || b && 1 pattern, while the third one
does not match any of them.
This behavior can lead to the lack of warnings for complicated
constexpr expressions. Clang should only suppress this warning when
literal values are placed in the place of t in the examples above.
This patch adds the literal-or-not check to fix the inconsistent
warnings for && within || when using const or constexpr.
Guillaume Chatelet [Fri, 13 Jan 2023 13:18:45 +0000 (13:18 +0000)]
[clang][NFC] Remove dependency on DataLayout::getPrefTypeAlignment
Paul Walker [Sun, 18 Dec 2022 16:22:39 +0000 (16:22 +0000)]
[SVE] Restrict SVE fixed length extload/truncstore combine to float and double types.
Prior to this patch we would create floating point extending load
and truncating store operations involving fp128 types, which we
cannot lower.
Fixes #58530
Differential Revision: https://reviews.llvm.org/D140318
Freddy Ye [Fri, 13 Jan 2023 13:09:13 +0000 (21:09 +0800)]
[X86][test] Add pre-commit test for D141657.
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D141677
Quentin Colombet [Fri, 13 Jan 2023 10:32:54 +0000 (10:32 +0000)]
Re-apply "[mlir][SparseTensor] Add a few more tests for sparse vectorization"
This reverts commit
93f40c983e0adbb63cbb7c59814090134d691dd1.
Update the tests to also work on window.
The order in which the `arith.constant`s appear in the output IR is
slightly different between window and linux.
Use CHECK.*-DAG for the constants.
Original message:
These tests cover muli, xor, and, addf, subf, and addi.
The tests themselves are not that interesting, their goal is to provide
code coverage for all the types of reductions currently supported.
NFC
Differential Revision: https://reviews.llvm.org/D141369
LiaoChunyu [Thu, 12 Jan 2023 13:58:38 +0000 (21:58 +0800)]
[RISCV] Optimize (brcond (seteq (and X, (1 << C)-1), 0))
Inspired by gcc's assembly: https://godbolt.org/z/54hbzsGYn, while referring to D130203
Replace AND+IMM{32,64} with a slli.
But gcc does not handle 0xffff and 0xffffffff, which also seem to be optimizable.
The testcases copies all the bits in D130203 and adds 16, 32, and 64 bits.
Differential Revision: https://reviews.llvm.org/D141607
Emmmer [Wed, 4 Jan 2023 09:31:10 +0000 (17:31 +0800)]
[LLDB][RISCV] Add RVDC instruction support for EmulateInstructionRISCV
RVC is the RISC-V standard compressed instruction-set extension, named "C", which reduces static and dynamic code size by adding short 16-bit instruction encodings for common operations, and RVCD is the compressed "D extension".
And "D extension" is a double-precision floating-point instruction-set extension, which adds double-precision floating-point computational instructions compliant with the IEEE 754-2008 arithmetic standard.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D140961
Peixin Qiao [Fri, 13 Jan 2023 12:40:51 +0000 (20:40 +0800)]
[flang] Initial support of allocate statement with source
Support allocate statement with source in runtime version. The source
expression is evaluated only once for each allocate statement. When the
source expression has shape-spec, uses it for bounds. Otherwise, get
the bounds from the source expression. Get the length if the source
expression has deferred length parameter.
Reviewed By: clementval, jeanPerier
Differential Revision: https://reviews.llvm.org/D137812
Ties Stuij [Fri, 13 Jan 2023 10:00:23 +0000 (10:00 +0000)]
[lld][ARM] support position independent thunks for Armv4(T)
- Position independent thunks now work for both Armv4 and Armv4T
- Armv4 arm->arm thunks don't emit a BX anymore, which doesn't exist for the
arch. This fixes https://github.com/llvm/llvm-project/issues/50764.
- Armv4 and Armv4T both have the same arm->arm behaviour. Which also is
desirable for the above ticket.
Reviewed By: MaskRay, peter.smith
Differential Revision: https://reviews.llvm.org/D141272
Jay Foad [Fri, 13 Jan 2023 11:12:17 +0000 (11:12 +0000)]
[TableGen] Use raw_ostream::write_escaped instead of reinventing it. NFCI.
Francesco Petrogalli [Fri, 13 Jan 2023 10:16:37 +0000 (11:16 +0100)]
Recommit [SchedBoundary] Add dump method for resource usage.
Summary:
As supporting information, I have added an example that describes how
the indexes of the vector of resources SchedBoundary::ReservedCycles
are tracked by the field SchedBoundary::ReservedCyclesIndex.
This has a minor rework of
https://github.com/llvm/llvm-project/commit/
b39a9a94f420a25a239ae03097c255900cbd660e
which was reverted in
https://github.com/llvm/llvm-project/commit/
df6ae1779fafd9984e144a27315d6dd65b32c325
becasue the llc invocation of the test was missing the argument
`-mtriple`.
See for example the failure at
https://lab.llvm.org/buildbot#builders/231/builds/7245 that reported
the following when targeting a non-aarch64 native build:
'cortex-a55' is not a recognized processor for this target (ignoring processor)
Reviewers: jroelofs
Subscribers:
Differential Revision: https://reviews.llvm.org/D141367
Joshua Cao [Thu, 12 Jan 2023 06:19:30 +0000 (22:19 -0800)]
[SCEV] Support all Min/Max SCEVs for GetMinTrailingZeros
There is already support for U/SMax. No reason why Min and SequentialMin
should not be supported.
NFC: code in GetMinTrailingZeroes is copied for a couple node types.
Refactor them into a single code block.
Differential Revision: https://reviews.llvm.org/D141568
Quentin Colombet [Fri, 13 Jan 2023 10:30:09 +0000 (10:30 +0000)]
Revert "[mlir][SparseTensor] Add a few more tests for sparse vectorization"
This reverts commit
904f2ccc3ba1d3aaf94140aa4595fd41af67d897.
This breaks a window bot. Reverting while I investigate.
https://lab.llvm.org/buildbot/#/builders/13/builds/30748
Quentin Colombet [Fri, 13 Jan 2023 10:09:10 +0000 (10:09 +0000)]
[mlir][memref] Add details on the semantic of reinterpret_cast
Make it clearer what the semantic of reinterpret_cast is.
In particular, call out that this instruction is not a no-op.
NFC
Related to https://github.com/llvm/llvm-project/issues/59896
Differential Revision: https://reviews.llvm.org/D141662
Francesco Petrogalli [Fri, 13 Jan 2023 10:12:15 +0000 (11:12 +0100)]
Revert "[SchedBoundary] Add dump method for resource usage."
Reverting because of https://lab.llvm.org/buildbot#builders/16/builds/41860
When building on x86, I need to specify also -mtriple in the
invocation of llc otherwise the folllowing error shows up:
'cortex-a55' is not a recognized processor for this target (ignoring processor)
This reverts commit
b39a9a94f420a25a239ae03097c255900cbd660e.
Jean Perier [Fri, 13 Jan 2023 09:58:03 +0000 (10:58 +0100)]
[flang] Implement codegen of hlfir.designate with component refs
Lower all the different kinds of hlfir.designate component refs to
FIR.
Differential Revision: https://reviews.llvm.org/D141476
Nikita Popov [Fri, 13 Jan 2023 10:04:03 +0000 (11:04 +0100)]
[InstCombine] Regenerate test checks (NFC)
Pick up changes in variables names.
Dmitri Gribenko [Fri, 13 Jan 2023 09:55:26 +0000 (10:55 +0100)]
[bazel] Updates for https://github.com/llvm/llvm-project/commit/
aa0883b59ae17e5465906dad51b5561b5292a28d
Matthias Springer [Fri, 13 Jan 2023 09:42:01 +0000 (10:42 +0100)]
[mlir] GreedyPatternRewriter: Add ancestors to worklist
When adding an op to the worklist, also add its ancestors to the worklist. This allows for RewritePatterns to match an op `a` based on what is inside of the body of `a`.
This change fixes a problem that became apparent with `vector.warp_execute_on_lane_0`, but could probably be triggered with similar patterns. The pattern extracts an op `b` with `eligible = true` from the body of an op `a`:
```
test.a {
%0 = test.b() {eligible = true}
yield %0
}
```
Afterwards:
```
%0 = test.b() {eligible = true}
test.a {
yield %0
}
```
The pattern is an `OpRewritePattern<OpA>`. For some reason, `test.a` is not on the GreedyPatternRewriter's worklist. E.g., because no pattern could be applied and it was removed. Now, another pattern updates `test.b`, so that `eligible` is changed from `true` to `false`. The `OpRewritePattern<OpA>` could now be applied, but (without this revision) `test.a` is still not on the worklist.
Note: In the above example, an `OpRewritePattern<OpB>` could have been used instead of an `OpRewritePattern<OpA>`. With such a design, we can run into the same problem (when the `eligible` attr is on `test.a` and `test.b` is removed from the worklist because no patterns could be applied).
Note: This change uncovered an unrelated bug in TestSCFUtils.cpp that was triggered due to a change in the order in which ops are processed. A TODO is added to the broken code and test cases are adapted so that the bug is no longer triggered.
Differential Revision: https://reviews.llvm.org/D140304
Ivan Kosarev [Fri, 13 Jan 2023 09:33:11 +0000 (09:33 +0000)]
[AMDGPU][AsmParser][NFC] Refine defining i8- and i16-typed custom operands.
Reviewed By: dp
Differential Revision: https://reviews.llvm.org/D140799
Quentin Colombet [Tue, 10 Jan 2023 10:58:01 +0000 (10:58 +0000)]
[mlir][SparseTensor] Add a few more tests for sparse vectorization
These tests cover muli, xor, and, addf, subf, and addi.
The tests themselves are not that interesting, their goal is to provide
code coverage for all the types of reductions currently supported.
NFC
Differential Revision: https://reviews.llvm.org/D141369
Nikita Popov [Thu, 15 Dec 2022 13:05:00 +0000 (14:05 +0100)]
[MemDep] Reduce block limit
The non-local MemDep analysis has a limit on the number of blocks
it will scan trying to find dependencies. The current limit of 1000
is very high, especially when we consider that each block scan can
also visit up to 100 instructions. In degenerate cases (where we
actually scan that many blocks) MemDep/GVN dominate overall
compile-time, for little benefit.
This patch reduces the limit to 200, which is probably still too
large, but at least mitigates some of the more catastrophic cases.
(For comparison, MSSA clobber walks consider up to 100
MemoryDefs/MemoryPhis, rather than 200 blocks * 100 instructions,
but these limits aren't directly comparable.)
I know that we were kind of hoping that this issue would resolve
itself in time, either by a switch to NewGVN or use of MSSA in GVN.
But I think we should still address this in the meantime.
Additionally, a switch to an MSSA-based implementation will
effectively be doing this as well, in a roundabout way (by dint of
MSSA having lower cutoffs than MDA).
Differential Revision: https://reviews.llvm.org/D140097
Francesco Petrogalli [Fri, 13 Jan 2023 09:04:39 +0000 (10:04 +0100)]
[SchedBoundary] Add dump method for resource usage.
As supporting information, I have added an example that describes how
the indexes of the vector of resources SchedBoundary::ReservedCycles
are tracked by the field SchedBoundary::ReservedCyclesIndex.
Reviewed By: jroelofs
Differential Revision: https://reviews.llvm.org/D141367
Nikita Popov [Thu, 12 Jan 2023 11:04:56 +0000 (12:04 +0100)]
[Mips] Convert some tests to opaque pointers (NFC)
I'm not sure why, but the absence of bitcasts / no-op GEPs causes
the branch delay slot to be used.
Differential Revision: https://reviews.llvm.org/D141593
Matthias Springer [Fri, 13 Jan 2023 09:23:51 +0000 (10:23 +0100)]
[mlir][affine][transform] Simplify affine.min/max ops with given constraints
This transform op uses `mlir::simplifyConstrainedMinMaxOp` to simplify `affine.min` and `affine.max` ops based on a given constraints.
Differential Revision: https://reviews.llvm.org/D140997
Sergey Kachkov [Fri, 13 Jan 2023 08:21:55 +0000 (11:21 +0300)]
[GVN][NFC] Refactor GVN::AnalyzeLoadAvailability method
Simplify AnalyzeLoadAvailability code:
1. Use std::optional for return value
2. Use range-based loop for non-local dependencies
Differential Revision: https://reviews.llvm.org/D141664
Tobias Gysi [Fri, 13 Jan 2023 09:14:21 +0000 (10:14 +0100)]
Reland "[mlir][llvm] Add an explicit void type debug info attribute."
Previously, the DISubroutineType attribute used an optional result
parameter and an optional argument types array to model the subroutine
signature. LLVM IR debug metadata, on the other hand, has one types
list whose first entry maps to the result type. That entry may be
null to model a void result type. The type list may also be entirely
empty not specifying any type information. The latter is problematic
since the current DISubroutineType attribute cannot express it.
The revision changes DISubroutineTypeAttr to closely follow the
LLVM metadata design. In particular, it uses a single types parameter
array to model the subroutine signature and introduces an explicit
DIVoidResultTypeAttr to model the null entries.
Reviewed By: Dinistro
Differential Revision: https://reviews.llvm.org/D141261
This reverts commit 81f57b6
and relands commit a960547
Fixes flang build and drop_begin on an empty array ref.
Ben Shi [Thu, 12 Jan 2023 11:58:53 +0000 (19:58 +0800)]
[clang] Redefine some AVR specific macros
Fixes https://github.com/llvm/llvm-project/issues/58855
Reviewed By: aykevl, Miss_Grape
Differential Revision: https://reviews.llvm.org/D141598
Rainer Orth [Fri, 13 Jan 2023 09:08:33 +0000 (10:08 +0100)]
[Driver] Add crtfastmath.o on Solaris if appropriate
`Flang :: Driver/fast_math.f90` `FAIL`s on Solaris because `crtfastmath.o`
is missing from the link line.
This patch adds it as appropriate.
Tested on `amd64-pc-solaris2.11`.
Differential Revision: https://reviews.llvm.org/D141596
Diana Picus [Mon, 9 Jan 2023 08:21:41 +0000 (09:21 +0100)]
MachineIRBuilder: Add buildMergeValues. NFC
Add a `buildMergeValues` method that unconditionally builds a
G_MERGE_VALUES instruction, as opposed to `buildMergeLikeInstr` which
may decide on a different opcode based on the input types.
I haven't audited all the uses of `buildMergeLikeInstr` to see if they
can be replaced with `buildMergeValues`, but I did find a couple of
obvious ones where we check that we're merging scalars right before
calling `buildMerge`.
This is a follow-up suggested in https://reviews.llvm.org/D140964
Differential Revision: https://reviews.llvm.org/D141373
Diana Picus [Mon, 9 Jan 2023 10:59:00 +0000 (11:59 +0100)]
MachineIRBuilder: Rename buildMerge. NFC
`buildMerge` may build a G_MERGE_VALUES, G_BUILD_VECTOR or
G_CONCAT_VECTORS. Rename it to `buildMergeLikeInstr`.
This is a follow-up suggested in https://reviews.llvm.org/D140964
Differential Revision: https://reviews.llvm.org/D141372
Diana Picus [Thu, 12 Jan 2023 10:01:54 +0000 (11:01 +0100)]
GlobalISel: s/Op/Instr in some places. NFC
This patch replaces `GMergeLikeOp` with `GMergeLikeInstr` and
`MachineIRBuilder::buildAssertOp` with `buildAssertInstr` in order to
remove ambiguity. Discussed in: https://reviews.llvm.org/D141372
Jean Perier [Fri, 13 Jan 2023 08:15:52 +0000 (09:15 +0100)]
[flang] Lower elemental intrinsics to hlfir.elemental
- Move the core code generating hlfir.elemental for user calls from
genUserElementalCall into a new ElementalCallBuilder class and use
C++ CRTP (curiously recursive template pattern) to implement the
parts specific to user and intrinsic call into ElementalUserCallBuilder
and ElementalIntrinsicCallBuilder. This allows sharing the core logic
to lower elemental procedures for both user defined and intrinsics
procedures.
- To allow using ElementalCallBuilder, split the intrinsic lowering code
into two parts: first lower the arguments to hlfir::Entity regardless
of the interface of the intrinsics, and then, in a different function
(genIntrinsicProcRefCore), prepare the hlfir::Entity according to the
interface. This allows using the same core logic to prepare "normal"
arguments for non-elemental intrinsics, and to prepare the elements of
array arguments inside elemental call (ElementalIntrinsicCallBuilder
calls genIntrinsicProcRefCore once it has computed the scalar actual
arguments).
To allow this split, genExprBox/genExprAddr/genExprValue logic had to
be split in ConvertExprToHlfir.[cpp/h].
- Add missing statement context pushScope/finalizeAndPop around the
code generation inside the hlfir.elemental so that any temps created
while lowering the call at the element level is correctly cleaned-up.
- One piece of code in hlfir::Entity::hasNonDefaultLowerBounds() was wrong for assumed shape arrays (returned true when an assumed shaped array had no explicit lower bounds). This caused the added test to hit a bogus TODO, so fix it.
Elemental intrinsics returning are still TODO (e.g., adjustl). I will implement this in a next patch, this one is big enough.
Differential Revision: https://reviews.llvm.org/D141612
Siva Chandra Reddy [Fri, 13 Jan 2023 07:42:49 +0000 (07:42 +0000)]
[libc][NFC] Use C headers in host CPU sniffing code.
Yingchi Long [Fri, 16 Sep 2022 14:13:54 +0000 (22:13 +0800)]
[C2x] reject type definitions in offsetof
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2350.htm made very
clear that it is an UB having type definitions with in offsetof. After
this patch clang will reject any type definitions in __builtin_offsetof.
Fixes https://github.com/llvm/llvm-project/issues/57065
```
local/offsetof.c:10:38: error: 'struct S' cannot be defined in '__builtin_offsetof'
return __builtin_offsetof(struct S{ int a, b;}, a);
^
```
Reviewed By: aaron.ballman, #clang-language-wg
Differential Revision: https://reviews.llvm.org/D133574
chenglin.bi [Fri, 13 Jan 2023 07:31:19 +0000 (15:31 +0800)]
[Instcombine] Add precommit tests for pattern xor(and, or); NFC
Carlos Galvez [Thu, 12 Jan 2023 10:10:48 +0000 (10:10 +0000)]
[clang-tidy][doc] Deprecate the AnalyzeTemporaryDtors option
It's not used anywhere, and we should not keep it
for eternity. Document it as deprecated and announce
its removal 2 releases later, so people have time
to update their .clang-tidy files.
Differential Revision: https://reviews.llvm.org/D141583
Yingchi Long [Fri, 9 Dec 2022 13:05:37 +0000 (21:05 +0800)]
[RISCV][VP] expand vp intrinsics if no +zve32x feature
If the subtarget does not support VInstructions, expand vp intrinscs to scalar instructions.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D139706
Ben Shi [Thu, 12 Jan 2023 09:21:55 +0000 (17:21 +0800)]
[AVR] Fix a bug in AsmPrinter when printing inline-asm operands
Fixes https://github.com/llvm/llvm-project/issues/58878
Reviewed By: aykevl, Miss_Grape
Differential Revision: https://reviews.llvm.org/D141589
Vitaly Buka [Fri, 13 Jan 2023 05:46:51 +0000 (21:46 -0800)]
[X86] Remove unused variable after D140087
Joshua Cao [Fri, 6 Jan 2023 04:48:13 +0000 (20:48 -0800)]
[LoopReroll] Allow for multiple loop control only induction vars
Before this, LoopReroll would fail an assertion, falsely assuming that
there can only possibly a single loop control only induction variable.
For example:
```
%a = phi i16 [ %dec2, %for.body ], [ 0, %entry ]
%b = phi i16 [ %dec1, %for.body ], [ 0, %entry ]
%a.next = add nsw i16 %1, -1
%b.next = add nsw i16 %0, -1
%add = add nsw i16 %a, %b
; ... rerollable code
%cmp.not = icmp eq i16 -10, %add
br i1 %cmp.not, label %exit, label %loop
```
Both %a and %b are valid loop control only induction vars
Additionally, some NFC changes to remove unnecessary isa<PHINode> check
Updated complex_reroll checks
Differential Revision: https://reviews.llvm.org/D141109
Noah Goldstein [Fri, 13 Jan 2023 02:30:00 +0000 (18:30 -0800)]
[X86] Improve mul x, 2^N +/- 2 pattern by making the +/- 2x compute independently to x << N
Previous pattern was omitting ops in sequence which just increases the
latency (to 3c, same as imul!) i.e:
`(add/sub (add/sub (shl x, N), x), x)`
Better is to compute 2x indepedently so x << N for better ULP i.e:
`(add/sub (shl x, N), (add x, x))`
Reviewed By: pengfei, RKSimon
Differential Revision: https://reviews.llvm.org/D141113
Noah Goldstein [Fri, 13 Jan 2023 02:24:43 +0000 (18:24 -0800)]
[X86] Replace (31/63 -/^ X) with (NOT X) and ignore (32/64 ^ X) when computing shift count
Shift count is masked by hardware so these peepholes just extend
common patterns for NOT to the lower bits of shift count.
As well (32/64 ^ X) is masked off by the shift so can be safely
ignored.
Reviewed By: pengfei, lebedev.ri
Differential Revision: https://reviews.llvm.org/D140087
Matt Arsenault [Fri, 30 Dec 2022 14:21:49 +0000 (09:21 -0500)]
AMDGPU/GlobalISel: Make regbankselect of implicit_def consistent with constants
LiDongjin [Fri, 13 Jan 2023 03:48:22 +0000 (11:48 +0800)]
[RISCV] Change the return type of getStreamer() to support the use of overloading and other functions in RISCVELFStreamer
Move the declaration of RISCVELFStreamer from RISCVELFStreamer.cpp to RISCVELFStreamer.h.
Change the return type of getStreamer() to support the use of overloading and other functions in RISCVELFStreamer.
Differential Revision: https://reviews.llvm.org/D138500
Med Ismail Bennani [Wed, 11 Jan 2023 00:28:25 +0000 (16:28 -0800)]
[lldb/test] Fix data racing issue in TestStackCoreScriptedProcess
This patch should fix an nondeterministic error in TestStackCoreScriptedProcess.
In order to test both the multithreading capability and shared library
loading in Scripted Processes, the test would create multiple threads
that would take the same variable as a reference.
The first thread would alter the value and the second thread would
monitor the value until it gets altered. This assumed a certain ordering
regarding the `std::thread` spawning, however the ordering was not
always guaranteed at runtime.
To fix that, the test now makes use of a `std::condition_variable`
shared between the each thread. On the former, it will notify the other
thread when the variable gets initialized or updated and on the latter,
it will wait until the variable it receives a new notification.
This should fix the data racing issue while preserving the testing
coverage.
rdar://
98678134
Differential Revision: https://reviews.llvm.org/D139484
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Med Ismail Bennani [Thu, 12 Jan 2023 23:30:24 +0000 (15:30 -0800)]
[lldb] Update custom commands to always be overrriden
This is a follow-up patch to
6f7835f309b9.
As explained previously, when running from an IDE, it can happen that
the IDE imports some lldb scripts by itself. If the user also tries to
import these commands, lldb will show the following message:
```
error: cannot add command: user command exists and force replace not set
```
This message is confusing to the user, because it suggests that the
command import failed and that the execution should stop. However, in
this case, lldb will continue the execution with the command added
previously by the user.
To prevent that, this patch updates every first-party lldb-packaged
custom commands to override commands that were pre-imported in lldb.
Differential Revision: https://reviews.llvm.org/D140293
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>