platform/upstream/llvm.git
2 years ago[llvm] Use make_early_inc_range (NFC)
Kazu Hirata [Tue, 16 Nov 2021 05:28:46 +0000 (21:28 -0800)]
[llvm] Use make_early_inc_range (NFC)

2 years ago[gn build] Port dc84770d559b
LLVM GN Syncbot [Tue, 16 Nov 2021 05:11:06 +0000 (05:11 +0000)]
[gn build] Port dc84770d559b

2 years ago[GlobalISel] Add a store-merging optimization pass and enable for AArch64.
Amara Emerson [Mon, 15 Nov 2021 22:45:30 +0000 (14:45 -0800)]
[GlobalISel] Add a store-merging optimization pass and enable for AArch64.

This is a first attempt at a constant value consecutive store merging pass,
a counterpart to the DAGCombiner's store merging optimization.

The high level goals of this pass:

* Have a simple and efficient algorithm. As close to linear time as we can get.
  Thus, prioritizing scalability of the algorithm over merging every corner case
  we can find. The DAGCombiner's store merging code has been the source of
  compile time and complexity issues in the past and I wanted to avoid that.
* Don't introduce any new data structures for ordering memory operations. In MIR,
  we don't have the concept of chains like we do in the DAG, and the instruction
  order is stricter than enforcing ordering with graph edges. Although I
  considered adding something similar, I couldn't justify the overhead.

The pass is current split into 3 main parts. The main store merging code focuses
on identifying candidate stores and managing the candidate group that's under
consideration for merging. Analyzing addressing of stores is a potentially
complex part and for now there's just a basic implementation to identify easy
cases. Finally, the other main bit of complexity is the alias analysis, which
tries to follow the same logic as the DAG's AA.

Currently this implementation only supports merging of constant stores. Stores
of arbitrary variables are technically possible with a very small change, but
the DAG chooses not to do this. Doing so here makes most code worse since
there's extra overhead in merging values into wider registers.

On AArch64 -Os, this optimization results in very minor savings on CTMark.

Differential Revision: https://reviews.llvm.org/D109131

2 years ago[llvm-profgen] Add switch to allow use of first loadable segment for calculating...
Wenlei He [Fri, 12 Nov 2021 02:28:47 +0000 (18:28 -0800)]
[llvm-profgen] Add switch to allow use of first loadable segment for calculating offset

Adding `-use-loadable-segment-as-base` to allow use of first loadable segment for calculating offset. By default first executable segment is used for calculating offset. The switch helps compatibility with unsymbolized profile generated from older tools.

Differential Revision: https://reviews.llvm.org/D113727

2 years agoAdd the stop count to "statistics dump" in each target's dictionary.
Greg Clayton [Fri, 12 Nov 2021 23:26:27 +0000 (15:26 -0800)]
Add the stop count to "statistics dump" in each target's dictionary.

It is great to know how many times the target has stopped over its lifetime as each time the target stops, and possibly resumes without the user seeing it for things like shared library loading and signals that are not notified and auto continued, to help explain why a debug session might be slow. This is now included as "stopCount" inside each target JSON.

Differential Revision: https://reviews.llvm.org/D113810

2 years ago[RISCV] Override TargetLowering::hasAndNot for Zbb.
Craig Topper [Tue, 16 Nov 2021 01:45:16 +0000 (17:45 -0800)]
[RISCV] Override TargetLowering::hasAndNot for Zbb.

Differential Revision: https://reviews.llvm.org/D113937

2 years ago[RISCV] Add test cases to prepare for overring TargetLowering::hasAndNot. NFC
Craig Topper [Tue, 16 Nov 2021 01:32:15 +0000 (17:32 -0800)]
[RISCV] Add test cases to prepare for overring TargetLowering::hasAndNot. NFC

These test files are copied directly from AArch64. Some of the cases
may benefit from ANDN with the Zbb extension. Somes cases already
improve use ANDN.

selectcc-to-shiftand.ll also contains tests that test select->and
conversion even when a ANDN isn't needed. I think this improves our
coverage of these optimizations.

Differential Revision: https://reviews.llvm.org/D113935

2 years ago[X86] Fix crash with inline asm using wrong register name
Fabian Wolff [Tue, 16 Nov 2021 00:54:31 +0000 (08:54 +0800)]
[X86] Fix crash with inline asm using wrong register name

Fixes PR#48678. `X86TargetLowering::getRegForInlineAsmConstraint()` can adjust the register class to match the type, e.g. change `VR128X` to `VR256X` if the type needs 256 bits. However, the function currently returns the unadjusted register and the adjusted register class, e.g. `xmm15` and `VR256X`, which then causes an assertion failure later because the register class does not contain that register. This patch fixes this behavior.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D113834

2 years agoAMDGPU: Mark prolog/epilog SCC defs as dead
Matt Arsenault [Sat, 13 Nov 2021 17:03:44 +0000 (12:03 -0500)]
AMDGPU: Mark prolog/epilog SCC defs as dead

A future change will add SCC liveness checks. Since we are still
relying on forward register scavenging, add dead flags to avoid
spuriously detecting SCC as live.

2 years agoAMDGPU: Regenerate test checks
Matt Arsenault [Sat, 13 Nov 2021 17:13:58 +0000 (12:13 -0500)]
AMDGPU: Regenerate test checks

2 years ago[mlir][linalg][bufferize][NFC] Clean up tensor op bufferization
Matthias Springer [Tue, 16 Nov 2021 02:10:57 +0000 (11:10 +0900)]
[mlir][linalg][bufferize][NFC] Clean up tensor op bufferization

Differential Revision: https://reviews.llvm.org/D113730

2 years ago[clang] NFC: rename internal `IsPossiblyOpaquelyQualifiedType` overload
Matheus Izvekov [Tue, 16 Nov 2021 00:49:30 +0000 (01:49 +0100)]
[clang] NFC: rename internal `IsPossiblyOpaquelyQualifiedType` overload

Rename `IsPossiblyOpaquelyQualifiedType` overload taking a Type*
as `IsPossiblyOpaquelyQualifiedTypeInternal` instead.

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D113954

2 years agoDebugInfo: Make DWARFExpression::iterator::skipBytes() const, NFC
Duncan P. N. Exon Smith [Tue, 16 Nov 2021 01:18:07 +0000 (17:18 -0800)]
DebugInfo: Make DWARFExpression::iterator::skipBytes() const, NFC

Given that DWARFExpression::iterator::skipBytes() doesn't change any
state (it returns a new `iterator`), it might as well be
`const`-qualified.

2 years agoDebugInfo: const-qualify accessors of DWARFExpression::Operation
Duncan P. N. Exon Smith [Tue, 16 Nov 2021 01:05:09 +0000 (17:05 -0800)]
DebugInfo: const-qualify accessors of DWARFExpression::Operation

Add `const` to DWARFExpression::Operation's accessors and make
Operation::extract() private, since it's only used by the friend class
DWARFExpression::iterator.

2 years agoDebugInfo: Make DWARFExpression::iterator::operator++ return itself
Duncan P. N. Exon Smith [Tue, 16 Nov 2021 00:42:48 +0000 (16:42 -0800)]
DebugInfo: Make DWARFExpression::iterator::operator++ return itself

Looks like an accident that `operator++` was returning `Operator&`
instead of `iterator&`. Update to match standard iterator behaviour.

2 years ago[DAGCombiner] Prevent unfoldMaskedMerge from creating an AND with two inverted inputs.
Craig Topper [Mon, 15 Nov 2021 23:16:37 +0000 (15:16 -0800)]
[DAGCombiner] Prevent unfoldMaskedMerge from creating an AND with two inverted inputs.

It's possible that the mask is already a NOT. At least if InstCombine
hasn't canonicalized the input. In that case we will form an ANDN with
X instead of with Y. So we don't need to worry about Y being a constant.

We might need to check that X isn't a constant instead, but we don't
have a test case for that yet.

This fixes a size regression found when trying to enable this combine
for RISCV in D113937.

Differential Revision: https://reviews.llvm.org/D113948

2 years agoadd tsan shared lib
ZijunZhao [Wed, 1 Sep 2021 21:52:25 +0000 (21:52 +0000)]
add tsan shared lib

Change-Id: Ic83ff1ec86d6a7d61b07fa3df7e0cb2790b5ebc7

2 years ago[LLDB][NativePDB] Fix local-variables.cpp failure on windows bots
Zequan Wu [Tue, 16 Nov 2021 00:14:49 +0000 (16:14 -0800)]
[LLDB][NativePDB] Fix local-variables.cpp failure on windows bots

2 years ago[Bazel] Enable layering_check for MLIR build
Geoffrey Martin-Noble [Mon, 15 Nov 2021 23:31:08 +0000 (15:31 -0800)]
[Bazel] Enable layering_check for MLIR build

This feature checks that headers included by a file are provided by a
header exported by one of the direct dependencies of the build rule in
which it is contained. It ensures that appropriate layering (a goal of
the LLVM project) is preserved. So far, I'm only adding this to MLIR
because we've had it turned on internally since the beginning, so MLIR
is already layering clean. It would be nice to also enable it for LLVM,
but that requires some additional cleanup.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D113952

2 years ago[InstSimplify] Fold A|B | (A^B) --> A|B
Mehrnoosh Heidarpour [Mon, 15 Nov 2021 23:54:07 +0000 (18:54 -0500)]
[InstSimplify] Fold A|B | (A^B) --> A|B

This patch adds the following fold opportunity:
A|B | (A^B) --> A|B

that is reported here : https://bugs.llvm.org/show_bug.cgi?id=52479

https://alive2.llvm.org/ce/z/33-My-

Test cases with base results are added in D113860

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D113861

2 years ago[RISCV] Optimize immediate materialisation with SH*ADD
Ben Shi [Thu, 11 Nov 2021 12:57:33 +0000 (12:57 +0000)]
[RISCV] Optimize immediate materialisation with SH*ADD

Use LUI+SH*ADD+ADDI to compose specific immediates.

Reviewed By: craig.topper, luismarques

Differential Revision: https://reviews.llvm.org/D113568

2 years ago[RISCV][test] Add more tests of immediate materialisation
Ben Shi [Wed, 10 Nov 2021 07:40:13 +0000 (07:40 +0000)]
[RISCV][test] Add more tests of immediate materialisation

Reviewed By: craig.topper, luismarques

Differential Revision: https://reviews.llvm.org/D113567

2 years ago[mlir][sparse] first version of "truly" dynamic sparse tensors as outputs of kernels
Aart Bik [Thu, 11 Nov 2021 18:05:01 +0000 (10:05 -0800)]
[mlir][sparse] first version of "truly" dynamic sparse tensors as outputs of kernels

This revision contains all "sparsification" ops and rewriting necessary to support sparse output tensors when the kernel has no reduction (viz. insertions occur in lexicographic order and are "injective"). This will be later generalized to allow reductions too. Also, this first revision only supports sparse 1-d tensors (viz. vectors) as output in the runtime support library. This will be generalized to n-d tensors shortly. But this way, the revision is kept to a manageable size.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D113705

2 years ago[tosa][mlir] Refactor tosa.reshape lowering to linalg for dynamic cases.
natashaknk [Mon, 15 Nov 2021 23:10:36 +0000 (15:10 -0800)]
[tosa][mlir] Refactor tosa.reshape lowering to linalg for dynamic cases.

Split tosa.reshape into three individual lowerings: collapse, expand and a
combination of both. Add simple dynamic shape support.

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D113936

2 years agoRevert "[InstSimplify] Fold A|B | (A^B) --> A|B"
Stanislav Mekhanoshin [Mon, 15 Nov 2021 22:56:20 +0000 (14:56 -0800)]
Revert "[InstSimplify] Fold A|B | (A^B) --> A|B"

This reverts commit 193c40e9667ca2b173232b393fc72ea9e4944aa3.

2 years agoAdd `isInitCapture` and `forEachLambdaCapture` matchers.
James King [Mon, 15 Nov 2021 18:56:22 +0000 (18:56 +0000)]
Add `isInitCapture` and `forEachLambdaCapture` matchers.

This contributes follow-up work from https://reviews.llvm.org/D112491, which
allows for increased control over the matching of lambda captures. This also
updates the documentation for the `lambdaCapture` matcher.

Reviewed By: ymandel, aaron.ballman

Differential Revision: https://reviews.llvm.org/D113575

2 years ago[llvm-reduce] Don't reuse SmallVector across calls to getAllMetadata()
Arthur Eubanks [Fri, 12 Nov 2021 23:10:59 +0000 (15:10 -0800)]
[llvm-reduce] Don't reuse SmallVector across calls to getAllMetadata()

The SmallVector is not cleared in calls to getAllMetadata().

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D113808

2 years ago[mlir][tosa] Add tosa.mul by one canonicalization
not-jenni [Mon, 15 Nov 2021 22:44:13 +0000 (14:44 -0800)]
[mlir][tosa] Add tosa.mul by one canonicalization

Multiply by one can be removed during canonicalization. This optimizes away unneeded operations.

Differential Revision: https://reviews.llvm.org/D113807

2 years ago[NewPM] Only invalidate modified functions' analyses in CGSCC passes + turn on eagerl...
Arthur Eubanks [Mon, 3 May 2021 23:50:26 +0000 (16:50 -0700)]
[NewPM] Only invalidate modified functions' analyses in CGSCC passes + turn on eagerly invalidate analyses

Previously, any change in any function in an SCC would cause all
analyses for all functions in the SCC to be invalidated. With this
change, we now manually invalidate analyses for functions we modify,
then let the pass manager know that all function analyses should be
preserved since we've already handled function analysis invalidation.

So far this only touches the inliner, argpromotion, function-attrs, and
updateCGAndAnalysisManager(), since they are the most used.

This is part of an effort to investigate running the function
simplification pipeline less on functions we visit multiple times in the
inliner pipeline.

However, this causes major memory regressions especially on larger IR.
To counteract this, turn on the option to eagerly invalidate function
analyses. This invalidates analyses on functions immediately after
they're processed in a module or scc to function adaptor for specific
parts of the pipeline.

Within an SCC, if a pass only modifies one function, other functions in
the SCC do not have their analyses invalidated, so in later function
passes in the SCC pass manager the analyses may still be cached. It is
only after the function passes that the eager invalidation takes effect.
For the default pipelines this makes sense because the inliner pipeline
runs the function simplification pipeline after all other SCC passes
(except CoroSplit which doesn't request any analyses).

Overall this has mostly positive effects on compile time and positive effects on memory usage.
https://llvm-compile-time-tracker.com/compare.php?from=7f627596977624730f9298a1b69883af1555765e&to=39e824e0d3ca8a517502f13032dfa67304841c90&stat=instructions
https://llvm-compile-time-tracker.com/compare.php?from=7f627596977624730f9298a1b69883af1555765e&to=39e824e0d3ca8a517502f13032dfa67304841c90&stat=max-rss

D113196 shows that we slightly regressed compile times in exchange for
some memory improvements when turning on eager invalidation.  D100917
shows that we slightly improved compile times in exchange for major
memory regressions in some cases when invalidating less in SCC passes.
Turning these on at the same time keeps the memory improvements while
keeping compile times neutral/slightly positive.

Reviewed By: asbirlea, nikic

Differential Revision: https://reviews.llvm.org/D113304

2 years ago[clang] retain type sugar in auto / template argument deduction
Matheus Izvekov [Fri, 12 Nov 2021 23:40:18 +0000 (00:40 +0100)]
[clang] retain type sugar in auto / template argument deduction

This implements the following changes:
* AutoType retains sugared deduced-as-type.
* Template argument deduction machinery analyses the sugared type all the way
down. It would previously lose the sugar on first recursion.
* Undeduced AutoType will be properly canonicalized, including the constraint
template arguments.
* Remove the decltype node created from the decltype(auto) deduction.

As a result, we start seeing sugared types in a lot more test cases,
including some which showed very unfriendly `type-parameter-*-*` types.

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith, #libc, ldionne

Differential Revision: https://reviews.llvm.org/D110216

2 years ago[JITLink] Fix splitBlock if there are symbols span across the boundary
Steven Wu [Mon, 15 Nov 2021 21:55:06 +0000 (13:55 -0800)]
[JITLink] Fix splitBlock if there are symbols span across the boundary

Fix `splitBlock` so that it can handle the case when the block being
split has symbols span across the split boundary. This is an error
case in general but for EHFrame splitting on macho platforms, there is an
anonymous symbol that marks the entire block. Current implementation
will leave a symbol that is out of bound of the underlying block. Fix
the problem by dropping such symbols when the block is split.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D113912

2 years ago[InstSimplify] Fold A|B | (A^B) --> A|B
Stanislav Mekhanoshin [Mon, 15 Nov 2021 20:24:44 +0000 (12:24 -0800)]
[InstSimplify] Fold A|B | (A^B) --> A|B

This patch adds the following fold opportunity:
A|B | (A^B) --> A|B

that is reported here : https://bugs.llvm.org/show_bug.cgi?id=52479

https://alive2.llvm.org/ce/z/33-My-

Test cases with base results are added in D113860

(authored by MehrHeidar, committed by rampitec).

Differential Revision:  https://reviews.llvm.org/D113861

2 years ago[SystemZ] Support symbolic displacements.
Jonas Paulsson [Fri, 5 Nov 2021 17:51:13 +0000 (18:51 +0100)]
[SystemZ] Support symbolic displacements.

This patch adds support for symbolic displacements, e.g. like 'lg %r0,
sym(%r1)', which is done using relocations. This is needed to compile the
kernel without disabling the integrated assembler.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D113341

2 years ago[mlir][Vector] Make vector.shape_cast based size-1 foldings opt-in and separate.
Nicolas Vasilache [Mon, 15 Nov 2021 21:17:24 +0000 (21:17 +0000)]
[mlir][Vector] Make vector.shape_cast based size-1 foldings opt-in and separate.

This is in prevision of dropping them altogether and using insert/extract based patterns.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D113928

2 years ago[NFC][Regalloc] Factor types that would be used by the eviction advisor
Mircea Trofin [Mon, 15 Nov 2021 19:42:04 +0000 (11:42 -0800)]
[NFC][Regalloc] Factor types that would be used by the eviction advisor

This is in prepartion of pulling the eviction decision-making into an
analysis pass, which would then allow swapping that decision making
process.

RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-November/153639.html

Differential Revision: https://reviews.llvm.org/D113929

2 years ago[X86] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off build. NFC
Fangrui Song [Mon, 15 Nov 2021 21:10:47 +0000 (13:10 -0800)]
[X86] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off build. NFC

2 years ago[llvm] adapt DWARFExpression.h for 6b9b86db9dd974585a5c71cf2e5231d1e3385f7c
Krasimir Georgiev [Mon, 15 Nov 2021 20:40:37 +0000 (21:40 +0100)]
[llvm] adapt DWARFExpression.h for 6b9b86db9dd974585a5c71cf2e5231d1e3385f7c

No functional changes intended.

Updated the iterator class for https://github.com/llvm/llvm-project/commit/6b9b86db9dd974585a5c71cf2e5231d1e3385f7c.

Differential Revision: https://reviews.llvm.org/D113934

2 years ago[unroll-runtime] Relax two profitability limitations on multi-exit unrolling
Philip Reames [Mon, 15 Nov 2021 20:30:48 +0000 (12:30 -0800)]
[unroll-runtime] Relax two profitability limitations on multi-exit unrolling

This change is mostly about getting rid of some "uninteresting" cases in a follow on deeper heuristic change.  If anyone sees actually interesting code differences out of this, please let me know.  I'm not expecting this to have much impact at all.

Case 1 - With the single deoptimize non-latch exit, we can't have two exiting blocks sharing an exit block.  We can only hit this with a poorly documented debug flag.

Case 2 - Why should we treat epilog cases differently from prolog cases?  Or to say it differently, why should starting with a constant control whether a multiple exit loop gets unrolled?

Sorry for the lack of tests here.  These are both *exceedingly* narrow cases in practice, and after a while trying, I couldn't come up with a test which did anything "useful" as opposed to simply exercise a random combination of force flags.  Note that the legality cases for each are already exercised with force flags.

2 years ago[InstrProf][NFC] Fix a few typos in code comments.
Snehasish Kumar [Mon, 15 Nov 2021 20:53:47 +0000 (12:53 -0800)]
[InstrProf][NFC] Fix a few typos in code comments.

2 years ago[mlir][Linalg] Add a DownscaleDepthwiseConv2DNhwcHwcOp decomposition pattern.
Nicolas Vasilache [Mon, 15 Nov 2021 20:39:37 +0000 (20:39 +0000)]
[mlir][Linalg] Add a DownscaleDepthwiseConv2DNhwcHwcOp decomposition pattern.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D113907

2 years ago[asm] Make EmitMSInlineAsmStr and EmitGCCInlineAsmStr more alike
Nico Weber [Mon, 15 Nov 2021 18:55:05 +0000 (13:55 -0500)]
[asm] Make EmitMSInlineAsmStr and EmitGCCInlineAsmStr more alike

https://reviews.llvm.org/D71677 copied a bunch of code from
EmitGCCInlineAsmStr() to EmitMSInlineAsmStr() but made a few small
(likely unintentional) changes. This makes these pieces look the same.

No behavior change.

(Why are these functions two copies? No great reason as far as I can tell.
https://reviews.llvm.org/rG1778831a3d1d24ab6545635f63da4d9c5f8f0ac7 did the
split; we might want to undo them at some point. But PR23933 suggests
that a bigger change is planned for this file in the future, so keeping
this incremental for now.)

Differential Revision: https://reviews.llvm.org/D113924

2 years ago[asm] Convert AsmPrinter::PrintSpecial() to StringRef
Nico Weber [Mon, 15 Nov 2021 17:32:12 +0000 (12:32 -0500)]
[asm] Convert AsmPrinter::PrintSpecial() to StringRef

No behavior change.

Differential Revision: https://reviews.llvm.org/D113911

2 years ago[asm] Correctly handle special names in variants
Nico Weber [Mon, 15 Nov 2021 17:09:09 +0000 (12:09 -0500)]
[asm] Correctly handle special names in variants

There's really no reason why anyone should use these special names in a variant.
I noticed this while reading the code: all other writes to OS are guarded by
this conditional, and the behavior with the check seems more correct, so
let's add the check.

Differential Revision: https://reviews.llvm.org/D113909

2 years ago[PowerPC] Fix 32bit vector insert instructions for ISA3.1
Lei Huang [Fri, 12 Nov 2021 21:05:52 +0000 (15:05 -0600)]
[PowerPC] Fix 32bit vector insert instructions for ISA3.1

The platform independent ISD::INSERT_VECTOR_ELT take a element index,
but vins* instructions take a byte index. Update 32bit td patterns for
vector insert to handle the element index accordingly.

Since vector insert for non constant index are supported in
ISA3.1, there is no need to use platform specific ISD node,
PPCISD::VECINSERT.  Update td pattern to directly use
ISD::INSERT_VECTOR_ELT instead.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D113802

2 years ago[NFC][lld] Inclusive language: change master file to merged file
Quinn Pham [Mon, 15 Nov 2021 15:54:41 +0000 (09:54 -0600)]
[NFC][lld] Inclusive language: change master file to merged file

[NFC] As part of using inclusive language within the llvm project, this patch
replaces master with merged in these comments.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D113903

2 years ago[LLDB][NativePDB] Fix image lookup by address
Zequan Wu [Sat, 13 Nov 2021 01:38:05 +0000 (17:38 -0800)]
[LLDB][NativePDB] Fix image lookup by address

`image lookup -a ` doesn't work because the compilands list is always empty.
Create CU at given index if doesn't exit.

Differential Revision: https://reviews.llvm.org/D113821

2 years agoMaking the code compliant to the documentation about Floating Point
Zahira Ammarguellat [Mon, 15 Nov 2021 19:31:59 +0000 (14:31 -0500)]
Making the code compliant to the documentation about Floating Point
support default values for C/C++. FPP-MODEL=PRECISE enables FFP-CONTRACT
(FMA is enabled).

2 years agoFix a misleading FIXME in an unroll test
Philip Reames [Mon, 15 Nov 2021 20:19:53 +0000 (12:19 -0800)]
Fix a misleading FIXME in an unroll test

2 years ago[SLP][NFC]Add a test for extra shuffle emission, NFC.
Alexey Bataev [Mon, 15 Nov 2021 20:09:32 +0000 (12:09 -0800)]
[SLP][NFC]Add a test for extra shuffle emission, NFC.

2 years ago[llvm][fix] Inclusive language: replace master with main in find_interesting_reviews.py
Quinn Pham [Mon, 15 Nov 2021 18:38:26 +0000 (12:38 -0600)]
[llvm][fix] Inclusive language: replace master with main in find_interesting_reviews.py

As part of using inclusive language within the llvm project and to fix
the command because the master branch was renamed, this patch replaces
master with main in `find_interesting_reviews.py`.

Reviewed By: kristof.beyls

Differential Revision: https://reviews.llvm.org/D113918

2 years ago[runtime-unroll] Inline canSafelyUnrollMultiExitLoop [NFC]
Philip Reames [Mon, 15 Nov 2021 19:38:06 +0000 (11:38 -0800)]
[runtime-unroll] Inline canSafelyUnrollMultiExitLoop [NFC]

All of the interesting logic from this routine has been removed, inline the single check into the sole non-assert caller.  The assert use has little value with the restructured code and is simply dropped.

2 years ago[mlir][linalg] Make loop ops in TileLoopNest accessible
Lei Zhang [Mon, 15 Nov 2021 19:33:31 +0000 (14:33 -0500)]
[mlir][linalg] Make loop ops in TileLoopNest accessible

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D113927

2 years ago[PatternMatch] Add m_BinOp/m_c_BinOp with specific opcode
Stanislav Mekhanoshin [Tue, 9 Nov 2021 20:22:03 +0000 (12:22 -0800)]
[PatternMatch] Add m_BinOp/m_c_BinOp with specific opcode

Differential Revision: https://reviews.llvm.org/D113508

2 years ago[msan] Disabled test failing on new GLIBC
Vitaly Buka [Mon, 15 Nov 2021 19:18:40 +0000 (11:18 -0800)]
[msan] Disabled test failing on new GLIBC

2 years ago[runtime-unroll] Restructure if-clause to improve readability [NFC]
Philip Reames [Mon, 15 Nov 2021 19:13:27 +0000 (11:13 -0800)]
[runtime-unroll] Restructure if-clause to improve readability [NFC]

2 years ago[SLP][DOT][NFCI]Output all scalars for the splats, not only the first one.
Alexey Bataev [Mon, 15 Nov 2021 18:53:39 +0000 (10:53 -0800)]
[SLP][DOT][NFCI]Output all scalars for the splats, not only the first one.

2 years agoRevert "[llvm][ubsan] Inclusive language: replace use of blacklist HandleLLVMOptions...
Zarko Todorovski [Mon, 15 Nov 2021 18:54:15 +0000 (18:54 +0000)]
Revert "[llvm][ubsan] Inclusive language: replace use of blacklist HandleLLVMOptions.cmake"

This reverts commit 44a64afd43943ed6f47c37f61a6cd2e99c7287f3.

2 years ago[AIX][llvm-go] AIX linker does not recognize `-rpath`
Steven Wan [Mon, 15 Nov 2021 18:12:59 +0000 (13:12 -0500)]
[AIX][llvm-go] AIX linker does not recognize `-rpath`

The existing logic adds `-rpath` to CGO_LDFLAGS, which is not a valid linker option on AIX. This patch substitutes it with `-blibpath` on AIX.

Reviewed By: daltenty

Differential Revision: https://reviews.llvm.org/D113704

2 years ago[analyzer][NFC] Separate CallDescription from CallEvent
Balazs Benics [Mon, 15 Nov 2021 18:10:46 +0000 (19:10 +0100)]
[analyzer][NFC] Separate CallDescription from CallEvent

`CallDescriptions` deserve its own translation unit.
This patch simply moves the corresponding parts.
Also includes the `CallDescription.h` where it's necessary.

Reviewed By: martong, xazax.hun, Szelethus

Differential Revision: https://reviews.llvm.org/D113587

2 years ago[X86] Add generic splitVectorOp helper. NFC
Simon Pilgrim [Mon, 15 Nov 2021 17:57:33 +0000 (17:57 +0000)]
[X86] Add generic splitVectorOp helper. NFC

Update splitVectorIntUnary/splitVectorIntBinary to use this internally, after some operand type sanity checks.

Avoid code duplication and makes it easier to split other vector instruction forms in the future.

2 years ago[RISCV] Teach needVSETVLIPHI to handle mask register instructions.
Craig Topper [Mon, 15 Nov 2021 17:49:22 +0000 (09:49 -0800)]
[RISCV] Teach needVSETVLIPHI to handle mask register instructions.

This handles the case where the mask register instruction input
comes from a Phi of vsetvlis. If the VLMAX is the same as the VLMAX
required by the mask register instruction, we can avoid a vsetvli.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D113204

2 years ago[flang] Allow implicit procedure pointers to associate with explicit procedures
Peter Steinfeld [Thu, 11 Nov 2021 18:46:26 +0000 (10:46 -0800)]
[flang] Allow implicit procedure pointers to associate with explicit procedures

Section 10.2.2.4, paragraph 3 states that, for procedure pointer assignment:
  If the pointer object has an explicit interface, its characteristics shall be
  the same as the pointer target ...

Thus, it's illegal for a procedure pointer with an explicit interface to be
associated with a procedure whose interface is implicit.  However, there's no
prohibition that disallows a procedure pointer with an implicit interface from
being associated with a procedure whose interface is explicit.

We were incorrectly emitting an error message for this latter case.

We were also not covering the case of procedures with explicit
interfaces where calling them requires the use of a descriptor.  Such
procedures cannot be associated with procedure pointers with implicit
interfaces.

Differential Revision: https://reviews.llvm.org/D113706

2 years ago[NFC][X86][Costmodel] Improve test coverage for {i8,i16,i32,i64}->i1 vector trunc
Roman Lebedev [Mon, 15 Nov 2021 17:46:14 +0000 (20:46 +0300)]
[NFC][X86][Costmodel] Improve test coverage for {i8,i16,i32,i64}->i1 vector trunc

2 years ago[NFC][X86][Costmodel] Improve test coverage for i1->{i8,i16,i32,i64} vector *ext
Roman Lebedev [Mon, 15 Nov 2021 17:33:11 +0000 (20:33 +0300)]
[NFC][X86][Costmodel] Improve test coverage for i1->{i8,i16,i32,i64} vector *ext

2 years ago[InstCombine] Fold (A^B)|~A-->~(A&B)
Mehrnoosh Heidarpour [Mon, 15 Nov 2021 17:20:46 +0000 (12:20 -0500)]
[InstCombine] Fold (A^B)|~A-->~(A&B)

https://alive2.llvm.org/ce/z/2v6rhF

Fixes:
https://llvm.org/PR52478

Differential Revision: https://reviews.llvm.org/D113783

2 years ago[analyzer] Fix region cast between the same types with different qualifiers.
Denys Petrov [Tue, 9 Nov 2021 12:20:23 +0000 (14:20 +0200)]
[analyzer] Fix region cast between the same types with different qualifiers.

Summary: Specifically, this fixes the case when we get an access to array element through the pointer to element. This covers several FIXME's. in https://reviews.llvm.org/D111654.
Example:
  const int arr[4][2];
  const int *ptr = arr[1]; // Fixes this.
The issue is that `arr[1]` is `int*` (&Element{Element{glob_arr5,1 S64b,int[2]},0 S64b,int}), and `ptr` is `const int*`. We don't take qualifiers into account. Consequently, we doesn't match the types as the same ones.

Differential Revision: https://reviews.llvm.org/D113480

2 years ago[X86] LowerFunnelShift - pull out repeated EltSizeInBits variable. NFC.
Simon Pilgrim [Mon, 15 Nov 2021 16:34:05 +0000 (16:34 +0000)]
[X86] LowerFunnelShift - pull out repeated EltSizeInBits variable. NFC.

2 years ago[NFC][X86][Costmodel] Add i1 replication shuffle costmodel test coverage
Roman Lebedev [Mon, 15 Nov 2021 16:57:30 +0000 (19:57 +0300)]
[NFC][X86][Costmodel] Add i1 replication shuffle costmodel test coverage

2 years ago[PatternMatch] Add a new m_Any that binds a value.
Chris Lattner [Mon, 15 Nov 2021 16:21:49 +0000 (08:21 -0800)]
[PatternMatch] Add a new m_Any that binds a value.

This is analogous to what LLVM's PatternMatch.h supports,
but LLVM calls it m_Value for both the binding and
nonbinding versions.

This is an upstream from CIRCT and is used there.

Differential Revision: https://reviews.llvm.org/D113905

2 years ago[llvm][ubsan] Inclusive language: replace use of blacklist HandleLLVMOptions.cmake
Zarko Todorovski [Mon, 15 Nov 2021 15:32:02 +0000 (15:32 +0000)]
[llvm][ubsan] Inclusive language: replace use of blacklist HandleLLVMOptions.cmake

This patch changes it to ignorelist and contains a filename change for the
.txt file that's called.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D113689

2 years ago[x86] fold vector (X > -1) & Y to shift+andn (2nd try)
Sanjay Patel [Mon, 15 Nov 2021 15:47:11 +0000 (10:47 -0500)]
[x86] fold vector (X > -1) & Y to shift+andn (2nd try)

The first try at this patch ( bf5748a1af0d ) was reverted ( 5be64d416481 )
because it could crash. The cause of that problem was failing to account
for the optional peek-through-bitcast in the enclosing function.

This version of the patch adds a clause to avoid the fold in case of
bitcasts because it is unlikely to be profitable in that scenario.

A test case based on https://llvm.org/PR52504 was added to make sure
we don't have that problem again.

Original commit message:

and (pcmpgt X, -1), Y --> pandn (vsrai X, BitWidth-1), Y

This avoids the -1 constant vector in favor of an arithmetic shift
instruction if it exists (the ISA is still not complete after all
these years...).

We catch this pattern late in combining by matching PCMPGT, so it
should not interfere with more general folds.

Differential Revision: https://reviews.llvm.org/D113603

2 years ago[x86] add test for vector signbit mask fold (PR52504); NFC
Sanjay Patel [Mon, 15 Nov 2021 15:08:47 +0000 (10:08 -0500)]
[x86] add test for vector signbit mask fold (PR52504); NFC

This goes with D113603 -
which was reverted because it could crash on this and similar examples.

2 years ago[X86][Costmodel] `getReplicationShuffleCost()`: promote 8 bit-wide elements to 32...
Roman Lebedev [Mon, 15 Nov 2021 15:55:45 +0000 (18:55 +0300)]
[X86][Costmodel] `getReplicationShuffleCost()`: promote 8 bit-wide elements to 32 bit when no AVX512VBMI

Currently `X86TTIImpl::getInterleavedMemoryOpCostAVX512()` asks about i8 elt type,
so this change does affect vectorization. In the end, it will ask about i1.

We should also try to promote to i16 if we have AVX512BW, i'll do that in a follow-up.
All costs here look good, i've added the missing truncation costs in preparatory patches.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D113853

2 years ago[X86][Costmodel] `trunc v32i16 to v64i8` can appear after legalization, cost is same...
Roman Lebedev [Mon, 15 Nov 2021 15:55:44 +0000 (18:55 +0300)]
[X86][Costmodel] `trunc v32i16 to v64i8` can appear after legalization, cost is same as for `trunc v32i16 to v32i8`

Some of the costs get larger here,
but i suppose that makes sense since we'd previously query
scalarization costs that may not be really representative of the reality.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D113852

2 years ago[X86][Costmodel] `trunc v8i64 to v16i8/v32i8/v64i8` can appear after legalization...
Roman Lebedev [Mon, 15 Nov 2021 15:55:39 +0000 (18:55 +0300)]
[X86][Costmodel] `trunc v8i64 to v16i8/v32i8/v64i8` can appear after legalization, cost is same as for `trunc v8i64 to v8i8`

While this one is trivial and identical to the previous patch,
there is a weird cost change in a follow-up patch that i'm not sure about.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D113851

2 years ago[X86][Costmodel] `trunc v16i32 to v32i8/v64i8` can appear after legalization, cost...
Roman Lebedev [Mon, 15 Nov 2021 15:55:35 +0000 (18:55 +0300)]
[X86][Costmodel] `trunc v16i32 to v32i8/v64i8` can appear after legalization, cost is same as for `trunc v16i32 to v16i8`

While this one is trivial and identical to the previous patch,
there is a weird cost change in a follow-up patch that i'm not sure about.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D113850

2 years ago[Flang] Add the FIR LLVMPointer Type
Kiran Chandramohan [Sat, 13 Nov 2021 18:02:16 +0000 (18:02 +0000)]
[Flang] Add the FIR LLVMPointer Type

Add a fir.llvm_ptr type to allow any level of indirections

Currently, fir pointer types (fir.ref, fir.ptr, and fir.heap) carry
a special Fortran semantics, and cannot be freely combined/nested.

When implementing some features, lowering sometimes needs more liberty
regarding the number of indirection levels. Add a fir.llvm_ptr that has
no constraints.

Allow its usage in fir.coordinate_op, fir.load, and fir.store.

Convert the FIR LLVMPointer to an LLVMPointer in the LLVM dialect.

Reviewed By: clementval

Differential Revision: https://reviews.llvm.org/D113755

Co-authored-by: Jean Perier <jperier@nvidia.com>
2 years ago[openmp][amdgpu] Add comment warning that libm may be broken
Jon Chesterfield [Mon, 15 Nov 2021 15:56:00 +0000 (15:56 +0000)]
[openmp][amdgpu] Add comment warning that libm may be broken

Using llvm-link to add rocm device-libs probably doesn't work

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D112639

2 years ago[X86] getAVX512Node() - find constant broadcasts to encourage load-folding
Simon Pilgrim [Mon, 15 Nov 2021 15:51:02 +0000 (15:51 +0000)]
[X86] getAVX512Node() - find constant broadcasts to encourage load-folding

If an operand is a bitcasted or widended constant, try to more aggressively create broadcastable constants for folding, which in particular helps non-VLX modes.

I've refactored getAVX512Node so that VLX targets can make better use of this as well.

NOTE: In the future, I think we should consider removing the broadcast of constant data from DAG entirely and move this to either X86InstrInfo::foldMemoryOperand or a new pass - AVX1/2 targets has similar problems with missed (whole vector) folds that need to be improved as well.

Differential Revision: https://reviews.llvm.org/D113845

2 years ago[SLP]Improve splat detection.
Alexey Bataev [Fri, 12 Nov 2021 15:11:51 +0000 (07:11 -0800)]
[SLP]Improve splat detection.

A bunch of scalars can be treated as a splat not only if all elements
are the same but also if some of them are undefvalues.

Differential Revision: https://reviews.llvm.org/D113774

2 years ago[clang] Allow clang-check to customize analyzer output file or dir name
Ella Ma [Mon, 15 Nov 2021 15:47:39 +0000 (16:47 +0100)]
[clang] Allow clang-check to customize analyzer output file or dir name

Required by https://stackoverflow.com/questions/58073606

As the output argument is stripped out in the clang-check tool, it seems impossible for clang-check users to customize the output file name, even with -extra-args and -extra-arg-before.

This patch adds the -analyzer-output-path argument to allow users to adjust the output name. And if the argument is not set or the analyzer is not enabled, the original strip output adjuster will remove the output arguments.

Differential Revision: https://reviews.llvm.org/D97265

2 years ago[fir] Add fir.global_len conversion placeholder
Valentin Clement [Mon, 15 Nov 2021 15:47:04 +0000 (16:47 +0100)]
[fir] Add fir.global_len conversion placeholder

As for D113662, this patch just add a place holder for
the fir.global_len operation conversion. This operation
is part of F20xx and is not implemented yet.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D113887

2 years ago[flang][CodeGen] Transform `fir.unboxchar` to a sequence of LLVM MLIR
Andrzej Warzynski [Wed, 10 Nov 2021 13:24:56 +0000 (13:24 +0000)]
[flang][CodeGen] Transform `fir.unboxchar` to a sequence of LLVM MLIR

This patch extends the `FIRToLLVMLowering` pass in Flang by adding a
hook to transform `fir.unboxchar` to a sequence of LLVM MLIR
instructions.

This is part of the upstreaming effort from the `fir-dev` branch in [1].

[1] https://github.com/flang-compiler/f18-llvm-project

Differential Revision: https://reviews.llvm.org/D113747

Originally written by:
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
2 years ago[fir] Remove extra return in SelectTypeOpConversion
Valentin Clement [Mon, 15 Nov 2021 15:20:23 +0000 (16:20 +0100)]
[fir] Remove extra return in SelectTypeOpConversion

Extra commit after D113878

2 years ago[libc++] Add missing _LIBCPP_HIDE_FROM_ABI to __rewrap_iter
Louis Dionne [Mon, 15 Nov 2021 15:10:14 +0000 (10:10 -0500)]
[libc++] Add missing _LIBCPP_HIDE_FROM_ABI to __rewrap_iter

2 years agoRegenerate acle_st1*.c tests
Matt Devereau [Mon, 15 Nov 2021 15:02:46 +0000 (15:02 +0000)]
Regenerate acle_st1*.c tests

Regenerate acle_st1*.c tests using update_cc_test_checks.py

2 years ago[NFC][InstSimplify] add test cases with base results for or-xor fold
Mehrnoosh Heidarpour [Mon, 15 Nov 2021 15:02:09 +0000 (10:02 -0500)]
[NFC][InstSimplify] add test cases with base results for or-xor fold

This patch adds tests with baseline results as a pre-commit for D113861

Differential Revision: https://reviews.llvm.org/D113860

2 years ago[SLP][NFC]Use `isa_and_nonnull` and fix comment, NFC.
Alexey Bataev [Mon, 15 Nov 2021 14:49:07 +0000 (06:49 -0800)]
[SLP][NFC]Use `isa_and_nonnull` and fix comment, NFC.

2 years agoFix an unused variable warning
Kirstóf Umann [Mon, 15 Nov 2021 14:45:01 +0000 (15:45 +0100)]
Fix an unused variable warning

2 years agoRevert "[GVN][NFC] Remove redundant check"
ksyx [Mon, 15 Nov 2021 14:03:15 +0000 (09:03 -0500)]
Revert "[GVN][NFC] Remove redundant check"

This reverts commit c35e8185d8c170c20e28956e0c9f3c1be895fefb.

mstorsjo reported in the revision thread that one VNCoercion assertion
is violated and seemly in relate to this commit. As per "If a test case
that demonstrates a problem is reported in the commit thread, please
revert and investigate offline", this commit is reverted.

2 years ago[SLP]Do not create unused gather nodes for scalar arguments of vector intrinsics.
Alexey Bataev [Fri, 12 Nov 2021 21:34:32 +0000 (13:34 -0800)]
[SLP]Do not create unused gather nodes for scalar arguments of vector intrinsics.

If the vector intrinsic has scalar argument, we currently still create
a tree entry for this argument. This entry is not used, just consumes
resources and increases the cost of the tree.

Differential Revision: https://reviews.llvm.org/D113806

2 years ago[CMake] Allow passing extra options to extract_symbols.py.
Simon Tatham [Mon, 15 Nov 2021 14:01:21 +0000 (14:01 +0000)]
[CMake] Allow passing extra options to extract_symbols.py.

When cross-compiling LLVM in an environment where there //is// an
objdump binary available but it does not understand the target
platform's object file format, extract_symbols.py fails, because its
initial check for tool availability decides that the existence of
objdump at all is good enough to settle on it as the tool of choice.

In such an environment it's useful to work around this by telling
extract_symbols.py to use llvm-readobj instead. The script itself has
an option for that, but its invocation in AddLLVM.cmake wasn't
providing a mechanism to add extra options passed through for the
cmake command line.

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D113557

2 years ago[lldb] Unwrap the type when dereferencing the value
Andy Yankovsky [Thu, 11 Nov 2021 14:39:49 +0000 (15:39 +0100)]
[lldb] Unwrap the type when dereferencing the value

The value type can be a typedef of a reference (e.g. `typedef int& myint`).
In this case `GetQualType(type)` will return `clang::Typedef`, which cannot
be casted to `clang::ReferenceType`.

Fix a regression introduced in https://reviews.llvm.org/D103532.

Reviewed By: teemperor

Differential Revision: https://reviews.llvm.org/D113673

2 years ago[fir] Add fir.select_type conversion placeholder
Valentin Clement [Mon, 15 Nov 2021 13:35:44 +0000 (14:35 +0100)]
[fir] Add fir.select_type conversion placeholder

As for D113662, this patch just add a place holder for
the `fir.select_type` operation conversion. This operation
is part of F20xx and is not implemented yet.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D113878

2 years ago[IVDescriptor] Make sure the sign is included for negative extension.
Florian Hahn [Mon, 15 Nov 2021 10:31:07 +0000 (10:31 +0000)]
[IVDescriptor] Make sure the sign is included for negative extension.

At the moment, computeRecurrenceType does not include any sign bits in
the maximum bit width. If the value can be negative, this means the sign
bit will be missing and the sext won't properly extend the value.

If the value can be negative, increment the bitwidth by one to make sure
there is at least one sign bit in the result value.

Note that the increment is also needed *if* the value is *known* to be
negative, as a sign bit needs to be preserved for the sext to work.

Note that this at the moment prevents vectorization, because the
analysis computes i1 as type for the recurrence when looking through the
AND in lookThroughAnd.

Fixes PR51794, PR52485.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D113056

2 years ago[libcxx] Fix enable_if condition of std::reverse_iterator::operator=
Mikhail Maltsev [Mon, 15 Nov 2021 13:08:36 +0000 (13:08 +0000)]
[libcxx] Fix enable_if condition of std::reverse_iterator::operator=

The template std::is_assignable<T, U> checks that T is assignable from
U. Hence, the order of operands in the instantiation of
std::is_assignable in the std::reverse_iterator::operator= condition
should be reversed.

This issue remained unnoticed because std::reverse_iterator has an
implicit conversion constructor. This patch adds a test to check that
the assignment operator is used directly, without any implicit
conversions. The patch also adds a similar test for
std::move_iterator.

Reviewed By: Quuxplusone, ldionne, #libc

Differential Revision: https://reviews.llvm.org/D113417

2 years ago[llvm-nm][test] Move X86 lit.local.cfg into the X86 subfolder
James Henderson [Mon, 15 Nov 2021 11:49:47 +0000 (11:49 +0000)]
[llvm-nm][test] Move X86 lit.local.cfg into the X86 subfolder

The file seems to have been put in the wrong place in its original
commit. This had the effect of marking all llvm-nm tests as unsupported,
unless X86 was enabled, even for tests that weren't X86 specific.

Fixes https://bugs.llvm.org/show_bug.cgi?id=52506.

Reviewed by: mstorsjo

Differential Revision: https://reviews.llvm.org/D113882

2 years ago[mlir][Linalg] Fix and improve vectorization of depthwise convolutions.
Nicolas Vasilache [Mon, 15 Nov 2021 12:47:18 +0000 (12:47 +0000)]
[mlir][Linalg] Fix and improve vectorization of depthwise convolutions.

When trying to connect the vectorization of depthwise convolutions to e2e execution
a number of problems surfaced.
Fix an off-by-one error on the size of the input vector (similary to what was previously done for regular conv).
Rewrite the lowering to vector.fma instead of vector.contract: the KW reduction dimension has already been unrolled and vector.contract requires a reduction dimension to be valid.

Differential Revision: https://reviews.llvm.org/D113884

2 years ago[mlir][Linalg] Add bounded recursion declaration to FMAOp -> LLVM conversion.
Nicolas Vasilache [Mon, 15 Nov 2021 12:19:00 +0000 (12:19 +0000)]
[mlir][Linalg] Add bounded recursion declaration to FMAOp -> LLVM conversion.

FMAOp -> LLVM conversion is done progressively by peeling off 1 dimension from FMAOp at each pattern iteration. Add the recursively bounded property declaration to the pattern so that the rewriter can apply it multiple times.

Without this, FMAOps with 3+D do not lower to LLVM.

Differential Revision: https://reviews.llvm.org/D113886

2 years ago[mlir] Move min/max ops from Std to Arith.
Alexander Belyaev [Mon, 15 Nov 2021 11:52:37 +0000 (12:52 +0100)]
[mlir] Move min/max ops from Std to Arith.

Differential Revision: https://reviews.llvm.org/D113881