platform/upstream/llvm.git
3 years agoAArch64: support i128 (& larger) returns in GlobalISel
Tim Northover [Mon, 26 Jul 2021 12:53:34 +0000 (13:53 +0100)]
AArch64: support i128 (& larger) returns in GlobalISel

3 years ago[SimplifyCFG] Improve store speculation check
Nikita Popov [Wed, 21 Jul 2021 20:34:28 +0000 (22:34 +0200)]
[SimplifyCFG] Improve store speculation check

isSafeToSpeculateStore() looks for a preceding store to the same
location to make sure that introducing a new store of the same
value is safe. It currently bails on intervening mayHaveSideEffect()
instructions. However, I believe just checking mayWriteToMemory()
is sufficient there -- we just need to make sure that we know which
value was stored, we don't care if we can unwind in the meantime.

While looking into this, I started having some doubts about the
correctness of the transform with regard to thread safety. While
we don't try to hoist non-simple stores, I believe we also need
to make sure that the preceding store is simple as well. Otherwise
we could introduce a spurious non-atomic write after an atomic write
-- under our memory model this would result in a subsequent undef
atomic read, even if the second write stores the same value as the
first.

Example: https://alive2.llvm.org/ce/z/q_3YAL

Differential Revision: https://reviews.llvm.org/D106742

3 years ago[SVE] Fix casts to <FixedVectorType> in truncateToMinimalBitwidths
Kerry McLaughlin [Mon, 26 Jul 2021 09:55:15 +0000 (10:55 +0100)]
[SVE] Fix casts to <FixedVectorType> in truncateToMinimalBitwidths

Fixes more casts to `<FixedVectorType>` for the cases where the
instruction is a Insert/ExtractElementInst.

For fixed-width, this part of truncateToMinimalBitWidths is tested by
AArch64/type-shrinkage-insertelt.ll. I attempted to write a test case for this part
of truncateToMinimalBitWidths which uses scalable vectors, but was unable to add
one. The tests in type-shrinkage-insertelt.ll rely on scalarization to create extract
element instructions for instance, which is not possible for scalable vectors.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D106163

3 years agoRevert "[SLP]Fix costs calculations."
Alexey Bataev [Mon, 26 Jul 2021 12:42:11 +0000 (05:42 -0700)]
Revert "[SLP]Fix costs calculations."

This reverts commit a053afed49897aa34e08287f91c5255efa4e5131 to fix
buildbots.

3 years ago[AArch65][SVE] Remove vector_splice from AddedComplexity pattern
Caroline Concatto [Mon, 26 Jul 2021 12:08:33 +0000 (13:08 +0100)]
[AArch65][SVE] Remove vector_splice from AddedComplexity pattern

The pattern for vector_splice with Index equal or bigger than
zero was misplaced in the AddedComplexity = 1 pattern in the AArch64
tablegen file. This patch fixes it by removing vector_splice pattern
from inside AddedComplexity = 1.

3 years ago[mlir] split type conversion to two lines for GCC's sake
Tres Popp [Mon, 26 Jul 2021 12:15:37 +0000 (14:15 +0200)]
[mlir] split type conversion to two lines for GCC's sake

3 years ago[SLP]Fix costs calculations.
Alexey Bataev [Thu, 22 Jul 2021 17:48:36 +0000 (10:48 -0700)]
[SLP]Fix costs calculations.

Need to fix several cost-related problems. The final type may be defined
incorrectly because of to early definition (we may end up with the wider
type), the CommonCost should not be redefined in ExtractElements
cost related calculations and the shuffle of the final insertelements
vectors should be calculated as a cost of single vector permutations
+ costs of two vector permutations for other n-1 incoming vectors.

Differential Revision: https://reviews.llvm.org/D106578

3 years ago[NFC] Change VFShape so it contains an ElementCount rather than seperate VF and IsSca...
Paul Walker [Sat, 24 Jul 2021 15:34:20 +0000 (16:34 +0100)]
[NFC] Change VFShape so it contains an ElementCount rather than seperate VF and IsScalable properties.

Differential Revision: https://reviews.llvm.org/D106750

3 years ago[Inliner] Make the CallPenalty configurable
Philipp Krones [Wed, 14 Jul 2021 11:21:40 +0000 (12:21 +0100)]
[Inliner] Make the CallPenalty configurable

Tests with multiple benchmarks, like Embench [1], showed that the
CallPenalty magic number has the most influence on inlining decisions
when optimizing for size.

On the other hand, there was no good default value for this parameter.
Some benchmarks profited strongly from a reduced call penalty. On
example is the picojpeg benchmark compiled for RISC-V, which got 6%
smaller with a CallPenalty of 10 instead of 12. Other benchmarks
increased in size, like matmult.

This commit makes the compromise of turning the magic number constant of
CallPenalty into a configurable value. This introduces the flag
`--inline-call-penalty`. With that flag users can fine tune the inliner
to their needs.

The CallPenalty constant was also used for loops. This commit replaces
the CallPenalty constant with a new LoopPenalty constant that is now
used instead.

This is a slimmed down version of https://reviews.llvm.org/D30899

[1]: https://github.com/embench/embench-iot

Differential Revision: https://reviews.llvm.org/D105976

3 years ago[VPlan] Use stored value from recipes for interleave groups.
Florian Hahn [Mon, 26 Jul 2021 10:38:39 +0000 (11:38 +0100)]
[VPlan] Use stored value from recipes for interleave groups.

Instead of getting the VPValue for the stored IR values through the
current plan, use the stored value of the recipes directly.

This way, the correct VPValues are used if the store recipes have been
modified in the VPlan and the IR value is not correct any longer. This
can happen, e.g. due to D105008.

3 years ago[SVE] Add support for folding for select + masked loads
Dylan Fleming [Mon, 26 Jul 2021 10:26:28 +0000 (11:26 +0100)]
[SVE] Add support for folding for select + masked loads

Add folds to instcombine to support the removal of select instruction when the masked_load is guaranteed to zero the same lanes, i.e. select(mask, mload(,,mask,0), 0) -> mload(,,mask,0).

Patch originally authored by @paulwalker-arm

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D106376

3 years ago[SVE][AArch64] Improve code generation for vector_splice for Imm > 0
Caroline Concatto [Mon, 19 Jul 2021 10:14:20 +0000 (11:14 +0100)]
[SVE][AArch64] Improve code generation for vector_splice for Imm > 0

This patch implements vector_splice in tablegen for all cases when the
Immediate is positive and lower than the known minimum value of
a scalable vector.
Vector_splice can be implemented using SVE instruction EXT.
For instance :
    @llvm.experimental.vector.splice(Vector_1, Vector_2, Imm)
    @llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <B, C, D, E>
        EXT  Vector_1, Vector_2, Imm              // Vector_1 = B, C, D + Vector_2 = E

Depends on D105633

Differential Revision: https://reviews.llvm.org/D106273

3 years agoFix test failures caused by 0aff1798b5721d5f95d16f465b99d357012bb8d1
David Sherwood [Mon, 26 Jul 2021 10:38:49 +0000 (11:38 +0100)]
Fix test failures caused by 0aff1798b5721d5f95d16f465b99d357012bb8d1

3 years ago[AArch64][SVE] Improve code generation for vector_splice for Imm == -1
Caroline Concatto [Tue, 6 Jul 2021 08:19:16 +0000 (09:19 +0100)]
[AArch64][SVE] Improve code generation for vector_splice for Imm == -1

This patch implements vector_splice in tablegen for:
  a) when the immediate is equal to -1 (Imm==1) and uses:
       INSR  +  LASTB
For instance :
@llvm.experimental.vector.splice(Vector_1, Vector_2, -1)
@llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <D, E, F, G>
    LAST   RegLast, Vector_1                 // RegLast = D
    INSR   Res, (Vector_1 >> 1), RegLast     // Res = D + E, F, G

Differential Revision: https://reviews.llvm.org/D105633

3 years ago[X86][AVX] Prefer vinsertf128 to vperm2f128 on AVX1 targets
Simon Pilgrim [Mon, 26 Jul 2021 10:11:56 +0000 (11:11 +0100)]
[X86][AVX] Prefer vinsertf128 to vperm2f128 on AVX1 targets

Splatting the lower xmm with vinsertf128 is at least as quick as vperm2f128, and a lot faster on some AMD targets.

First step towards PR50053

3 years ago[X86][SSE] Don't scrub address math from interleaved shuffle tests
Simon Pilgrim [Mon, 26 Jul 2021 10:03:31 +0000 (11:03 +0100)]
[X86][SSE] Don't scrub address math from interleaved shuffle tests

3 years agoRevert "[clangd] Avoid range-loop init-list lifetime subtleties."
Sam McCall [Mon, 26 Jul 2021 08:42:03 +0000 (10:42 +0200)]
Revert "[clangd] Avoid range-loop init-list lifetime subtleties."

This reverts commit 253b8145dedbe8d10792f44b4af7f52dbecd527f.

This doesn't actually fix anything - I should stop guessing.
See https://github.com/clangd/clangd/issues/800 for update

3 years ago[AArch64][AsmParser] NFC: Parser.getTok().getLoc() -> getLoc()
Cullen Rhodes [Mon, 26 Jul 2021 09:23:59 +0000 (09:23 +0000)]
[AArch64][AsmParser] NFC: Parser.getTok().getLoc() -> getLoc()

Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D106635

3 years ago[Analysis] Add simple cost model for strict (in-order) reductions
David Sherwood [Wed, 7 Jul 2021 12:18:20 +0000 (13:18 +0100)]
[Analysis] Add simple cost model for strict (in-order) reductions

I have added a new FastMathFlags parameter to getArithmeticReductionCost
to indicate what type of reduction we are performing:

  1. Tree-wise. This is the typical fast-math reduction that involves
  continually splitting a vector up into halves and adding each
  half together until we get a scalar result. This is the default
  behaviour for integers, whereas for floating point we only do this
  if reassociation is allowed.
  2. Ordered. This now allows us to estimate the cost of performing
  a strict vector reduction by treating it as a series of scalar
  operations in lane order. This is the case when FP reassociation
  is not permitted. For scalable vectors this is more difficult
  because at compile time we do not know how many lanes there are,
  and so we use the worst case maximum vscale value.

I have also fixed getTypeBasedIntrinsicInstrCost to pass in the
FastMathFlags, which meant fixing up some X86 tests where we always
assumed the vector.reduce.fadd/mul intrinsics were 'fast'.

New tests have been added here:

  Analysis/CostModel/AArch64/reduce-fadd.ll
  Analysis/CostModel/AArch64/sve-intrinsics.ll
  Transforms/LoopVectorize/AArch64/strict-fadd-cost.ll
  Transforms/LoopVectorize/AArch64/sve-strict-fadd-cost.ll

Differential Revision: https://reviews.llvm.org/D105432

3 years ago[SelectionDAG] Support scalable-vector splats in yet more cases
Fraser Cormack [Thu, 22 Jul 2021 17:03:40 +0000 (18:03 +0100)]
[SelectionDAG] Support scalable-vector splats in yet more cases

This patch extends support for (scalable-vector) splats in the
DAGCombiner via the `ISD::matchBinaryPredicate` function, which enable a
variety of simple combines of constants.

Users of this function may now have to distinguish between
`BUILD_VECTOR` and `SPLAT_VECTOR` vector operands. The way of dealing
with this in-tree follows the approach added for
`ISD::matchUnaryPredicate` implemented in D94501.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D106575

3 years agoRevert "Revert D106562 "[clangd] Get rid of arg adjusters in CommandMangler""
Kadir Cetinkaya [Sat, 24 Jul 2021 17:44:15 +0000 (19:44 +0200)]
Revert "Revert D106562 "[clangd] Get rid of arg adjusters in CommandMangler""

This reverts commit 2aa0cf19e7fe17c9eb5eb2555e10184061b933f1.
Get rid of reference to the temporary.

3 years ago[libomptarget] Build amdgpu plugin without hsa
Jon Chesterfield [Mon, 26 Jul 2021 08:54:50 +0000 (09:54 +0100)]
[libomptarget] Build amdgpu plugin without hsa

Default to building the amdgpu plugin to use dlopen when hsa is
not found instead of disabling it.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106600

3 years ago[libomptarget][nfc] Squash unused variable warning
Jon Chesterfield [Mon, 26 Jul 2021 08:54:30 +0000 (09:54 +0100)]
[libomptarget][nfc] Squash unused variable warning

Suppress only current warning on openmp-clang-x86_64-linux-debian

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106777

3 years ago[mlir] Fix RankedTensorType::walkImmediateSubElements method
Vladislav Vinogradov [Wed, 7 Jul 2021 15:07:51 +0000 (18:07 +0300)]
[mlir] Fix RankedTensorType::walkImmediateSubElements method

Add 'enconding' attribute visitor.
Without it ASM printer doesn't use attribute aliases for 'enconding'.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D105554

3 years ago[libc] fix LibcUnitTestMain when building with shared libraries
Guillaume Chatelet [Mon, 26 Jul 2021 08:43:45 +0000 (08:43 +0000)]
[libc] fix LibcUnitTestMain when building with shared libraries

3 years ago[ORC][ORC-RT] Add initial Objective-C and Swift support to MachOPlatform.
Lang Hames [Mon, 26 Jul 2021 07:49:05 +0000 (17:49 +1000)]
[ORC][ORC-RT] Add initial Objective-C and Swift support to MachOPlatform.

This allows ORC to execute code containing Objective-C and Swift classes and
methods (provided that the language runtime is loaded into the executor).

3 years ago[mlir] Added new RegionBranchTerminatorOpInterface and adapted uses of hasTrait<Retur...
Marcel Koester [Fri, 23 Jul 2021 09:59:21 +0000 (11:59 +0200)]
[mlir] Added new RegionBranchTerminatorOpInterface and adapted uses of hasTrait<ReturnLike>.

This CL adds a new RegionBranchTerminatorOpInterface to query information about operands that can be
passed to successor regions. Similar to the BranchOpInterface, it allows to freely define the
involved operands. However, in contrast to the BranchOpInterface, it expects an additional region
number to distinguish between various use cases which might require different operands passed to
different regions.

Moreover, we added new utility functions (namely getMutableRegionBranchSuccessorOperands and
getRegionBranchSuccessorOperands) to query (mutable) operand ranges for operations equiped with the
ReturnLike trait and/or implementing the newly added interface.  This simplifies reasoning about
terminators in the scope of the nested regions.

We also adjusted the SCF.ConditionOp to benefit from the newly added capabilities.

Differential Revision: https://reviews.llvm.org/D105018

3 years ago[Preprocessor] Implement -fminimize-whitespace.
Michael Kruse [Mon, 26 Jul 2021 02:39:08 +0000 (21:39 -0500)]
[Preprocessor] Implement -fminimize-whitespace.

This patch adds the -fminimize-whitespace with the following effects:

 * If combined with -E, remove as much non-line-breaking whitespace as
   possible.

 * If combined with -E -P, removes as much whitespace as possible,
   including line-breaks.

The motivation is to reduce the amount of insignificant changes in the
preprocessed output with source files where only whitespace has been
changed (add/remove comments, clang-format, etc.) which is in particular
useful with ccache.

A patch for ccache for using this flag has been proposed to ccache as well:
https://github.com/ccache/ccache/pull/815, which will use
-fnormalize-whitespace when clang-13 has been detected, and additionally
uses -P in "unify_mode". ccache already had a unify_mode in an older
version which was removed because of problems that using the
preprocessor itself does not have (such that the custom tokenizer did
not recognize C++11 raw strings).

This patch slightly reorganizes which part is responsible for adding
newlines that are required for semantics. It is now either
startNewLineIfNeeded() or MoveToLine() but never both; this avoids the
ShouldUpdateCurrentLine workaround and avoids redundant lines being
inserted in some cases. It also fixes a mandatory newline not inserted
after a _Pragma("...") that is expanded into a #pragma.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D104601

3 years ago[Object] make SourceMgr available to MCContext during inline asm symbols
Yuanfang Chen [Mon, 26 Jul 2021 04:12:28 +0000 (21:12 -0700)]
[Object] make SourceMgr available to MCContext during inline asm symbols
collection

Fixes PR51210.

3 years ago[Debug-Info][llvm-dwarfdump] Don't use DW_FORM_data4/8
Esme-Yi [Mon, 26 Jul 2021 03:47:02 +0000 (03:47 +0000)]
[Debug-Info][llvm-dwarfdump] Don't use DW_FORM_data4/8
to encode the constants for DW_AT_data_member_location.

Summary: In DWARF v3, DW_FORM_data4/8 in
DW_AT_data_member_location are interpreted as location
list pointers. Interpreting constants as pointers is
not expected, so we use DW_FORM_udata to encode the
constants.

Reviewed By: probinson

Differential Revision: https://reviews.llvm.org/D105687

3 years agoRevert "Build libSupport with -Werror=global-constructors (NFC)"
Mehdi Amini [Mon, 26 Jul 2021 03:08:26 +0000 (03:08 +0000)]
Revert "Build libSupport with -Werror=global-constructors (NFC)"

This reverts commit 579cc9ad2e2db6c3f1670b9f42c2cfe67bc5722c.
This breaks on Windows.

3 years agoBuild libSupport with -Werror=global-constructors (NFC)
Mehdi Amini [Fri, 16 Jul 2021 03:32:59 +0000 (03:32 +0000)]
Build libSupport with -Werror=global-constructors (NFC)

Ensure that libSupport does not carry any static global initializer.
libSupport can be embedded in use cases where we don't want to load all
cl::opt unless we want to parse the command line.
ManagedStatic can be used to enable lazy-initialization of globals.

3 years ago[yaml2obj] Do not write the string table if there is no string entry.
Esme-Yi [Mon, 26 Jul 2021 02:37:49 +0000 (02:37 +0000)]
[yaml2obj] Do not write the string table if there is no string entry.

Summary: yaml2obj shouldn't create the string table that isn't needed
         - doing so wastes time and disk space.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D106420

3 years ago[OPENCL] opencl-c.h: add initial CL 3.0 conditionals for atomic operations.
Dave Airlie [Mon, 26 Jul 2021 01:04:25 +0000 (11:04 +1000)]
[OPENCL] opencl-c.h: add initial CL 3.0 conditionals for atomic operations.

This adds the optional wrappers around things, however this isn't sufficient yet for CL 3.0 without generic address space, I've got one more additional patch to add all those APIs, but this is an easier to review precursor.

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D106111

3 years agoRevert "Build libSupport with -Werror=global-constructors (NFC)"
Mehdi Amini [Mon, 26 Jul 2021 00:55:36 +0000 (00:55 +0000)]
Revert "Build libSupport with -Werror=global-constructors (NFC)"

This reverts commit 5eb2e9aa64b7be7cd8ed7f36de19c2c9bdf1977c.
This broke MacOS builds, needs to have a safer check guarding the flag
addition.

3 years agoBuild libSupport with -Werror=global-constructors (NFC)
Mehdi Amini [Fri, 16 Jul 2021 03:32:59 +0000 (03:32 +0000)]
Build libSupport with -Werror=global-constructors (NFC)

Ensure that libSupport does not carry any static global initializer.
libSupport can be embedded in use cases where we don't want to load all
cl::opt unless we want to parse the command line.
ManagedStatic can be used to enable lazy-initialization of globals.

3 years agoRemove the NotUnderValgrind caching flag
Mehdi Amini [Mon, 26 Jul 2021 00:20:24 +0000 (00:20 +0000)]
Remove the NotUnderValgrind caching flag

The motivation for this caching wasn't clear, remove it in an effort to
simplify the code and make libSupport free of global dynamic constructor.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D106206

3 years ago[SimplifyCFG] Fold branch to common dest: if branch is unpredictable, prefer to speculate
Roman Lebedev [Sun, 25 Jul 2021 23:57:19 +0000 (02:57 +0300)]
[SimplifyCFG] Fold branch to common dest: if branch is unpredictable, prefer to speculate

This is consistent with the two other usages of prof md in this pass.

3 years ago[SimplifyCFG] Don't speculatively execute BB[s] if they are predictably not taken
Roman Lebedev [Sun, 25 Jul 2021 23:38:40 +0000 (02:38 +0300)]
[SimplifyCFG] Don't speculatively execute BB[s] if they are predictably not taken

Same as D106650, but for `FoldTwoEntryPHINode()`

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D106717

3 years ago[SimplifyCFG] Don't speculatively execute BB if it's predictably not taken
Roman Lebedev [Sun, 25 Jul 2021 23:30:31 +0000 (02:30 +0300)]
[SimplifyCFG] Don't speculatively execute BB if it's predictably not taken

If the branch isn't `unpredictable`, and it is predicted to *not* branch
to the block we are considering speculatively executing,
then it seems counter-productive to execute the code that is predicted not to be executed.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D106650

3 years ago[NFC][SimplifyCFG] Add more negative tests for profmd-induced speculation avoidance
Roman Lebedev [Sun, 25 Jul 2021 23:30:12 +0000 (02:30 +0300)]
[NFC][SimplifyCFG] Add more negative tests for profmd-induced speculation avoidance

3 years ago[ELF] Support quoted symbols in symbol assignments
Fangrui Song [Sun, 25 Jul 2021 23:26:37 +0000 (16:26 -0700)]
[ELF] Support quoted symbols in symbol assignments

glibc/elf/tst-absolute-zero-lib.lds uses `"absolute" = 0;`

3 years ago[lld/mac] Make comment style uniform in start-end.s test
Nico Weber [Sun, 25 Jul 2021 22:37:49 +0000 (18:37 -0400)]
[lld/mac] Make comment style uniform in start-end.s test

3 years ago[lld/mac] Add support for segment$start$ and segment$end$ symbols
Nico Weber [Fri, 23 Jul 2021 14:12:55 +0000 (10:12 -0400)]
[lld/mac] Add support for segment$start$ and segment$end$ symbols

These symbols are somewhat interesting in that they create non-existing
segments, which as far as I know is the only way to create segments
that don't contain any sections.

Final part of part of PR50760. Like D106629, but for segments instead
of sections. I'm not aware of anything that needs this in practice.

Differential Revision: https://reviews.llvm.org/D106767

3 years ago[lld/mac] Move output segment rename logic into OutputSegment
Nico Weber [Sun, 25 Jul 2021 14:09:37 +0000 (10:09 -0400)]
[lld/mac] Move output segment rename logic into OutputSegment

Fixes the output segment name if both -rename_section and
-rename_segment are used and the post-section-rename segment
name is the same as the pre-segment-rename segment name to
match ld64's behavior.

The motivation is that segment$start$ can create section-less segments,
and this makes a corner case in the interaction between segment$start and
-rename_segment in the upcoming segment$start patch.

Differential Revision: https://reviews.llvm.org/D106766

3 years ago[lld/mac] Reland: Add tests for the interaction between -rename_section and -rename_s...
Nico Weber [Sun, 25 Jul 2021 14:05:58 +0000 (10:05 -0400)]
[lld/mac] Reland: Add tests for the interaction between -rename_section and -rename_segment

No behavior change.

Differential Revision: https://reviews.llvm.org/D106765

3 years ago[libomptarget][amdgpu] More robust handling of failure to init HSA
Jon Chesterfield [Sun, 25 Jul 2021 22:14:42 +0000 (23:14 +0100)]
[libomptarget][amdgpu] More robust handling of failure to init HSA

If hsa_init fails, subsequent calls into hsa are not safe. Except for
hsa_init, but we don't retry on failure.

This patch:
- deletes a print that called into hsa to ask why it can't call into hsa
- drops a merge conflict block next to that print
- reliably initializes number of devices to zero
- skips the plugin destructor contents if the constructor failed to init hsa

Tested by making hsa_init return error, and by forcing the dynamic library
use which was then deleted from disk. Before this patch, both segv. After it,
friendly message about offloading being unavailable.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106774

3 years agoRevert "[lld/mac] Add tests for the interaction between -rename_section and -rename_s...
Nico Weber [Sun, 25 Jul 2021 22:11:36 +0000 (18:11 -0400)]
Revert "[lld/mac] Add tests for the interaction between -rename_section and -rename_segment"

This reverts commit a6eb34624dcfa5a33caa0211f4a16710b22079c2.
The test fails, I screwed something up.

3 years ago[lld/mac] Add tests for the interaction between -rename_section and -rename_segment
Nico Weber [Sun, 25 Jul 2021 14:05:58 +0000 (10:05 -0400)]
[lld/mac] Add tests for the interaction between -rename_section and -rename_segment

No behavior change.

Differential Revision: https://reviews.llvm.org/D106765

3 years ago[docs] Update release notes to mention lli JIT engine switch
Stefan Gränitz [Sun, 25 Jul 2021 21:58:43 +0000 (23:58 +0200)]
[docs] Update release notes to mention lli JIT engine switch

3 years agoRevert "[VPlan] Add recipe for first-order rec phis, make splicing explicit."
Nico Weber [Sun, 25 Jul 2021 21:31:02 +0000 (17:31 -0400)]
Revert "[VPlan] Add recipe for first-order rec phis, make splicing explicit."

Makes clang crash: https://reviews.llvm.org/D105008#2903350
This reverts commit d2a73fb44ea0b8c981e4b923f811f18793fc4770.

Also revert a minor formatting follow-up:
This reverts commit 82834a673246f27a541ffcc57e0eb65b008102ef.

3 years agoRevert "[libomptarget] Build amdgpu plugin without hsa"
Jon Chesterfield [Sun, 25 Jul 2021 20:03:51 +0000 (21:03 +0100)]
Revert "[libomptarget] Build amdgpu plugin without hsa"

Inaccurate error handling around hsa_init

This reverts commit e30b3b23a4eddbc08b5648e643f0a0b456a57832.

3 years ago[LangRef] Reorder two paragraphs for comdat
Fangrui Song [Sun, 25 Jul 2021 19:53:14 +0000 (12:53 -0700)]
[LangRef] Reorder two paragraphs for comdat

so that IMAGE_COMDAT_SELECT_LARGEST refers to the correct example.

3 years ago[X86][AVX] Add getBROADCAST_LOAD helper function. NFCI.
Simon Pilgrim [Sun, 25 Jul 2021 19:37:42 +0000 (20:37 +0100)]
[X86][AVX] Add getBROADCAST_LOAD helper function. NFCI.

Begin replacing individual getMemIntrinsicNode calls and setup (for X86ISD::VBROADCAST_LOAD + X86ISD::SUBV_BROADCAST_LOAD opcodes) with this getBROADCAST_LOAD helper.

3 years ago[libomptarget] Build amdgpu plugin without hsa
Jon Chesterfield [Sun, 25 Jul 2021 18:33:35 +0000 (19:33 +0100)]
[libomptarget] Build amdgpu plugin without hsa

Default to building the amdgpu plugin to use dlopen when hsa is
not found instead of disabling it.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106600

3 years ago[OpenMP] Introduce RAII to protect certain RTL calls from DCE
Joseph Huber [Fri, 23 Jul 2021 20:32:47 +0000 (16:32 -0400)]
[OpenMP] Introduce RAII to protect certain RTL calls from DCE

This patch introduces a new RAII struct that will temporarily make an OpenMP
RTL function have external linkage. This is done before the attributor is
invoked to prevent it from incorrectly removing some function definitions that
we will use later. For example, if we determine all calls to one function are
dead, because it has internal linkage it can safely be removed. Later when we
try to get an instance to that function to modify the source using
`getOrCreateRuntimeFunction` we will then get an empty declaration for that
function that won't be defined anywhere. This patch prevents this from
occurring.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106707

3 years ago[NFC][Codegen][X86] Improve test coverage for insertions into XMM vector
Roman Lebedev [Sun, 25 Jul 2021 14:36:51 +0000 (17:36 +0300)]
[NFC][Codegen][X86] Improve test coverage for insertions into XMM vector

3 years ago[AArch64] Fix Local Deallocation for Homogeneous Prolog/Epilog
Kyungwoo Lee [Sun, 25 Jul 2021 17:50:39 +0000 (10:50 -0700)]
[AArch64] Fix Local Deallocation for Homogeneous Prolog/Epilog

The stack adjustment for local deallocation was incorrectly ported.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D106760

3 years ago[OpenMP][tests][NFC] Update test status for gcc 11 and 12
Joachim Protze [Sun, 25 Jul 2021 16:27:40 +0000 (18:27 +0200)]
[OpenMP][tests][NFC] Update test status for gcc 11 and 12

gcc 11 introduced support for depend clause, but the gomp interface of libomp
does not yet handle the information.
Also remove -fopenmp-version=50, which is no longer needed for clang, but not
supported by gcc.

3 years ago[X86][SSE] LowerRotate - perform modulo on the amount splat source directly.
Simon Pilgrim [Sun, 25 Jul 2021 16:30:17 +0000 (17:30 +0100)]
[X86][SSE] LowerRotate - perform modulo on the amount splat source directly.

If the rotation amount is a known splat, perform the modulo on the splat source, and then perform the splat. That way the amount-extension performed later by LowerScalarVariableShift can fold the splats away without any multiple-use issues.

Fixes one of the concerns raised on D104156

3 years ago[Attributes] Clean up handling of UB implying attributes (NFC)
Nikita Popov [Sun, 25 Jul 2021 16:21:13 +0000 (18:21 +0200)]
[Attributes] Clean up handling of UB implying attributes (NFC)

Rather than adding methods for dropping these attributes in
various places, add a function that returns an AttrBuilder with
these attributes, which can then be used with existing methods
for dropping attributes. This is with an eye on D104641, which
also needs to drop them from returns, not just parameters.

Also be more explicit about the semantics of the method in the
documentation. Refer to UB rather than Undef, which is what this
is actually about.

3 years ago[Attributes] Remove nonnull from UB-implying attributes
Nikita Popov [Sun, 25 Jul 2021 16:04:50 +0000 (18:04 +0200)]
[Attributes] Remove nonnull from UB-implying attributes

From LangRef:

> if the parameter or return pointer is null, poison value is
> returned or passed instead. The nonnull attribute should be
> combined with the noundef attribute to ensure a pointer is not
> null or otherwise the behavior is undefined.

Dropping noundef is sufficient to prevent UB. Including nonnull
in this method just muddies the semantics.

3 years agoRevert rG939291041bb35b8088e3b61be2b8b3bc950f64a7 "[AMDGPU] Regenerate wave32.ll...
Simon Pilgrim [Sun, 25 Jul 2021 14:59:26 +0000 (15:59 +0100)]
Revert rG939291041bb35b8088e3b61be2b8b3bc950f64a7 "[AMDGPU] Regenerate wave32.ll test checks"

This still breaks buildbots

3 years ago[JITLink][RISCV] Run new test from 0ad562b48 only if the RISCV backend is enabled
Nico Weber [Sun, 25 Jul 2021 14:45:46 +0000 (10:45 -0400)]
[JITLink][RISCV] Run new test from 0ad562b48 only if the RISCV backend is enabled

3 years ago[InstCombine] Fix PR47960 - Incorrect transformation of fabs with nnan flag
Krishna Kariya [Sun, 25 Jul 2021 14:36:25 +0000 (10:36 -0400)]
[InstCombine] Fix PR47960 - Incorrect transformation of fabs with nnan flag

Bug Fix for PR: https://llvm.org/PR47960

This patch makes sure that the fast math flag used in the 'select'
instruction is the same as the 'fabs' instruction after the transformation.

Differential Revision: https://reviews.llvm.org/D101727

3 years ago[OpenMP][NVPTX] Disable OpenMPOpt when building deviceRTLs
Shilei Tian [Sun, 25 Jul 2021 14:38:15 +0000 (10:38 -0400)]
[OpenMP][NVPTX] Disable OpenMPOpt when building deviceRTLs

We build `deviceRTLs` with `-O1` by default, which also triggers OpenMPOpt. When
the info cache is created, some attributes are removed. As a result, although we
mark a few functions `noinline`, they are still inlined when the bitcode library
is generated. This can cause an issue in middle end optimization.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106710

3 years ago[NFC][Codegen][X86] Improve test coverage for repeated insertions of the same scalar...
Roman Lebedev [Sun, 25 Jul 2021 14:36:51 +0000 (17:36 +0300)]
[NFC][Codegen][X86] Improve test coverage for repeated insertions of the same scalar into different elements

3 years ago[AMDGPU] Regenerate wave32.ll test checks
Simon Pilgrim [Sun, 25 Jul 2021 14:12:42 +0000 (15:12 +0100)]
[AMDGPU] Regenerate wave32.ll test checks

To simplify diff in future patch

3 years ago[AMDGPU] Regenerate mul24 test checks
Simon Pilgrim [Sun, 25 Jul 2021 14:11:42 +0000 (15:11 +0100)]
[AMDGPU] Regenerate mul24 test checks

To simplify diffs in future patch

3 years ago[x86] improve CMOV codegen by pushing add into operands, part 2
Sanjay Patel [Sun, 25 Jul 2021 14:01:14 +0000 (10:01 -0400)]
[x86] improve CMOV codegen by pushing add into operands, part 2

This is a minimum extension of D106607 to allow folding for
2 non-zero constantsi that can be materialized as immediates..

In the reduced test examples, we save 1 instruction by rolling
the constants into LEA/ADD. In the motivating test from the bullet
benchmark, we absorb both of the constant moves into add ops via
LEA magic, so we reduce by 2 instructions.

Differential Revision: https://reviews.llvm.org/D106684

3 years ago[GlobalISel] Remove FlagsOp (NFC)
Kazu Hirata [Sun, 25 Jul 2021 14:05:07 +0000 (07:05 -0700)]
[GlobalISel] Remove FlagsOp (NFC)

The class was introduced without a use on Dec 11, 2018 in commit
cef44a234219e38e1c28c902ff24586150eef682.

3 years ago[Inline] Fix a warning by removing an explicit copy constructor
Kazu Hirata [Sun, 25 Jul 2021 13:56:47 +0000 (06:56 -0700)]
[Inline] Fix a warning by removing an explicit copy constructor

This patches fixes the warning:

  llvm/include/llvm/Analysis/InlineCost.h:62:3: error: definition of
  implicit copy assignment operator for 'CostBenefitPair' is
  deprecated because it has a user-declared copy constructor
  [-Werror,-Wdeprecated-copy]

by removing the explicit copy constructor.

3 years ago[X86][AVX] Adjust AllowBWIVPERMV3 tolerance to account for VariableCrossLaneShuffleDepth
Simon Pilgrim [Sun, 25 Jul 2021 12:30:05 +0000 (13:30 +0100)]
[X86][AVX] Adjust AllowBWIVPERMV3 tolerance to account for VariableCrossLaneShuffleDepth

As noticed on D105390 - we were hardwiring the depth limit for combining to VPERMI2W/VPERMI2B instructions. Not only had we made the limit too low, we hadn't accounted for slow/fast shuffles via the VariableCrossLaneShuffleDepth control

3 years ago[AMDGPU] Regenerate global-load-saddr-to-vaddr test checks
Simon Pilgrim [Sun, 25 Jul 2021 12:27:30 +0000 (13:27 +0100)]
[AMDGPU] Regenerate global-load-saddr-to-vaddr test checks

To simplify diff in future patch

3 years ago[AMDGPU] Regenerate ctpop16 test checks
Simon Pilgrim [Sun, 25 Jul 2021 12:24:09 +0000 (13:24 +0100)]
[AMDGPU] Regenerate ctpop16 test checks

To simplify diff in future patch

3 years ago[AMDGPU] Regenerate half test checks
Simon Pilgrim [Sun, 25 Jul 2021 12:02:34 +0000 (13:02 +0100)]
[AMDGPU] Regenerate half test checks

To simplify diff in future patch

3 years ago[AMDGPU] Regenerate anyext test checks
Simon Pilgrim [Sun, 25 Jul 2021 12:01:20 +0000 (13:01 +0100)]
[AMDGPU] Regenerate anyext test checks

To simplify diff in future patch

3 years ago[llvm][Inline] Add interface to return cost-benefit stuff
Liqiang Tao [Sun, 25 Jul 2021 12:17:30 +0000 (20:17 +0800)]
[llvm][Inline] Add interface to return cost-benefit stuff

Return cost-benefit stuff which is computed by cost-benefit analysis.

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D105349

3 years ago[AArch64][GlobalISel] Widen non-pow-2 types for shifts before clamping.
Amara Emerson [Sat, 24 Jul 2021 21:06:37 +0000 (14:06 -0700)]
[AArch64][GlobalISel] Widen non-pow-2 types for shifts before clamping.

For types like s96, we don't want to clamp to s64, we want to first widen to
s128 and then narrow it. Otherwise we end up with impossible to legalize types.

3 years ago[mlir] Async: lower SCF operations into CFG inside coroutines
Eugene Zhulenev [Sat, 24 Jul 2021 13:51:15 +0000 (06:51 -0700)]
[mlir] Async: lower SCF operations into CFG inside coroutines

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D106747

3 years ago[RISCV] Custom lower (i32 (fptoui/fptosi X)).
Craig Topper [Sat, 24 Jul 2021 17:34:49 +0000 (10:34 -0700)]
[RISCV] Custom lower (i32 (fptoui/fptosi X)).

I stumbled onto a case where our (sext_inreg (assertzexti32 (fptoui X)), i32)
isel pattern can cause an fcvt.wu and fcvt.lu to be emitted if
the assertzexti32 has an additional user. If we add a one use check
it would just cause a fcvt.lu followed by a sext.w when only need
a fcvt.wu to satisfy both users.

To mitigate this I've added custom isel and new ISD opcodes for
fcvt.wu. This allows us to keep know it started life as a conversion
to i32 without needing to match multiple nodes. ComputeNumSignBits
has been taught that this new nodes produces 33 sign bits. To
prevent regressions when we need to zero extend the result of an
(i32 (fptoui X)), I've added a DAG combine to convert it to an
(i64 (fptoui X)) before type legalization. In most cases this would
happen in InstCombine, but a zero_extend can be created for function
returns or arguments.

To keep everything consistent I've added new nodes for fptosi as well.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D106346

3 years ago[Tests] Add additional tests for incorrect willreturn handling (NFC)
Nikita Popov [Sat, 24 Jul 2021 15:00:16 +0000 (17:00 +0200)]
[Tests] Add additional tests for incorrect willreturn handling (NFC)

Highlight a few of the places that don't handle non-willreturn
calls correctly right now.

3 years ago[Tests] Add missing willreturn attributes (NFC)
Nikita Popov [Wed, 21 Jul 2021 20:47:26 +0000 (22:47 +0200)]
[Tests] Add missing willreturn attributes (NFC)

To retain the spirit of these tests after an upcoming change
to mayHaveSideEffect(), add willreturn attributes to a number
of functions.

3 years ago[LICM] Extract debugify test (NFC)
Nikita Popov [Sat, 24 Jul 2021 15:03:20 +0000 (17:03 +0200)]
[LICM] Extract debugify test (NFC)

Only one of the tests in the file wants to check debug info, so
move it into a separate file. This allows update_test_checks to
work.

3 years ago[ADT] Remove WrappedPairNodeDataIterator (NFC)
Kazu Hirata [Sat, 24 Jul 2021 15:02:57 +0000 (08:02 -0700)]
[ADT] Remove WrappedPairNodeDataIterator (NFC)

The last use was removed on Jul 16, 2020 in commit
f1d4db4f0cdcbfeaee0840bf8a4fb5dc1b9b56fd.

3 years ago[X86] Add additional div-mod-pair negative test coverage
Simon Pilgrim [Sat, 24 Jul 2021 14:21:46 +0000 (15:21 +0100)]
[X86] Add additional div-mod-pair negative test coverage

As suggested on D106745

3 years ago[mlir] Restore markUnknownOpDynamicallyLegal to call isDynamicallyLegal by default
Benjamin Kramer [Sat, 24 Jul 2021 13:54:42 +0000 (15:54 +0200)]
[mlir] Restore markUnknownOpDynamicallyLegal to call isDynamicallyLegal by default

Looks like an oversight from b7a464989955e6374b39b518e317b59b510d4dc5

This should probably have a test case ...

3 years ago[BasicTTI] Set scalarization cost of scalable vector casts to Invalid.
Sander de Smalen [Sat, 24 Jul 2021 12:41:40 +0000 (13:41 +0100)]
[BasicTTI] Set scalarization cost of scalable vector casts to Invalid.

When BasicTTIImpl::getCastInstrCost can't determine the cost of a
vector cast operation when the types need legalization, it falls
back to calculating scalarization costs. Instead of crashing on
`cast<FixedVectorType>(DstVTy)` when the type is a scalable vector,
return an Invalid cost.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D106655

3 years ago[X86] Add i128 div-mod-pair test coverage
Simon Pilgrim [Sat, 24 Jul 2021 13:00:40 +0000 (14:00 +0100)]
[X86] Add i128 div-mod-pair test coverage

3 years ago[SVE][NFC] Cleanup fixed length code gen tests to make them more resilient.
Paul Walker [Thu, 22 Jul 2021 16:09:18 +0000 (17:09 +0100)]
[SVE][NFC] Cleanup fixed length code gen tests to make them more resilient.

Many of the tests have used NEXT when DAG is more approprite. In
some cases single DAG lines have been used. Note that these are
manual tests because they're to complex for update_llc_test_checks.py
and so it's worth not relying too much on the ordered output.

I've also made the CHECK lines more uniform when it comes to the
ordering of things like LO/HI.

3 years ago[CGP] despeculateCountZeros - Don't create is-zero branch if cttz/ctlz source is...
Simon Pilgrim [Sat, 24 Jul 2021 11:58:02 +0000 (12:58 +0100)]
[CGP] despeculateCountZeros - Don't create is-zero branch if cttz/ctlz source is known non-zero

If value tracking can confirm that the cttz/ctlz source is known non-zero then we don't need to create a branch (which DAG will struggle to recover from).

Differential Revision: https://reviews.llvm.org/D106685

3 years ago[gn build] Port 6aa9e746ebde
LLVM GN Syncbot [Sat, 24 Jul 2021 12:03:50 +0000 (12:03 +0000)]
[gn build] Port 6aa9e746ebde

3 years ago[AVR] Only support sp, r0 and r1 in llvm.read_register
Ayke van Laethem [Thu, 18 Feb 2021 17:02:13 +0000 (18:02 +0100)]
[AVR] Only support sp, r0 and r1 in llvm.read_register

Most other registers are allocatable and therefore cannot be used.

This issue was flagged by the machine verifier, because reading other
registers is considered reading from an undefined register.

Differential Revision: https://reviews.llvm.org/D96969

3 years ago[AVR] Fix rotate instructions
Ayke van Laethem [Thu, 18 Feb 2021 15:12:11 +0000 (16:12 +0100)]
[AVR] Fix rotate instructions

This patch fixes some issues with the RORB pseudo instruction.

  - A minor issue in which the instructions were said to use the SREG,
    which is not true.
  - An issue with the BLD instruction, which did not have an output operand.
  - A major issue in which invalid instructions were generated. The fix
    also reduce RORB from 4 to 3 instructions, so it's also a small
    optimization.

These issues were flagged by the machine verifier.

Differential Revision: https://reviews.llvm.org/D96957

3 years ago[AVR] Expand large shifts early in IR
Ayke van Laethem [Sun, 14 Feb 2021 22:29:59 +0000 (23:29 +0100)]
[AVR] Expand large shifts early in IR

This patch makes sure shift instructions such as this one:

    %result = shl i32 %n, %amount

are expanded just before the IR to SelectionDAG conversion to a loop so
that calls to non-existing library functions such as __ashlsi3 are
avoided. The generated code is currently pretty bad but there's a lot of
room for improvement: the shift itself can be done in just four
instructions.

Differential Revision: https://reviews.llvm.org/D96677

3 years ago[AVR] Improve 8/16 bit atomic operations
Ayke van Laethem [Sat, 20 Feb 2021 19:51:38 +0000 (20:51 +0100)]
[AVR] Improve 8/16 bit atomic operations

There were some serious issues with atomic operations. This patch should
fix the biggest issues.

For details on the issue take a look at this Compiler Explorer sample:
https://godbolt.org/z/n3ndhn

Code:

    void atomicadd(_Atomic char *val) {
        *val += 5;
    }

Output:

    atomicadd:
        movw    r26, r24
        ldi     r24, 5     ; 'operand' register
        in      r0, 63
        cli
        ld      r24, X     ; load value
        add     r24, r26   ; value += X
        st      X, r24     ; store value back
        out     63, r0
        ret                ; return the wrong value (in r24)

There are various problems with this.

 - The value to add (5) is stored in r24. However, the value to add to
   is loaded in the same register: r24.
 - The `add` instruction adds half of the pointer to the loaded value,
   instead of (attempting to) add the operand with value 5.
 - The output value of the cmpxchg instruction (which is not used in
   this code sample) is the new value with 5 added, not the old value.
   The LangRef specifies that it has to be the old value, before the
   operation.

This patch fixes the first two and leaves the third problem to be fixed
at a later date. I believe atomics were mostly broken before this patch,
with this patch they should become usable as long as you ignore the
output of the atomic operation. In particular it fixes the following
things:

 - It sets the earlyclobber flag for the input ('$operand' operand) so
   that the register allocator puts it in a different register than the
   output value.
 - It fixes a number of issues with the pseudo op expansion pass, for
   example now it adds the $operand field instead of the pointer. This
   fixes most machine instruction verifier issues (other flagged issues
   are unrelated to atomics).

Differential Revision: https://reviews.llvm.org/D97127

3 years ago[AVR] Set R31R30 as clobbered after ADJCALLSTACKDOWN
Ayke van Laethem [Tue, 2 Mar 2021 00:21:41 +0000 (01:21 +0100)]
[AVR] Set R31R30 as clobbered after ADJCALLSTACKDOWN

In most cases, using R31R30 is fine because the call (which always
precedes ADJCALLSTACKDOWN) will clobber R31R30 anyway. However, in some
rare cases the register allocator might insert an instruction between
the call and the ADJCALLSTACKDOWN instruction and expect the register
pair to be live afterwards. I think this happens as a result of
rematerialization. Therefore, to fix this, the instruction needs to have
Defs set to R31R30.

Setting the Defs field does have the effect of making the instruction
look dead, which it certainly is not. This is fixed by setting
hasSideEffects to true.

Differential Revision: https://reviews.llvm.org/D97745

3 years ago[AVR] Do not chain stores in call frame setup
Ayke van Laethem [Wed, 3 Mar 2021 13:23:31 +0000 (14:23 +0100)]
[AVR] Do not chain stores in call frame setup

Previously, AVRTargetLowering::LowerCall attempted to keep stack stores
in order with chains. Perhaps this worked in the past, but it does not
work now: it appears that the SelectionDAG legalization phase removes
these chains. Therefore, I've removed these chains entirely to match
X86 (which, similar to AVR, also prefers to use push instructions over
stack-relative stores to set up a call frame). With this change, all the
stack stores are in a somewhat reasonable order.

Differential Revision: https://reviews.llvm.org/D97853

3 years ago[lld][WebAssembly] Align __heap_base
Ayke van Laethem [Wed, 21 Jul 2021 21:35:10 +0000 (23:35 +0200)]
[lld][WebAssembly] Align __heap_base

__heap_base was not aligned. In practice, it will often be aligned
simply because it follows the stack, but when the stack is placed at the
beginning (with the --stack-first option), the __heap_base might be
unaligned. It could even be byte-aligned.

At least wasi-libc appears to expect that __heap_base is aligned:
https://github.com/WebAssembly/wasi-libc/blob/659ff414560721b1660a19685110e484a081c3d4/dlmalloc/src/malloc.c#L5224

While WebAssembly itself does not appear to require any alignment for
memory accesses, it is sometimes required when sharing a pointer
externally. For example, WASI might expect alignment up to 8:
https://github.com/WebAssembly/WASI/blob/main/phases/snapshot/docs.md#-timestamp-u64

This issue got introduced with the addition of the --stack-first flag:
https://reviews.llvm.org/D46141
I suspect the lack of alignment wasn't intentional here.

Differential Revision: https://reviews.llvm.org/D106499

3 years ago[mlir] ConversionTarget legality callbacks refactoring
Butygin [Tue, 6 Jul 2021 16:11:16 +0000 (19:11 +0300)]
[mlir] ConversionTarget legality callbacks refactoring

* Get rid of Optional<std::function> as std::function already have a null state
* Add private setLegalityCallback function to set legality callback for unknown ops
* Get rid of unknownOpsDynamicallyLegal flag, use unknownLegalityFn state insted. This causes behavior change when user first calls markUnknownOpDynamicallyLegal with callback and then without but I am not sure is the original behavior was really a 'feature', or just oversignt in the original implementation.

Differential Revision: https://reviews.llvm.org/D105496