platform/upstream/llvm.git
2 years ago[AMDGPU][MC][NFC] Refactored sendmsg(...) handling
Dmitry Preobrazhensky [Mon, 21 Mar 2022 12:23:44 +0000 (15:23 +0300)]
[AMDGPU][MC][NFC] Refactored sendmsg(...) handling

Differential Revision: https://reviews.llvm.org/D121995

2 years ago [LV] Fix typo in comment
Sophia [Fri, 18 Mar 2022 02:14:50 +0000 (10:14 +0800)]
[LV] Fix typo in comment

    Reviewed by: fhahn (Florian Hahn)

    Differential Revision: https://reviews.llvm.org/D121781

2 years ago[mlir] Fix block merging with the result of a terminator
Markus Böck [Mon, 21 Mar 2022 12:26:00 +0000 (13:26 +0100)]
[mlir] Fix block merging with the result of a terminator

When the current implementation merges two blocks that have operands defined outside of their block respectively, it will merge these by adding a block argument in the resulting merged block and adding successor arguments to the predecessors.
There is a special case where this is incorrect however: If one of predecessors terminator produce the operand, inserting the block argument and updating the predecessor would lead to the terminator using its own result as successor argument.
IR Example:
```
  %0 = "test.producing_br"()[^bb1, ^bb2] {
        operand_segment_sizes = dense<0> : vector<2 x i32>
} : () -> i32

^bb1:
  "test.br"(%0)[^bb4] : (i32) -> ()
```
where `^bb1` is then merged with another block would lead to:
 ```
  %0 = "test.producing_br"(%0)[^bb1, ^bb2]
```

This patch fixes that issue during clustering by making sure that if the operand is from an outside block, that it is not produced by the terminator of a predecessor.

Differential Revision: https://reviews.llvm.org/D121988

2 years ago[Utils] Fix %S substitution
Nikita Popov [Mon, 21 Mar 2022 12:20:32 +0000 (13:20 +0100)]
[Utils] Fix %S substitution

%S refers to the directory of %s, not to the cwd. This is mostly
handled correctly, but update_cc_test_checks.py used for the wrong
path for non-FileCheck RUN lines.

2 years ago[mlir] Add a function to print C-strings to RunnerUtils.cpp.
Alexander Belyaev [Mon, 21 Mar 2022 10:01:50 +0000 (11:01 +0100)]
[mlir] Add a function to print C-strings to RunnerUtils.cpp.

Differential Revision: https://reviews.llvm.org/D122066

2 years ago[X86] Add nounwind to adc/sbb tests to prevent cfi noise
Simon Pilgrim [Mon, 21 Mar 2022 11:44:12 +0000 (11:44 +0000)]
[X86] Add nounwind to adc/sbb tests to prevent cfi noise

2 years ago[AMDGPU] Add an agpr copy propagation test
Jay Foad [Mon, 21 Mar 2022 11:41:22 +0000 (11:41 +0000)]
[AMDGPU] Add an agpr copy propagation test

2 years ago[AMDGPU] Update checks in agpr-copy-propagation.mir
Jay Foad [Mon, 21 Mar 2022 11:40:42 +0000 (11:40 +0000)]
[AMDGPU] Update checks in agpr-copy-propagation.mir

2 years ago[lld-macho][nfc] Add comment explaining why a cast<> is safe
Jez Ng [Wed, 16 Mar 2022 21:53:02 +0000 (17:53 -0400)]
[lld-macho][nfc] Add comment explaining why a cast<> is safe

2 years ago[lld-macho][nfc] Have findContainingSubsection take a Section
Jez Ng [Wed, 16 Mar 2022 22:05:32 +0000 (18:05 -0400)]
[lld-macho][nfc] Have findContainingSubsection take a Section

... instead of an instance of `Subsections`.

This simplifies the code slightly since all its callsites have a Section
instance anyway.

2 years ago[clang] Fix wrong -Wunused-local-typedef warning within a template function
Kristina Bessonova [Mon, 21 Mar 2022 11:21:24 +0000 (13:21 +0200)]
[clang] Fix wrong -Wunused-local-typedef warning within a template function

Partially fixes PR24883.

The patch sets Reference bit while instantiating a typedef if it
previously was found referenced.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D114382

2 years ago[clang-tidy] Skip parentheses in `readability-make-member-function-const`
Evgeny Shulgin [Sat, 19 Mar 2022 18:23:47 +0000 (21:23 +0300)]
[clang-tidy] Skip parentheses in `readability-make-member-function-const`

The checker should ignore parentheses when looking whether the
function should be marked as `const`.

Fixes https://github.com/llvm/llvm-project/issues/52838

Reviewed By: mgehre-amd, njames93

Differential Revision: https://reviews.llvm.org/D122075

2 years ago[AMDGPU] SIInstrInfo::verifyInstruction tweaks. NFCI.
Jay Foad [Mon, 21 Mar 2022 11:14:17 +0000 (11:14 +0000)]
[AMDGPU] SIInstrInfo::verifyInstruction tweaks. NFCI.

Simplify some for loops. Don't bother checking src2 operand for
writelane because it doesn't have one. Check all VALU instructions,
not just VOP1/2/3/C/SDWA.

2 years ago[mlir][OpenMP] Added translation from `omp.atomic.capture` to LLVM IR
Shraiysh Vaishay [Mon, 21 Mar 2022 10:50:54 +0000 (16:20 +0530)]
[mlir][OpenMP] Added translation from `omp.atomic.capture` to LLVM IR

This patch adds translation from `omp.atomic.capture` to LLVM IR. Also
added tests for the same.

Depends on D121546

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D121554

2 years ago[CodeGen][OpenMP] Use correct type in EmitLoadOfPointer()
Nikita Popov [Mon, 21 Mar 2022 10:54:39 +0000 (11:54 +0100)]
[CodeGen][OpenMP] Use correct type in EmitLoadOfPointer()

Rather than using a dummy void pointer type, we should specify the
correct private type and perform the bitcast beforehand rather than
afterwards. This way, the Address will have correct alignment
information.

2 years ago[X86] combineAddOrSubToADCOrSBB - Fold ADD/SUB + (AND(SRL(X,Y),1) -> ADC/SBB+BT(X,Y)
Simon Pilgrim [Mon, 21 Mar 2022 10:56:27 +0000 (10:56 +0000)]
[X86] combineAddOrSubToADCOrSBB - Fold ADD/SUB + (AND(SRL(X,Y),1) -> ADC/SBB+BT(X,Y)

As suggested on PR35908, if we are adding/subtracting an extracted bit, attempt to use BT instead to fold the op and use a ADC/SBB op.

Differential Revision: https://reviews.llvm.org/D122084

2 years ago[OpenMP][IRBuilder] Fix emitAtomicUpdate conditions
Shraiysh Vaishay [Mon, 21 Mar 2022 10:08:45 +0000 (15:38 +0530)]
[OpenMP][IRBuilder] Fix emitAtomicUpdate conditions

This patch fixes the condition for emitting atomic update using
`atomicrmw` instruction or compare-exchange loop.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D121546

2 years ago[X86] Add (x - y - ((z & m) >> s)) sub -> sbb test case for D122084
Simon Pilgrim [Mon, 21 Mar 2022 10:44:05 +0000 (10:44 +0000)]
[X86] Add (x - y - ((z & m) >> s)) sub -> sbb test case for D122084

2 years ago[instcombine] Support and test __builtin_object_size interaction with __strdup and...
serge-sans-paille [Fri, 18 Mar 2022 13:37:43 +0000 (14:37 +0100)]
[instcombine] Support and test __builtin_object_size interaction with __strdup and __strndup

Differential Revision: https://reviews.llvm.org/D122005

2 years ago[LowerConstantIntrinsics] Support phi operand in __builtin_object_size folder
serge-sans-paille [Thu, 17 Mar 2022 10:18:54 +0000 (11:18 +0100)]
[LowerConstantIntrinsics] Support phi operand in __builtin_object_size folder

The implementation is just a generalization of the Select handler.
We're no trying to be smart and compute any kind of fixed point.

Differential Revision: https://reviews.llvm.org/D121897

2 years ago[LV] Remove unneeded Loop argument from completeLoopSkeleton. (NFCI)
Florian Hahn [Mon, 21 Mar 2022 10:07:25 +0000 (10:07 +0000)]
[LV] Remove unneeded Loop argument from completeLoopSkeleton. (NFCI)

completeLoopSkeleton only uses its loop argument only to get the
pre-header, but the pre-header is already known (we created/cached it
earlier). Remove the unneeded loop argument.

2 years ago[clang-format] [doc] Improve BraceWrapping documentation.
Marek Kurdej [Mon, 21 Mar 2022 09:24:56 +0000 (10:24 +0100)]
[clang-format] [doc] Improve BraceWrapping documentation.

2 years ago[libcxx] [ci] Check that Windows static libraries don't contain dllexports
Martin Storsjö [Mon, 7 Mar 2022 21:35:45 +0000 (23:35 +0200)]
[libcxx] [ci] Check that Windows static libraries don't contain dllexports

Differential Revision: https://reviews.llvm.org/D121164

2 years ago[Docs] Update opaque pointers docs (NFC)
Nikita Popov [Mon, 21 Mar 2022 09:11:03 +0000 (10:11 +0100)]
[Docs] Update opaque pointers docs (NFC)

Mention automatic enablement of opaque pointers mode that was
recently implemented. Update wording in the transition state,
because it seems like my overly cautious wording has given some
people an incorrect impression of the state of opaque pointer
support in clang.

2 years ago[Docs] Fix reference (NFC)
Nikita Popov [Mon, 21 Mar 2022 09:06:04 +0000 (10:06 +0100)]
[Docs] Fix reference (NFC)

2 years ago[clang-format] Use range-for loop with drop_end. NFC.
Marek Kurdej [Fri, 18 Mar 2022 13:28:31 +0000 (14:28 +0100)]
[clang-format] Use range-for loop with drop_end. NFC.

2 years agoRevert "[AMDGPU] Improve v_cmpx usage on GFX10.3."
Thomas Symalla [Mon, 21 Mar 2022 08:49:30 +0000 (09:49 +0100)]
Revert "[AMDGPU] Improve v_cmpx usage on GFX10.3."

This reverts commit 011c64191ef9ccc6538d52f4b57f98f37d4ea36e and
e725e2afe02e18398525652c9bceda1eb055ea64.

Differential Revision: https://reviews.llvm.org/D122117

2 years ago[ADT] Add drop_end.
Marek Kurdej [Fri, 18 Mar 2022 13:18:37 +0000 (14:18 +0100)]
[ADT] Add drop_end.

This patch adds drop_end that is analogical to drop_begin.
It tries to fill the functional gap where one could drop first elements but not the last ones.
The need for it came in when refactoring clang-format.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D122009

2 years ago[AMDGPU] [NFC] Fix missing include.
Thomas Symalla [Mon, 21 Mar 2022 08:37:22 +0000 (09:37 +0100)]
[AMDGPU] [NFC] Fix missing include.

2 years ago[AMDGPU] Improve v_cmpx usage on GFX10.3.
Thomas Symalla [Tue, 1 Feb 2022 09:28:18 +0000 (10:28 +0100)]
[AMDGPU] Improve v_cmpx usage on GFX10.3.

On GFX10.3 targets, the following instruction sequence

v_cmp_* SGPR, ...
s_and_saveexec ..., SGPR

leads to a fairly long stall caused by a VALU write to a SGPR and having the
following SALU wait for the SGPR.

An equivalent sequence is to save the exec mask manually instead of letting
s_and_saveexec do the work and use a v_cmpx instruction instead to do the
comparison.

This patch modifies the SIOptimizeExecMasking pass as this is the last position
where s_and_saveexec instructions are inserted. It does the transformation by
trying to find the pattern, extracting the operands and generating the new
instruction sequence.

It also changes some existing lit tests and introduces a few new tests to show
the changed behavior on GFX10.3 targets.

Reviewed By: sebastian-ne, critson

Differential Revision: https://reviews.llvm.org/D119696

2 years ago[tests][intelpt] Fix outdated trace load test
Alisamar Husain [Mon, 21 Mar 2022 07:38:40 +0000 (13:08 +0530)]
[tests][intelpt] Fix outdated trace load test

Differential Revision: https://reviews.llvm.org/D122114

2 years ago[gn build] Port 9ada761be3b9
LLVM GN Syncbot [Mon, 21 Mar 2022 07:47:10 +0000 (07:47 +0000)]
[gn build] Port 9ada761be3b9

2 years ago[clang][Bazel] Add missing dependency from symbol_graph to llvm:support.
Adrian Kuegel [Mon, 21 Mar 2022 07:41:17 +0000 (08:41 +0100)]
[clang][Bazel] Add missing dependency from symbol_graph to llvm:support.

This did not show up as build error because the build also works if the
dependency is transitively available. But there should be a direct
dependency anyway.

2 years ago[PowerPC][NFC] rename file for PPCCTRLoopsVerify pass.
Chen Zheng [Mon, 21 Mar 2022 07:38:46 +0000 (03:38 -0400)]
[PowerPC][NFC] rename file for PPCCTRLoopsVerify pass.

Rename file for PPCCTRLoopsVerify pass from PPCCTRLoops.cpp
to PPCCTRLoopsVerify.cpp.

There will be a new file PPCCTRLoops.cpp for PPC CTR loops
generation later.

2 years agoRevert "[lldb/test] Add events listener helper class to lldbtest"
Pavel Labath [Mon, 21 Mar 2022 07:26:34 +0000 (08:26 +0100)]
Revert "[lldb/test] Add events listener helper class to lldbtest"

It removes the "wait-until-event-thread-stops" logic, which makes
TestDiagnosticReporting.py flaky.

This reverts commits 09ff41a087760ea7e80b8e5390a05101c5a5b929 and
acdd41b4590935e39208a941fbac7889d778e8e5.

2 years ago[libc] Add a linux file implementation.
Siva Chandra Reddy [Thu, 17 Mar 2022 08:38:37 +0000 (08:38 +0000)]
[libc] Add a linux file implementation.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D121976

2 years ago[trace][intelpt] Added total memory usage by decoded trace
Alisamar Husain [Sun, 20 Mar 2022 07:31:31 +0000 (13:01 +0530)]
[trace][intelpt] Added total memory usage by decoded trace

This fails currently but the basics are there

Differential Revision: https://reviews.llvm.org/D122093

2 years ago[CodeGen] Apply clang-tidy fixes for readability-redundant-smartptr-get (NFC)
Kazu Hirata [Mon, 21 Mar 2022 06:11:06 +0000 (23:11 -0700)]
[CodeGen] Apply clang-tidy fixes for readability-redundant-smartptr-get (NFC)

2 years ago[X86] Simplify the interface to getCondNoFromDesc.
Craig Topper [Mon, 21 Mar 2022 05:02:40 +0000 (22:02 -0700)]
[X86] Simplify the interface to getCondNoFromDesc.

Instead of taking a SkipDefs parameter, rename to getCondSrcNoFromDesc
and have it return the source operand number. Make getCondFromMI
responsible for adding the number of Defs for MI instructions.

While there remove some unneeded casts to unsigned and check for
negative numbers instead of explicitly -1. Less than 0 is easier
for a compiler to codegen.

Differential Revision: https://reviews.llvm.org/D122113

2 years ago[llmv-pdbutil] Replace ExitOnError with explicit error handling.
Carlos Alberto Enciso [Sun, 20 Mar 2022 07:40:32 +0000 (07:40 +0000)]
[llmv-pdbutil] Replace ExitOnError with explicit error handling.

At Sony we are developing llvm-dva

https://lists.llvm.org/pipermail/llvm-dev/2020-August/144174.html

For its PDB support, it requires functionality already present
in llvm-pdbutil.

We intend to move that functionaly into the PDB library to be
shared by both tools. That change will be done in 2 steps, that
will be submitted as 2 patches:

(1) Replace 'ExitOnError' with explicit error handling.
(2) Move the intended shared code to the PDB library.

This patch is for step (1).

As 'ExitOnError' is intended to be used only in tool code, replace
all occurrences in the code that will be moved to the PDB library
with explicit error handling.

Reviewed By: aganea, dblaikie, rnk

Differential Revision: https://reviews.llvm.org/D121801

2 years ago[IROutliner] Do not outlined from functions with optnone
Andrew Litteken [Mon, 14 Mar 2022 04:34:30 +0000 (23:34 -0500)]
[IROutliner] Do not outlined from functions with optnone

Since the IROutliner is performing an optimization, it should not outline from functions explicitly marked with optnone. This adds an extra check and test to make sure this does not occur.

Reviewers: paquette

Differential Revision: https://reviews.llvm.org/D121567

2 years ago[X86][NFC] Run clang-format on cb26730aaa8b, fix typo and remove redundant else
Shengchen Kan [Mon, 21 Mar 2022 04:05:05 +0000 (12:05 +0800)]
[X86][NFC] Run clang-format on cb26730aaa8b, fix typo and remove redundant else

2 years ago[X86][NFC] Unify implementations of getting condition code
Shengchen Kan [Mon, 21 Mar 2022 03:28:24 +0000 (11:28 +0800)]
[X86][NFC] Unify implementations of getting condition code

2 years ago[mlir][Arith] Add constant folder for right shift
jacquesguan [Fri, 18 Mar 2022 07:57:47 +0000 (15:57 +0800)]
[mlir][Arith] Add constant folder for right shift

Differential Revision: https://reviews.llvm.org/D121985

2 years ago[Analysis] Apply clang-tidy fixes for readability-redundant-smartptr-get (NFC)
Kazu Hirata [Mon, 21 Mar 2022 01:21:40 +0000 (18:21 -0700)]
[Analysis] Apply clang-tidy fixes for readability-redundant-smartptr-get (NFC)

2 years ago[LV] Remove dead Loop argument from emitMemRuntimeChecks. (NFC)
Florian Hahn [Sun, 20 Mar 2022 21:01:15 +0000 (21:01 +0000)]
[LV] Remove dead Loop argument from emitMemRuntimeChecks. (NFC)

2 years ago[SLP] Explicit track required stacksave/alloca dependency
Philip Reames [Sun, 20 Mar 2022 20:50:36 +0000 (13:50 -0700)]
[SLP] Explicit track required stacksave/alloca dependency

The semantics of an inalloca alloca instruction requires that it not be reordered with a preceeding stacksave intrinsic call.  Unfortunately, there's no def/use edge or memory dependence edge.  (THe memory point is slightly subtle, but in general a new allocation can't alias with a call which executes strictly before it comes into existance.)

I'd tried to tackle this same case previously in 689babdf6, but the fix chosen there turned out to be incomplete.  As such, this change contains a fully revert of the first fix attempt.

This was noticed when investigating problems which surfaced with D118538, but this is definitely an existing bug.  This time around, I managed to reduce a couple of additional cases, including one which was being actively miscompiled even without the new scheduling change.  (See test diffs)

Compile time wise, we only spend extra time when seeing a stacksave (rare), and even then we walk the block at most once per schedule window extension.  Likely a non-issue.

2 years ago[PPCISelLowering] Avoid emitting calls to __multi3, __muloti4
Aaron Puchert [Sun, 20 Mar 2022 19:59:06 +0000 (20:59 +0100)]
[PPCISelLowering] Avoid emitting calls to __multi3, __muloti4

After D108936, @llvm.smul.with.overflow.i64 was lowered to __multi3
instead of __mulodi4, which also doesn't exist on PowerPC 32-bit, not
even with compiler-rt. Block it as well so that we get inline code.

Because libgcc doesn't have __muloti4, we block that as well.

Fixes #54460.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122090

2 years ago[Transform] Apply clang-tidy fixes for readability-redundant-smartptr-get (NFC)
Kazu Hirata [Sun, 20 Mar 2022 17:41:22 +0000 (10:41 -0700)]
[Transform] Apply clang-tidy fixes for readability-redundant-smartptr-get (NFC)

2 years ago[libc++][test][NFC] Remove libcpp-no-concepts.
Mark de Wever [Sun, 20 Mar 2022 13:15:00 +0000 (14:15 +0100)]
[libc++][test][NFC] Remove libcpp-no-concepts.

This is no longer needed.

Reviewed By: #libc, philnik

Differential Revision: https://reviews.llvm.org/D122099

2 years ago[PowerPC][NFC] use right hardware loop intrinsics in test case
Chen Zheng [Sun, 20 Mar 2022 13:58:41 +0000 (09:58 -0400)]
[PowerPC][NFC] use right hardware loop intrinsics in test case

2 years ago[XCOFF] support XCOFFObjectWriter for fileHeader and sectionHeaders in 64-bit XCOFF.
esmeyi [Sun, 20 Mar 2022 13:31:29 +0000 (09:31 -0400)]
[XCOFF] support XCOFFObjectWriter for fileHeader and sectionHeaders in 64-bit XCOFF.

This is the first patch to enable the XCOFF64 object writer.
Currently only fileHeader and sectionHeaders are supported.

Reviewed By: jhenderson, DiggerLin

Differential Revision: https://reviews.llvm.org/D120861

2 years ago[MLIR][Presburger] remove redundant constraints in coalesce
Michel Weber [Sun, 20 Mar 2022 12:59:05 +0000 (12:59 +0000)]
[MLIR][Presburger] remove redundant constraints in coalesce

This patch improves the representation size of individual
`IntegerRelation`s by calling the function
`IntegerRelation::removeRedundantConstraints`. While this is only a
slight optimization in the current version, it will be necessary for
patches to come.

Reviewed By: arjunp

Differential Revision: https://reviews.llvm.org/D121989

2 years agoenable binop identity constant folds for add
Luo, Yuanke [Mon, 14 Mar 2022 00:27:30 +0000 (08:27 +0800)]
enable binop identity constant folds for add

Differential Revision: https://reviews.llvm.org/D119654

2 years ago[X86] Simplify function isDataInvariant by using X86MnemonicTables
Shengchen Kan [Sun, 20 Mar 2022 10:38:09 +0000 (18:38 +0800)]
[X86] Simplify function isDataInvariant by using X86MnemonicTables

This is not a NFC change b/c we add more instructions like
IMUL16/32/64r, MOV16ao16 and MOV16rr_REV etc to the list.
But I think it's reasonable.

Reviewed By: Amir

Differential Revision: https://reviews.llvm.org/D122063

2 years ago[VPlan] Add test for VPExpandSCEVRecipe printing.
Florian Hahn [Sun, 20 Mar 2022 10:11:40 +0000 (10:11 +0000)]
[VPlan] Add test for VPExpandSCEVRecipe printing.

2 years ago[X86] Add test add with bit0 extraction and improve comments
Simon Pilgrim [Sun, 20 Mar 2022 09:31:52 +0000 (09:31 +0000)]
[X86] Add test add with bit0 extraction and improve comments

Based on feedback from D122084

2 years ago[X86] combineAddOrSubToADCOrSBB - split to more cleanly handle commuted variants.
Simon Pilgrim [Sat, 19 Mar 2022 21:09:43 +0000 (21:09 +0000)]
[X86] combineAddOrSubToADCOrSBB - split to more cleanly handle commuted variants.

Split combineAddOrSubToADCOrSBB into wrapper (which handles ADDs with commuted args) and the real combine, which no longer has to account for commutation.

I'm intending to extend combineAddOrSubToADCOrSBB to detect patterns other than just X86ISD::SETCC, so we need to detect all patterns without detecting them as part of a commutation swap.

2 years ago[X86][NFC] Generate fields and getters for subtarget features
Shengchen Kan [Sun, 20 Mar 2022 07:22:02 +0000 (15:22 +0800)]
[X86][NFC] Generate fields and getters for subtarget features

Non-duplicated comments are moved from X86Subtarget.h to X86.td.
This is a follow-up patch for D120906.

2 years ago[trace][intelpt] Instruction count in trace info
Alisamar Husain [Sat, 19 Mar 2022 19:58:04 +0000 (01:28 +0530)]
[trace][intelpt] Instruction count in trace info

Added a line to `thread trace dump info` results which shows total number of instructions executed until now.

Differential Revision: https://reviews.llvm.org/D122076

2 years ago[X86][NFC] Remove unused variable UseAA
Shengchen Kan [Sun, 20 Mar 2022 05:21:25 +0000 (13:21 +0800)]
[X86][NFC] Remove unused variable UseAA

2 years ago[X86][NFC] Remove unused feature UseAA
Shengchen Kan [Sun, 20 Mar 2022 05:14:13 +0000 (13:14 +0800)]
[X86][NFC] Remove unused feature UseAA

2 years ago[X86][NFC] Rename hasCMOV() to canUseCMOV(), hasLAHFSAHF() to canUseLAHFSAHF()
Shengchen Kan [Sun, 20 Mar 2022 04:00:25 +0000 (12:00 +0800)]
[X86][NFC] Rename hasCMOV() to canUseCMOV(), hasLAHFSAHF() to canUseLAHFSAHF()

To make them less like other feature functions.
This is a follow-up patch for D121978.

2 years ago[SelectionDAG][RISCV] Make RegsForValue::getCopyToRegs explicitly zero_extend constants.
Craig Topper [Sat, 19 Mar 2022 01:33:30 +0000 (18:33 -0700)]
[SelectionDAG][RISCV] Make RegsForValue::getCopyToRegs explicitly zero_extend constants.

ComputePHILiveOutRegInfo assumes that constant incoming values to
Phis will be zero extended if they aren't a legal type. To guarantee
that we should zero_extend rather than any_extend constants.

This fixes a bug for RISCV where any_extend of constants can be
treated as a sign_extend.

Differential Revision: https://reviews.llvm.org/D122053

2 years ago[RISCV] Add test case for miscompile caused by treating ANY_EXTEND of constants as...
Craig Topper [Sat, 19 Mar 2022 01:32:23 +0000 (18:32 -0700)]
[RISCV] Add test case for miscompile caused by treating ANY_EXTEND of constants as SIGN_EXTEND.

The code that inserts AssertZExt based on predecessor information assumes
constants are zero extended for phi incoming values this allows
AssertZExt to be created in blocks consuming a Phi.
SelectionDAG::getNode treats any_extend of i32 constants as sext for RISCV.
The code that creates phi incoming values in the predecessors creates an
any_extend for the constants which then gets treated as a sext by getNode.
This makes the AssertZExt incorrect and can cause zexts to be
incorrectly removed.

This bug was introduced by D105918

Differential Revision: https://reviews.llvm.org/D122052

2 years ago[slp,tests] Consolidate two test files into one
Philip Reames [Sun, 20 Mar 2022 01:23:03 +0000 (18:23 -0700)]
[slp,tests] Consolidate two test files into one

There are some slight changes to the test lines due to different cost threshold choices in the two command lines, but I don't believe these to be interesting the purpose of the tests.

2 years ago[bazel][mlir] Add MLIR PDLL LSP server target
Jacques Pienaar [Sun, 20 Mar 2022 00:53:37 +0000 (17:53 -0700)]
[bazel][mlir] Add MLIR PDLL LSP server target

Add targets for PDLL LSP server.

2 years ago[tests, SLP] Add coverage for missing dependencies for stacksave intrinsics
Philip Reames [Sun, 20 Mar 2022 01:03:57 +0000 (18:03 -0700)]
[tests, SLP] Add coverage for missing dependencies for stacksave intrinsics

The existing scheduling doesn't account for the scheduling restrictions implied by inalloca allocas combined with stacksave/stackrestore.  This adds coverage including one currently miscompiling case.

2 years agoRevert "[amdgpu][nfc] Pass function instead of module to allocateModuleLDSGlobal"
Jon Chesterfield [Sun, 20 Mar 2022 00:57:20 +0000 (00:57 +0000)]
Revert "[amdgpu][nfc] Pass function instead of module to allocateModuleLDSGlobal"
Reconsidered, better to handle per-function state in the constructor as before.
This reverts commit 98e474c1b3210d90e313457bf6a6e39a7edb4d2b.

2 years agomlir: set CMAKE_INCLUDE_CURRENT_DIR to fix out-of-tree builds
Will Dietz [Sat, 19 Mar 2022 21:53:59 +0000 (16:53 -0500)]
mlir: set CMAKE_INCLUDE_CURRENT_DIR to fix out-of-tree builds

This option tells CMake to add current source and binary
directories to the include path for each directory[1].

Required include directories from build tree (for generated
files) were previously added in `mlir_tablegen` but this was
changed in 03078ec20b12605fd4dfd9fe9c98a26c9d2286d7 .

These are still needed, however, for out-of-tree builds
that don't build as part of LLVM (via LLVM_ENABLE_PROJECTS).
Building as part of LLVM works regardless, AFAICT,
because LLVM sets this option and so the MLIR build inherits it.

FWIW, various other (in-tree) LLVM projects set this as well.

And of course this fixes the out-of-tree
mlir-by-itself build scenario I'm using.

[1] https://cmake.org/cmake/help/latest/variable/CMAKE_INCLUDE_CURRENT_DIR.html

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D122088

2 years ago[mlir:PDLL][NFC] Remove a dead comment about constant params
River Riddle [Sat, 19 Mar 2022 20:38:27 +0000 (13:38 -0700)]
[mlir:PDLL][NFC] Remove a dead comment about constant params

These were removed, and the FIXME is no longer relevant.

2 years ago[SLP] Respect control dependence within a block during scheduling
Philip Reames [Fri, 18 Mar 2022 22:25:10 +0000 (15:25 -0700)]
[SLP] Respect control dependence within a block during scheduling

This fixes an active miscompile visible in the test changes.  The basic problem is that the scheduling dependency graph didn't have any edges for control dependence within a single basic block.  The result is that we could (and in some rare cases *did*) perform reorderings within a block which could introduce new undefined behavior along paths which didn't previously contain any.

Impact wise, we have two major cases where control is not guaranteed to reach a later instruction in the block: may throw calls, and calls containing infinite loops.
* The former case was mostly covered by the memory dependencies, and to trigger require a function which can throw, but not write to memory.  In theory, such a case is possible, but not likely in practice.
* The later case is likely more of an issue in practice.  After this code was first written, we changed the IR semantics to allow well defined infinite loops without satisifying mustprogress.  Even for C/C++ - which do imply mustprogress - recent changes to how we treat atomics (e.g. an atomic read does not always imply a write) could expose this issue.  I'm a bit shocked we don't seem to have a bug report which hit this in real code actually.

Compile time wise, this results in a single extra scan of the scheduling window in the common case.  Since we stop scanning at the next instruction which isn't guaranteed to execute, no matter what order we traverse instructions in, we scan the block once.  The exception to this is that when we extend the scheduling window downwards, we invalidate all dependencies, and thus rescan.  So the potentially expensive case is when we a call in a big schedule window which is frequently extended.  We could optimize this case (by caching the last instruction not guaranteeed to transfer execution and scanning only the extended window) and starting there), but I decided to leave the complexity until it mattered.  That same case is already degenerate with memory dependences which is more expensive than the control dependence scan.

We could also consider combining the memory dependence and control dependence sets to reduce memory usage, but since it complicates the code slightly and makes debugging a bit harder, I went with the simplest scheme for now.

This was noticed while trying to understand the failures reported against D118538, but is not otherwise related to that change.

2 years ago[libSupport] make CallBacksToRun static local
Tal Kedar [Sat, 19 Mar 2022 19:08:02 +0000 (19:08 +0000)]
[libSupport] make CallBacksToRun static local

In order to allow compiling with -Werror=global-constructors with c++20 and above.

Discussion: https://discourse.llvm.org/t/llvm-lib-support-signals-cpp-fails-to-compile-due-to-werror-global-constructors/61070

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D122067

2 years ago[mlir:PDL] Remove the ConstantParams support from native Constraints/Rewrites
River Riddle [Mon, 14 Mar 2022 05:09:20 +0000 (22:09 -0700)]
[mlir:PDL] Remove the ConstantParams support from native Constraints/Rewrites

This support has never really worked well, and is incredibly clunky to
use (it effectively creates two argument APIs), and clunky to generate (it isn't
clear how we should actually expose this from PDL frontends). Treating these
as just attribute arguments is much much cleaner in every aspect of the stack.
If we need to optimize lots of constant parameters, it would be better to
investigate internal representation optimizations (e.g. batch attribute creation),
that do not affect the user (we want a clean external API).

Differential Revision: https://reviews.llvm.org/D121569

2 years ago[mlir][PDLL] Add signature help to the PDLL language server
River Riddle [Fri, 11 Mar 2022 08:38:17 +0000 (00:38 -0800)]
[mlir][PDLL] Add signature help to the PDLL language server

This commit adds signature support to the language server,
and initially supports providing help for: operation operands and results,
and constraint/rewrite calls.

Differential Revision: https://reviews.llvm.org/D121545

2 years ago[mlir][PDLL] Add code completion to the PDLL language server
River Riddle [Fri, 11 Mar 2022 08:32:49 +0000 (00:32 -0800)]
[mlir][PDLL] Add code completion to the PDLL language server

This commit adds code completion support to the language server,
and initially supports providing completions for: Member access,
attributes/constraint/dialect/operation names, and pattern metadata.

Differential Revision: https://reviews.llvm.org/D121544

2 years ago[mlir][PDLL] Add symbol support to the PDLL language server
River Riddle [Fri, 11 Mar 2022 08:23:39 +0000 (00:23 -0800)]
[mlir][PDLL] Add symbol support to the PDLL language server

This adds support for documenting the top-level "symbols",
e.g. patterns, constraints, rewrites, etc., within a PDLL file.

Differential Revision: https://reviews.llvm.org/D121543

2 years ago[mlir][PDLL] Add hover support to the PDLL language server
River Riddle [Fri, 11 Mar 2022 08:18:44 +0000 (00:18 -0800)]
[mlir][PDLL] Add hover support to the PDLL language server

This adds support for providing information when hovering over
operation names, variables, patters, constraints, and rewrites.

Differential Revision: https://reviews.llvm.org/D121542

2 years ago[mlir][PDLL] Add an initial language server for PDLL
River Riddle [Fri, 11 Mar 2022 07:44:53 +0000 (23:44 -0800)]
[mlir][PDLL] Add an initial language server for PDLL

This commits adds a basic language server for PDLL to enable providing
language features in IDEs such as VSCode. This initial commit only
adds support for tracking definitions, references, and diagnostics, but
followup commits will build upon this to provide more significant behavior.

In addition to the server, this commit also updates mlir-vscode to support
the PDLL language and invoke the server.

Differential Revision: https://reviews.llvm.org/D121541

2 years ago[LV] Remove unnecessary uses of Loop* (NFC).
Florian Hahn [Sat, 19 Mar 2022 20:18:47 +0000 (20:18 +0000)]
[LV] Remove unnecessary uses of Loop* (NFC).

Update functions that previously took a loop pointer but only to get the
pre-header. Instead, pass the block directly. This removes the
requirement for the loop object to be created up-front.

2 years ago[X86] Rename FeatureCMPXCHG8B/FeatureCMPXCHG16B to FeatureCX8/CX16 to match CPUID.
Craig Topper [Sat, 19 Mar 2022 19:31:12 +0000 (12:31 -0700)]
[X86] Rename FeatureCMPXCHG8B/FeatureCMPXCHG16B to FeatureCX8/CX16 to match CPUID.

Rename hasCMPXCHG16B() to canUseCMPXCHG16B() to make it less like other
feature functions. Add a similar canUseCMPXCHG8B() that aliases
hasCX8() to keep similar naming.

Differential Revision: https://reviews.llvm.org/D121978

2 years ago[X86] Add some initial test coverage for PR35908 add/sub + bittest patterns
Simon Pilgrim [Sat, 19 Mar 2022 19:20:12 +0000 (19:20 +0000)]
[X86] Add some initial test coverage for PR35908 add/sub + bittest patterns

2 years ago[OpenMP][FIX] Do not crash when kernels are debug wrapper functions
Johannes Doerfert [Fri, 18 Mar 2022 21:53:40 +0000 (16:53 -0500)]
[OpenMP][FIX] Do not crash when kernels are debug wrapper functions

With debug information enabled (-g) Clang will wrap the actual target
region into a new function which is called from the "kernel". The problem
is that the "kernel" is now basically a wrapper without all the things
we expect. More importantly, if we end up asking for an AAKernelInfo
for the "target region function" we might try to turn it into SPMD mode.
That used to cause an assertion as that function doesn't have an
appropriately named `_exec_mode` global. While the global is going away
soon we still need to make sure to properly handle this case, e.g.,
perform optimizations reliably.

Differential Revision: https://reviews.llvm.org/D122043

2 years ago[docs] Fix a couple of typos
Itay Bookstein [Sat, 19 Mar 2022 18:24:38 +0000 (20:24 +0200)]
[docs] Fix a couple of typos

Signed-off-by: Itay Bookstein <ibookstein@gmail.com>
2 years ago[X86] combineAddOrSubToADCOrSBB - pull out repeated Y.getOperand(1) calls. NFC.
Simon Pilgrim [Sat, 19 Mar 2022 17:56:06 +0000 (17:56 +0000)]
[X86] combineAddOrSubToADCOrSBB - pull out repeated Y.getOperand(1) calls. NFC.

2 years ago[libc++] Prepare string tests for constexpr
Nikolas Klauser [Thu, 10 Mar 2022 19:15:23 +0000 (20:15 +0100)]
[libc++] Prepare string tests for constexpr

These are the last™ changes to the tests for constexpr preparation.

Reviewed By: Quuxplusone, #libc, Mordante

Spies: Mordante, EricWF, libcxx-commits

Differential Revision: https://reviews.llvm.org/D120951

2 years ago[docs] Fixed minor ordering issue
Alisamar Husain [Sat, 19 Mar 2022 16:49:13 +0000 (22:19 +0530)]
[docs] Fixed minor ordering issue

Differential Revision: https://reviews.llvm.org/D122073

2 years ago[X86] createShuffleMaskFromVSELECT - handle BLENDV constant masks as well as VSELECT...
Simon Pilgrim [Sat, 19 Mar 2022 16:51:00 +0000 (16:51 +0000)]
[X86] createShuffleMaskFromVSELECT - handle BLENDV constant masks as well as VSELECT constant masks

Handle constant masks for both vselect nodes (mask != 0) and blendv nodes (mask < 0)

2 years ago[SLP,tests] Add coverage showing need for control dependencies during scheduling
Philip Reames [Sat, 19 Mar 2022 16:41:14 +0000 (09:41 -0700)]
[SLP,tests] Add coverage showing need for control dependencies during scheduling

2 years ago[amdgpu][nfc] Pass function instead of module to allocateModuleLDSGlobal
Jon Chesterfield [Sat, 19 Mar 2022 16:10:05 +0000 (16:10 +0000)]
[amdgpu][nfc] Pass function instead of module to allocateModuleLDSGlobal

2 years ago[X86] combineSelect - don't constant fold BLENDV nodes like VSELECT
Simon Pilgrim [Sat, 19 Mar 2022 16:31:15 +0000 (16:31 +0000)]
[X86] combineSelect - don't constant fold BLENDV nodes like VSELECT

If a X86ISD::BLENDV op appears before legalization (in this test case due to the icmp_slt x, 0) its constant mask was being treated as a vselect mask (mask != 0) instead of blendv (mask < 0)

This just prevents constant folding entirely for non-VSELECT ops.

2 years ago[X86] Add test showing a bug where a BLENDV mask is being constant folded as VSELECT...
Simon Pilgrim [Sat, 19 Mar 2022 16:26:20 +0000 (16:26 +0000)]
[X86] Add test showing a bug where a BLENDV mask is being constant folded as VSELECT mask

combineSelect doesn't expect X86ISD::BLENDV ops to appear before legalization and is treating the constant mask as a vselect mask (mask != 0) instead of blendv (mask < 0)

2 years ago[VPlan] Improve pattern in vplan-printing.ll check line.
Florian Hahn [Sat, 19 Mar 2022 16:02:41 +0000 (16:02 +0000)]
[VPlan] Improve pattern in vplan-printing.ll check line.

The existing pattern only matched a single value, which breaks if the
numbering slightly changes.

2 years ago[X86] Update remaining AVX512 VBMI2 VL intrinsic tests to avoid adds
Simon Pilgrim [Sat, 19 Mar 2022 15:41:25 +0000 (15:41 +0000)]
[X86] Update remaining AVX512 VBMI2 VL intrinsic tests to avoid adds

As noticed in D119654, by adding the masked intrinsics results together we can end up with the selects being canonicalized away from the intrinsic - this isn't what we want to test here so replace with a insertvalue chain into a aggregate instead to retain all the results.

2 years ago[X86] LowerAndToBT - fold BT(NOT(X),Y) -> BT(X,Y) and flip the CondCode
Simon Pilgrim [Sat, 19 Mar 2022 14:03:03 +0000 (14:03 +0000)]
[X86] LowerAndToBT - fold BT(NOT(X),Y) -> BT(X,Y) and flip the CondCode

2 years ago[X86][SSE] Add initial support for extracting non-constant bool vector elements
Simon Pilgrim [Sat, 19 Mar 2022 13:31:05 +0000 (13:31 +0000)]
[X86][SSE] Add initial support for extracting non-constant bool vector elements

We can use MOVMSK+TEST/BT to extract individual bool elements even if the index isn't constant

This relies on combineBitcastvxi1 so some AVX512 cases still aren't optimized as they avoid MOVMSK usage.

2 years ago[X86][SSE] Add tests for non-constant bool vector extractions
Simon Pilgrim [Sat, 19 Mar 2022 13:25:21 +0000 (13:25 +0000)]
[X86][SSE] Add tests for non-constant bool vector extractions

We should be able to perform this with MOVMSK+TEST/BT instead of spilling to stack

2 years ago[AArch64] Combine ISD::SETCC into AArch64ISD::ANDS
chenglin.bi [Sat, 19 Mar 2022 12:54:44 +0000 (12:54 +0000)]
[AArch64] Combine ISD::SETCC into AArch64ISD::ANDS

When N > 12, (2^N -1) is not a legal add immediate (isLegalAddImmediate will return false).
ANd if SetCC input use this number, DAG combiner will generate one more SRL instruction.
So combine [setcc (srl x, imm), 0, ne] to [setcc (and x, (-1 << imm)), 0, ne] to get better optimization in emitComparison
Fix https://github.com/llvm/llvm-project/issues/54283

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D121449

2 years ago[AArch64] Fix incorrect getSetCCInverse usage within trySwapVSelectOperands.
Paul Walker [Thu, 17 Mar 2022 21:55:55 +0000 (21:55 +0000)]
[AArch64] Fix incorrect getSetCCInverse usage within trySwapVSelectOperands.

When inverting the compare predicate trySwapVSelectOperands is
incorrectly using the type of the select's cond operand rather
than the type of cond's operands. This means we're treating all
inversions as if they're integer.

Differential Revision: https://reviews.llvm.org/D121968

2 years ago[libc++][test] Improves handle formatter.
Mark de Wever [Thu, 3 Mar 2022 16:21:17 +0000 (17:21 +0100)]
[libc++][test] Improves handle formatter.

Before it only accepted one output iterator type. Now it accepts all
output iterator types as required by BasicFormatter.

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D120916