review.tizen.org Git - platform/upstream/llvm.git/log

projects / platform / upstream / llvm.git / log

Mehdi Amini [Wed, 3 Nov 2021 23:59:05 +0000 (23:59 +0000)]

Revert "Fix iterator_adaptor_base/enumerator_iter to allow composition of llvm::enumerate with llvm::make_filter_range"

This reverts commit ba7a6b314fd14bb2c9ff5d3f4fe2b6525514cada.

Post-commit review showed that the fix implemented wasn't correct, and a
more principled fix is possible.

commit | commitdiff | tree

Volodymyr Sapsai [Tue, 21 Sep 2021 01:59:19 +0000 (18:59 -0700)]

[clang][objc] Speed up populating the global method pool from modules.

For each selector encountered in the source code, we need to load
selectors from the imported modules and check that we are calling a
selector with compatible types.

At the moment, for each module we are storing methods declared in the
headers belonging to this module and methods from the transitive closure
of imported modules. When a module is imported by a few other modules,
methods from the shared module are duplicated in each importer. As the
result, we can end up with lots of identical methods that we try to add
to the global method pool. Doing this duplicate work is useless and
relatively expensive.

Avoid processing duplicate methods by storing in each module only its
own methods and not storing methods from dependencies. Collect methods
from dependencies by walking the graph of module dependencies.

The issue was discovered and reported by Richard Howell. He has done the
hard work for this fix as he has investigated and provided a detailed
explanation of the performance problem.

Differential Revision: https://reviews.llvm.org/D110123

commit | commitdiff | tree

Michael Jones [Tue, 2 Nov 2021 21:49:38 +0000 (14:49 -0700)]

[libc][NFC] rename str_conv_utils to str_to_integer

rename str_conv_utils to str_to_integer to be more
in line with str_to_float.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D113061

commit | commitdiff | tree

Jacques Pienaar [Wed, 3 Nov 2021 22:34:13 +0000 (15:34 -0700)]

[mlir] Use _odsPrinter for printer name in generated code

The generated name should not be load bearing, so this should be a NFC change.

Differential Revision: https://reviews.llvm.org/D113149

commit | commitdiff | tree

Philip Reames [Wed, 3 Nov 2021 22:13:31 +0000 (15:13 -0700)]

Backout must-exit based parts of 3fc9882e, and 412eb0

Not sure these are correct. I think I missed a case when porting this from the original SCEV change to the IndVar changes. I may end up reapplying this later with a comment about how this is correct, but in case the current bad feeling turns out to be true, I'm removing from tree while investigating further.

commit | commitdiff | tree

Jonas Devlieghere [Wed, 3 Nov 2021 21:55:28 +0000 (14:55 -0700)]

[lldb] Update tagged pointer command output and test.

- Use formatv to print the addresses.
- Add check for 0x0 which is treated as an invalid address.
- Use a an address that's less likely to be interpreted as a real
tagged pointer.

commit | commitdiff | tree

Arthur Eubanks [Wed, 3 Nov 2021 22:00:28 +0000 (15:00 -0700)]

[NFC] Clarify why LinkAll*.h are actually necessary

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D113074

commit | commitdiff | tree

Arthur Eubanks [Tue, 2 Nov 2021 02:49:05 +0000 (19:49 -0700)]

[ArgPromo] Preserve FunctionAnalysisManagerCGSCCProxy

We already make sure to properly clear analyses for deleted functions.

This makes investigating some future potential compile time improvements easier.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D113032

commit | commitdiff | tree

Mogball [Wed, 3 Nov 2021 21:12:41 +0000 (21:12 +0000)]

[mlir] fix Debug unittests

Flag NDEBUG needed to be changed to LLVM_ENABLE_ABI_BREAKING_CHECKS

commit | commitdiff | tree

Philip Reames [Wed, 3 Nov 2021 21:33:08 +0000 (14:33 -0700)]

[tests] Precommit for generalization of D112262

commit | commitdiff | tree

Craig Topper [Wed, 3 Nov 2021 21:11:18 +0000 (14:11 -0700)]

[RISCV] Use HasVInstructions and HasVInstructionsAnyF in more place in TableGen. NFC

Change RISCVSubtarget.hasVInstructionAnyF() to call hasVInstructionsF32
so that any changes to hasVInstructionsF32 are reflected.

The files were missed in D112496.

commit | commitdiff | tree

Matthias Braun [Tue, 28 Sep 2021 00:57:22 +0000 (17:57 -0700)]

X86InstrInfo: Support immediates that are +1/-1 different in optimizeCompareInstr

This is a re-commit of e2c7ee0743592e39274e28dbe0d0c213ba342317 which
was reverted in a2a58d91e82db38fbdf88cc317dcb3753d79d492. This includes
a fix to consistently check for EFLAGS being live-out. See phabricator
review.

Original Summary:

This extends `optimizeCompareInstr` to re-use previous comparison
results if the previous comparison was with an immediate that was 1
bigger or smaller. Example:

    CMP x, 13
    ...
    CMP x, 12   ; can be removed if we change the SETg
    SETg ...    ; x > 12  changed to `SETge` (x >= 13) removing CMP

Motivation: This often happens because SelectionDAG canonicalization
tends to add/subtract 1 often when optimizing for fallthrough blocks.
Example for `x > C` the fallthrough optimization switches true/false
blocks with `!(x > C)` --> `x <= C` and canonicalization turns this into
`x < C + 1`.

Differential Revision: https://reviews.llvm.org/D110867

commit | commitdiff | tree

Lang Hames [Wed, 3 Nov 2021 20:42:05 +0000 (13:42 -0700)]

[ORC-RT] Add SPS serialization for span<const char> / SPSSequence<char>.

commit | commitdiff | tree

Philip Reames [Wed, 3 Nov 2021 20:38:09 +0000 (13:38 -0700)]

Revert "[indvars] Move a check slightlly earlier [NFC]"

This reverts commit 7ff943a9ed878e3b8ffe162b2af41a81da1a11a2.

This wasn't NFC. isSigned != !isUnsigned as there are also relational operators.

commit | commitdiff | tree

River Riddle [Wed, 3 Nov 2021 19:57:36 +0000 (19:57 +0000)]

[mlir] Avoid folding in OpBuilder::tryFold when types change

This was missed when tightening fold restrictions in https://reviews.llvm.org/D95991.

Differential Revision: https://reviews.llvm.org/D113138

commit | commitdiff | tree

Kirill Stoimenov [Wed, 3 Nov 2021 18:39:38 +0000 (18:39 +0000)]

[ASan] Process functions in Asan module pass

This came up as recommendation while reviewing D112098.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D112732

commit | commitdiff | tree

alex-t [Sun, 31 Oct 2021 20:34:03 +0000 (23:34 +0300)]

[AMDGPU] Enable divergence-driven BFE selection

Detailed description: This change enables the bit field extract patterns
selection to s_bfe_u32 or v_bfe_u32 dependent on the pattern root node
divergence.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D110950

commit | commitdiff | tree

Vitaly Buka [Wed, 3 Nov 2021 20:11:05 +0000 (13:11 -0700)]

[asan] Disable test on Android Arm 32bit

Caused by D111703.

commit | commitdiff | tree

Valentin Clement [Wed, 3 Nov 2021 19:44:51 +0000 (20:44 +0100)]

[fir] Use notifyMatchFailure in fir.zero_bits conversion

Change emitOpError to notifyMatchFailure in conversion pattern.

Post-commit change after D113014

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D113091

commit | commitdiff | tree

Martin Storsjö [Thu, 28 Oct 2021 07:57:27 +0000 (10:57 +0300)]

[Support] [Windows] Use RemoveFileOnSignal if unable to use the delete-on-close flag

This takes care of cleaning up the temp files on crashes. It doesn't
handle cleanup when explicitly killed though.

Differential Revision: https://reviews.llvm.org/D112710

commit | commitdiff | tree

Philip Reames [Wed, 3 Nov 2021 19:24:10 +0000 (12:24 -0700)]

[indvars] Move a check slightlly earlier [NFC]

commit | commitdiff | tree

Philip Reames [Wed, 3 Nov 2021 19:08:16 +0000 (12:08 -0700)]

[indvars] Rotate zext though icmp to reduce loop varying computation

This change looks for cases where we can prove that an exit test of a loop can be performed in a narrower bitwidth, and that by doing so we can replace a loop-varying extend with a loop-invariant truncate.

The motivation here is that doing this unblocks the trip count analysis for narrow IVs involved in extended compare exit tests. It also has the nice side effect of simply making the code faster, even if we gain no other benefit from the improved analysis ability.

I've noted a few places this could be extended, but I think this stands reasonable on it's own as well.

Differential Revision: https://reviews.llvm.org/D112262

commit | commitdiff | tree

Vitaly Buka [Wed, 3 Nov 2021 19:02:09 +0000 (12:02 -0700)]

[PassBuilder] Remove unused function after D113072

commit | commitdiff | tree

Keith Smiley [Wed, 3 Nov 2021 18:28:45 +0000 (11:28 -0700)]

[lld-macho] Enable search-paths tests on macOS

I'm not sure what the history is here but this test passes on macOS
today. It seems like we should unify these tests if they need to run
cross platform.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D113085

commit | commitdiff | tree

Vitaly Buka [Wed, 3 Nov 2021 18:56:26 +0000 (11:56 -0700)]

[sanitizer] Disable new test on Android

Test added with D113055

commit | commitdiff | tree

River Riddle [Wed, 3 Nov 2021 18:22:49 +0000 (18:22 +0000)]

[mlir] Move the Operation OperandStorage to the first trailing object

The main benefits of this change are faster access to operands
(no need to compute the offset, as it is now right after the
operation), simpler code(no need to manage a lot of the "is the
operand storage trailing" logic we had to before). The major
downside to this though, is that operand holding operations now
grow in size by 1 word (as no matter how we do this change, there
will need to be some additional book keeping).

Differential Revision: https://reviews.llvm.org/D111695

commit | commitdiff | tree

Vitaly Buka [Wed, 3 Nov 2021 00:06:28 +0000 (17:06 -0700)]

[NFC][asan] Use AddressSanitizerOptions in ModuleAddressSanitizerPass

Reviewed By: kstoimenov

Differential Revision: https://reviews.llvm.org/D113072

commit | commitdiff | tree

Keith Smiley [Wed, 3 Nov 2021 18:08:57 +0000 (11:08 -0700)]

[lld-macho] Cache discovered framework paths

On our large iOS project this took a link from 1 minute 45 seconds to 45
seconds. For reference ld64 does the same link in ~20 seconds.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D113063

commit | commitdiff | tree

Markus Böck [Wed, 3 Nov 2021 18:02:50 +0000 (19:02 +0100)]

[mlir] Change ABI breaking use of NDEBUG to LLVM_ENABLE_ABI_BREAKING_CHECKS in DebugActions.h

A quick grep for NDEBUG in MLIR revealed a use in DebugActions.h that breaks ABI. This patch changes the use of NDEBUG to LLVM_ENABLE_ABI_BREAKING_CHECKS which has the advantage of being independent of whether clients build their own app in debug or release as it is purely dependant on how MLIR itself was built.

Differential Revision: https://reviews.llvm.org/D113088

commit | commitdiff | tree

Kirill Stoimenov [Wed, 3 Nov 2021 17:59:29 +0000 (17:59 +0000)]

Revert "[ASan] Process functions in Asan module pass"

This reverts commit 76ea87b94e5cba335d691e4e18e3464ad45c8b52.

Reviewed By: kstoimenov

Differential Revision: https://reviews.llvm.org/D113129

commit | commitdiff | tree

Kirill Stoimenov [Wed, 3 Nov 2021 16:32:44 +0000 (16:32 +0000)]

[ASan] Process functions in Asan module pass

This came up as recommendation while reviewing D112098.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D112732

commit | commitdiff | tree

Sanjay Patel [Wed, 3 Nov 2021 16:53:23 +0000 (12:53 -0400)]

[InstCombine] adjust test for icmp fold; NFC

I missed that the bitwidth changed from the previous test in the sequence.

commit | commitdiff | tree

Tamir Duberstein [Wed, 3 Nov 2021 17:21:43 +0000 (10:21 -0700)]

[sanitizer] Allow getsockname with NULL addrlen

This is already permitted in getpeername, and returns EFAULT
on Linux (does not crash the program).

Fixes https://github.com/google/sanitizers/issues/1451.

Differential Revision: https://reviews.llvm.org/D113055

commit | commitdiff | tree

Fangrui Song [Wed, 3 Nov 2021 17:21:13 +0000 (10:21 -0700)]

[docs] Mention --leading-lines instead of --no-leading-lines

commit | commitdiff | tree

Tamir Duberstein [Wed, 3 Nov 2021 17:16:20 +0000 (10:16 -0700)]

[sanitizer] Mark before deref in PosixSpawnImpl

Read each pointer in the argv and envp arrays before dereferencing
it; this correctly marks an error when these pointers point into
memory that has been freed.

Differential Revision: https://reviews.llvm.org/D113046

commit | commitdiff | tree

Keith Smiley [Wed, 3 Nov 2021 16:49:13 +0000 (09:49 -0700)]

[lld-macho] Cache library paths from findLibrary

On top of https://reviews.llvm.org/D113063 this took another 10 seconds
off our overall link time.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D113073

commit | commitdiff | tree

Louis Dionne [Wed, 3 Nov 2021 15:30:12 +0000 (11:30 -0400)]

[libc++] Fix GDB pretty printer tests for older Clangs and GCC

This was missed by https://llvm.org/D111477, which broke the CI.

Differential Revision: https://reviews.llvm.org/D113112

commit | commitdiff | tree

Shivam Gupta [Wed, 3 Nov 2021 16:44:42 +0000 (22:14 +0530)]

[Docs] Document scripts that are use to generate assertion in test cases

This patch document llvm/utils/update_* python scripts that are used to generate
assertions in many of the LLVM regression test cases.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D112936

commit | commitdiff | tree

Harald van Dijk [Wed, 3 Nov 2021 16:43:44 +0000 (16:43 +0000)]

[X86] Fix X32 indirect call generation

The check for whether a zero extension was needed was subtly wrong and
saw a value that was already 64 bits, so did not extend.

Fixes PR52357.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D112860

commit | commitdiff | tree

Sanjay Patel [Wed, 3 Nov 2021 16:13:53 +0000 (12:13 -0400)]

[InstCombine] refactor fold for icmp with trunc op; NFC

There are at least 3 related folds we can add here - see D112634.

commit | commitdiff | tree

Sanjay Patel [Wed, 3 Nov 2021 16:06:32 +0000 (12:06 -0400)]

[InstCombine] add tests for icmp with trunc op; NFC

commit | commitdiff | tree

Roman Lebedev [Wed, 3 Nov 2021 16:40:23 +0000 (19:40 +0300)]

[NFC] Add forgotten `REQUIRES: asserts` into the new costmodel test

commit | commitdiff | tree

Roman Lebedev [Wed, 3 Nov 2021 16:23:25 +0000 (19:23 +0300)]

[PassManager] `buildModuleOptimizationPipeline()`: schedule `LoopDeletion` pass run before vectorization passes

Test thanks to Michael Kuklinski from `#llvm`: https://godbolt.org/z/bdrah5Goo
originally inspired by Daniel Lemire's https://lemire.me/blog/2021/10/26/in-c-is-empty-faster-than-comparing-the-size-with-zero/

We manage to deduce that the answer does not require looping,
but we do that after the last `LoopDeletion` pass run,
so we end up being stuck with a dead loop.

Now, as with all things SCEV, this has
a very expected ~`+0.12%` compile time performance regression:
https://llvm-compile-time-tracker.com/compare.php?from=0ae7bf124a9bca76dd9a91b2f7379168ff13f562&to=c2ae57c9b961aeb4a28c747266949340613a6d84&stat=instructions
(for comparison, doing that in function simplification pipeline
would have been ~`+0.5` compile time performance regression, D112840)

Looking at the transformation stats over vanilla test-suite, i think it's rather expected:
```
| statistic name                                   |  baseline |  proposed |     Δ |      % |    |%| |
|--------------------------------------------------|----------:|----------:|------:|-------:|-------:|
| scalar-evolution.NumBruteForceTripCountsComputed |       789 |       888 |    99 | 12.55% | 12.55% |
| scalar-evolution.NumTripCountsNotComputed        |    105592 |    117900 | 12308 | 11.66% | 11.66% |
| loop-delete.NumBackedgesBroken                   |       542 |       559 |    17 |  3.14% |  3.14% |
| regalloc.numExtends                              |        81 |        79 |    -2 | -2.47% |  2.47% |
| indvars.NumFoldedUser                            |       408 |       400 |    -8 | -1.96% |  1.96% |
| indvars.NumElimCmp                               |      3831 |      3758 |   -73 | -1.91% |  1.91% |
| scalar-evolution.NumTripCountsComputed           |    299759 |    304278 |  4519 |  1.51% |  1.51% |
| loop-delete.NumDeleted                           |      8055 |      8128 |    73 |  0.91% |  0.91% |
| machine-cse.NumCommutes                          |       111 |       110 |    -1 | -0.90% |  0.90% |
| globaldce.NumFunctions                           |      1187 |      1192 |     5 |  0.42% |  0.42% |
| codegenprepare.NumSelectsExpanded                |       277 |       278 |     1 |  0.36% |  0.36% |
| loop-unroll.NumRuntimeUnrolled                   |     13841 |     13791 |   -50 | -0.36% |  0.36% |
| machinelicm.NumPostRAHoisted                     |      1168 |      1172 |     4 |  0.34% |  0.34% |
| phi-node-elimination.NumCriticalEdgesSplit       |     83054 |     82879 |  -175 | -0.21% |  0.21% |
| machine-cse.NumPREs                              |      3085 |      3079 |    -6 | -0.19% |  0.19% |
| branch-folder.NumBranchOpts                      |    108122 |    107942 |  -180 | -0.17% |  0.17% |
| loop-unroll.NumUnrolled                          |     40136 |     40067 |   -69 | -0.17% |  0.17% |
| branch-folder.NumDeadBlocks                      |    130818 |    130607 |  -211 | -0.16% |  0.16% |
| codegenprepare.NumBlocksElim                     |     92856 |     92714 |  -142 | -0.15% |  0.15% |
| instsimplify.NumSimplified                       |    103263 |    103129 |  -134 | -0.13% |  0.13% |
| instcombine.NumConstProp                         |     26070 |     26102 |    32 |  0.12% |  0.12% |
| instsimplify.NumExpand                           |      1716 |      1718 |     2 |  0.12% |  0.12% |
| loop-unroll.NumCompletelyUnrolled                |      9236 |      9225 |   -11 | -0.12% |  0.12% |
| branch-folder.NumHoist                           |      2773 |      2770 |    -3 | -0.11% |  0.11% |
| regalloc.NumReloadsRemoved                       |     10822 |     10834 |    12 |  0.11% |  0.11% |
| regalloc.NumSnippets                             |     11394 |     11406 |    12 |  0.11% |  0.11% |
| machine-cse.NumCrossBBCSEs                       |      1052 |      1053 |     1 |  0.10% |  0.10% |
| machinelicm.NumCSEed                             |     99887 |     99784 |  -103 | -0.10% |  0.10% |
| branch-folder.NumTailMerge                       |     72501 |     72435 |   -66 | -0.09% |  0.09% |
| codegenprepare.NumExtUses                        |     22007 |     21987 |   -20 | -0.09% |  0.09% |
| local.NumRemoved                                 |     68232 |     68294 |    62 |  0.09% |  0.09% |
| loop-vectorize.LoopsAnalyzed                     |     75483 |     75413 |   -70 | -0.09% |  0.09% |
```

Note that i'm only changing current PM, and not touching obsolete PM.

This is an alternative to the function simplification pipeline variant
of the same change, D112840. It has both less compile time impact
(since the additional number of SCEV trip count calculations
is way lass less than with the D112840), and it is
much more powerful/impactful (almost 2x more loops deleted).

I have checked, and doing this after loop rotation
is favorable (more loops deleted).

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D112851

commit | commitdiff | tree

Kazu Hirata [Wed, 3 Nov 2021 16:22:50 +0000 (09:22 -0700)]

[AArch64, AMDGPU] Use make_early_inc_range (NFC)

commit | commitdiff | tree

Roman Lebedev [Wed, 3 Nov 2021 16:15:05 +0000 (19:15 +0300)]

[NFC] Rewrite runlines in interleaved-store-accesses-with-gaps.ll once again

https://lab.llvm.org/buildbot/#/builders/98/builds/8198 is still failing,
and i really don't understand how runlines in this test differ
from the ones in other nearby tests...

commit | commitdiff | tree

Hans Wennborg [Wed, 3 Nov 2021 15:54:28 +0000 (16:54 +0100)]

Revert "X86InstrInfo: Support immediates that are +1/-1 different in optimizeCompareInstr"

This casued miscompiles of switches, see comments on the code review.

> This extends `optimizeCompareInstr` to re-use previous comparison
> results if the previous comparison was with an immediate that was 1
> bigger or smaller. Example:
>
>     CMP x, 13
>     ...
>     CMP x, 12   ; can be removed if we change the SETg
>     SETg ...    ; x > 12  changed to `SETge` (x >= 13) removing CMP
>
> Motivation: This often happens because SelectionDAG canonicalization
> tends to add/subtract 1 often when optimizing for fallthrough blocks.
> Example for `x > C` the fallthrough optimization switches true/false
> blocks with `!(x > C)` --> `x <= C` and canonicalization turns this into
> `x < C + 1`.
>
> Differential Revision: https://reviews.llvm.org/D110867

This reverts commit e2c7ee0743592e39274e28dbe0d0c213ba342317.

commit | commitdiff | tree

Roman Lebedev [Wed, 3 Nov 2021 15:14:35 +0000 (18:14 +0300)]

[X86] `X86TTIImpl::getInterleavedMemoryOpCostAVX512()`: fallback to scalarization cost computation for mask

I don't really buy that masked interleaved memory loads/stores are supported on X86.
There is zero costmodel test coverage, no actual cost modelling for the generation
of the mask repetition, and basically only two LV tests.
Additionally, i'm not very interested in AVX512.

I don't know if this really helps "soft" block over at
https://reviews.llvm.org/D111460#inline-1075467,
but i think it can't make things worse at least.

When we are being told that there is a masking, instead of
completely giving up and falling back to
fully scalarizing `BasicTTIImplBase::getInterleavedMemoryOpCost()`,
let's correctly query the cost of masked memory ops,
keep all the pretty shuffle cost modelling,
but scalarize the cost computation for the mask replication.

I think, not scalarizing the shuffles themselves
may adjust the computed costs a bit,
and maybe hopefully just enough to hide the "regressions"
at https://reviews.llvm.org/D111460#inline-1075467
I do mean hide, because the test coverage is non-existent.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D112873

commit | commitdiff | tree

Roman Lebedev [Wed, 3 Nov 2021 15:12:12 +0000 (18:12 +0300)]

[NFC] Use single-dash-prefixed options in newly-added test

https://lab.llvm.org/buildbot/#/builders/98/builds/8195 complains,
and this is the only guess i have.

commit | commitdiff | tree

Clement Courbet [Wed, 3 Nov 2021 14:43:04 +0000 (15:43 +0100)]

[Sema][NFC] Improve test coverage for builtin binary operators.

In preparation for D112453.

commit | commitdiff | tree

Erich Keane [Wed, 3 Nov 2021 14:42:00 +0000 (07:42 -0700)]

Update ast-dump-decl.mm test to work on 32 bit windows

Windows member functions have __attribute__((thiscall)) on their type,
so any machine running this that is 32 bit windows fails this test, add
a wildcard, plus an additional run line to explain why.

commit | commitdiff | tree

Roman Lebedev [Wed, 3 Nov 2021 14:33:28 +0000 (17:33 +0300)]

[BasicTTI] getInterleavedMemoryOpCost(): discount unused members of mask if mask for gap will be used

As it can be seen in `InnerLoopVectorizer::vectorizeInterleaveGroup()`,
in some cases (reported by `UseMaskForGaps`), the gaps in the interleaved load/store group
will be masked away by another constant mask, so there is no need to
account for the cost of replication of the mask for these.

Differential Revision: https://reviews.llvm.org/D112877

commit | commitdiff | tree

Roman Lebedev [Wed, 3 Nov 2021 14:13:17 +0000 (17:13 +0300)]

[NFC][X86] Duplicate LV test into a costmodel test

Copied from llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll
As discussed in D111460 / D112877 / D112873 we have basically no test coverage
for this part of cost model.

commit | commitdiff | tree

Erich Keane [Wed, 3 Nov 2021 14:13:02 +0000 (07:13 -0700)]

Revert part of D112349 to allow ifunc resolvers be declarations.

The patch in D112349 added a previously nonexistant restriction on ifunc
resolvers that they MUST be defintions. However, the function
multiversioning depends on being able to resolve these resolvers at
link-time, so this additional restriction was breaking.

commit | commitdiff | tree

David Sherwood [Wed, 3 Nov 2021 13:37:30 +0000 (13:37 +0000)]

[NFC][LoopVectorize] Simple tidy-up in InnerLoopVectorizer::createVectorIntOrFpInductionPHI

Use getSignedIntOrFpConstant instead of creating int or FP constants
manually.

commit | commitdiff | tree

David Spickett [Wed, 3 Nov 2021 13:32:34 +0000 (13:32 +0000)]

Reland "[lldb] Remove non address bits when looking up memory regions"

This reverts commit 5fbcf677347e38718461496d9e9e184a7a30c3fb.

ProcessDebugger is used in ProcessWindows and NativeProcessWindows.
I thought I was simplifying things by renaming to DoGetMemoryRegionInfo
in ProcessDebugger but the Native process side expects "GetMemoryRegionInfo".

Follow the pattern that WriteMemory uses. So:
* ProcessWindows::DoGetMemoryRegioninfo calls ProcessDebugger::GetMemoryRegionInfo
* NativeProcessWindows::GetMemoryRegionInfo does the same

commit | commitdiff | tree

Peter Waller [Wed, 3 Nov 2021 13:40:22 +0000 (13:40 +0000)]

Reland "[AArch64][SVE][InstCombine] Combine contiguous gather/scatter to load/store"

This reverts commit 753eba64213ef20195644994df53d564f30eb65f.

Contiguous gather => masked load:

  (sve.ld1.gather.index Mask BasePtr (sve.index IndexBase 1))
  => (masked.load (gep BasePtr IndexBase) Align Mask undef)

Contiguous scatter => masked store:

  (sve.ld1.scatter.index Value Mask BasePtr (sve.index IndexBase 1))
  => (masked.store Value (gep BasePtr IndexBase) Align Mask)

Tests with <vscale x 2 x double>:

[Gather, Scatter] for each [Positive test (index=1), Negative test
(index=2), Alignment propagation].

Differential Revision: https://reviews.llvm.org/D112076

commit | commitdiff | tree

Peter Waller [Wed, 3 Nov 2021 13:39:38 +0000 (13:39 +0000)]

Revert "[AArch64][SVE][InstCombine] Combine contiguous gather/scatter to load/store"

This reverts commit 1febf42f03f664ec84aedf0ece3b29f92b10dce9, which has
a use-of-uninitialized-memory bug.

See: https://reviews.llvm.org/D112076

commit | commitdiff | tree

David Spickett [Wed, 3 Nov 2021 13:27:41 +0000 (13:27 +0000)]

Revert "[lldb] Remove non address bits when looking up memory regions"

This reverts commit 6f5ce43b433706c3ae5c37022d6c0964b6bfadf8 due to
build failure on Windows.

commit | commitdiff | tree

Florian Hahn [Wed, 3 Nov 2021 13:26:15 +0000 (14:26 +0100)]

[LV] Drop unneeded use of getVPSingleValue (NFC).

VPReductionPHIRecipe inherits from VPValue, so there's no need to call
getVPSingleValue.

commit | commitdiff | tree

Konstantin Boyarinov [Wed, 3 Nov 2021 13:08:27 +0000 (16:08 +0300)]

[libcxx][test][NFC] More tests for containers comparisons

Add more missing tests for comparisons to improve code coverage (follow-up for D111738)

Reviewed By: ldionne, rarutyun, #libc

Differential Revision: https://reviews.llvm.org/D112424

commit | commitdiff | tree

Sanjay Patel [Wed, 3 Nov 2021 12:55:50 +0000 (08:55 -0400)]

[PhaseOrdering] add tests for x86 abs/max using SSE intrinsics (PR34047); NFC

D113035

commit | commitdiff | tree

Florian Hahn [Wed, 3 Nov 2021 13:11:01 +0000 (14:11 +0100)]

[VPlan] Make VPWidenCanonicalIVRecipe a VPValue (NFC).

The recipe produces exactly one VPValue and can inherit directly from
it. This is in line with other recipes and avoids having to use
getVPSingleValue.

commit | commitdiff | tree

Andrew Savonichev [Wed, 3 Nov 2021 12:48:04 +0000 (15:48 +0300)]

[NVPTX] Mark special registers as reserved

A reserved register:
- is not allocatable
- is considered always live
- is ignored by liveness tracking

NVPTX special registers match the criteria, and marking them as
reserved helps to avoid machine verifier error:

    *** Bad machine code: Using an undefined physical register ***
    - function:    foo
    - basic block: %bb.0  (0x557bb178b708)
    - instruction: %0:int32regs = MOV_SPECIAL $envreg0
    - operand 1:   $envreg0

Differential Revision: https://reviews.llvm.org/D113008

commit | commitdiff | tree

Clement Courbet [Wed, 3 Nov 2021 09:44:21 +0000 (10:44 +0100)]

[Sema][NFC] Improve test coverage for builtin operators.

In preparation for D112453.

commit | commitdiff | tree

Pavel Labath [Wed, 3 Nov 2021 11:59:51 +0000 (12:59 +0100)]

[lldb] Remove ConstString from plugin names in PluginManager innards

This completes de-constification of plugin names.

commit | commitdiff | tree

Cullen Rhodes [Wed, 3 Nov 2021 08:04:28 +0000 (08:04 +0000)]

[TableGen] Emit a warning for unused template args

Add a warning to TableGen for unused template arguments in classes and
multiclasses, for example:

  multiclass Foo<int x> {
    def bar;
  }

  $ llvm-tblgen foo.td

  foo.td:1:20: warning: unused template argument: Foo::x
  multiclass Foo<int x> {
                     ^
A flag '--no-warn-on-unused-template-args' is added to disable the
warning. The warning is disabled for LLVM and sub-projects if
'LLVM_ENABLE_WARNINGS=OFF'.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D109359

commit | commitdiff | tree

Cullen Rhodes [Wed, 3 Nov 2021 11:49:58 +0000 (11:49 +0000)]

[mlir][nvvm] NFC: Fix unused template arg tablegen warning

Identified in D109359.

commit | commitdiff | tree

Butygin [Thu, 28 Oct 2021 16:04:35 +0000 (19:04 +0300)]

[mlir] spirv: Add some atomic ops

Differential Revision: https://reviews.llvm.org/D112812

commit | commitdiff | tree

Andrew Savonichev [Wed, 3 Nov 2021 11:42:32 +0000 (14:42 +0300)]

[NVPTX] Add MoveParam instruction for TargetExternalSymbol operand

TargetExternalSymbol is considered to be an immediate and not a
register, so machine verifier emits an error:

    *** Bad machine code: Expected a register operand. ***
    - function:    static_offset
    - basic block: %bb.0 bb (0x560e9b306028)
    - instruction: %3:int64regs = MoveParamI64 &static_offset_param_1
    - operand 1:   &static_offset_param_1

The patch adds variants of this instruction with an immediate operand
for byval arguments on 64-bit and 32-bit targets.

Differential Revision: https://reviews.llvm.org/D113006

commit | commitdiff | tree

David Green [Wed, 3 Nov 2021 11:41:06 +0000 (11:41 +0000)]

[ARM] Treat MVE gather add-like-or's like adds

LLVM has the habit of turning adds with no common bits set into ors,
which means we need to detect them and treat them like adds again in the
MVE gather/scatter lowering pass.

Differential Revision: https://reviews.llvm.org/D112922

commit | commitdiff | tree

David Spickett [Wed, 3 Nov 2021 10:14:13 +0000 (10:14 +0000)]

[lldb] Remove non address bits when looking up memory regions

On AArch64 we have various things using the non address bits
of pointers. This means when you lookup their containing region
you won't find it if you don't remove them.

This changes Process GetMemoryRegionInfo to a non virtual method
that uses the current ABI plugin to remove those bits. Then it
calls DoGetMemoryRegionInfo.

That function does the actual work and is virtual to be overriden
by Process implementations.

A test case is added that runs on AArch64 Linux using the top
byte ignore feature.

Reviewed By: omjavaid

Differential Revision: https://reviews.llvm.org/D102757

commit | commitdiff | tree

Peter Waller [Mon, 25 Oct 2021 12:54:33 +0000 (12:54 +0000)]

[AArch64][SVE][InstCombine] Combine contiguous gather/scatter to load/store

Contiguous gather => masked load:

  (sve.ld1.gather.index Mask BasePtr (sve.index IndexBase 1))
  => (masked.load (gep BasePtr IndexBase) Align Mask undef)

Contiguous scatter => masked store:

  (sve.ld1.scatter.index Value Mask BasePtr (sve.index IndexBase 1))
  => (masked.store Value (gep BasePtr IndexBase) Align Mask)

Tests with <vscale x 2 x double>:

[Gather, Scatter] for each [Positive test (index=1), Negative test (index=2), Alignment propagation].

Differential Revision: https://reviews.llvm.org/D112076

commit | commitdiff | tree

David Green [Wed, 3 Nov 2021 11:00:05 +0000 (11:00 +0000)]

[ARM] Push gather/scatter shl index updates out of loops

This teaches the MVE gather scatter lowering pass that SHL is
essentially the same as Mul, where we are able to optimize the
induction of a gather/scatter address by pushing them out of loops.
https://alive2.llvm.org/ce/z/wG4VyT

Differential Revision: https://reviews.llvm.org/D112920

commit | commitdiff | tree

David Spickett [Tue, 5 Oct 2021 09:13:35 +0000 (10:13 +0100)]

[libcxx][utils] Note read only mount and ptrace permission in container script

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D110938

commit | commitdiff | tree

Qiu Chaofan [Wed, 3 Nov 2021 09:57:25 +0000 (17:57 +0800)]

[PowerPC] Implement longdouble pack/unpack builtins

Implement two builtins to pack/unpack IBM extended long double float,
according to GCC 'Basic PowerPC Builtin Functions Available ISA 2.05'.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D112055

commit | commitdiff | tree

David Sherwood [Wed, 27 Oct 2021 13:21:54 +0000 (14:21 +0100)]

[NFC][LoopVectorize] Add test for tail-folding loop with conditional uniform load

I've added a test for a loop containing a conditional uniform load for
a target that supports masked loads. The test just ensures that we
correctly use gather instructions and have the correct mask.

Differential Revision: https://reviews.llvm.org/D112619

commit | commitdiff | tree

Alex Zinenko [Tue, 2 Nov 2021 15:44:37 +0000 (16:44 +0100)]

[mlir][python] expose the shape property of shaped types

This has been missing in the original definition of shaped types.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D113025

commit | commitdiff | tree

Alex Zinenko [Tue, 2 Nov 2021 13:15:25 +0000 (14:15 +0100)]

[mlir][python] improve usability of Python affine construct bindings

- Provide the operator overloads for constructing (semi-)affine expressions in
Python by combining existing expressions with constants.
- Make AffineExpr, AffineMap and IntegerSet hashable in Python.
- Expose the AffineExpr composition functionality.

Reviewed By: gysit, aoyal

Differential Revision: https://reviews.llvm.org/D113010

commit | commitdiff | tree

rkayaith [Tue, 2 Nov 2021 16:04:42 +0000 (17:04 +0100)]

[mlir][python] Make Operation and Value hashable

This allows operations and values to be used as dict keys

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D112669

commit | commitdiff | tree

Andrew Savonichev [Wed, 3 Nov 2021 09:38:06 +0000 (12:38 +0300)]

[NVPTX] Copy machine operand flags in TII::insertBranch

Before this patch, flags such as undef were dropped by TII::insertBranch
(used by BranchFolding pass), resulting in the following error from
machine verifier:

    *** Bad machine code: Reading virtual register without a def ***
    - function:    hoge
    - basic block: %bb.0 bb (0x562e9c240e68)
    - instruction: CBranch %2:int1regs, %bb.3
    - operand 0:   %2:int1regs

Differential Revision: https://reviews.llvm.org/D113001

commit | commitdiff | tree

Yi Kong [Wed, 3 Nov 2021 09:18:04 +0000 (17:18 +0800)]

[ARM][AsmParser] Don't emit "deprecated instruction in IT block" warning if requested

Also fixed formatting in AsmMatcherEmitter because it was confusing.

Differential Revision: https://reviews.llvm.org/D112993

commit | commitdiff | tree

Valentin Clement [Wed, 3 Nov 2021 09:13:35 +0000 (10:13 +0100)]

[fir] Add substr information to fircg.ext_embox and fircg.ext_rebox operations

This patch adds the substring information to the fircg.ext_embox and
fircg.ext_rebox operations.

Substring is used for CHARACTER types.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D112807

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>

commit | commitdiff | tree

Andrew Savonichev [Wed, 3 Nov 2021 09:08:39 +0000 (12:08 +0300)]

[X86][clang] Disable long double type for -mno-x87 option

This patch attempts to fix a compiler crash that occurs when long
double type is used with -mno-x87 compiler option.

The option disables x87 target feature, which in turn disables x87
registers, so CG cannot select them for x86_fp80 LLVM IR type. Long
double is lowered as x86_fp80 for some targets, so it leads to a
crash.

The option seems to contradict the SystemV ABI, which requires long
double to be represented as a 80-bit floating point, and it also
requires to use x87 registers.

To avoid that, `long double` type is disabled when -mno-x87 option is
set. In addition to that, `float` and `double` also use x87 registers
for return values on 32-bit x86, so they are disabled as well.

Differential Revision: https://reviews.llvm.org/D98895

commit | commitdiff | tree

Kazushi (Jam) Marukawa [Wed, 3 Nov 2021 06:04:39 +0000 (15:04 +0900)]

[VE] Change to omitting the frame pointer on leaf functions

Change to omitting the frame pointer on leaf functions by default for VE.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D113087

commit | commitdiff | tree

Piotr Sobczak [Fri, 22 Oct 2021 15:11:13 +0000 (17:11 +0200)]

[InstCombine] Extend pattern to replace shuffle's insertelement operand

In D71220 a pattern was added to replace shuffle's insertelement operand
if inserted scalar is not demanded. The pattern was added only for
the case where the shuffle's mask size is equal to element's vector size.
However, that condition is not required because the pattern does not
change the shuffle vector size.

This patch extends the pattern to also include cases where shuffle's mask
size is not equal to element's vector size.

Differential Revision: https://reviews.llvm.org/D112318

commit | commitdiff | tree

Nicolas Vasilache [Mon, 1 Nov 2021 10:40:56 +0000 (10:40 +0000)]

[mlir][Linalg] Refactor vectorization of conv1d more aggressively.

This better decouples transfer read/write from vector-only rewrite of conv.
This form is close to ready to plop into a new vector.conv op and the vector.transfer operations to be generalized as part of generic vectorization once the properties ConvolutionOpInterface are inferred from the indexing maps.

This also results in a nice perf boost in the dw == 1 cases.

Differential revision: https://reviews.llvm.org/D112822

commit | commitdiff | tree

Nicolas Vasilache [Fri, 29 Oct 2021 10:17:24 +0000 (10:17 +0000)]

[mlir][Linalg] Refactor conv vectorization to decouple memory from vector ops.

This refactoring prepares conv1d vectorization for a future integration into
the generic codegen path.
Once transfer_read / transfer_write vectorization also supports sliding windows,
the special pattern for conv can disappear.
This will also likely need a vector.conv operation.

Differential Revision: https://reviews.llvm.org/D112797

commit | commitdiff | tree

Fangrui Song [Wed, 3 Nov 2021 07:56:09 +0000 (00:56 -0700)]

Revert "[ELF] Try appeasing --target=armv7-linux-androideabi24 sanitizer symbolization tests"

This reverts commit 5cbec88cbf1c8d06030b84ebf17f5eebc3e3f1f9.

Vitaly said that 2faac77f26dee2a1367f373180573ea9c56efed1 actually works.

Sanitizer's armv7-linux-androideabi24 configuration has other issues which haven't been identified yet, but that's unrelated to the empty symbol name issue.

commit | commitdiff | tree

Markus Böck [Wed, 3 Nov 2021 07:54:47 +0000 (08:54 +0100)]

[mlir] Fix typos in comments in DebugAction.h

commit | commitdiff | tree

Ben Shi [Wed, 3 Nov 2021 06:15:21 +0000 (14:15 +0800)]

Revert "[AArch64] Optimize add/sub with immediate"

This reverts commit 3de3ca3137bec5115cd10c53f4059f9bf1054e96.

commit | commitdiff | tree

Chen Zheng [Wed, 3 Nov 2021 05:17:41 +0000 (05:17 +0000)]

[PowerPC] handle more splat loads without stack operation

This mostly improves splat loads code generation on Power7

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D106555

commit | commitdiff | tree

Johannes Doerfert [Sun, 31 Oct 2021 19:22:50 +0000 (14:22 -0500)]

[OpenMP][FIX] Do not signal SPMD-mode but then keep generic-mode

If we assume SPMD-mode during the fixpoint iteration we have to execute
the kernel in SPMD-mode. If we change our mind during manifest there is
the chance of a mismatch between the simplification, e.g., of
`__kmpc_is_spmd_exec_mode` calls, and the execution mode. This problem
was introduced in D109438.

This patch is compromise to resolve the problem purely in OpenMP-opt
while trying to keep the benefits of D109438 around. This might not
always work, see `get_hardware_num_threads_in_block_fold` but it often
does. At the same time we do keep value specialization and execution
mode in sync.

Proper solutions to this problem should be considered. I believe a new
execution mode is the easiest way forward (Singleton-SPMD).
Alternatively, SPMD-mode execution can be used with a way to provide a
new thread_limit (here 1) to the runtime. This is more general and could
be useful if we see `num_threads` clauses or workshared loops with small
trip counts in the kernel. In either proposal we need to disable the
guarding for the kernel (which was the motivation for D109438).

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D112894

commit | commitdiff | tree

Johannes Doerfert [Sun, 31 Oct 2021 17:23:45 +0000 (12:23 -0500)]

[OpenMP][FIX] Introduce and use a simple generic-mode barrier

Before we had aligned barriers the `__kmpc_barrier_simple_spmd` was
OK to be used in the custom state machine. Now that SPMD barriers are
assumed to be aligned we need to use a "generic" barrier in places
that are not aligned.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D112893

commit | commitdiff | tree

Johannes Doerfert [Fri, 17 Sep 2021 18:23:42 +0000 (13:23 -0500)]

[NVVM] Update intrinsic definitions to include more attributes

A lot of NVVM intrinsics can use the default intrinsic attributes (e.g.,
nosync, nofree, ...) as well as `speculatable`. The latter is important
if we want to recompute intrinsics results instead of communicating them
via memory.

I did use default attributes for almost all `readnone` attributes but
speculatable only where I had reasonable confidence they cannot
experience UB. That said, someone should double check.

TODO: There seem to be various intrinsics marked `Commutative` which
should not, e.g., fma and div.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D109987

commit | commitdiff | tree

Johannes Doerfert [Fri, 29 Oct 2021 22:34:51 +0000 (17:34 -0500)]

[OpenMP][FIX] Ensure guarding uses proper global name

Global symbols cannot have any name so we need to sanitize the string
first. Also remove an assertion that is not actually necessary nor
true in general.

Reviewed By: ggeorgakoudis

Differential Revision: https://reviews.llvm.org/D112892

commit | commitdiff | tree

Johannes Doerfert [Sat, 30 Oct 2021 19:24:25 +0000 (14:24 -0500)]

[OpenMP][FIX] Avoid a race between initialization and first state reads

When we pick state 0 to initialize state but thread N is going to be the
"main thread", in generic mode, we would require extra synchronization.
Instead, we should pick the main thread to initialize state in generic
mode and any thread in SPMD mode.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D112874

commit | commitdiff | tree

Abinav Puthan Purayil [Tue, 2 Nov 2021 07:33:42 +0000 (13:03 +0530)]

[AMDGPU] Fix SGPR checks in S_MOV_B64_IMM_PSEUDO generation.

The function to generate S_MOV_B64_IMM_PSEUDO was recently modified to
optimize AGPR to AGPR copy but it missed checking for the SGPR
clobbering for the S_MOV_B64_IMM_PSEUDO generation.

Differential Revision: https://reviews.llvm.org/D113005

commit | commitdiff | tree

Ben Shi [Mon, 1 Nov 2021 09:47:44 +0000 (09:47 +0000)]

[AArch64] Optimize add/sub with immediate

Optimize ([add|sub] r, imm) -> ([ADD|SUB] ([ADD|SUB] r, #imm0, lsl #12), #imm1),
if imm == (imm0<<12)+imm1. and both imm0 and imm1 are non-zero 12-bit unsigned
integers.

Optimize ([add|sub] r, imm) -> ([SUB|ADD] ([SUB|ADD] r, #imm0, lsl #12), #imm1),
if imm == -(imm0<<12)-imm1, and both imm0 and imm1 are non-zero 12-bit unsigned
integers.

Reviewed By: jaykang10, dmgreen

Differential Revision: https://reviews.llvm.org/D111034

commit | commitdiff | tree

wlei [Tue, 2 Nov 2021 04:35:32 +0000 (21:35 -0700)]

[llvm-profgen] Refactor the code of getHashCode

Refactor to generate hash code lazily. Tested on clang self build, no observable generating time regression.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D113059

commit | commitdiff | tree

wlei [Tue, 26 Oct 2021 02:20:28 +0000 (19:20 -0700)]

[llvm-profgen] Warn on invalid range and show warning summary

Two things in this diff:

1) Warn on the invalid range, currently three types of checking, see the detailed message in the code.

2) In some situation, llvm-profgen gives lots of warnings on the truncated stacks which is noisy. This change provides a switch to `--show-detailed-warning` to skip the warnings. Alternatively, we use a summary for those warning and show the percentage of cases with those issues.

Example of warning summary.
```
warning: 0.05%(1120/2428958) cases with issue: Profile context truncated due to missing probe for call instruction.
warning: 0.00%(2/178637) cases with issue: Range does not belong to any functions, likely from external function.
```

Reviewed By: hoy

Differential Revision: https://reviews.llvm.org/D111902

Domain: System / Toolchain;

RSS Atom