review.tizen.org Git - platform/upstream/llvm.git/log

projects / platform / upstream / llvm.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Alexey Bataev [Mon, 20 Mar 2023 14:26:33 +0000 (07:26 -0700)]

[SLP]Find reused scalars in buildvector sequences, if any.

Patch generalizes analysis of scalars. The main part is outlined into
lambda, which can be used to find reused inserted scalars and emit
shuffle for them instead of multiple insertelement instructions, if the
permutation is found alreadyi. I.e. some scalars are transformed by the
permutation of previously vectorized nodes, and some are inserted
directly.

Reworked part of D110978

Differential Revision: https://reviews.llvm.org/D146564

commit | commitdiff | tree

Philip Reames [Wed, 5 Apr 2023 16:26:17 +0000 (09:26 -0700)]

[IVDescriptors] Add pointer InductionDescriptors with non-constant strides (try 2)

(JFYI - This has been heavily reframed since original attempt at landing.)

This change updates the InductionDescriptor logic to allow matching a pointer IV with a non-constant stride, but also updates the LoopVectorizer to bailout on such descriptors by default. This preserves the default vectorizer behavior.

In review, it was pointed out that there's multiple unfortunate performance implications which need to be addressed before this can be enabled. Having a flag allows us to exercise the behavior, and write test cases for logic which is otherwise unreachable (or hard to reach).

This will also enable non-constant stride pointer recurrences for other consumers. I've audited said code, and don't see any obvious issues.

Differential Revision: https://reviews.llvm.org/D147336

commit | commitdiff | tree

Martin Storsjö [Sat, 1 Apr 2023 22:02:37 +0000 (01:02 +0300)]

[libunwind] Fix a case of inconsistent indentation. NFC.

commit | commitdiff | tree

Stefan Gränitz [Tue, 4 Apr 2023 17:06:10 +0000 (19:06 +0200)]

[JITLink][AArch32] Implement ELF::R_ARM_ABS32 after we stopped skipping debug info sections

We create LinkGraph sections with NoAlloc lifetime now since f05ac803ffe76c7f4299a4e1288cc6bb8b098410
This means we do process debug info sections now with all their relocations. That's ok for the moment.

commit | commitdiff | tree

Adrian Prantl [Wed, 5 Apr 2023 16:00:46 +0000 (09:00 -0700)]

Skip tests under asan

commit | commitdiff | tree

Shengchen Kan [Wed, 5 Apr 2023 15:39:32 +0000 (23:39 +0800)]

[X86][mem-fold] Simplify code by using StringRef::drop_back, NFCI

commit | commitdiff | tree

OCHyams [Wed, 5 Apr 2023 15:23:02 +0000 (16:23 +0100)]

[DebugInfo] Update test to use opaque ptrs

The test was added in D99169.

commit | commitdiff | tree

Shengchen Kan [Wed, 5 Apr 2023 15:22:18 +0000 (23:22 +0800)]

[X86][mem-fold] Remove the logic for FoldGenData, NFCI

commit | commitdiff | tree

Nikita Popov [Wed, 5 Apr 2023 15:03:20 +0000 (17:03 +0200)]

Revert "[SimplifyCFG][LICM] Preserve nonnull, range and align metadata when speculating"

This reverts commit 78b1fbc63f78660ef10e3ccf0e527c667a563bc8.

This causes or exposes miscompiles in Rust, revert until they
have been investigated.

commit | commitdiff | tree

Paul Scoropan [Wed, 5 Apr 2023 14:31:16 +0000 (14:31 +0000)]

[Flang] Fix usage of uninitialized resolution variable

Recent Flang PowerPC intrinsics patch used resolution variable without checking if it exists first, causing segmentation faults in some scenarios. This patch checks that the resolution variable exists first before usage.

Reviewed By: DanielCChen

Differential Revision: https://reviews.llvm.org/D147616

commit | commitdiff | tree

Nikita Popov [Wed, 5 Apr 2023 14:59:04 +0000 (16:59 +0200)]

[Coroutines] Convert test to opaque pointers (NFC)

commit | commitdiff | tree

Philip Reames [Wed, 5 Apr 2023 14:47:24 +0000 (07:47 -0700)]

[RISCV] Account for LMUL in memory op costs

Generally, the cost of a memory op will scale with the number of vector registers accessed. Machines might exist which have a narrow memory access than vector register width, but machines with a wider memory access width than vector register width seem unlikely.

I noticed this because we were preferring wide loads + deinterleaves on examples where the cost of a short gather (actually a strided load) would be better. Touching 8 vector registers instead of doing a 4 element gather is not a good tradeoff.

Differential Revision: https://reviews.llvm.org/D147470

commit | commitdiff | tree

Dávid Bolvanský [Wed, 5 Apr 2023 10:52:30 +0000 (12:52 +0200)]

[AggressiveInstCombine] Enable also for -O2

Next step after https://reviews.llvm.org/D113179

Recently a set of patches by @anton-afanasyev improved many cases (better and cleaner vectorized code) thanks to improvements to AIC's TruncInstCombine (IC cannot handle it) motivated by real examples in bug reports. There was a discussion that -O2 could benefit from AIC as well, but discussion then stalled, so I would like restart it, with new numbers from LLVM compile time tracker.

As -O2 pipeline is not tracked by LLVM compile time tracker, I disabled AIC for -O3 to get an idea how expensive is it. Without AIC, I observed that geomean was cca -0.10%. Given that it seems like AIC is quite cheap, heavily tested by -O3 pipeline, I am proposing to enable it also with -O2 and similar to improve quality to vectorized code.

https://llvm-compile-time-tracker.com/compare.php?from=a1df5abef5f27646c809c7b85cf6170eb68f7735&to=e1ba6068f58c6ca862b920b8750faccb42a5843c&stat=instructions:u

Differential Revision: https://reviews.llvm.org/D147604
Reviewed-By: nikic

commit | commitdiff | tree

Florian Hahn [Wed, 5 Apr 2023 14:49:06 +0000 (15:49 +0100)]

[Matrix] Limit dot lowering to column major matrixes.

Limit to dot product lowering to column major matrixes for now. This
simplifies the code and reasoning for upcoming planned improvements.
Support for row-major matrixes can be added later as extension.

commit | commitdiff | tree

David Sherwood [Mon, 3 Apr 2023 10:19:19 +0000 (10:19 +0000)]

[AArch64][SME] Disable ZA LDR/STR addressing optimisations

Since the same encoded offset is used for both the vector
select offset and the address offset we have to spot two
patterns simulatenously in the ldr/str intrinsic inputs, i.e.

vector select = base + off
address = base + (off * VL)

whereas currently we only look for the address pattern. I
don't think this is possible in tablegen, so I suspect we'll
have to do this manually as part of lowering or as a target
DAG combine. For now, I've removed these tablegen patterns
so that we at least do the correct thing even if the code
quality isn't great.

I've also changed some of the ldr/str tests to pass in the
same vector select pattern (base + off) as the address
pattern.

Differential Revision: https://reviews.llvm.org/D147433

commit | commitdiff | tree

Nikita Popov [Wed, 5 Apr 2023 14:22:56 +0000 (16:22 +0200)]

[InstCombine] Remove varargs cast transform (NFC)

This is no longer relevant with opaque pointers.

Also drop the CastInst::isLosslessCast() method, which was only
used here.

commit | commitdiff | tree

Jie Fu [Wed, 5 Apr 2023 14:34:42 +0000 (22:34 +0800)]

[Transforms] Fix -Wunused-function for 'GetReplicateRegion' with -DLLVM_ENABLE_ASSERTIONS=OFF (NFC)

/Users/jiefu/llvm-project/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:614:23: error: unused function 'GetReplicateRegion' [-Werror,-Wunused-function]
static VPRegionBlock *GetReplicateRegion(VPRecipeBase *R) {
^
1 error generated.

commit | commitdiff | tree

Krzysztof Parzyszek [Wed, 5 Apr 2023 13:54:31 +0000 (06:54 -0700)]

[Hexagon] Remove -opaque-pointers=0 from tests

Two tests still had opaque-pointers=0:
(1) llvm/test/CodeGen/Hexagon/addrmode.ll
(2) llvm/test/CodeGen/Hexagon/swp-epilog-phi7.ll

Deleted (1) since it no longer exercised the original scenario, modified
(2) to reflect codegen changes.

This fixes https://github.com/llvm/llvm-project/issues/61928

commit | commitdiff | tree

Nikita Popov [Wed, 5 Apr 2023 14:06:34 +0000 (16:06 +0200)]

[CodeGenCXX] Convert some tests to opaque pointers (NFC)

In particular also fixes fallout from instcombine changes.

commit | commitdiff | tree

Nikita Popov [Wed, 5 Apr 2023 13:59:29 +0000 (15:59 +0200)]

[InstCombine] Remove convertBitCastToGEP() fold (NFC)

This only applies to typed pointers, so the fold is no longer
necessary.

commit | commitdiff | tree

Richard Howell [Tue, 4 Apr 2023 19:35:45 +0000 (12:35 -0700)]

[clang] don't serialize MODULE_DIRECTORY with ModuleFileHomeIsCwd

Fix a bug in the MODULE_DIRECTORY serialization logic
that would cause MODULE_DIRECTORY to be serialized when
`-fmodule-file-home-is-cwd` is specified.

This matches the original logic added in:
https://github.com/apple/llvm-project/commit/f7b41371d9ede1aecf0930e5bd4a463519264633

Reviewed By: keith

Differential Revision: https://reviews.llvm.org/D147561

commit | commitdiff | tree

Florian Hahn [Wed, 5 Apr 2023 14:19:11 +0000 (15:19 +0100)]

[Matrix] Add dotproduct tests with row-major default layout.

commit | commitdiff | tree

Jie Fu [Wed, 5 Apr 2023 14:17:36 +0000 (22:17 +0800)]

[InstCombine] Remove unneeded internal function 'decomposeSimpleLinearExpr' in InstCombineCasts.cpp (NFC)

/data/llvm-project/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp:32:15: error: function 'decomposeSimpleLinearExpr' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration]
static Value *decomposeSimpleLinearExpr(Value *Val, unsigned &Scale,
^
1 error generated.

commit | commitdiff | tree

Shengchen Kan [Wed, 5 Apr 2023 14:16:17 +0000 (22:16 +0800)]

[X86][mem-fold] Remove the logic for TB_NO_FORWARD | TB_NO_REVERSE, NFCI

commit | commitdiff | tree

Eric Gullufsen [Tue, 4 Apr 2023 19:33:37 +0000 (15:33 -0400)]

[InstCombine] Preserve nsw/nuw flags in canonicalization

canonicalizeLogicFirst reorders logic op / math op for suitable
constants, and this commit makes this function pass through
nsw/nuw flags on the Add.

Differential Revision: https://reviews.llvm.org/D147568

commit | commitdiff | tree

Eric Gullufsen [Tue, 4 Apr 2023 18:30:14 +0000 (14:30 -0400)]

[InstCombine] Add baseline tests for nsw/nuw (NFC)

nsw/nuw flags are currently not preserved when canonicalizing,
adding baseline tests here and subsequent patch will fix.

Differential Revision: https://reviews.llvm.org/D147566

commit | commitdiff | tree

Florian Hahn [Wed, 5 Apr 2023 14:04:05 +0000 (15:04 +0100)]

[Matrix] Add test variants where 2nd operand of dotprod is add/sub.

commit | commitdiff | tree

Fangrui Song [Wed, 5 Apr 2023 13:59:09 +0000 (06:59 -0700)]

[RuntimeDyld][ELF] Actually fix R_AARCH64_ABS{16,32} overflow check

7b58259481417bb22d144a9c12ee8f4fb0a046e0 is incorrect.

commit | commitdiff | tree

Nikita Popov [Wed, 5 Apr 2023 13:41:53 +0000 (15:41 +0200)]

[InstCombine] Remove PromoteCastOfAllocation() fold (NFC)

This fold does not apply to opaque pointers, and as such is no
longer needed.

commit | commitdiff | tree

Nikita Popov [Wed, 5 Apr 2023 13:53:15 +0000 (15:53 +0200)]

[Coroutines] Convert some tests to opaque pointers (NFC)

commit | commitdiff | tree

Alex Bradbury [Wed, 5 Apr 2023 13:52:08 +0000 (14:52 +0100)]

[docs][RISCV] Remove outdated note about zfa implementation status

MC layer and codegen support for fli.{h,s,d} has since been implemented.

commit | commitdiff | tree

Fangrui Song [Wed, 5 Apr 2023 13:52:54 +0000 (06:52 -0700)]

[RuntimeDyld][ELF] Fix off-by-1 issues in R_AARCH64_ABS{16,32} overflow checks

commit | commitdiff | tree

Florian Hahn [Wed, 5 Apr 2023 13:49:12 +0000 (14:49 +0100)]

[AArch64] Add cost-model tests for fshr.

commit | commitdiff | tree

Alex Bradbury [Wed, 5 Apr 2023 13:46:44 +0000 (14:46 +0100)]

[RISCV][NFC] Use RISCVSubtarget method for predicate in RISCVFeatures.td when available

As RISCVSubtarget defines hasStdExtZfhOrZfhmin() and hasStdExtCOrZca(),
just use these for the matching Predicate definitions rather than
repeating the logic.

commit | commitdiff | tree

Benjamin Kramer [Wed, 5 Apr 2023 13:35:02 +0000 (15:35 +0200)]

[mlir] Fix a use after free when loading dependent dialects

The way dependent dialects are implemented is by recursively calling
loadDialect in the constructor. This means we have to reload from the
dialect table because the constructor might have rehashed that table.

The steps for loading a dialect are
  1. Insert a nullptr into loadedDialects. This indicates the dialect is
     loading
  2. Call ctor(). This recursively loads dependent dialects
  3. Insert the new dialect into the table.

We had a conflict between steps 2 and 3 here. You have to be extremely
unlucky though as rehashing is rare and operator[] does no generation
checking on DenseMap. Changing that to an iterator would've uncovered
this issue immediately.

commit | commitdiff | tree

Alex Zinenko [Wed, 5 Apr 2023 13:42:02 +0000 (15:42 +0200)]

[mlir] update Bazel bulid for 1ef51e0452a473f404edc635412685fce6f61004

commit | commitdiff | tree

Alexey Lapshin [Wed, 5 Apr 2023 13:40:02 +0000 (15:40 +0200)]

Revert "[dsymutil][NFC] Move ARM specific test into the ARM directory."

This reverts commit adeb1fa7a34d097825f71dfdfe5c62a242353bb9.

commit | commitdiff | tree

Florian Hahn [Wed, 5 Apr 2023 13:29:24 +0000 (14:29 +0100)]

[VPlan] Replace check for replicate regions with assert (NFCI).

After recent changes, replication regions only get introduced later, so
there's no need to check for them.

commit | commitdiff | tree

Shengchen Kan [Wed, 5 Apr 2023 13:23:12 +0000 (21:23 +0800)]

[X86][mem-fold] Remove definition of NotMemoryFoldable and move code into a def file, NFCI

The goal is to centralize the logic of the memory fold.

commit | commitdiff | tree

Nikita Popov [Wed, 5 Apr 2023 13:17:15 +0000 (15:17 +0200)]

[InstCombine] Convert tests to opaque pointers (NFC)

The two debuginfo tests go away because the relevant transforms
no longer occur in this form, e.g. the "cast of alloca" transform
just doesn't exist with opaque pointers.

commit | commitdiff | tree

Nikita Popov [Wed, 5 Apr 2023 13:14:03 +0000 (15:14 +0200)]

[InstCombine] Regenerate test checks (NFC)

commit | commitdiff | tree

Nikita Popov [Wed, 5 Apr 2023 13:07:26 +0000 (15:07 +0200)]

[InstCombine] Name instructions in test (NFC)

commit | commitdiff | tree

Nikita Popov [Wed, 5 Apr 2023 13:05:16 +0000 (15:05 +0200)]

[InstCombine] Regenerate test checks (NFC)

commit | commitdiff | tree

Nikita Popov [Wed, 5 Apr 2023 13:02:02 +0000 (15:02 +0200)]

[InstCombine] Use CreateGEP() API (NFC)

Use the IRBuilder API that accepts inbounds as a boolean parameter,
rather than using a ternary.

commit | commitdiff | tree

Christian Ulmann [Wed, 5 Apr 2023 12:55:22 +0000 (12:55 +0000)]

[mlir][Analysis] Introduce LoopInfo in mlir

This commit introduces an instantiation of LLVM's LoopInfo for CFGs in
MLIR. To test the LoopInfo, a test pass is added the checks the analysis
results for a set of CFGs.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D147323

commit | commitdiff | tree

Simon Pilgrim [Wed, 5 Apr 2023 12:42:02 +0000 (13:42 +0100)]

[X86] combinePredicateReduction - reuse LowerVectorAllEqual for all_of/any_of(vXi1 eq/ne) reductions

commit | commitdiff | tree

Nikita Popov [Wed, 5 Apr 2023 12:33:11 +0000 (14:33 +0200)]

[ArgPromotion] Require noundef to copy poison-generating metadata

For poison-generating (rather than IUB) metadata, only copy it
from the dominating must-exec load if it is combined with !noundef.
This could be further extended by additionall intersecting the
metadata from all loads, which does not require !noundef.

commit | commitdiff | tree

Simon Pilgrim [Wed, 5 Apr 2023 11:28:26 +0000 (12:28 +0100)]

[X86] LowerVectorAllEqual - split ALLOF(CMPEQ(X,Y)) -> AND(CMPEQ(X[0],Y[0]),CMPEQ(X[1],Y[1]),....) on MOVMSK codegen

Fix minor regression on pre-PTEST targets, since these are always 128-bits we're better off reducing the comparison results (assuming we're not comparing against 0/-1).

commit | commitdiff | tree

Felipe de Azevedo Piovezan [Tue, 4 Apr 2023 13:35:23 +0000 (09:35 -0400)]

[GlobalISel] Improve stack slot tracking in dbg.values

For IR like:

```
%alloca = alloca ...
dbg.value(%alloca, !myvar, OP_deref(<other_ops>))
```

GlobalISel lowers it to MIR:

```
%some_reg = G_FRAME_INDEX <stack_slot>
DBG_VALUE %some_reg, !myvar, OP_deref(<other_ops>)
```

In other words, if the value of `!myvar` can be obtained by
dereferencing an alloca, in MIR we say that the _location_ of a variable
is obtained by dereferencing register %some_reg (plus some
`<other_ops>`).

We can instead remove the use of `%some_reg`: the location of `!myvar`
_is_ `<stack_slot>` (plus some `<other_ops>`). This patch implements
this transformation, which improves debug information handling in O0, as
these registers hardly ever survive register allocation.

A note about testing: similar to what was done in D76934
(f24e2e9eebde4b7a1d), this patch exposed a bug in the Builder class when
using `-debug`, where we tried to print an incomplete instruction. The
changes in `MachineIRBuilder.cpp` address that.

Differential Revision: https://reviews.llvm.org/D147536

commit | commitdiff | tree

Nikita Popov [Wed, 5 Apr 2023 12:09:00 +0000 (14:09 +0200)]

[X86] Convert tests to opaque pointers (NFC)

commit | commitdiff | tree

Nikita Popov [Wed, 5 Apr 2023 12:07:52 +0000 (14:07 +0200)]

[X86] Name instructions in test (NFC)

commit | commitdiff | tree

Nikita Popov [Wed, 5 Apr 2023 11:02:19 +0000 (13:02 +0200)]

[X86] Convert some tests to opaque pointers (NFC)

commit | commitdiff | tree

Jie Fu [Wed, 5 Apr 2023 11:54:23 +0000 (19:54 +0800)]

[lldb] Remove unused private field 'm_orig_rax_info' in RegisterContextLinux_x86_64.h (NFC)

/data/llvm-project/lldb/source/Plugins/Process/Utility/RegisterContextLinux_x86_64.h:32:30: error: private field 'm_orig_rax_info' is not used [-Werror,-Wunused-private-field]
lldb_private::RegisterInfo m_orig_rax_info;
^
1 error generated.

commit | commitdiff | tree

Jez Ng [Wed, 5 Apr 2023 05:52:14 +0000 (01:52 -0400)]

[lld-macho][nfc] std::find_if -> llvm::find_if

commit | commitdiff | tree

Jez Ng [Wed, 5 Apr 2023 05:48:34 +0000 (01:48 -0400)]

[lld-macho][nfc] Clean up a bunch of clang-tidy issues

commit | commitdiff | tree

Tres Popp [Tue, 4 Apr 2023 09:36:30 +0000 (11:36 +0200)]

[MLIR] Clarify (test-scf-)parallel-loop-collapsing

1. parallel-loop-collapsing is renamed to test-scf-parallel-loop-collapsing.
2. The pass adds various checks to provide error messages instead of
hitting assert failures.
3. Testing is added to verify these error messages

This is roughly an NFC. The name changes, but all checked behavior
previously would have resulted in an assertion failure. Almost no new
support is added, so this pass is still limited in scope to testing the
transform behaves correctly with input arguments that perfectly match
the ParallelLoop's iterator arg set. The one new piece of functionality
is that invalid operations will now be skipped with an error messages
instead of producing an assertion failure, so the pass can be used with
expected failures for pieces of the IR not cared about with a specific
RUN command.

Differential Revision: https://reviews.llvm.org/D147514

commit | commitdiff | tree

Pavel Labath [Thu, 12 Jan 2023 13:04:57 +0000 (14:04 +0100)]

[lldb] Detach the child process when stepping over a fork

Step over thread plans were claiming to explain the fork stop reasons,
which prevented the default fork logic (detaching from the child
process) from kicking in. This patch changes that.

Differential Revision: https://reviews.llvm.org/D141605

commit | commitdiff | tree

Pavel Labath [Tue, 28 Mar 2023 12:50:16 +0000 (14:50 +0200)]

[lldb] Drop RegisterInfoInterface::GetDynamicRegisterInfo

"Dynamic register info" is a very overloaded term, and this particular
instance of it was only used for passing the information about the
"orig_[re]ax" pseudo-register on x86 through some generic code. Since
both sides of the code are x86-specific, I have replaced this with a
more direct route.

Differential Revision: https://reviews.llvm.org/D147045

commit | commitdiff | tree

Vladislav Dzhidzhoev [Tue, 28 Mar 2023 13:57:35 +0000 (15:57 +0200)]

[AArch64][GlobalISel] Add support for some across-vector NEON intrinsics

Support uaddv, saddv, umaxv, smaxv, uminv, sminv, fmaxv, fminv,
fmaxnmv, fminnmv intrinsics in GlobalISel.

GlobalISelEmitter couldn't import SelectionDAG patterns containing nodes
with 8-bit result type, since they had untyped values. Therefore,
register type for FPR8 is set to i8 to eliminate untyped nodes in these
patterns.

Differential Revision: https://reviews.llvm.org/D146531

commit | commitdiff | tree

Paul Walker [Tue, 4 Apr 2023 12:49:44 +0000 (12:49 +0000)]

[NFC][InstCombine] Add tests that show bogus combine of SVE intrinsics when using strictfp.

commit | commitdiff | tree

David Green [Wed, 5 Apr 2023 10:52:05 +0000 (11:52 +0100)]

[ARM] Fold fadd of vcmul into vcmla

This adds an extra tablegen combine for folding fadd(a, vcmul(b, c)) into
vcmla(a, b, c), so long as the fadd is allowed to contract.

Differential Revision: https://reviews.llvm.org/D147201

commit | commitdiff | tree

Sven van Haastregt [Wed, 5 Apr 2023 10:49:41 +0000 (11:49 +0100)]

Update mentions of reduction intrinsics; NFC

The intrinsics have been out of experimental since 322d0afd875d
("[llvm][mlir] Promote the experimental reduction intrinsics to be
first class intrinsics.", 2020-10-07); update some places that still
referred to them as experimental.

commit | commitdiff | tree

Piotr Zegar [Wed, 5 Apr 2023 09:44:36 +0000 (09:44 +0000)]

[clang-tidy] Fix init-list handling in readability-implicit-bool-conversion

Adds support for explicit casts using initListExpr,
for example: int{boolValue} constructions.

Fixes: #47000

Reviewed By: ccotter

Differential Revision: https://reviews.llvm.org/D147551

commit | commitdiff | tree

Alexey Lapshin [Wed, 5 Apr 2023 10:01:10 +0000 (12:01 +0200)]

[dsymutil][NFC] Move ARM specific test into the ARM directory.

This patch moves fat-header.test -> ARM/fat-header.test

commit | commitdiff | tree

OCHyams [Wed, 5 Apr 2023 10:00:26 +0000 (11:00 +0100)]

[Assignment Tracking][SROA] Handle createFragmentExpression failure

createFragmentExpression will fail if it determines that the expression cannot
be split over fragments. Handle this case in SROA. Similarly to D147312 this
should be a rare occurrence as the `dbg.assign` will usually reference the
`Value` being stored without modifying it with a `DIExpression`.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D147431

commit | commitdiff | tree

Graham Hunter [Fri, 16 Sep 2022 14:23:18 +0000 (15:23 +0100)]

[LV] Use available masked vector function variants when required

LLVM has the ability to vectorize using function variants that require
a mask by creating an all-true mask, and to vectorize a conditional
call via scalarization, now we want to join the two parts together
and use a masked variant when a mask is required.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D136251

commit | commitdiff | tree

Dinar Temirbulatov [Wed, 5 Apr 2023 10:10:55 +0000 (10:10 +0000)]

[AArch64][SME] Fix an infinite loop in DAGCombine related to adding -force-streaming-compatible-sve flag.

Compiler hits infinite loop in DAGCombine. For force-streaming-compatible-sve
mode we have custom lowering for 128-bit vector splats and later in
DAGCombiner::SimplifyVCastOp() we scalarized SPLAT because we have custom
lowering for SME. Later, we restored SPLAT opertion via performMulCombine().

commit | commitdiff | tree

Andrew Ng [Fri, 31 Mar 2023 16:50:22 +0000 (17:50 +0100)]

[Support] Improve Windows performance of buffered raw_ostream

The "preferred" buffer size for raw_ostream is set to BUFSIZ which on
Windows is only 512. This results in more calls to write and this
overhead can have a significant negative impact on performance,
especially when Anti-Virus is also involved.

Therefore increase the "preferred" buffer size to 16KB for Windows.

One example of where this helps is the LLD --Map option which dumps out
the symbol map for a link. In a link of UE4, this change has been seen
to improve the performance of the symbol map writing by more than a
factor of 6.

Differential Revision: https://reviews.llvm.org/D147340

commit | commitdiff | tree

Shengchen Kan [Wed, 5 Apr 2023 09:42:12 +0000 (17:42 +0800)]

[X86][mem-fold][NFC] Refine code

1. Use `unsigned` for `KeyOp` and `DstOp` b/c `Opcode` is of type `unsigned`.
2. Align the comparator used in X86FoldTablesEmitter.cpp with the one in
CodeGenTarget::ComputeInstrsByEnum.

commit | commitdiff | tree

LLVM GN Syncbot [Wed, 5 Apr 2023 09:49:30 +0000 (09:49 +0000)]

[gn build] Port 628f11f78d33

commit | commitdiff | tree

Alexey Lapshin [Tue, 4 Apr 2023 12:23:34 +0000 (14:23 +0200)]

[DWARFLinkerParallel] Add StringTable class.

This patch adds StringTable class which is used to prepare
strings for emission into the .debug_str table. Specifically,
this class translates strings if necessary, keeps them in order,
assigns index and offset.

Differential Revision: https://reviews.llvm.org/D147529

commit | commitdiff | tree

Jay Foad [Wed, 5 Apr 2023 09:36:32 +0000 (10:36 +0100)]

[AMDGPU] Add machine verifier to a test

commit | commitdiff | tree

David Green [Wed, 5 Apr 2023 09:31:19 +0000 (10:31 +0100)]

[ARM] Combine fadd into fcmla

This is the MVE equivalent of https://reviews.llvm.org/D146407. It adds a
target combine for fadd(a, vcmla(b, c, d)) -> vcmla(fadd(a, b), c, d), pushing
the fadd into the operands of the fcmla, which can help simplify away some
additions.

Differential Revision: https://reviews.llvm.org/D147200

commit | commitdiff | tree

Mariya Podchishchaeva [Wed, 5 Apr 2023 09:01:42 +0000 (05:01 -0400)]

[clang] Fix crash when handling nested immediate invocations

Before this patch it was expected that if there was several immediate
invocations they all belong to the same expression evaluation context.
During parsing of non local variable initializer a new evaluation context is
pushed, so code like this
```
namespace scope {
struct channel {
consteval channel(const char* name) noexcept { }
};
consteval const char* make_channel_name(const char* name) { return name;}

channel rsx_log(make_channel_name("rsx_log"));
}
```
produced a nested immediate invocation whose subexpressions are attached
to different expression evaluation contexts. The constructor call
belongs to TU context and `make_channel_name` call to context of
variable initializer.

This patch removes this assumption and adds tracking of previously
failed immediate invocations, so it is possible when handling an
immediate invocation th check that its subexpressions from possibly another
evaluation context contains errors and not produce duplicate
diagnostics.

Fixes https://github.com/llvm/llvm-project/issues/58207

Reviewed By: aaron.ballman, shafik

Differential Revision: https://reviews.llvm.org/D146234

commit | commitdiff | tree

Nikita Popov [Fri, 24 Mar 2023 15:35:02 +0000 (16:35 +0100)]

[LICM] Don't require optimized uses

LICM currently requests optimized use MSSA form. This is wasteful,
because LICM doesn't actually care about most uses, only those of
invariant pointers in loops. Everything else doesn't need to be
optimized.

LICM already uses the clobber walker in most places. This patch
adjusts one place that was using getDefiningAccess() to use it as
well, so we no longer have a dependence on pre-optimized uses.

This change is not NFC in that the fallback on the defining access
when there are too many clobber calls may now fall back to an
unoptimized use. In practice, I've not seen any problems with this
though. If desired, we could also increase licm-mssa-optimization-cap
to a higher value (increasing this from 100 to 200 has no impact on
average compile-time -- but also doesn't appear to have any impact
on LICM quality either).

This makes for a 0.9% geomean compile-time improvement on CTMark.

Differential Revision: https://reviews.llvm.org/D147437

commit | commitdiff | tree

Hassnaa Hamdi [Thu, 30 Mar 2023 14:41:58 +0000 (14:41 +0000)]

[AArch64][DAGCombiner]: combine <2xi64> add/sub.

64-bit vector mul is not supported in NEON,
so we use the SVE's mul.
To improve the performance, we can go one step further,
and use SVE's add/sub, so that we can use SVE's mla/mls.
That works on these patterns:
// This works on the patterns of:
// add v1, (mul v2, v3)
// sub v1, (mul v2, v3)

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D147236

commit | commitdiff | tree

Heejin Ahn [Wed, 29 Mar 2023 19:40:30 +0000 (12:40 -0700)]

[WebAssembly] Fix selection of global calls

When selecting calls, currently we unconditionally remove `Wrapper`s of
the call target. But we are supposed to do that only when the target is
a function, an external symbol (= library function), or an alias of a
function. Otherwise we end up directly calling globals that are not
functions.

Fixes https://github.com/llvm/llvm-project/issues/60003.

Reviewed By: tlively, HerrCai0907

Differential Revision: https://reviews.llvm.org/D147397

commit | commitdiff | tree

Heejin Ahn [Sun, 2 Apr 2023 03:08:42 +0000 (20:08 -0700)]

[WebAssembly] Move call_indirect_alloca to call.ll

Not sure the distinction between `call.ll` and `call-indirect.ll`,
because `call.ll` also seems to contain many `call_indirect` tests. Also
before D147033 `call-indirect.ll` only contained a single test and it
also tests it with `obj2yaml`, so I guess that file was created for
testing functionalities for object files as well.

We can probably merge these two someday. But anyway, this moves
`call_indirect_alloca` I added in D147033 to `call.ll`, given that that
file contains more `call_indirect` tests and I'm planning to add more
`call_indirect` tests in a followup CL.

Reviewed By: tlively

Differential Revision: https://reviews.llvm.org/D147396

commit | commitdiff | tree

OCHyams [Wed, 5 Apr 2023 08:28:15 +0000 (09:28 +0100)]

[Assignment Tracking] Ignore zero-sized fragments

Such dbg.assigns will occur if you write zero-sized memcpys (see
https://reviews.llvm.org/D146987#4240016).

Handle this in AssignmentTrackingAnalysis (back end) rather than
AssignmentTrackingPass (declare-to-assign) in case it is possible to reproduce
this as a result of optimisations.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D147435

commit | commitdiff | tree

Kai Luo [Wed, 5 Apr 2023 08:19:18 +0000 (16:19 +0800)]

[PowerPC] Precommit test case for issue 61882. NFC.

commit | commitdiff | tree

David Spickett [Wed, 5 Apr 2023 08:05:29 +0000 (08:05 +0000)]

Revert "[-Wunsafe-buffer-usage] Fix-Its transforming `&DRE[any]` to `&DRE.data()[any]`"

This reverts commit 87b5807d3802b932c06d83c4287014872aa2caab.

The test case is failing on Windows https://lab.llvm.org/buildbot/#/builders/65/builds/8950.

commit | commitdiff | tree

Jean Perier [Wed, 5 Apr 2023 08:04:29 +0000 (10:04 +0200)]

[flang][hlfir] Support TYPE(*) actual argument in intrinsic procedures

Similar to https://reviews.llvm.org/D147487.
TYPE(*) evaluate::ActualArgument wraps a symbol instead of an
expression. This requires special handling, which is limited because
C710 restrict the intrinsics in which TYPE(*) may appear as arguments
(there is for instance no need to deal with dynamic presence aspects).

Differential Revision: https://reviews.llvm.org/D147513

commit | commitdiff | tree

Carlos Galvez [Tue, 4 Apr 2023 19:45:24 +0000 (19:45 +0000)]

[clang-tidy] Deprecate cert-dcl21-cpp

It is no longer part of the CERT standard. Looking at the
CERT webpage, we can see it has been moved to the Void
section:
https://wiki.sei.cmu.edu/confluence/display/cplusplus/5+The+Void

Differential Revision: https://reviews.llvm.org/D147563

commit | commitdiff | tree

sgokhale [Wed, 5 Apr 2023 05:41:36 +0000 (11:11 +0530)]

[AArch64][SVE][CodeGen] Generate fused mul+add/sub ops with one of add/sub operands as splat

Currently, depending upon whether the add/sub instruction can synthesize immediate directly,
its decided whether to generate mul+(add/sub immediate) or mov+mla/mad/msb/mls ops.

If the add/sub can synthesize immediate directly, then fused ops wont get generated. This
patch tries to address this by having makeshift higher priority for the fused ops.

Specifically, patch aims at transformation similar to below:
add ( mul, splat_vector(C))
->
MOV C
MAD

Differential Revision: https://reviews.llvm.org/D142656

commit | commitdiff | tree

Serguei Katkov [Wed, 5 Apr 2023 05:08:58 +0000 (12:08 +0700)]

[InstSimplify] Pre-land test for fp min/max optimization.

commit | commitdiff | tree

Jonas Devlieghere [Wed, 5 Apr 2023 04:49:55 +0000 (21:49 -0700)]

[dsymutil] Prevent interleaved errors and warnings

Use a mutex to protect the printing of errors and warnings and prevents
interleaving. There are two sources of parallelism in dsymutil that
could result in interleaved output: errors from different architectures
being processed in parallel and errors from the analyze and clone steps
which execute in lockstep. This patch addresses both by using a unique
mutex across all error reporting.

commit | commitdiff | tree

Jonas Devlieghere [Wed, 5 Apr 2023 04:48:58 +0000 (21:48 -0700)]

[dsymutil] Unify reporting of warnings and errors

Make all error reporting in DwarfLinkerForBinary go through the
`reportWarning` and `reportError` wrappers.

commit | commitdiff | tree

Jonas Devlieghere [Wed, 5 Apr 2023 04:28:57 +0000 (21:28 -0700)]

[dsymutil] Make copySwiftInterfaces a member of DwarfLinkerForBinary (NFC)

Make copySwiftInterfaces a member of DwarfLinkerForBinary instead of a
static function.

commit | commitdiff | tree

Aviad Cohen [Sun, 2 Apr 2023 09:12:15 +0000 (12:12 +0300)]

[mlir][tosa] Add InferTensorType interface to tosa reduce operations

When this interface is used, a call to inferReturnTypeComponents()
is generated on creation and verification of the op.

Reviewed By: jpienaar, eric-k256

Differential Revision: https://reviews.llvm.org/D147407

commit | commitdiff | tree

Lang Hames [Wed, 5 Apr 2023 01:07:44 +0000 (18:07 -0700)]

[ORC] Return bootstrap map values via reference argument.

This simplifies checking of the result (it's just an Error, rather than an
optional<Expected<T>>), and allows T to be deduced rather than requiring that
it be specified.

commit | commitdiff | tree

Nico Weber [Wed, 5 Apr 2023 01:25:45 +0000 (21:25 -0400)]

[gn build] Port 443825c517c8

commit | commitdiff | tree

Nico Weber [Wed, 5 Apr 2023 01:23:29 +0000 (21:23 -0400)]

[gn] port 4dc3bcf0124a

commit | commitdiff | tree

Tomás Longeri [Wed, 5 Apr 2023 01:07:25 +0000 (01:07 +0000)]

Fix bazel overlay after "[mlir] Introduce IRDL dialect"

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D147583

commit | commitdiff | tree

David Blaikie [Wed, 5 Apr 2023 01:06:16 +0000 (01:06 +0000)]

Fix a few clang-tidy warnings (container empty checks, function decl/def param naming)

commit | commitdiff | tree

Joseph Huber [Tue, 4 Apr 2023 23:32:00 +0000 (18:32 -0500)]

[libc] Forward CUDA options to the runtimes invocation of `libc`

Some configurations may require `-DCUDAToolkit_ROOT` to find CUDA
properly. This is currently not forwarded to the CMake invocation. This
patch adds a prefix so it will be visible when the runtimes build is
started.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D147582

commit | commitdiff | tree

Joseph Huber [Tue, 4 Apr 2023 23:20:13 +0000 (18:20 -0500)]

[libc] Ensure that the required clang tools are up-to-date for libc GPU

The `clang-offload-packager`. `nvptx-arch`, and `amdgpu-arch` tools are
required for building the GPU target of `libc`. This patch ensures that
we build this tool when directly building `libc` via `ninja libc` or similar.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D147581

commit | commitdiff | tree

Joseph Huber [Tue, 4 Apr 2023 23:10:51 +0000 (18:10 -0500)]

[nvptx-arch] Dynamically load `libcuda.so.1` directly instead

This patch loads the CUDA driver library directly via its real
`DT_SONAME`. This prevents the filesystem from needing to reload it in
cases when it's already loaded.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D147579

commit | commitdiff | tree

Hongtao Yu [Fri, 31 Mar 2023 00:36:51 +0000 (17:36 -0700)]

[FS-AFDO] Assign discriminators to pseudo probes

This is the first change for FS-AFDO integration with CSSPGO. There are more patches coming.

With pseudo probes, we do not assign FS discriminators to any other instructions since we will be using only probes for profile correlation.

Also call instructions are excluded since their dwarf discriminators are used for other purposes, i.e, storing probe ids. Since they are not getting a FS discriminator, they will also be excluded from MIR profile loading. The corresponding changes will be in the subsequent patches.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D147286

commit | commitdiff | tree

Ian Douglas Scott [Tue, 4 Apr 2023 23:32:14 +0000 (16:32 -0700)]

[M68k] Add `TRAP`, `TRAPV`, `BKPT`, `ILLEGAL` instructions

This makes it possible to use TRAP to make Linux system calls using
inline assembly for instance.

Differential Revision: https://reviews.llvm.org/D147102

commit | commitdiff | tree

Sanjeet Karan Singh [Tue, 4 Apr 2023 22:17:29 +0000 (15:17 -0700)]

asan_memory_profile: Fix for deadlock in memory profiler code.

Calling symbolization directly from stopTheWorld was causing deadlock.
For libc dep systems, symbolization uses dl_iterate_phdr, which acquire a
dl write lock. It could deadlock if the lock is already acquired by one of
suspended.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D146990

Domain: System / Toolchain;