review.tizen.org Git - platform/upstream/llvm.git/log

projects / platform / upstream / llvm.git / log

Peiming Liu [Tue, 15 Nov 2022 00:02:43 +0000 (00:02 +0000)]

[mlir][sparse] fix memory leak sparse2sparse reshape

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D137994

commit | commitdiff | tree

Stella Stamenova [Tue, 15 Nov 2022 00:18:04 +0000 (16:18 -0800)]

Revert "[mlir][sparse] Macros to clean up StridedMemRefType in the SparseTensorRuntime" and "[mlir][sparse] move SparseTensorReader functions into the _mlir_ciface_ section"

This reverts commits 6c22dad and 92bc3fb.

These broke the windows mlir buildbot.

commit | commitdiff | tree

Matt Arsenault [Mon, 14 Nov 2022 20:42:08 +0000 (12:42 -0800)]

GlobalISel: Add debug print for applied rule in generated combiner

commit | commitdiff | tree

Fangrui Song [Mon, 14 Nov 2022 23:51:03 +0000 (15:51 -0800)]

Revert "[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm"

This reverts commit bf8381a8bce28fc69857645cc7e84a72317e693e.

There is a layering violation: LLVMAnalysis depends on LLVMCore, so
LLVMCore should not include LLVMAnalysis header
llvm/Analysis/ModuleSummaryAnalysis.h

commit | commitdiff | tree

Alexander Shaposhnikov [Mon, 14 Nov 2022 23:15:19 +0000 (23:15 +0000)]

[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm

Enable using -module-summary with -S
(similarly to what currently can be achieved with opt <input> -o - | llvm-dis).
This is a recommit of ef9e62469.

Test plan: ninja check-all

Differential revision: https://reviews.llvm.org/D137768

commit | commitdiff | tree

Peiming Liu [Mon, 14 Nov 2022 22:28:12 +0000 (22:28 +0000)]

[mlir][sparse] fix memory leak in test cases

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D137985

commit | commitdiff | tree

wren romano [Mon, 14 Nov 2022 22:43:03 +0000 (14:43 -0800)]

[mlir][sparse] Fix warning on GCC

Reviewed By: aartbik, Peiming

Differential Revision: https://reviews.llvm.org/D137987

commit | commitdiff | tree

Philip Reames [Mon, 14 Nov 2022 22:21:51 +0000 (14:21 -0800)]

[RISCV] Add codegen coverage for select idioms which might benefit from XVentanaCondOps

commit | commitdiff | tree

Joseph Huber [Mon, 14 Nov 2022 20:58:19 +0000 (14:58 -0600)]

[libc] Forward LLVM_LIBC options when using a runtimes build

The `LLVM_ENABLE_RUNTIMES' mode is commonly used to build runtimes that
depend on an up-to-date version of clang. Currently, `libc` uses some
internal variables that are not forwarded when building in this mode.
This patch forwards the relevent arguments beginning with `LLVM_LIBC` to
the build when built this way.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D137977

commit | commitdiff | tree

bixia1 [Mon, 14 Nov 2022 18:05:19 +0000 (10:05 -0800)]

[mlir][sparse] Make three tests run with the codegen path.

Reviewed By: aartbik, Peiming

Differential Revision: https://reviews.llvm.org/D137964

commit | commitdiff | tree

wren romano [Wed, 9 Nov 2022 21:38:11 +0000 (13:38 -0800)]

[mlir][sparse] move SparseTensorReader functions into the _mlir_ciface_ section

Depends On D137735

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D137737

commit | commitdiff | tree

wren romano [Wed, 9 Nov 2022 21:35:45 +0000 (13:35 -0800)]

[mlir][sparse] Macros to clean up StridedMemRefType in the SparseTensorRuntime

In particular, this silences warnings from [-Wsign-compare].

Depends On D137681

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D137735

commit | commitdiff | tree

wren romano [Wed, 9 Nov 2022 21:33:01 +0000 (13:33 -0800)]

[mlir][sparse] Making way for SparseTensorRuntime to support non-permutations

Systematically updates the SparseTensorRuntime to properly distinguish tensor-dimensions from storage-levels (and their associated ranks, shapes, sizes, indices, etc). With a few exceptions which are noted in the code, this ensures the runtime has all the **semantic** changes necessary to support non-permutations.

(Whereas **operationally**, since we're still using `std::vector<uing64_t>` to represent the mappings, there's no way to pass in any interesting non-permutations. Changing the representation to `std::function` will be done in a separate differential.)

Depends On D137680

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D137681

commit | commitdiff | tree

Craig Topper [Mon, 14 Nov 2022 21:34:24 +0000 (13:34 -0800)]

[RISCV] Add PseudoCCMOVGPR to RISCVSExtWRemoval.

This instruction is a conditional move. It propagates sign bits
from its inputs.

commit | commitdiff | tree

Alexander Shaposhnikov [Mon, 14 Nov 2022 21:31:30 +0000 (21:31 +0000)]

Revert "[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm"

This reverts commit ef9e624694c0f125c53f7d0d3472fd486bada57d
for further investigation offline.
It appears to break the buildbot
llvm-clang-x86_64-sie-ubuntu-fast.

commit | commitdiff | tree

Alexander Shaposhnikov [Mon, 14 Nov 2022 21:10:24 +0000 (21:10 +0000)]

[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm

Enable using -module-summary with -S
(similarly to what currently can be achieved with opt <input> -o - | llvm-dis).

Test plan: ninja check-all

Differential revision: https://reviews.llvm.org/D137768

commit | commitdiff | tree

Jonas Devlieghere [Mon, 14 Nov 2022 21:03:29 +0000 (13:03 -0800)]

Revert "[dsymutil] Fix assertion in the Reproducer/FileCollector when TMPDIR is empty"

This reverts commit 68efb4772c0d0e60cbfb09ea619b58d80c31ff0f because the
test fails on some of the buildbots.

commit | commitdiff | tree

Xiang Li [Fri, 11 Nov 2022 08:00:11 +0000 (00:00 -0800)]

[DirectX backend] Fix build and test error caused by out of sync with upstream change.

Fix build and test error caused by
https://github.com/llvm/llvm-project/commit/a2620e00ffa232a406de3a1d8634beeda86956fd#
and
https://github.com/llvm/llvm-project/commit/304f1d59ca41872c094def3aee0a8689df6aa398

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D137815

commit | commitdiff | tree

Jonas Devlieghere [Fri, 11 Nov 2022 19:50:45 +0000 (11:50 -0800)]

[dsymutil] Fix assertion in the Reproducer/FileCollector when TMPDIR is empty

Fix a assertion in dsymutil coming from the Reproducer/FileCollector.
When TMPDIR is empty, the root becomes a relative path, triggering an
assertion when adding a relative path to the VFS mapping. This patch
fixes the issue by resolving the relative path and also moves the
assertion up to make it easier to diagnose these issues in the future.

rdar://102170986

Differential revision: https://reviews.llvm.org/D137959

commit | commitdiff | tree

Andreas Hollandt [Mon, 14 Nov 2022 20:27:12 +0000 (12:27 -0800)]

[cmake] Fix _GNU_SOURCE being added unconditionally

Reviewed By: tstellar

Differential Revision: https://reviews.llvm.org/D137917

commit | commitdiff | tree

Nico Weber [Mon, 14 Nov 2022 19:23:56 +0000 (14:23 -0500)]

[COFF, Mach-O] Include -mllvm options in thinlto cache key

Like D134013, but for COFF and Mach-O.

Also expand the ELF test a bit. I at first didn't realize that `getValue()` for
`-mllvm -foo=bar` would return `-foo=bar` instead of just `bar`, and so
I wrote the test to check if we indeed get this wrong. We don't, but
having the test for it seems nice, so I'm including it.

Differential Revision: https://reviews.llvm.org/D137971

commit | commitdiff | tree

Jakub Kuderski [Mon, 14 Nov 2022 20:07:18 +0000 (15:07 -0500)]

[mlir][arith][spirv] Handle i1 sign extension in arith-to-spirv

Also fix some surrounding nits.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D137974

commit | commitdiff | tree

Mehdi Amini [Mon, 14 Nov 2022 06:24:54 +0000 (06:24 +0000)]

Apply clang-tidy fixes for readability-identifier-naming in AlgebraicSimplification.cpp (NFC)

commit | commitdiff | tree

Mehdi Amini [Mon, 14 Nov 2022 06:10:39 +0000 (06:10 +0000)]

Apply clang-tidy fixes for readability-simplify-boolean-expr in GPUDialect.cpp (NFC)

commit | commitdiff | tree

Rob Suderman [Mon, 14 Nov 2022 19:38:16 +0000 (11:38 -0800)]

[mlir][tosa] Remove zero-fill of tosa.concat outputs when lowering to linalg.

Since all output elements are known to be overridden by construction the fill is not required. This change makes the tosa lowering consistent with the MHLO and Torch lowerings of concat which do not do the fill.

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D137967

commit | commitdiff | tree

Guozhi Wei [Mon, 14 Nov 2022 19:34:59 +0000 (19:34 +0000)]

[MachineCSE] Allow CSE for instructions with ignorable operands

Ignorable operands don't impact instruction's behavior, we can safely do CSE on
the instruction.

It is split from D130919. It has big impact to some AMDGPU test cases.
For example in atomic_optimizations_raw_buffer.ll, when trying to check if the
following instruction can be CSEed

  %37:vgpr_32 = V_MOV_B32_e32 0, implicit $exec

Function isCallerPreservedOrConstPhysReg is called on operand "implicit $exec",
this function is implemented as

  -  return TRI.isCallerPreservedPhysReg(Reg, MF) ||
  +  return TRI.isCallerPreservedPhysReg(Reg, MF) || TII.isIgnorableUse(MO) ||
            (MRI.reservedRegsFrozen() && MRI.isConstantPhysReg(Reg));

Both TRI.isCallerPreservedPhysReg and MRI.isConstantPhysReg return false on this
operand, so isCallerPreservedOrConstPhysReg is also false, it causes LLVM failed
to CSE this instruction.

With this patch TII.isIgnorableUse returns true for the operand $exec, so
isCallerPreservedOrConstPhysReg also returns true, it causes this instruction to
be CSEed with previous instruction

  %14:vgpr_32 = V_MOV_B32_e32 0, implicit $exec

So I got different result from here. AMDGPU's implementation of isIgnorableUse
is

  bool SIInstrInfo::isIgnorableUse(const MachineOperand &MO) const {
    // Any implicit use of exec by VALU is not a real register read.
    return MO.getReg() == AMDGPU::EXEC && MO.isImplicit() &&
           isVALU(*MO.getParent()) && !resultDependsOnExec(*MO.getParent());
  }

Since the operand $exec is not a real register read, my understanding is it's
reasonable to do CSE on such instructions.

Because more instructions are CSEed, so I get less instructions generated for
these tests.

Differential Revision: https://reviews.llvm.org/D137222

commit | commitdiff | tree

Matt Arsenault [Fri, 28 Oct 2022 23:10:41 +0000 (16:10 -0700)]

clang/AMDGPU: Use Support's wrapper around getenv

This does some extra stuff for Windows, so might as well
use it just in case.

commit | commitdiff | tree

Quentin Colombet [Thu, 20 Oct 2022 22:18:58 +0000 (22:18 +0000)]

[mlir][MemRef] Change the anchor point of a reshapeLikeOp pattern

Essentially, this patches changes the anchor point of the
`extract_strided_metadata(reshapeLikeOp)` pattern from
`extract_strided_metadata` to `reshapeLikeOp`.

In details, this means that instead of replacing:
```
base, offset, sizes, strides =
extract_strided_metadata(reshapeLikeOp(src))
```
With
```
base, offset = extract_strided_metadata(src)
sizes = <some math>
strides = <some math>
```

We replace only the reshapeLikeOp part and connect it back with a
reinterpret_cast:
```
val = reshapeLikeOp(src)
```
=>
```
base, offset, ... = extract_strided_metadata(src)
sizes = <some math>
strides = <some math>
val = reinterpret_cast base, offset, sizes, strides

Differential Revision: https://reviews.llvm.org/D136386

commit | commitdiff | tree

Quentin Colombet [Wed, 12 Oct 2022 21:23:27 +0000 (21:23 +0000)]

[mlir][MemRef] Change the anchor point of a subview pattern

Essentially, this patches changes the anchor point of the
`extract_strided_metadata(subview)` pattern from
`extract_strided_metadata` to `subview`.

In details, this means that instead of replacing:
```
base, offset, sizes, strides = extract_strided_metadata(subview(src))
```
With
```
base, ... = extract_strided_metadata(src)
offset = <some math>
sizes = subSizes
strides = <some math>
```

We replace only the subview part and connect it back with a
reinterpret_cast:
```
val = subview(src)
```
=>
```
base, ... = extract_strided_metadata(src)
offset = <some math>
sizes = subSizes
strides = <some math>
val = reinterpret_cast base, offset, sizes, strides
```

Differential Revision: https://reviews.llvm.org/D135839

commit | commitdiff | tree

Quentin Colombet [Wed, 12 Oct 2022 21:18:53 +0000 (21:18 +0000)]

[mlir][MemRef] Simplify extract_strided_metadata(reinterpret_cast)

This patch adds a pattern to simplify
```
base, offset, sizes, strides =
extract_strided_metadata(
reinterpret_cast(src, srcOffset, srcSizes, srcStrides))
```

Into
```
base, baseOffset, ... = extract_strided_metadata(src)
offset = srcOffset
sizes = srcSizes
strides = srcStrides
```

Note: Reinterpret_cast with unranked sources are not simplified since
they cannot feed extract_strided_metadata operations.

Differential Revision: https://reviews.llvm.org/D135837

commit | commitdiff | tree

Nico Weber [Mon, 14 Nov 2022 18:30:55 +0000 (13:30 -0500)]

[lto] Update function name in comment after 5f312ad45

commit | commitdiff | tree

Craig Topper [Mon, 14 Nov 2022 18:28:29 +0000 (10:28 -0800)]

[RISCV] Add scalar FP compares to isSignExtendingOpW in RISCVSExtWRemoval.

commit | commitdiff | tree

Joe Nash [Mon, 14 Nov 2022 15:15:27 +0000 (10:15 -0500)]

[AMDGPU][MC][NFC] Rename VOP3 VOPC test files

D136149 and D136148 renamed the MC test files for VOP3 promoted from VOP1 and
VOP2 in a consistent way. Do the same for VOP3 coming from VOPC.

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D137950

commit | commitdiff | tree

Craig Topper [Mon, 14 Nov 2022 17:59:03 +0000 (09:59 -0800)]

[RISCV] Move FixableDef handling out of isSignExtendingOpW.

We have two layers of opcode checks. The first is in
isSignExtendingOpW. If that returns false, a second switch is used
for looking through nodes by adding them to the worklist.

Move the FixableDef handling to the second switch. This simplies
the interface of isSignExtendingOpW and makes that function more
accurate to its name.

commit | commitdiff | tree

Yashwant Singh [Mon, 14 Nov 2022 17:57:08 +0000 (23:27 +0530)]

[GlobalIsel][AMDGPU] Changing legalize rule for G_{UADDO|UADDE|USUBO|USUBE|SADDE|SSUBE}

Generic add and sub with carry are now legalized in a way to explicitly calculate carry/borrow output. i.e
%6:_(s64), %7:_(s1) = G_UADDO %0, %1
becomes,
%13:_(s32), %14:_(s1) = G_UADDO %2, %4
%15:_(s32), %16:_(s1) = G_UADDE %3, %5, %14
%6:_(s64) = G_MERGE_VALUES %13(s32), %15(s32)
%7:_(s1) = G_ICMP intpred(ult), %6(s64), %1

Here G_MERGE and G_ICMP instructions are redundant for recalculating carry output. (Similar case for sub with borrow)
This change fix this.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D137932

commit | commitdiff | tree

Thomas Raoux [Sun, 13 Nov 2022 18:52:03 +0000 (18:52 +0000)]

[mlir][linalg] Add reduction tiling using scf.foreachthread

This adds a transformation to tile reduction operations to partial
reduction using scf.foreachthread. This uses
PartialReductionOpInterface to create a merge operation of the partial
tiles.

Differential Revision: https://reviews.llvm.org/D137912

commit | commitdiff | tree

Benjamin Kramer [Mon, 14 Nov 2022 18:02:56 +0000 (19:02 +0100)]

[bazel] Add another missing dependency after D137833

While there run buildifier.

commit | commitdiff | tree

Quentin Colombet [Wed, 12 Oct 2022 00:29:39 +0000 (00:29 +0000)]

[mlir][MemRef] Make reinterpret_cast(extract_strided_metadata) more robust

Prior to this patch the canonicalization pattern that turns
`reinterpret_cast(extract_strided_metadata)` into cast was only applied
when all the input operands of the `reinterpret_cast` are exactly all the
output results of the `extract_strided_metadata`.

This missed simplification opportunities when the values would have hold
the same constant values, but yet, come from different actual values.

E.g., prior to this patch, a pattern of the form:
```
%base, %offset = extract_strided_metadata %source : memref<i16>
reinterpret_cast %base to offset:[0]
```
Wouldn't have been simplified into a simple cast, because %offset is not
directly the same value object as 0.

This patch teaches this pattern how to check if the constant values
match what the results of the `extract_strided_metadata` operation would
have hold.

Differential Revision: https://reviews.llvm.org/D135736

commit | commitdiff | tree

Chenguang Wang [Mon, 14 Nov 2022 17:57:52 +0000 (18:57 +0100)]

[bazel] Fix Bufferization dialect build

D137833 added a new .td file and updated existing files to use it.
It broke bazel build.

Differential Revision: https://reviews.llvm.org/D137961

commit | commitdiff | tree

Jason Molenda [Mon, 14 Nov 2022 17:50:58 +0000 (09:50 -0800)]

Change last-ditch magic address in IRMemoryMap::FindSpace

When we cannot allocate memory in the inferior process, the IR
interpreter's IRMemoryMap::FindSpace will create an lldb local
buffer and assign it an address range in the inferior address
space. When the interpreter sees an address in that range, it
will read/write from the local buffer instead of the target. If
this magic address overlaps with actual data in the target, the
target cannot be accessed through expressions.

Instead of using a high memory address that is validly addressable,
this patch uses an address that cannot be accessed on 64-bit systems
that don't actually use all 64 bits of the virtual address.

Differential Revision: https://reviews.llvm.org/D137682
rdar://96248287

commit | commitdiff | tree

Yabin Cui [Mon, 14 Nov 2022 17:48:44 +0000 (17:48 +0000)]

[Support] Use thread safe version of getpwuid and getpwnam.

OpenGroup specification doesn't require getpwuid and getpwnam
to be thread-safe. And musl libc has a not thread-safe implementation.
When building clang with musl, this can make clang-scan-deps crash.

Reviewed By: pirama

Differential Revision: https://reviews.llvm.org/D137864

commit | commitdiff | tree

Arthur Eubanks [Sun, 13 Nov 2022 22:54:26 +0000 (14:54 -0800)]

[LegacyPM] Remove cl::opts controlling optimization pass manager passes

Move these to the new PM if they're used there.

Part of removing the legacy pass manager for optimization pipeline.

Reland with UseNewGVN usage in clang removed.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D137915

commit | commitdiff | tree

Craig Topper [Mon, 14 Nov 2022 17:17:47 +0000 (09:17 -0800)]

[RISCV] Remove old test case. NFC

This seemed to be testing a pattern for an RV64 Zbp instruction, but
on RV32. On RV32, it's just swizzling registers so isn't very
interesting.

commit | commitdiff | tree

Craig Topper [Mon, 14 Nov 2022 17:15:31 +0000 (09:15 -0800)]

[RISCV] Improve use of PACK instruction on rv64.

Handle the case where the lower bits come from a zero extending
load or other operation with known zero bits.

commit | commitdiff | tree

Arthur Eubanks [Mon, 14 Nov 2022 17:33:38 +0000 (09:33 -0800)]

Revert "[LegacyPM] Remove cl::opts controlling optimization pass manager passes"

This reverts commit 7ec05fec7115a910b2e172de794adc462388c25e.

Breaks bots, e.g. https://lab.llvm.org/buildbot#builders/217/builds/15008

commit | commitdiff | tree

Arthur Eubanks [Sun, 13 Nov 2022 22:54:26 +0000 (14:54 -0800)]

[LegacyPM] Remove cl::opts controlling optimization pass manager passes

Move these to the new PM if they're used there.

Part of removing the legacy pass manager for optimization pipeline.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D137915

commit | commitdiff | tree

Nicolas Vasilache [Sun, 13 Nov 2022 13:28:32 +0000 (05:28 -0800)]

[mlir][Transform]Significantly cleanup scf.foreach_thread and GPU transform permutation handling

Previously, the need for a dense permutation leaked into the thread_dim_mapping specification.
This revision allows to use a sparse specification of the thread_dim_mapping and the proper completion / sorting is applied automatically.

In the process, the sematics of scf.foreach_thread is tightened to require a matching number of thread dimensions and mappings.
The relevant negative test is added.

Differential Revision: https://reviews.llvm.org/D137906

commit | commitdiff | tree

Akash Banerjee [Wed, 9 Nov 2022 15:54:21 +0000 (15:54 +0000)]

Migrate getOrCreateInternalVariable from Clang to OMPIRBuilder.

This patch removes getOrCreateInternalVariable from Clang OMP CodeGen and replaces it's uses with OMPBuilder::getOrCreateInternalVariable. Also refactors OMPBuilder::getOrCreateInternalVariable to change type of name from Twine to StringRef

Differential Revision: https://reviews.llvm.org/D137720

commit | commitdiff | tree

Lorenzo Chelini [Fri, 11 Nov 2022 12:35:16 +0000 (13:35 +0100)]

[MLIR][Transform] Expose map layout option in `OneShotBufferizeOp`

Expose `function-boundary-type-conversion` in `OneShotBufferizeOp`. To
reuse options between passes and transform operations, create a
`BufferizationEnums.td`.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D137833

commit | commitdiff | tree

Sanjay Patel [Mon, 14 Nov 2022 16:54:47 +0000 (11:54 -0500)]

[InstSimplify] restrict logic fold with partial undef vector

https://alive2.llvm.org/ce/z/4ncsnX

Fixes #58977

commit | commitdiff | tree

Sanjay Patel [Sun, 13 Nov 2022 17:38:12 +0000 (12:38 -0500)]

[SystemZ] improve test for showing store merge miscompile; NFC

See issue #58883 for details.

commit | commitdiff | tree

Philip Reames [Mon, 14 Nov 2022 16:29:55 +0000 (08:29 -0800)]

[RISCV] Implement assembler support for XVentanaCondOps

This change provides an implementation of the XVentanaCondOps vendor extension. This extension is defined in version 1.0.0 of the VTx-family custom instructions specification (https://github.com/ventanamicro/ventana-custom-extensions/releases/download/v1.0.0/ventana-custom-extensions-v1.0.0.pdf) by Ventana Micro Systems.

In addition to the technical contribution, this change is intended to be a test case for our vendor extension policy.

Once this lands, I plan to use this extension to prototype selection lowering to conditional moves. There's an RVI proposal in flight, and the expectation is that lowering to these and the new RVI instructions is likely to be substantially similar.

Differential Revision: https://reviews.llvm.org/D137350

commit | commitdiff | tree

bixia1 [Wed, 9 Nov 2022 17:07:06 +0000 (09:07 -0800)]

[mlir][sparse] Add rewriting rules for sparse_tensor.sort_coo.

Refactor the rewriting of sparse_tensor.sort to support the implementation of
sparse_tensor.sort_coo.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D137522

commit | commitdiff | tree

Sylvain Audi [Wed, 9 Nov 2022 15:01:55 +0000 (10:01 -0500)]

[PDB] Don't include input files in the 'cmd' entry of S_ENVBLOCK

MSVC records the command line arguments in S_ENVBLOCK, skipping the input file arguments.
This patch adds this filtering on lld-link side.

Differential Revision: https://reviews.llvm.org/D137723

commit | commitdiff | tree

Simon Pilgrim [Mon, 14 Nov 2022 16:13:16 +0000 (16:13 +0000)]

[MCA][X86] Ensure the avx512 vnni tests use the upper xmm/ymm registers

Ensure we're testing the avx512vl vnni instructions and not the avx vnni instructions

commit | commitdiff | tree

Simon Pilgrim [Mon, 14 Nov 2022 15:57:13 +0000 (15:57 +0000)]

[MCA][X86] Add test coverage for VBMI2 instructions

commit | commitdiff | tree

Chris Bieneman [Mon, 14 Nov 2022 16:28:36 +0000 (10:28 -0600)]

[NFC] Fixing spelling in code comment

commit | commitdiff | tree

bixia1 [Fri, 11 Nov 2022 22:24:26 +0000 (14:24 -0800)]

[mlir][sparse][NFC] Add comments to tests that are run for with and without runtime libraries.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D137869

commit | commitdiff | tree

Ivan Kosarev [Mon, 14 Nov 2022 16:10:23 +0000 (16:10 +0000)]

[AMDGPU][AsmParser] Forbid TFE modifiers for MBUF stores.

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D137832

commit | commitdiff | tree

Nicholas Guy [Mon, 14 Nov 2022 15:55:44 +0000 (15:55 +0000)]

[NFC] Removal of complex deinterleaving test case complex_mul_v8f64

This test is not particularly useful for testing complex deinterleaving,
especially due to f64 muls not being supported in mve. The test is
being removed as it's hitting an unrelated pre-existing condition
regarding register spilling.

commit | commitdiff | tree

Jay Foad [Mon, 14 Nov 2022 15:27:59 +0000 (15:27 +0000)]

[AMDGPU] More use of DivergentBinFrag and friends. NFC.

commit | commitdiff | tree

Nikita Popov [Tue, 18 Oct 2022 10:11:04 +0000 (12:11 +0200)]

[AA] Move MayBeCrossIteration into AAQI (NFC)

Move the MayBeCrossIteration flag from BasicAA into AAQI. This is
in preparation for exposing it to users of the AA API.

commit | commitdiff | tree

Ivan Kosarev [Mon, 14 Nov 2022 12:37:26 +0000 (12:37 +0000)]

[AMDGPU][MC] Support TFE modifiers in MUBUF loads and stores.

Reviewed By: dp, arsenm

Differential Revision: https://reviews.llvm.org/D137783

commit | commitdiff | tree

Mindong Chen [Mon, 14 Nov 2022 15:18:47 +0000 (23:18 +0800)]

[docs][OpaquePtr] Fix hyperlinks

commit | commitdiff | tree

Jay Foad [Mon, 14 Nov 2022 15:14:55 +0000 (15:14 +0000)]

[AMDGPU] Define and use UniformTernaryFrag. NFC.

commit | commitdiff | tree

Simon Pilgrim [Mon, 14 Nov 2022 10:58:20 +0000 (10:58 +0000)]

[X86] Remove unnecessary overrides for CBW/CWDE/CDQE/CMC instructions

All of these match the default WriteALU schedule

commit | commitdiff | tree

Caroline Concatto [Thu, 3 Nov 2022 12:18:20 +0000 (12:18 +0000)]

[AArch64] Add all SME2.1 instructions Assembly/Disassembly

This patch adds a new feature flag:
sme-f16f16 to represent FEAT_SME-F16F16

This patch add the following instructions:
SME2.1 stand alone instructions:
   MOVAZ (array to vector, four registers): Move and zero four ZA single-vector groups to vector registers.
         (array to vector, two registers): Move and zero two ZA single-vector groups to vector registers.
         (tile to vector, four registers): Move and zero four ZA tile slices to vector registers.
         (tile to vector, single): Move and zero ZA tile slice to vector register.
         (tile to vector, two registers): Move and zero two ZA tile slices to vector registers.

   LUTI2 (Strided four registers): Lookup table read with 2-bit indexes.
         (Strided two registers): Lookup table read with 2-bit indexes.

   LUTI4 (Strided four registers): Lookup table read with 4-bit indexes.
         (Strided two registers): Lookup table read with 4-bit indexes.

   ZERO (double-vector): Zero ZA double-vector groups.
        (quad-vector): Zero ZA quad-vector groups.
        (single-vector): Zero ZA single-vector groups.

SME2p1 and SME-F16F16:
All instructions are half precision elements:
   FADD: Floating-point add multi-vector to ZA array vector accumulators.

   FSUB: Floating-point subtract multi-vector from ZA array vector accumulators.

   FMLA (multiple and indexed vector): Multi-vector floating-point fused multiply-add by indexed element.
        (multiple and single vector): Multi-vector floating-point fused multiply-add by vector.
        (multiple vectors): Multi-vector floating-point fused multiply-add.

   FMLS (multiple and indexed vector): Multi-vector floating-point fused multiply-subtract by indexed element.
        (multiple and single vector): Multi-vector floating-point fused multiply-subtract by vector.
        (multiple vectors): Multi-vector floating-point fused multiply-subtract.

   FCVT (widening): Multi-vector floating-point convert from half-precision to single-precision (in-order).

   FCVTL: Multi-vector floating-point convert from half-precision to deinterleaved single-precision.

   FMOPA (non-widening): Floating-point outer product and accumulate.

   FMOPS (non-widening): Floating-point outer product and subtract.

SME2p1 and B16B16:
   BFADD: BFloat16 floating-point add multi-vector to ZA array vector accumulators.

   BFSUB: BFloat16 floating-point subtract multi-vector from ZA array vector accumulators.

   BFCLAMP: Multi-vector BFloat16 floating-point clamp to minimum/maximum number.

   BFMLA (multiple and indexed vector): Multi-vector BFloat16 floating-point fused multiply-add by indexed element.
         (multiple and single vector): Multi-vector BFloat16 floating-point fused multiply-add by vector.
         (multiple vectors): Multi-vector BFloat16 floating-point fused multiply-add.

   BFMLS (multiple and indexed vector): Multi-vector BFloat16 floating-point fused multiply-subtract by indexed element.
         (multiple and single vector): Multi-vector BFloat16 floating-point fused multiply-subtract by vector.
         (multiple vectors): Multi-vector BFloat16 floating-point fused multiply-subtract.

   BFMAX (multiple and single vector): Multi-vector BFloat16 floating-point maximum by vector.
         (multiple vectors): Multi-vector BFloat16 floating-point maximum.

   BFMAXNM (multiple and single vector): Multi-vector BFloat16 floating-point maximum number by vector.
           (multiple vectors): Multi-vector BFloat16 floating-point maximum number.

   BFMIN (multiple and single vector): Multi-vector BFloat16 floating-point minimum by vector.
         (multiple vectors): Multi-vector BFloat16 floating-point minimum.

   BFMINNM (multiple and single vector): Multi-vector BFloat16 floating-point minimum number by vector.
           (multiple vectors): Multi-vector BFloat16 floating-point minimum number.

   BFMOPA (non-widening): BFloat16 floating-point outer product and accumulate.

   BFMOPS (non-widening): BFloat16 floating-point outer product and subtract.

The reference can be found here:

https://developer.arm.com/documentation/ddi0602/2022-09

Differential Revision: https://reviews.llvm.org/D137571

commit | commitdiff | tree

Nikita Popov [Mon, 14 Nov 2022 14:46:00 +0000 (15:46 +0100)]

[AST] Remove legacy AliasSetPrinter pass

A NewPM version of this pass exists, drop the legacy version of
this testing-only pass.

commit | commitdiff | tree

Sjoerd Meijer [Fri, 11 Nov 2022 12:56:42 +0000 (18:26 +0530)]

[AArch64] Add match patterns for the reassociated forms of FNMUL

Differential Revision: https://reviews.llvm.org/D137925

commit | commitdiff | tree

Nikita Popov [Mon, 14 Nov 2022 14:28:09 +0000 (15:28 +0100)]

[LoopVersioningLICM] Clarify scope of AST (NFC)

Make it clearer that the AST is only temporarily used during the
legality check, and does not have to survive into the transformation
phase.

commit | commitdiff | tree

Joseph Huber [Mon, 14 Nov 2022 14:11:33 +0000 (08:11 -0600)]

[OpenMP] Fix installation to old resource dir

Summary:
The changes in D125860 renamed the old resource directory to only use
the major version. This was not updated for the OpenMP project, causing
OpenMP resources to still be installed in the old `major.minor.rev`
folder. This lead to problems including the header files.

fixes #58966

commit | commitdiff | tree

Luca Di Sera [Mon, 14 Nov 2022 14:17:22 +0000 (15:17 +0100)]

Add clang_CXXMethod_isMoveAssignmentOperator to libclang

The new method is a wrapper of `CXXMethodDecl::isMoveAssignmentOperator` and
can be used to recognized move-assignment operators in libclang.

An export for the function, together with its documentation, was added to
"clang/include/clang-c/Index.h" with an implementation provided in
"clang/tools/libclang/CIndex.cpp". The implementation was based on
similar `clang_CXXMethod.*` implementations, following the same
structure but calling `CXXMethodDecl::isMoveAssignmentOperator` for its
main logic.

The new symbol was further added to "clang/tools/libclang/libclang.map"
to be exported, under the LLVM16 tag.

"clang/tools/c-index-test/c-index-test.c" was modified to print a
specific tag, "(move-assignment operator)", for cursors that are
recognized by `clang_CXXMethod_isMoveAssignmentOperator`.
A new regression test file,
"clang/test/Index/move-assignment-operator.cpp", was added to ensure
whether the correct constructs were recognized or not by the new function.

The "clang/test/Index/get-cursor.cpp" regression test file was updated
as it was affected by the new "(move-assignment operator)" tag.

A binding for the new function was added to libclang's python's
bindings, in "clang/bindings/python/clang/cindex.py", adding a new
method for `Cursor`, `is_move_assignment_operator_method`.
An accompanying test was added to
`clang/bindings/python/tests/cindex/test_cursor.py`, testing the new
function with the same methodology as the corresponding libclang test.

The current release note, `clang/docs/ReleaseNotes.rst`, was modified to
report the new addition under the "libclang" section.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D137246

commit | commitdiff | tree

Nikita Popov [Mon, 14 Nov 2022 14:16:38 +0000 (15:16 +0100)]

[LoopVersioningLICM] Remove unnecessary reset code (NFC)

The LoopVersioningLICM object is only ever used for a single loop,
but there was various unnecessary code for handling the case where
it is reused across loops. Drop that code, and pass the loop to the
constructor.

commit | commitdiff | tree

LLVM GN Syncbot [Mon, 14 Nov 2022 14:05:19 +0000 (14:05 +0000)]

[gn build] Port d52e2839f3b1

commit | commitdiff | tree

Nicholas Guy [Mon, 14 Nov 2022 13:59:59 +0000 (13:59 +0000)]

[ARM][CodeGen] Add support for complex deinterleaving

Adds the Complex Deinterleaving Pass implementing support for complex numbers in a target-independent manner, deferring to the TargetLowering for the given target to create a target-specific intrinsic.

Differential Revision: https://reviews.llvm.org/D114174

commit | commitdiff | tree

revunov.denis@huawei.com [Mon, 14 Nov 2022 13:25:20 +0000 (13:25 +0000)]

[BOLT][NFC] Fix possible use-after-free

If NewName twine has reference to the old name, then after
Section.Name = NewName.str(); this reference is invalidated,
so we cannot use NewName.str() anymore.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D137616

commit | commitdiff | tree

Valentin Clement [Mon, 14 Nov 2022 13:27:44 +0000 (14:27 +0100)]

[flang][NFC] Fix typo in fir.box_typecode op description

commit | commitdiff | tree

Dmitry Preobrazhensky [Mon, 14 Nov 2022 13:20:20 +0000 (16:20 +0300)]

[AMDGPU][MC][GFX11] Improve diagnostic messages for invalid VOPD syntax

Differential Revision: https://reviews.llvm.org/D137842

commit | commitdiff | tree

Nicolas Vasilache [Wed, 9 Nov 2022 11:56:26 +0000 (03:56 -0800)]

[mlir][Transform] Add support for dynamically unpacking tile_sizes / num_threads in tile_to_foreach_thread

This commit adds automatic unpacking of Value's of type pdl::OperationType to the underlying single-result OpResult.
This allows mixing single-value, attribute and multi-value pdl::Operation tile sizes and num threads to TileToForeachThreadOp.

Differential Revision: https://reviews.llvm.org/D137896

commit | commitdiff | tree

Ying Yi [Mon, 10 Oct 2022 12:26:56 +0000 (13:26 +0100)]

[ThinLTO] a ThinLTO warning is added if cache_size_bytes or cache_size_files is too small for the current link job. The warning recommends the user to consider adjusting --thinlto-cache-policy.

A specific case for ThinLTO cache pruning is that the current build is huge, and the cache wasn't big enough to hold the intermediate object files of that build. So in doing that build, a file would be cached, and later in that same build it would be evicted. This was significantly decreasing the effectiveness of the cache. By giving this warning, the user could identify the required cache size/files and improve ThinLTO link speed.

Differential Revision: https://reviews.llvm.org/D135590

commit | commitdiff | tree

Jay Foad [Mon, 14 Nov 2022 11:36:24 +0000 (11:36 +0000)]

[AMDGPU] Simplify SelectPat and remove comment obsoleted by D133593

commit | commitdiff | tree

Thomas Symalla [Mon, 14 Nov 2022 11:55:05 +0000 (12:55 +0100)]

[InstCombine][NFC] Add extractelement tests

commit | commitdiff | tree

HanSheng Zhang [Mon, 14 Nov 2022 11:45:23 +0000 (12:45 +0100)]

[reg2mem] Skip non-sized Instructions (PR58890)

We can only convert sized values into alloca/load/store, skip
instructions returning other types.

Fixes https://github.com/llvm/llvm-project/issues/58890.

Differential Revision: https://reviews.llvm.org/D137700

commit | commitdiff | tree

Christian Sigg [Mon, 14 Nov 2022 11:21:59 +0000 (12:21 +0100)]

[mlir][bazel] NFC: change MLIR_GPU_TO_CUBIN_PASS_ENABLE from `defines` to `local_defines`.

commit | commitdiff | tree

Joshua Cao [Mon, 14 Nov 2022 03:24:15 +0000 (22:24 -0500)]

Do not write a comma when varargs is the only argument

Fixes https://github.com/llvm/llvm-project/issues/56544

AsmWriter always writes ", ..." when a tail call has a varargs argument. This patch only writes the ", " when there is an argument before the varargs argument.

I did not write a dedicated test this for this change, but I modified an existing test that will test for a regression.

Reviewed By: avogelsgesang

Differential Revision: https://reviews.llvm.org/D137893

Signed-off-by: Adrian Vogelsgesang <avogelsgesang@salesforce.com>

commit | commitdiff | tree

Jean Perier [Mon, 14 Nov 2022 10:19:21 +0000 (11:19 +0100)]

[flang] Add hlfir.declare codegen

hlfir.declare codegen generates a fir.declare, and may generate a
fir.embox/fir.rebox/fir.emboxchar if the base value does not convey
all the variable bounds and length parameter information.

Leave OPTIONAL as a TODO to keep this patch simple. It will require
making the embox/rebox optional to preserve the optionality aspects.

Differential Revision: https://reviews.llvm.org/D137789

commit | commitdiff | tree

LLVM GN Syncbot [Mon, 14 Nov 2022 10:12:18 +0000 (10:12 +0000)]

[gn build] Port dd46a08008f7

commit | commitdiff | tree

Haojian Wu [Mon, 14 Nov 2022 10:10:55 +0000 (11:10 +0100)]

Update the wrong isSelfContainedHeader API usage in the test.

commit | commitdiff | tree

Nikita Popov [Mon, 14 Nov 2022 10:01:15 +0000 (11:01 +0100)]

[ConstraintElimination] Use SmallVectorImpl (NFC)

When passing a SmallVector by reference, don't specify its size.

commit | commitdiff | tree

Nikita Popov [Tue, 8 Nov 2022 13:46:09 +0000 (14:46 +0100)]

[TableGen] Use MemoryEffects to represent intrinsic memory effects (NFCI)

The TableGen implementation was using a homegrown implementation of
FunctionModRefInfo. This switches it to use MemoryEffects instead.
This makes the code simpler, and will allow exposing the full
representational power of MemoryEffects in the future. Among other
things, this will allow us to map IntrHasSideEffects to an
inaccessiblemem readwrite, rather than just ignoring it entirely
in most cases.

To avoid layering issues, this moves the ModRef.h header from IR
to Support, so that it can be included in the TableGen layer.

Differential Revision: https://reviews.llvm.org/D137641

commit | commitdiff | tree

Valentin Clement [Mon, 14 Nov 2022 09:50:56 +0000 (10:50 +0100)]

[flang] Add fir.box_typecode operation

`fir.box_typecode` operation allows to retrieve the type code
from a boxed value. This will be used in the `fir.select_type` conversion
to if-then-else ladder for type guard statement with intrinsic type spec
instead of using a runtime call.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D137829

commit | commitdiff | tree

Valentin Clement [Mon, 14 Nov 2022 09:46:53 +0000 (10:46 +0100)]

[flang] Initial lowering of SELECT TYPE construct to fir.select_type operation

This patch is the initial path to lower the SELECT TYPE construct to the
fir.select_type operation. More work is required in the AssocEntity
mapping but it will be done in a follow up patch to ease the review.

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D137728

commit | commitdiff | tree

Sebastian Neubauer [Mon, 14 Nov 2022 09:46:46 +0000 (10:46 +0100)]

[Coroutines] Do not add allocas for retcon coroutines

Same as for async-style lowering, if there are no resume points in a
function, the coroutine frame pointer will be replaced by an undef,
making all accesses to the frame undefinde behavior.

Fix this by not adding allocas to the coroutine frame if there are no
resume points.

Differential Revision: https://reviews.llvm.org/D137866

commit | commitdiff | tree

Sebastian Neubauer [Fri, 11 Nov 2022 21:20:43 +0000 (22:20 +0100)]

[Coroutines] Presubmit retcon without suspend test

The test gets incorrectly optimized to unreachable.

commit | commitdiff | tree

Nikita Popov [Fri, 11 Nov 2022 16:27:31 +0000 (17:27 +0100)]

[ConstraintElimination] Add Decomposition struct (NFCI)

Replace the vector of DecompEntry with a struct that stores the
constant offset separately. I think this is cleaner than giving the
first element special handling.

This probably also fixes some potential ubsan errors by more
consistently using addWithOverflow/multiplyWithOverflow.

commit | commitdiff | tree

Nikita Popov [Fri, 11 Nov 2022 16:02:40 +0000 (17:02 +0100)]

[ConstraintElimination] Make decompose() infallible

decompose() currently returns a mix of {} and 0 + 1*V on failure.
This changes it to always return the 0 + 1*V form, thus making
decompose() infallible.

This makes the code marginally more powerful, e.g. we now fold
sub_decomp_i80 by treating the constant as a symbolic value.

Differential Revision: https://reviews.llvm.org/D137847

commit | commitdiff | tree

Jean Perier [Mon, 14 Nov 2022 09:38:22 +0000 (10:38 +0100)]

[flang][RFC] Do not rely on attributes to tag HLFIR variable uses

After more considerations and experience, switch to one of the
alternative plan for HLFIR variable that will avoid requiring naming
designators and having to maintain and update names in attributes after
inlining of code duplication.

The cost is the increase of fir.box usage, which in most cases should
be removed when lowering from HLFIR to FIR.

Differential Revision: https://reviews.llvm.org/D137634

commit | commitdiff | tree

Jean Perier [Mon, 14 Nov 2022 09:37:04 +0000 (10:37 +0100)]

[flang][NFC] rename hlfir::FortranEntity into EntityWithAttributes

This reflects the fact that Attributes will not always be visible when
looking at an HLFIR variable. The EntityWithAttributes class is used
to denote in the compiler code that the value at hand has visible
attributes. It is intended to be used in lowering so that the code
can query about operands attributes when generating code.

Differential Revision: https://reviews.llvm.org/D137792

commit | commitdiff | tree

Jean Perier [Mon, 14 Nov 2022 09:25:03 +0000 (10:25 +0100)]

[flang] Add hlfir.declare operation

This operation will be used to declare named variables in HLFIR.
See the added description in HLFIROpBase.td for more info about it.

The motivation behind this operation is described in https://reviews.llvm.org/D137634.

The FortranVariableInterface verifier is changed a bit. It used to
operate using the result type to verify the provided shape and length
parameters. This is a bit incorrect because what matters to verify the
information is the input address (This worked OK with fir.declare where
the input memref type is the same as the output result). Also, not all
operation defining variables will have an input memref with the same
meaning (hlfir.designate and hlfir.associate for instance).
Hence, this verifier is now optional and must be provided a memref to
operate.

Differential Revision: https://reviews.llvm.org/D137781

commit | commitdiff | tree

Haojian Wu [Mon, 7 Nov 2022 12:30:47 +0000 (13:30 +0100)]

Move the isSelfContainedHeader function from clangd to libtooling.

We plan to reuse it in the include-cleaner library, this patch moves
this functionality from clangd to libtooling, so that this piece of code can be
shared among all clang tools.

Differential Revision: https://reviews.llvm.org/D137697

Domain: System / Toolchain;

RSS Atom