review.tizen.org Git - platform/upstream/llvm.git/log

projects / platform / upstream / llvm.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Michele Scandale [Fri, 11 Nov 2022 20:06:29 +0000 (12:06 -0800)]

Fix `unsafe-fp-math` attribute emission.

The conditions for which Clang emits the `unsafe-fp-math` function
attribute has been modified as part of
`84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7`.
In the backend code generators `"unsafe-fp-math"="true"` enable floating
point contraction for the whole function.
The intent of the change in `84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7`
was to prevent backend code generators performing contractions when that
is not expected.
However the change is inaccurate and incomplete because it allows
`unsafe-fp-math` to be set also when only in-statement contraction is
allowed.

Consider the following example
```
float foo(float a, float b, float c) {
  float tmp = a * b;
  return tmp + c;
}
```
and compile it with the command line
```
clang -fno-math-errno -funsafe-math-optimizations -ffp-contract=on \
  -O2 -mavx512f -S -o -
```
The resulting assembly has a `vfmadd213ss` instruction which corresponds
to a fused multiply-add. From the user perspective there shouldn't be
any contraction because the multiplication and the addition are not in
the same statement.

The optimized IR is:
```
define float @test(float noundef %a, float noundef %b, float noundef %c) #0 {
  %mul = fmul reassoc nsz arcp afn float %b, %a
  %add = fadd reassoc nsz arcp afn float %mul, %c
  ret float %add
}

attributes #0 = {
  [...]
  "no-signed-zeros-fp-math"="true"
  "no-trapping-math"="true"
  [...]
  "unsafe-fp-math"="true"
}
```
The `"unsafe-fp-math"="true"` function attribute allows the backend code
generator to perform `(fadd (fmul a, b), c) -> (fmadd a, b, c)`.

In the current IR representation there is no way to determine the
statement boundaries from the original source code.
Because of this for in-statement only contraction the generated IR
doesn't have instructions with the `contract` fast-math flag and
`llvm.fmuladd` is being used to represent contractions opportunities
that occur within a single statement.
Therefore `"unsafe-fp-math"="true"` can only be emitted when contraction
across statements is allowed.

Moreover the change in `84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7` doesn't
take into account that the floating point math function attributes can
be refined during IR code generation of a function to handle the cases
where the floating point math options are modified within a compound
statement via pragmas (see `CGFPOptionsRAII`).
For consistency `unsafe-fp-math` needs to be disabled if the contraction
mode for any scope/operation is not `fast`.
Similarly for consistency reason the initialization of `UnsafeFPMath` of
in `TargetOptions` for the backend code generation should take into
account the contraction mode as well.

Reviewed By: zahiraam

Differential Revision: https://reviews.llvm.org/D136786

commit | commitdiff | tree

Chuanqi Xu [Tue, 15 Nov 2022 03:50:51 +0000 (11:50 +0800)]

[NFC] [C++20] [Modules] Remove unused Global Module Fragment variables/arguments

commit | commitdiff | tree

Craig Topper [Tue, 15 Nov 2022 03:37:04 +0000 (19:37 -0800)]

[RISCV] Expand i32 abs to negw+max at isel.

This adds a RISCVISD::ABSW to remember that we started with an i32
abs. Previously we used a DAG combine of (sext_inreg (abs)) to
delay emitting a freeze from type legalization in order to make
ComputeNumSignBits optimizations work on other promoted nodes.

This new approach always uses negw+max even if the result doesn't
need to be sign extended. This helps the RISCVSExtWRemoval pass
if the sext.w is in another basic block.

commit | commitdiff | tree

Jonas Devlieghere [Fri, 11 Nov 2022 19:50:45 +0000 (11:50 -0800)]

[dsymutil] Fix assertion in the Reproducer/FileCollector when TMPDIR is empty

Fix a assertion in dsymutil coming from the Reproducer/FileCollector.
When TMPDIR is empty, the root becomes a relative path, triggering an
assertion when adding a relative path to the VFS mapping. This patch
fixes the issue by resolving the relative path and also moves the
assertion up to make it easier to diagnose these issues in the future.

rdar://102170986

Differential revision: https://reviews.llvm.org/D137959

commit | commitdiff | tree

Chen Zheng [Tue, 8 Nov 2022 06:39:09 +0000 (01:39 -0500)]

[PowerPC] make expensive mflr be away from its user in the function prologue

mflr is kind of expensive on Power version smaller than 10, so we should
schedule the store for the mflr's def away from mflr.

In epilogue, the expensive mtlr has no user for its def, so it doesn't
matter that the load and the mtlr are back-to-back.

Reviewed By: RolandF

Differential Revision: https://reviews.llvm.org/D137423

commit | commitdiff | tree

River Riddle [Tue, 15 Nov 2022 00:55:33 +0000 (16:55 -0800)]

[mlir][AttrTypeReplacer] Make attribute dictionary replacement optional

This provides an optimization opportunity for clients that don't want/need
to recurse attribute dictionaries.

commit | commitdiff | tree

Xiaodong Liu [Tue, 15 Nov 2022 01:55:03 +0000 (09:55 +0800)]

[LoongArch] Handle register spill in BranchRelaxation pass

When the range of the unconditional branch is overflow, the indirect
branch way is used. The case when there is no scavenged register for
indirect branch needs to spill register to stack.

Reviewed By: SixWeining, wangleiat

Differential Revision: https://reviews.llvm.org/D137821

commit | commitdiff | tree

Jakub Kuderski [Tue, 15 Nov 2022 01:54:14 +0000 (20:54 -0500)]

[mlir][arith][spirv] Clean up arith-to-spirv. NFC.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D137978

commit | commitdiff | tree

Jakub Kuderski [Tue, 15 Nov 2022 01:51:58 +0000 (20:51 -0500)]

[mlir][arith] Add `arith.shrsi` support to WIE

This includes LIT tests over the generated ops and runtime tests.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D137965

commit | commitdiff | tree

TatWai Chong [Tue, 15 Nov 2022 01:29:45 +0000 (17:29 -0800)]

[mlir][tosa] Create a profile validation pass for TOSA dialect

Add a separate validation pass to check if TOSA operations match with
the specification against given requirement. Perform profile type
checking as the initial feature in the pass.

This is an optional pass that can be enabled via command line. e.g.
$mlir-opt --tosa-validate="profile=bi" for validating against the
base inference profile.

Description:
TOSA defines a variety of operator behavior and requirements in the
specification. It would be helpful to have a separate validation pass
to keep TOSA operation input match with TOSA specification for given
criteria, and also diminish the burden of dialect validation during
compilation.

TOSA supports three profiles of which two are for inference purposes.
The main inference profile supports both integer and floating-point
data types, but the base inference profile only supports integers.
In this initial PR, validate the operations against a given profile
of TOSA, so that validation would fail if a floating point tensor is
present when the base inference profile is selected. Afterward, others
checking will be added to the pass if needed. e.g. control flow
operators and custom operators validation.

The pass is expected to be able to run on any point of TOSA dialect
conversion/transformation pipeline, and not depend on a particular
pass run ahead. So that it is can be used to validate the initial tosa
operations just converted from other dialects, the intermediate form,
or the final tosa operations output.

Change-Id: Ib58349c873c783056e89d2ab3b3312b8d2c61863

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D137279

commit | commitdiff | tree

Roman Lebedev [Tue, 15 Nov 2022 00:29:58 +0000 (03:29 +0300)]

[NFC][Clang] Autogenerate checklines in a test being affected by a patch

commit | commitdiff | tree

Aart Bik [Mon, 14 Nov 2022 21:51:23 +0000 (13:51 -0800)]

[mlir][sparse] avoid nop rewriting on runtime lib path in pipeline

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D137981

commit | commitdiff | tree

Dmitry Vassiliev [Tue, 15 Nov 2022 00:30:00 +0000 (04:30 +0400)]

[NVPTX] Emit pragma nounroll for llvm.loop.unroll.count=1

Emit pragma nounroll for llvm.loop.unroll.count=1 (#pragma unroll 1).

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D137991

commit | commitdiff | tree

Peiming Liu [Tue, 15 Nov 2022 00:02:43 +0000 (00:02 +0000)]

[mlir][sparse] fix memory leak sparse2sparse reshape

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D137994

commit | commitdiff | tree

Stella Stamenova [Tue, 15 Nov 2022 00:18:04 +0000 (16:18 -0800)]

Revert "[mlir][sparse] Macros to clean up StridedMemRefType in the SparseTensorRuntime" and "[mlir][sparse] move SparseTensorReader functions into the _mlir_ciface_ section"

This reverts commits 6c22dad and 92bc3fb.

These broke the windows mlir buildbot.

commit | commitdiff | tree

Matt Arsenault [Mon, 14 Nov 2022 20:42:08 +0000 (12:42 -0800)]

GlobalISel: Add debug print for applied rule in generated combiner

commit | commitdiff | tree

Fangrui Song [Mon, 14 Nov 2022 23:51:03 +0000 (15:51 -0800)]

Revert "[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm"

This reverts commit bf8381a8bce28fc69857645cc7e84a72317e693e.

There is a layering violation: LLVMAnalysis depends on LLVMCore, so
LLVMCore should not include LLVMAnalysis header
llvm/Analysis/ModuleSummaryAnalysis.h

commit | commitdiff | tree

Alexander Shaposhnikov [Mon, 14 Nov 2022 23:15:19 +0000 (23:15 +0000)]

[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm

Enable using -module-summary with -S
(similarly to what currently can be achieved with opt <input> -o - | llvm-dis).
This is a recommit of ef9e62469.

Test plan: ninja check-all

Differential revision: https://reviews.llvm.org/D137768

commit | commitdiff | tree

Peiming Liu [Mon, 14 Nov 2022 22:28:12 +0000 (22:28 +0000)]

[mlir][sparse] fix memory leak in test cases

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D137985

commit | commitdiff | tree

wren romano [Mon, 14 Nov 2022 22:43:03 +0000 (14:43 -0800)]

[mlir][sparse] Fix warning on GCC

Reviewed By: aartbik, Peiming

Differential Revision: https://reviews.llvm.org/D137987

commit | commitdiff | tree

Philip Reames [Mon, 14 Nov 2022 22:21:51 +0000 (14:21 -0800)]

[RISCV] Add codegen coverage for select idioms which might benefit from XVentanaCondOps

commit | commitdiff | tree

Joseph Huber [Mon, 14 Nov 2022 20:58:19 +0000 (14:58 -0600)]

[libc] Forward LLVM_LIBC options when using a runtimes build

The `LLVM_ENABLE_RUNTIMES' mode is commonly used to build runtimes that
depend on an up-to-date version of clang. Currently, `libc` uses some
internal variables that are not forwarded when building in this mode.
This patch forwards the relevent arguments beginning with `LLVM_LIBC` to
the build when built this way.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D137977

commit | commitdiff | tree

bixia1 [Mon, 14 Nov 2022 18:05:19 +0000 (10:05 -0800)]

[mlir][sparse] Make three tests run with the codegen path.

Reviewed By: aartbik, Peiming

Differential Revision: https://reviews.llvm.org/D137964

commit | commitdiff | tree

wren romano [Wed, 9 Nov 2022 21:38:11 +0000 (13:38 -0800)]

[mlir][sparse] move SparseTensorReader functions into the _mlir_ciface_ section

Depends On D137735

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D137737

commit | commitdiff | tree

wren romano [Wed, 9 Nov 2022 21:35:45 +0000 (13:35 -0800)]

[mlir][sparse] Macros to clean up StridedMemRefType in the SparseTensorRuntime

In particular, this silences warnings from [-Wsign-compare].

Depends On D137681

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D137735

commit | commitdiff | tree

wren romano [Wed, 9 Nov 2022 21:33:01 +0000 (13:33 -0800)]

[mlir][sparse] Making way for SparseTensorRuntime to support non-permutations

Systematically updates the SparseTensorRuntime to properly distinguish tensor-dimensions from storage-levels (and their associated ranks, shapes, sizes, indices, etc). With a few exceptions which are noted in the code, this ensures the runtime has all the **semantic** changes necessary to support non-permutations.

(Whereas **operationally**, since we're still using `std::vector<uing64_t>` to represent the mappings, there's no way to pass in any interesting non-permutations. Changing the representation to `std::function` will be done in a separate differential.)

Depends On D137680

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D137681

commit | commitdiff | tree

Craig Topper [Mon, 14 Nov 2022 21:34:24 +0000 (13:34 -0800)]

[RISCV] Add PseudoCCMOVGPR to RISCVSExtWRemoval.

This instruction is a conditional move. It propagates sign bits
from its inputs.

commit | commitdiff | tree

Alexander Shaposhnikov [Mon, 14 Nov 2022 21:31:30 +0000 (21:31 +0000)]

Revert "[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm"

This reverts commit ef9e624694c0f125c53f7d0d3472fd486bada57d
for further investigation offline.
It appears to break the buildbot
llvm-clang-x86_64-sie-ubuntu-fast.

commit | commitdiff | tree

Alexander Shaposhnikov [Mon, 14 Nov 2022 21:10:24 +0000 (21:10 +0000)]

[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm

Enable using -module-summary with -S
(similarly to what currently can be achieved with opt <input> -o - | llvm-dis).

Test plan: ninja check-all

Differential revision: https://reviews.llvm.org/D137768

commit | commitdiff | tree

Jonas Devlieghere [Mon, 14 Nov 2022 21:03:29 +0000 (13:03 -0800)]

Revert "[dsymutil] Fix assertion in the Reproducer/FileCollector when TMPDIR is empty"

This reverts commit 68efb4772c0d0e60cbfb09ea619b58d80c31ff0f because the
test fails on some of the buildbots.

commit | commitdiff | tree

Xiang Li [Fri, 11 Nov 2022 08:00:11 +0000 (00:00 -0800)]

[DirectX backend] Fix build and test error caused by out of sync with upstream change.

Fix build and test error caused by
https://github.com/llvm/llvm-project/commit/a2620e00ffa232a406de3a1d8634beeda86956fd#
and
https://github.com/llvm/llvm-project/commit/304f1d59ca41872c094def3aee0a8689df6aa398

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D137815

commit | commitdiff | tree

Jonas Devlieghere [Fri, 11 Nov 2022 19:50:45 +0000 (11:50 -0800)]

[dsymutil] Fix assertion in the Reproducer/FileCollector when TMPDIR is empty

Fix a assertion in dsymutil coming from the Reproducer/FileCollector.
When TMPDIR is empty, the root becomes a relative path, triggering an
assertion when adding a relative path to the VFS mapping. This patch
fixes the issue by resolving the relative path and also moves the
assertion up to make it easier to diagnose these issues in the future.

rdar://102170986

Differential revision: https://reviews.llvm.org/D137959

commit | commitdiff | tree

Andreas Hollandt [Mon, 14 Nov 2022 20:27:12 +0000 (12:27 -0800)]

[cmake] Fix _GNU_SOURCE being added unconditionally

Reviewed By: tstellar

Differential Revision: https://reviews.llvm.org/D137917

commit | commitdiff | tree

Nico Weber [Mon, 14 Nov 2022 19:23:56 +0000 (14:23 -0500)]

[COFF, Mach-O] Include -mllvm options in thinlto cache key

Like D134013, but for COFF and Mach-O.

Also expand the ELF test a bit. I at first didn't realize that `getValue()` for
`-mllvm -foo=bar` would return `-foo=bar` instead of just `bar`, and so
I wrote the test to check if we indeed get this wrong. We don't, but
having the test for it seems nice, so I'm including it.

Differential Revision: https://reviews.llvm.org/D137971

commit | commitdiff | tree

Jakub Kuderski [Mon, 14 Nov 2022 20:07:18 +0000 (15:07 -0500)]

[mlir][arith][spirv] Handle i1 sign extension in arith-to-spirv

Also fix some surrounding nits.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D137974

commit | commitdiff | tree

Mehdi Amini [Mon, 14 Nov 2022 06:24:54 +0000 (06:24 +0000)]

Apply clang-tidy fixes for readability-identifier-naming in AlgebraicSimplification.cpp (NFC)

commit | commitdiff | tree

Mehdi Amini [Mon, 14 Nov 2022 06:10:39 +0000 (06:10 +0000)]

Apply clang-tidy fixes for readability-simplify-boolean-expr in GPUDialect.cpp (NFC)

commit | commitdiff | tree

Rob Suderman [Mon, 14 Nov 2022 19:38:16 +0000 (11:38 -0800)]

[mlir][tosa] Remove zero-fill of tosa.concat outputs when lowering to linalg.

Since all output elements are known to be overridden by construction the fill is not required. This change makes the tosa lowering consistent with the MHLO and Torch lowerings of concat which do not do the fill.

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D137967

commit | commitdiff | tree

Guozhi Wei [Mon, 14 Nov 2022 19:34:59 +0000 (19:34 +0000)]

[MachineCSE] Allow CSE for instructions with ignorable operands

Ignorable operands don't impact instruction's behavior, we can safely do CSE on
the instruction.

It is split from D130919. It has big impact to some AMDGPU test cases.
For example in atomic_optimizations_raw_buffer.ll, when trying to check if the
following instruction can be CSEed

  %37:vgpr_32 = V_MOV_B32_e32 0, implicit $exec

Function isCallerPreservedOrConstPhysReg is called on operand "implicit $exec",
this function is implemented as

  -  return TRI.isCallerPreservedPhysReg(Reg, MF) ||
  +  return TRI.isCallerPreservedPhysReg(Reg, MF) || TII.isIgnorableUse(MO) ||
            (MRI.reservedRegsFrozen() && MRI.isConstantPhysReg(Reg));

Both TRI.isCallerPreservedPhysReg and MRI.isConstantPhysReg return false on this
operand, so isCallerPreservedOrConstPhysReg is also false, it causes LLVM failed
to CSE this instruction.

With this patch TII.isIgnorableUse returns true for the operand $exec, so
isCallerPreservedOrConstPhysReg also returns true, it causes this instruction to
be CSEed with previous instruction

  %14:vgpr_32 = V_MOV_B32_e32 0, implicit $exec

So I got different result from here. AMDGPU's implementation of isIgnorableUse
is

  bool SIInstrInfo::isIgnorableUse(const MachineOperand &MO) const {
    // Any implicit use of exec by VALU is not a real register read.
    return MO.getReg() == AMDGPU::EXEC && MO.isImplicit() &&
           isVALU(*MO.getParent()) && !resultDependsOnExec(*MO.getParent());
  }

Since the operand $exec is not a real register read, my understanding is it's
reasonable to do CSE on such instructions.

Because more instructions are CSEed, so I get less instructions generated for
these tests.

Differential Revision: https://reviews.llvm.org/D137222

commit | commitdiff | tree

Matt Arsenault [Fri, 28 Oct 2022 23:10:41 +0000 (16:10 -0700)]

clang/AMDGPU: Use Support's wrapper around getenv

This does some extra stuff for Windows, so might as well
use it just in case.

commit | commitdiff | tree

Quentin Colombet [Thu, 20 Oct 2022 22:18:58 +0000 (22:18 +0000)]

[mlir][MemRef] Change the anchor point of a reshapeLikeOp pattern

Essentially, this patches changes the anchor point of the
`extract_strided_metadata(reshapeLikeOp)` pattern from
`extract_strided_metadata` to `reshapeLikeOp`.

In details, this means that instead of replacing:
```
base, offset, sizes, strides =
extract_strided_metadata(reshapeLikeOp(src))
```
With
```
base, offset = extract_strided_metadata(src)
sizes = <some math>
strides = <some math>
```

We replace only the reshapeLikeOp part and connect it back with a
reinterpret_cast:
```
val = reshapeLikeOp(src)
```
=>
```
base, offset, ... = extract_strided_metadata(src)
sizes = <some math>
strides = <some math>
val = reinterpret_cast base, offset, sizes, strides

Differential Revision: https://reviews.llvm.org/D136386

commit | commitdiff | tree

Quentin Colombet [Wed, 12 Oct 2022 21:23:27 +0000 (21:23 +0000)]

[mlir][MemRef] Change the anchor point of a subview pattern

Essentially, this patches changes the anchor point of the
`extract_strided_metadata(subview)` pattern from
`extract_strided_metadata` to `subview`.

In details, this means that instead of replacing:
```
base, offset, sizes, strides = extract_strided_metadata(subview(src))
```
With
```
base, ... = extract_strided_metadata(src)
offset = <some math>
sizes = subSizes
strides = <some math>
```

We replace only the subview part and connect it back with a
reinterpret_cast:
```
val = subview(src)
```
=>
```
base, ... = extract_strided_metadata(src)
offset = <some math>
sizes = subSizes
strides = <some math>
val = reinterpret_cast base, offset, sizes, strides
```

Differential Revision: https://reviews.llvm.org/D135839

commit | commitdiff | tree

Quentin Colombet [Wed, 12 Oct 2022 21:18:53 +0000 (21:18 +0000)]

[mlir][MemRef] Simplify extract_strided_metadata(reinterpret_cast)

This patch adds a pattern to simplify
```
base, offset, sizes, strides =
extract_strided_metadata(
reinterpret_cast(src, srcOffset, srcSizes, srcStrides))
```

Into
```
base, baseOffset, ... = extract_strided_metadata(src)
offset = srcOffset
sizes = srcSizes
strides = srcStrides
```

Note: Reinterpret_cast with unranked sources are not simplified since
they cannot feed extract_strided_metadata operations.

Differential Revision: https://reviews.llvm.org/D135837

commit | commitdiff | tree

Nico Weber [Mon, 14 Nov 2022 18:30:55 +0000 (13:30 -0500)]

[lto] Update function name in comment after 5f312ad45

commit | commitdiff | tree

Craig Topper [Mon, 14 Nov 2022 18:28:29 +0000 (10:28 -0800)]

[RISCV] Add scalar FP compares to isSignExtendingOpW in RISCVSExtWRemoval.

commit | commitdiff | tree

Joe Nash [Mon, 14 Nov 2022 15:15:27 +0000 (10:15 -0500)]

[AMDGPU][MC][NFC] Rename VOP3 VOPC test files

D136149 and D136148 renamed the MC test files for VOP3 promoted from VOP1 and
VOP2 in a consistent way. Do the same for VOP3 coming from VOPC.

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D137950

commit | commitdiff | tree

Craig Topper [Mon, 14 Nov 2022 17:59:03 +0000 (09:59 -0800)]

[RISCV] Move FixableDef handling out of isSignExtendingOpW.

We have two layers of opcode checks. The first is in
isSignExtendingOpW. If that returns false, a second switch is used
for looking through nodes by adding them to the worklist.

Move the FixableDef handling to the second switch. This simplies
the interface of isSignExtendingOpW and makes that function more
accurate to its name.

commit | commitdiff | tree

Yashwant Singh [Mon, 14 Nov 2022 17:57:08 +0000 (23:27 +0530)]

[GlobalIsel][AMDGPU] Changing legalize rule for G_{UADDO|UADDE|USUBO|USUBE|SADDE|SSUBE}

Generic add and sub with carry are now legalized in a way to explicitly calculate carry/borrow output. i.e
%6:_(s64), %7:_(s1) = G_UADDO %0, %1
becomes,
%13:_(s32), %14:_(s1) = G_UADDO %2, %4
%15:_(s32), %16:_(s1) = G_UADDE %3, %5, %14
%6:_(s64) = G_MERGE_VALUES %13(s32), %15(s32)
%7:_(s1) = G_ICMP intpred(ult), %6(s64), %1

Here G_MERGE and G_ICMP instructions are redundant for recalculating carry output. (Similar case for sub with borrow)
This change fix this.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D137932

commit | commitdiff | tree

Thomas Raoux [Sun, 13 Nov 2022 18:52:03 +0000 (18:52 +0000)]

[mlir][linalg] Add reduction tiling using scf.foreachthread

This adds a transformation to tile reduction operations to partial
reduction using scf.foreachthread. This uses
PartialReductionOpInterface to create a merge operation of the partial
tiles.

Differential Revision: https://reviews.llvm.org/D137912

commit | commitdiff | tree

Benjamin Kramer [Mon, 14 Nov 2022 18:02:56 +0000 (19:02 +0100)]

[bazel] Add another missing dependency after D137833

While there run buildifier.

commit | commitdiff | tree

Quentin Colombet [Wed, 12 Oct 2022 00:29:39 +0000 (00:29 +0000)]

[mlir][MemRef] Make reinterpret_cast(extract_strided_metadata) more robust

Prior to this patch the canonicalization pattern that turns
`reinterpret_cast(extract_strided_metadata)` into cast was only applied
when all the input operands of the `reinterpret_cast` are exactly all the
output results of the `extract_strided_metadata`.

This missed simplification opportunities when the values would have hold
the same constant values, but yet, come from different actual values.

E.g., prior to this patch, a pattern of the form:
```
%base, %offset = extract_strided_metadata %source : memref<i16>
reinterpret_cast %base to offset:[0]
```
Wouldn't have been simplified into a simple cast, because %offset is not
directly the same value object as 0.

This patch teaches this pattern how to check if the constant values
match what the results of the `extract_strided_metadata` operation would
have hold.

Differential Revision: https://reviews.llvm.org/D135736

commit | commitdiff | tree

Chenguang Wang [Mon, 14 Nov 2022 17:57:52 +0000 (18:57 +0100)]

[bazel] Fix Bufferization dialect build

D137833 added a new .td file and updated existing files to use it.
It broke bazel build.

Differential Revision: https://reviews.llvm.org/D137961

commit | commitdiff | tree

Jason Molenda [Mon, 14 Nov 2022 17:50:58 +0000 (09:50 -0800)]

Change last-ditch magic address in IRMemoryMap::FindSpace

When we cannot allocate memory in the inferior process, the IR
interpreter's IRMemoryMap::FindSpace will create an lldb local
buffer and assign it an address range in the inferior address
space. When the interpreter sees an address in that range, it
will read/write from the local buffer instead of the target. If
this magic address overlaps with actual data in the target, the
target cannot be accessed through expressions.

Instead of using a high memory address that is validly addressable,
this patch uses an address that cannot be accessed on 64-bit systems
that don't actually use all 64 bits of the virtual address.

Differential Revision: https://reviews.llvm.org/D137682
rdar://96248287

commit | commitdiff | tree

Yabin Cui [Mon, 14 Nov 2022 17:48:44 +0000 (17:48 +0000)]

[Support] Use thread safe version of getpwuid and getpwnam.

OpenGroup specification doesn't require getpwuid and getpwnam
to be thread-safe. And musl libc has a not thread-safe implementation.
When building clang with musl, this can make clang-scan-deps crash.

Reviewed By: pirama

Differential Revision: https://reviews.llvm.org/D137864

commit | commitdiff | tree

Arthur Eubanks [Sun, 13 Nov 2022 22:54:26 +0000 (14:54 -0800)]

[LegacyPM] Remove cl::opts controlling optimization pass manager passes

Move these to the new PM if they're used there.

Part of removing the legacy pass manager for optimization pipeline.

Reland with UseNewGVN usage in clang removed.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D137915

commit | commitdiff | tree

Craig Topper [Mon, 14 Nov 2022 17:17:47 +0000 (09:17 -0800)]

[RISCV] Remove old test case. NFC

This seemed to be testing a pattern for an RV64 Zbp instruction, but
on RV32. On RV32, it's just swizzling registers so isn't very
interesting.

commit | commitdiff | tree

Craig Topper [Mon, 14 Nov 2022 17:15:31 +0000 (09:15 -0800)]

[RISCV] Improve use of PACK instruction on rv64.

Handle the case where the lower bits come from a zero extending
load or other operation with known zero bits.

commit | commitdiff | tree

Arthur Eubanks [Mon, 14 Nov 2022 17:33:38 +0000 (09:33 -0800)]

Revert "[LegacyPM] Remove cl::opts controlling optimization pass manager passes"

This reverts commit 7ec05fec7115a910b2e172de794adc462388c25e.

Breaks bots, e.g. https://lab.llvm.org/buildbot#builders/217/builds/15008

commit | commitdiff | tree

Arthur Eubanks [Sun, 13 Nov 2022 22:54:26 +0000 (14:54 -0800)]

[LegacyPM] Remove cl::opts controlling optimization pass manager passes

Move these to the new PM if they're used there.

Part of removing the legacy pass manager for optimization pipeline.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D137915

commit | commitdiff | tree

Nicolas Vasilache [Sun, 13 Nov 2022 13:28:32 +0000 (05:28 -0800)]

[mlir][Transform]Significantly cleanup scf.foreach_thread and GPU transform permutation handling

Previously, the need for a dense permutation leaked into the thread_dim_mapping specification.
This revision allows to use a sparse specification of the thread_dim_mapping and the proper completion / sorting is applied automatically.

In the process, the sematics of scf.foreach_thread is tightened to require a matching number of thread dimensions and mappings.
The relevant negative test is added.

Differential Revision: https://reviews.llvm.org/D137906

commit | commitdiff | tree

Akash Banerjee [Wed, 9 Nov 2022 15:54:21 +0000 (15:54 +0000)]

Migrate getOrCreateInternalVariable from Clang to OMPIRBuilder.

This patch removes getOrCreateInternalVariable from Clang OMP CodeGen and replaces it's uses with OMPBuilder::getOrCreateInternalVariable. Also refactors OMPBuilder::getOrCreateInternalVariable to change type of name from Twine to StringRef

Differential Revision: https://reviews.llvm.org/D137720

commit | commitdiff | tree

Lorenzo Chelini [Fri, 11 Nov 2022 12:35:16 +0000 (13:35 +0100)]

[MLIR][Transform] Expose map layout option in `OneShotBufferizeOp`

Expose `function-boundary-type-conversion` in `OneShotBufferizeOp`. To
reuse options between passes and transform operations, create a
`BufferizationEnums.td`.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D137833

commit | commitdiff | tree

Sanjay Patel [Mon, 14 Nov 2022 16:54:47 +0000 (11:54 -0500)]

[InstSimplify] restrict logic fold with partial undef vector

https://alive2.llvm.org/ce/z/4ncsnX

Fixes #58977

commit | commitdiff | tree

Sanjay Patel [Sun, 13 Nov 2022 17:38:12 +0000 (12:38 -0500)]

[SystemZ] improve test for showing store merge miscompile; NFC

See issue #58883 for details.

commit | commitdiff | tree

Philip Reames [Mon, 14 Nov 2022 16:29:55 +0000 (08:29 -0800)]

[RISCV] Implement assembler support for XVentanaCondOps

This change provides an implementation of the XVentanaCondOps vendor extension. This extension is defined in version 1.0.0 of the VTx-family custom instructions specification (https://github.com/ventanamicro/ventana-custom-extensions/releases/download/v1.0.0/ventana-custom-extensions-v1.0.0.pdf) by Ventana Micro Systems.

In addition to the technical contribution, this change is intended to be a test case for our vendor extension policy.

Once this lands, I plan to use this extension to prototype selection lowering to conditional moves. There's an RVI proposal in flight, and the expectation is that lowering to these and the new RVI instructions is likely to be substantially similar.

Differential Revision: https://reviews.llvm.org/D137350

commit | commitdiff | tree

bixia1 [Wed, 9 Nov 2022 17:07:06 +0000 (09:07 -0800)]

[mlir][sparse] Add rewriting rules for sparse_tensor.sort_coo.

Refactor the rewriting of sparse_tensor.sort to support the implementation of
sparse_tensor.sort_coo.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D137522

commit | commitdiff | tree

Sylvain Audi [Wed, 9 Nov 2022 15:01:55 +0000 (10:01 -0500)]

[PDB] Don't include input files in the 'cmd' entry of S_ENVBLOCK

MSVC records the command line arguments in S_ENVBLOCK, skipping the input file arguments.
This patch adds this filtering on lld-link side.

Differential Revision: https://reviews.llvm.org/D137723

commit | commitdiff | tree

Simon Pilgrim [Mon, 14 Nov 2022 16:13:16 +0000 (16:13 +0000)]

[MCA][X86] Ensure the avx512 vnni tests use the upper xmm/ymm registers

Ensure we're testing the avx512vl vnni instructions and not the avx vnni instructions

commit | commitdiff | tree

Simon Pilgrim [Mon, 14 Nov 2022 15:57:13 +0000 (15:57 +0000)]

[MCA][X86] Add test coverage for VBMI2 instructions

commit | commitdiff | tree

Chris Bieneman [Mon, 14 Nov 2022 16:28:36 +0000 (10:28 -0600)]

[NFC] Fixing spelling in code comment

commit | commitdiff | tree

bixia1 [Fri, 11 Nov 2022 22:24:26 +0000 (14:24 -0800)]

[mlir][sparse][NFC] Add comments to tests that are run for with and without runtime libraries.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D137869

commit | commitdiff | tree

Ivan Kosarev [Mon, 14 Nov 2022 16:10:23 +0000 (16:10 +0000)]

[AMDGPU][AsmParser] Forbid TFE modifiers for MBUF stores.

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D137832

commit | commitdiff | tree

Nicholas Guy [Mon, 14 Nov 2022 15:55:44 +0000 (15:55 +0000)]

[NFC] Removal of complex deinterleaving test case complex_mul_v8f64

This test is not particularly useful for testing complex deinterleaving,
especially due to f64 muls not being supported in mve. The test is
being removed as it's hitting an unrelated pre-existing condition
regarding register spilling.

commit | commitdiff | tree

Jay Foad [Mon, 14 Nov 2022 15:27:59 +0000 (15:27 +0000)]

[AMDGPU] More use of DivergentBinFrag and friends. NFC.

commit | commitdiff | tree

Nikita Popov [Tue, 18 Oct 2022 10:11:04 +0000 (12:11 +0200)]

[AA] Move MayBeCrossIteration into AAQI (NFC)

Move the MayBeCrossIteration flag from BasicAA into AAQI. This is
in preparation for exposing it to users of the AA API.

commit | commitdiff | tree

Ivan Kosarev [Mon, 14 Nov 2022 12:37:26 +0000 (12:37 +0000)]

[AMDGPU][MC] Support TFE modifiers in MUBUF loads and stores.

Reviewed By: dp, arsenm

Differential Revision: https://reviews.llvm.org/D137783

commit | commitdiff | tree

Mindong Chen [Mon, 14 Nov 2022 15:18:47 +0000 (23:18 +0800)]

[docs][OpaquePtr] Fix hyperlinks

commit | commitdiff | tree

Jay Foad [Mon, 14 Nov 2022 15:14:55 +0000 (15:14 +0000)]

[AMDGPU] Define and use UniformTernaryFrag. NFC.

commit | commitdiff | tree

Simon Pilgrim [Mon, 14 Nov 2022 10:58:20 +0000 (10:58 +0000)]

[X86] Remove unnecessary overrides for CBW/CWDE/CDQE/CMC instructions

All of these match the default WriteALU schedule

commit | commitdiff | tree

Caroline Concatto [Thu, 3 Nov 2022 12:18:20 +0000 (12:18 +0000)]

[AArch64] Add all SME2.1 instructions Assembly/Disassembly

This patch adds a new feature flag:
sme-f16f16 to represent FEAT_SME-F16F16

This patch add the following instructions:
SME2.1 stand alone instructions:
   MOVAZ (array to vector, four registers): Move and zero four ZA single-vector groups to vector registers.
         (array to vector, two registers): Move and zero two ZA single-vector groups to vector registers.
         (tile to vector, four registers): Move and zero four ZA tile slices to vector registers.
         (tile to vector, single): Move and zero ZA tile slice to vector register.
         (tile to vector, two registers): Move and zero two ZA tile slices to vector registers.

   LUTI2 (Strided four registers): Lookup table read with 2-bit indexes.
         (Strided two registers): Lookup table read with 2-bit indexes.

   LUTI4 (Strided four registers): Lookup table read with 4-bit indexes.
         (Strided two registers): Lookup table read with 4-bit indexes.

   ZERO (double-vector): Zero ZA double-vector groups.
        (quad-vector): Zero ZA quad-vector groups.
        (single-vector): Zero ZA single-vector groups.

SME2p1 and SME-F16F16:
All instructions are half precision elements:
   FADD: Floating-point add multi-vector to ZA array vector accumulators.

   FSUB: Floating-point subtract multi-vector from ZA array vector accumulators.

   FMLA (multiple and indexed vector): Multi-vector floating-point fused multiply-add by indexed element.
        (multiple and single vector): Multi-vector floating-point fused multiply-add by vector.
        (multiple vectors): Multi-vector floating-point fused multiply-add.

   FMLS (multiple and indexed vector): Multi-vector floating-point fused multiply-subtract by indexed element.
        (multiple and single vector): Multi-vector floating-point fused multiply-subtract by vector.
        (multiple vectors): Multi-vector floating-point fused multiply-subtract.

   FCVT (widening): Multi-vector floating-point convert from half-precision to single-precision (in-order).

   FCVTL: Multi-vector floating-point convert from half-precision to deinterleaved single-precision.

   FMOPA (non-widening): Floating-point outer product and accumulate.

   FMOPS (non-widening): Floating-point outer product and subtract.

SME2p1 and B16B16:
   BFADD: BFloat16 floating-point add multi-vector to ZA array vector accumulators.

   BFSUB: BFloat16 floating-point subtract multi-vector from ZA array vector accumulators.

   BFCLAMP: Multi-vector BFloat16 floating-point clamp to minimum/maximum number.

   BFMLA (multiple and indexed vector): Multi-vector BFloat16 floating-point fused multiply-add by indexed element.
         (multiple and single vector): Multi-vector BFloat16 floating-point fused multiply-add by vector.
         (multiple vectors): Multi-vector BFloat16 floating-point fused multiply-add.

   BFMLS (multiple and indexed vector): Multi-vector BFloat16 floating-point fused multiply-subtract by indexed element.
         (multiple and single vector): Multi-vector BFloat16 floating-point fused multiply-subtract by vector.
         (multiple vectors): Multi-vector BFloat16 floating-point fused multiply-subtract.

   BFMAX (multiple and single vector): Multi-vector BFloat16 floating-point maximum by vector.
         (multiple vectors): Multi-vector BFloat16 floating-point maximum.

   BFMAXNM (multiple and single vector): Multi-vector BFloat16 floating-point maximum number by vector.
           (multiple vectors): Multi-vector BFloat16 floating-point maximum number.

   BFMIN (multiple and single vector): Multi-vector BFloat16 floating-point minimum by vector.
         (multiple vectors): Multi-vector BFloat16 floating-point minimum.

   BFMINNM (multiple and single vector): Multi-vector BFloat16 floating-point minimum number by vector.
           (multiple vectors): Multi-vector BFloat16 floating-point minimum number.

   BFMOPA (non-widening): BFloat16 floating-point outer product and accumulate.

   BFMOPS (non-widening): BFloat16 floating-point outer product and subtract.

The reference can be found here:

https://developer.arm.com/documentation/ddi0602/2022-09

Differential Revision: https://reviews.llvm.org/D137571

commit | commitdiff | tree

Nikita Popov [Mon, 14 Nov 2022 14:46:00 +0000 (15:46 +0100)]

[AST] Remove legacy AliasSetPrinter pass

A NewPM version of this pass exists, drop the legacy version of
this testing-only pass.

commit | commitdiff | tree

Sjoerd Meijer [Fri, 11 Nov 2022 12:56:42 +0000 (18:26 +0530)]

[AArch64] Add match patterns for the reassociated forms of FNMUL

Differential Revision: https://reviews.llvm.org/D137925

commit | commitdiff | tree

Nikita Popov [Mon, 14 Nov 2022 14:28:09 +0000 (15:28 +0100)]

[LoopVersioningLICM] Clarify scope of AST (NFC)

Make it clearer that the AST is only temporarily used during the
legality check, and does not have to survive into the transformation
phase.

commit | commitdiff | tree

Joseph Huber [Mon, 14 Nov 2022 14:11:33 +0000 (08:11 -0600)]

[OpenMP] Fix installation to old resource dir

Summary:
The changes in D125860 renamed the old resource directory to only use
the major version. This was not updated for the OpenMP project, causing
OpenMP resources to still be installed in the old `major.minor.rev`
folder. This lead to problems including the header files.

fixes #58966

commit | commitdiff | tree

Luca Di Sera [Mon, 14 Nov 2022 14:17:22 +0000 (15:17 +0100)]

Add clang_CXXMethod_isMoveAssignmentOperator to libclang

The new method is a wrapper of `CXXMethodDecl::isMoveAssignmentOperator` and
can be used to recognized move-assignment operators in libclang.

An export for the function, together with its documentation, was added to
"clang/include/clang-c/Index.h" with an implementation provided in
"clang/tools/libclang/CIndex.cpp". The implementation was based on
similar `clang_CXXMethod.*` implementations, following the same
structure but calling `CXXMethodDecl::isMoveAssignmentOperator` for its
main logic.

The new symbol was further added to "clang/tools/libclang/libclang.map"
to be exported, under the LLVM16 tag.

"clang/tools/c-index-test/c-index-test.c" was modified to print a
specific tag, "(move-assignment operator)", for cursors that are
recognized by `clang_CXXMethod_isMoveAssignmentOperator`.
A new regression test file,
"clang/test/Index/move-assignment-operator.cpp", was added to ensure
whether the correct constructs were recognized or not by the new function.

The "clang/test/Index/get-cursor.cpp" regression test file was updated
as it was affected by the new "(move-assignment operator)" tag.

A binding for the new function was added to libclang's python's
bindings, in "clang/bindings/python/clang/cindex.py", adding a new
method for `Cursor`, `is_move_assignment_operator_method`.
An accompanying test was added to
`clang/bindings/python/tests/cindex/test_cursor.py`, testing the new
function with the same methodology as the corresponding libclang test.

The current release note, `clang/docs/ReleaseNotes.rst`, was modified to
report the new addition under the "libclang" section.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D137246

commit | commitdiff | tree

Nikita Popov [Mon, 14 Nov 2022 14:16:38 +0000 (15:16 +0100)]

[LoopVersioningLICM] Remove unnecessary reset code (NFC)

The LoopVersioningLICM object is only ever used for a single loop,
but there was various unnecessary code for handling the case where
it is reused across loops. Drop that code, and pass the loop to the
constructor.

commit | commitdiff | tree

LLVM GN Syncbot [Mon, 14 Nov 2022 14:05:19 +0000 (14:05 +0000)]

[gn build] Port d52e2839f3b1

commit | commitdiff | tree

Nicholas Guy [Mon, 14 Nov 2022 13:59:59 +0000 (13:59 +0000)]

[ARM][CodeGen] Add support for complex deinterleaving

Adds the Complex Deinterleaving Pass implementing support for complex numbers in a target-independent manner, deferring to the TargetLowering for the given target to create a target-specific intrinsic.

Differential Revision: https://reviews.llvm.org/D114174

commit | commitdiff | tree

revunov.denis@huawei.com [Mon, 14 Nov 2022 13:25:20 +0000 (13:25 +0000)]

[BOLT][NFC] Fix possible use-after-free

If NewName twine has reference to the old name, then after
Section.Name = NewName.str(); this reference is invalidated,
so we cannot use NewName.str() anymore.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D137616

commit | commitdiff | tree

Valentin Clement [Mon, 14 Nov 2022 13:27:44 +0000 (14:27 +0100)]

[flang][NFC] Fix typo in fir.box_typecode op description

commit | commitdiff | tree

Dmitry Preobrazhensky [Mon, 14 Nov 2022 13:20:20 +0000 (16:20 +0300)]

[AMDGPU][MC][GFX11] Improve diagnostic messages for invalid VOPD syntax

Differential Revision: https://reviews.llvm.org/D137842

commit | commitdiff | tree

Nicolas Vasilache [Wed, 9 Nov 2022 11:56:26 +0000 (03:56 -0800)]

[mlir][Transform] Add support for dynamically unpacking tile_sizes / num_threads in tile_to_foreach_thread

This commit adds automatic unpacking of Value's of type pdl::OperationType to the underlying single-result OpResult.
This allows mixing single-value, attribute and multi-value pdl::Operation tile sizes and num threads to TileToForeachThreadOp.

Differential Revision: https://reviews.llvm.org/D137896

commit | commitdiff | tree

Ying Yi [Mon, 10 Oct 2022 12:26:56 +0000 (13:26 +0100)]

[ThinLTO] a ThinLTO warning is added if cache_size_bytes or cache_size_files is too small for the current link job. The warning recommends the user to consider adjusting --thinlto-cache-policy.

A specific case for ThinLTO cache pruning is that the current build is huge, and the cache wasn't big enough to hold the intermediate object files of that build. So in doing that build, a file would be cached, and later in that same build it would be evicted. This was significantly decreasing the effectiveness of the cache. By giving this warning, the user could identify the required cache size/files and improve ThinLTO link speed.

Differential Revision: https://reviews.llvm.org/D135590

commit | commitdiff | tree

Jay Foad [Mon, 14 Nov 2022 11:36:24 +0000 (11:36 +0000)]

[AMDGPU] Simplify SelectPat and remove comment obsoleted by D133593

commit | commitdiff | tree

Thomas Symalla [Mon, 14 Nov 2022 11:55:05 +0000 (12:55 +0100)]

[InstCombine][NFC] Add extractelement tests

commit | commitdiff | tree

HanSheng Zhang [Mon, 14 Nov 2022 11:45:23 +0000 (12:45 +0100)]

[reg2mem] Skip non-sized Instructions (PR58890)

We can only convert sized values into alloca/load/store, skip
instructions returning other types.

Fixes https://github.com/llvm/llvm-project/issues/58890.

Differential Revision: https://reviews.llvm.org/D137700

commit | commitdiff | tree

Christian Sigg [Mon, 14 Nov 2022 11:21:59 +0000 (12:21 +0100)]

[mlir][bazel] NFC: change MLIR_GPU_TO_CUBIN_PASS_ENABLE from `defines` to `local_defines`.

commit | commitdiff | tree

Joshua Cao [Mon, 14 Nov 2022 03:24:15 +0000 (22:24 -0500)]

Do not write a comma when varargs is the only argument

Fixes https://github.com/llvm/llvm-project/issues/56544

AsmWriter always writes ", ..." when a tail call has a varargs argument. This patch only writes the ", " when there is an argument before the varargs argument.

I did not write a dedicated test this for this change, but I modified an existing test that will test for a regression.

Reviewed By: avogelsgesang

Differential Revision: https://reviews.llvm.org/D137893

Signed-off-by: Adrian Vogelsgesang <avogelsgesang@salesforce.com>

commit | commitdiff | tree

Jean Perier [Mon, 14 Nov 2022 10:19:21 +0000 (11:19 +0100)]

[flang] Add hlfir.declare codegen

hlfir.declare codegen generates a fir.declare, and may generate a
fir.embox/fir.rebox/fir.emboxchar if the base value does not convey
all the variable bounds and length parameter information.

Leave OPTIONAL as a TODO to keep this patch simple. It will require
making the embox/rebox optional to preserve the optionality aspects.

Differential Revision: https://reviews.llvm.org/D137789

commit | commitdiff | tree

LLVM GN Syncbot [Mon, 14 Nov 2022 10:12:18 +0000 (10:12 +0000)]

[gn build] Port dd46a08008f7

Domain: System / Toolchain;