platform/upstream/llvm.git
21 months ago[Assignment Tracking][23/*] Account for assignment tracking in SLP Vectorizer
OCHyams [Tue, 15 Nov 2022 15:17:30 +0000 (15:17 +0000)]
[Assignment Tracking][23/*] Account for assignment tracking in SLP Vectorizer

The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

The SLP-Vectorizer can merge a set of scalar stores into a single vectorized
store. Merge DIAssignID intrinsics from the scalar stores onto the new
vectorized store.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D133320

21 months agoReapply [Hexagon] Use default attributes for intrinsics
Nikita Popov [Tue, 8 Nov 2022 10:48:03 +0000 (11:48 +0100)]
Reapply [Hexagon] Use default attributes for intrinsics

The issue that caused the revert has been fixed in:
44bd80751274a81c870882968ecd478b03af292a

-----

This switches Hexagon intrinsics to use the default attributes
(nosync, nofree, nocallback and willreturn). Especially willreturn
is needed to prevent optimization regressions in the future.

The only intrinsics I've excluded here are the load/store locked
intrinsics, which presumably aren't nosync.

Differential Revision: https://reviews.llvm.org/D137623

21 months ago[Assignment Tracking][22/*] Add loop-deletion test
OCHyams [Tue, 15 Nov 2022 14:40:38 +0000 (14:40 +0000)]
[Assignment Tracking][22/*] Add loop-deletion test

The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

This test covers the NFC-for-normal-debug-info change D133303.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D133319

21 months ago[NFC] Fix the typo and the format in the StandardCPlusPlusModules
Chuanqi Xu [Tue, 15 Nov 2022 14:52:21 +0000 (22:52 +0800)]
[NFC] Fix the typo and the format in the StandardCPlusPlusModules
document

21 months ago[Hexagon] Adjust handling of stack with variable-size and extra alignment
Krzysztof Parzyszek [Mon, 14 Nov 2022 16:23:03 +0000 (08:23 -0800)]
[Hexagon] Adjust handling of stack with variable-size and extra alignment

Make the stack alignment register (AP) reserved in the given function. This
will make it available everywhere in the function, and allow aligned access
to vector register spill slots.

21 months ago[libc++] Make it an error to define _LIBCPP_DEBUG
Louis Dionne [Mon, 14 Nov 2022 19:56:35 +0000 (09:56 -1000)]
[libc++] Make it an error to define _LIBCPP_DEBUG

We have been transitioning off of that macro since LLVM 15.

Differential Revision: https://reviews.llvm.org/D137975

21 months ago[MergeICmps][NFC] Fix a couple of typos in a comment
Fraser Cormack [Tue, 15 Nov 2022 14:46:23 +0000 (14:46 +0000)]
[MergeICmps][NFC] Fix a couple of typos in a comment

21 months ago[MemProf] ThinLTO summary support
Teresa Johnson [Tue, 11 Oct 2022 21:00:37 +0000 (14:00 -0700)]
[MemProf] ThinLTO summary support

Implements the ThinLTO summary support for memprof related metadata.

This includes support for the assembly format, and for building the
summary from IR during ModuleSummaryAnalysis.

To reduce space in both the bitcode format and the in memory index,
we do 2 things:
1. We keep a single vector of all uniq stack id hashes, and record the
   index into this vector in the callsite and allocation memprof
   summaries.
2. When building the combined index during the LTO link, the callsite
   and allocation memprof summaries are only kept on the FunctionSummary
   of the prevailing copy.

Differential Revision: https://reviews.llvm.org/D135714

21 months ago[AVR] Add FeatureEIJMPCALL to FamilyAVR6
Ayke van Laethem [Mon, 7 Nov 2022 17:57:35 +0000 (18:57 +0100)]
[AVR] Add FeatureEIJMPCALL to FamilyAVR6

This feature was probably missed when adding FamilyAVR6, but should
definitely be there. I checked all four devices in the AVR6 family and
they all support eijmp/eicall.

Found while working on https://reviews.llvm.org/D137572.

Differential Revision: https://reviews.llvm.org/D137573

21 months ago[AVR][Clang] Implement __AVR_ARCH__ macro
Ayke van Laethem [Mon, 7 Nov 2022 02:36:08 +0000 (03:36 +0100)]
[AVR][Clang] Implement __AVR_ARCH__ macro

This macro is defined in avr-gcc, and is very useful especially in
assembly code to check whether particular instructions are supported. It
is also the basis for other macros like __AVR_HAVE_ELPM__.

Differential Revision: https://reviews.llvm.org/D137521

21 months ago[AVR][Clang] Move family names into MCU list
Ayke van Laethem [Mon, 7 Nov 2022 02:04:54 +0000 (03:04 +0100)]
[AVR][Clang] Move family names into MCU list

This simplifies the code by avoiding some special cases for family names
(as opposed to device names).

Differential Revision: https://reviews.llvm.org/D137520

21 months ago[clang-tidy] Optionally ignore findings in macros in `readability-const-return-type`.
Thomas Etter [Mon, 14 Nov 2022 19:08:11 +0000 (19:08 +0000)]
[clang-tidy] Optionally ignore findings in macros in `readability-const-return-type`.

Adds support for options-controlled configuration of the check to ignore results in macros.

Differential Revision: https://reviews.llvm.org/D137972

21 months ago[clang-tidy] Ignore overriden methods in `readability-const-return-type`.
Thomas Etter [Mon, 14 Nov 2022 18:56:07 +0000 (18:56 +0000)]
[clang-tidy] Ignore overriden methods in `readability-const-return-type`.

Overrides are constrained by the signature of the overridden method, so a
warning on an override is frequently unactionable.

Differential Revision: https://reviews.llvm.org/D137968

21 months agoPEI should be able to use backward walk in replaceFrameIndicesBackward.
Alexander Timofeev [Fri, 4 Nov 2022 20:16:46 +0000 (21:16 +0100)]
PEI should be able to use backward walk in replaceFrameIndicesBackward.

The backward register scavenger has correct register
liveness information. PEI should leverage the backward register scavenger.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D137574

21 months ago[Assignment Tracking][20/*] Account for assignment tracking in DSE
OCHyams [Tue, 15 Nov 2022 13:38:03 +0000 (13:38 +0000)]
[Assignment Tracking][20/*] Account for assignment tracking in DSE

The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

DeadStoreElimmination shortens stores that are shadowed by later stores such
that the overlapping part of the earlier store is omitted. Insert an unlinked
dbg.assign intrinsic with a variable fragment that describes the omitted part
to signal that that fragment of the variable has a stale value in memory.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D133315

21 months ago[AAPointerInfo] refactor how offsets and Access objects are tracked
Sameer Sahasrabuddhe [Tue, 15 Nov 2022 13:22:11 +0000 (18:52 +0530)]
[AAPointerInfo] refactor how offsets and Access objects are tracked

This restores commit b756096b0cbef0918394851644649b3c28a886e2, which was
originally reverted in 00b09a7b18abb253d36b3d3e1c546007288f6e89.

AAPointerInfo now maintains a list of all Access objects that it owns, along
with the following maps:

- OffsetBins: OffsetAndSize -> { Access }
- InstTupleMap: RemoteI x LocalI -> Access

A RemoteI is any instruction that accesses memory. RemoteI is different from
LocalI if and only if LocalI is a call; then RemoteI is some instruction in the
callgraph starting from LocalI.

Motivation: When AAPointerInfo recomputes the offset for an instruction, it sets
the value to Unknown if the new offset is not the same as the old offset. The
instruction must now be moved from its current bin to the bin corresponding to
the new offset. This happens for example, when:

- A PHINode has operands that result in different offsets.
- The same remote inst is reachable from the same local inst via different paths
  in the callgraph:

```
               A (local inst)
               |
               B
              / \
             C1  C2
              \ /
               D (remote inst)

```
This fixes a bug where a store is incorrectly eliminated in a lit test.

Reviewed By: jdoerfert, ye-luo

Differential Revision: https://reviews.llvm.org/D136526

21 months ago[Assignment Tracking][19/*] Account for assignment tracking in ADCE
OCHyams [Tue, 15 Nov 2022 13:11:25 +0000 (13:11 +0000)]
[Assignment Tracking][19/*] Account for assignment tracking in ADCE

The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

In an attempt to preserve more info, don't delete dbg.assign intrinsics that
are considered "out of scope" if they're linked to instructions.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D133314

21 months ago[AArch64][SVE] Fix bad PTEST(PTRUE_ALL, PTEST_LIKE) optimization
Cullen Rhodes [Tue, 15 Nov 2022 12:00:00 +0000 (12:00 +0000)]
[AArch64][SVE] Fix bad PTEST(PTRUE_ALL, PTEST_LIKE) optimization

AArch64InstrInfo::optimizePTestInstr attempts to remove a PTEST of a
predicate generating operation that identically sets flags (implictly).

When the mask is an all active of matching element size the PTEST is
currently removed. For while instructions this is correct since they
perform an implicit PTEST with an all active mask. However, for other
instructions such as compares the mask could be different.

This patch fixes this bug by only removing the PTEST if the same all
active mask is used by the predicating-generating instruction.

Reviewed By: bsmith

Differential Revision: https://reviews.llvm.org/D137718

21 months agoUse TI.hasBuiltinAtomic() when setting ATOMIC_*_LOCK_FREE values. NFCI
Alex Richardson [Tue, 15 Nov 2022 12:29:23 +0000 (12:29 +0000)]
Use TI.hasBuiltinAtomic() when setting ATOMIC_*_LOCK_FREE values. NFCI

I noticed that the values for __{CLANG,GCC}_ATOMIC_POINTER_LOCK_FREE were
incorrectly set to 1 instead of two in downstream CHERI targets because
pointers are handled specially there. While fixing this downstream, I
noticed that the existing code could be refactored to use
TargetInfo::hasBuiltinAtomic instead of repeating the almost identical
logic. In theory there could be a difference here since hasBuiltinAtomic() also
returns true for types less than 1 char in size, but since
InitializePredefinedMacros() never passes such a value this change should
not introduce any functional changes.

Reviewed By: rprichard, efriedma

Differential Revision: https://reviews.llvm.org/D135142

21 months ago[AArch64] Add some missing tests for FNMADD combine patterns. NFC.
Sjoerd Meijer [Tue, 15 Nov 2022 12:22:56 +0000 (17:52 +0530)]
[AArch64] Add some missing tests for FNMADD combine patterns. NFC.

21 months ago[clang][Tooling] Sort filenames in test
Kadir Cetinkaya [Tue, 15 Nov 2022 12:31:29 +0000 (13:31 +0100)]
[clang][Tooling] Sort filenames in test

21 months ago[Assignment Tracking][18/*] Account for assignment tracking in LICM
OCHyams [Tue, 15 Nov 2022 12:18:49 +0000 (12:18 +0000)]
[Assignment Tracking][18/*] Account for assignment tracking in LICM

The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

Merge DIAssignID attachments on stores that are merged and sunk out of
loops. The store may be sunk into multiple exit blocks, and in this case all
the copies of the store get the same DIAssignID.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D133313

21 months ago[NFC][AArch64] SME2 Add instruction name convention and fix LookupTable number of...
Caroline Concatto [Tue, 8 Nov 2022 12:49:56 +0000 (12:49 +0000)]
[NFC][AArch64] SME2 Add instruction name convention and fix LookupTable number of registers

This patch adds the name convention for SME instructions.
This patch fixes the number of registers for LookUpTable in the AsmParser.
The number of registers is not used atm, but it is needed.
The switch case in getNumRegsForRegKind needs to have all the
RegKind enum.

21 months ago[NFC][SME2] Change instruction name for ADD/SUB array accumulator
Caroline Concatto [Tue, 15 Nov 2022 11:50:20 +0000 (11:50 +0000)]
[NFC][SME2] Change instruction name for ADD/SUB array accumulator

Now the names for all ADD, SUB, FADD and FSUB array accumulators instructions
are consistent with the developer's page and their operands.

21 months ago[AArch64][SVE] Fix bad PTEST(X, X) optimization
Cullen Rhodes [Tue, 15 Nov 2022 10:55:15 +0000 (10:55 +0000)]
[AArch64][SVE] Fix bad PTEST(X, X) optimization

AArch64InstrInfo::optimizePTestInstr attempts to remove a PTEST of a
predicate generating operation that identically sets flags (implictly).

When the mask is the same as the input predicate the PTEST is currently
removed. This is incorrect since the mask for the implicit PTEST
performed by the flag-setting instruction differs from the mask
specified to the explicit PTEST and could set different flags.

For example, consider

  PG=<1, 1, x, x>
  Z0=<1, 2, x, x>
  Z1=<2, 1, x, x>

  X=CMPLE(PG, Z0, Z1)
   =<0, 1, x, x>       NZCV=0xxx
  PTEST(X, X),         NZCV=1xxx

where the first active flag (bit 'N' in NZCV) is set by the explicit
PTEST, but not by the implicit PTEST as part of the compare. Given the
PTEST mask and source are the same however, first is equivalent to any,
so the PTEST could be removed if the condition is changed. The same
applies to last active. It is safe to remove the PTEST for any active,
but this information isn't available in the current optimization.

This patch fixes the bad optimization, a later patch will implement the
optimization proposed above and fix the any active case.

Reviewed By: bsmith

Differential Revision: https://reviews.llvm.org/D137717

21 months ago[Assignment Tracking][17/*] Account for assignment tracking in memcpyopt
OCHyams [Tue, 15 Nov 2022 11:51:10 +0000 (11:51 +0000)]
[Assignment Tracking][17/*] Account for assignment tracking in memcpyopt

The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

Maintain and propagate DIAssignID attachments in memcpyopt.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D133312

21 months ago[Assignment Tracking][16/*] Account for assignment tracking in mldst-motion
OCHyams [Tue, 15 Nov 2022 11:28:20 +0000 (11:28 +0000)]
[Assignment Tracking][16/*] Account for assignment tracking in mldst-motion

The Assignment Tracking debug-info feature is outlined in this RFC:
https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

mldst-motion will merge and sink the stores in if-diamond branches into the
common successor. Attach a merged DIAssignID to the merged store.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D133311

21 months ago[Assignment Tracking] Update mem2reg tests to use opaque pointers
OCHyams [Tue, 15 Nov 2022 11:20:59 +0000 (11:20 +0000)]
[Assignment Tracking] Update mem2reg tests to use opaque pointers

Follow up to 0946e463e8649896654b0dd39193db76a5789e11 (D133295).

21 months ago[Assignment Tracking][12/*] Account for assignment tracking in mem2reg
OCHyams [Tue, 15 Nov 2022 10:52:45 +0000 (10:52 +0000)]
[Assignment Tracking][12/*] Account for assignment tracking in mem2reg

The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

The changes for assignment tracking in mem2reg don't require much of a
deviation from existing behaviour. dbg.assign intrinsics linked to an alloca
are treated much in the same way as dbg.declare users of an alloca, except that
we don't insert dbg.value intrinsics to describe assignments when there is
already a dbg.assign intrinsic present, e.g. one linked to a store that is
going to be removed.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D133295

21 months ago[MCA][X86] Ensure the avx512 gfni tests use the upper xmm/ymm registers
Simon Pilgrim [Tue, 15 Nov 2022 11:06:50 +0000 (11:06 +0000)]
[MCA][X86] Ensure the avx512 gfni tests use the upper xmm/ymm registers

Ensure we're testing the avx512vl gfni instructions and not the avx gfni instructions

21 months ago[flang] Lower intrinsic assignment to fir.assign
Jean Perier [Tue, 15 Nov 2022 11:01:21 +0000 (12:01 +0100)]
[flang] Lower intrinsic assignment to fir.assign

Lower intrinsic assignment to hlfir.assign, except when the LHS
is a whole allocatable (this part will be done later to keep patch
simpler).

Differential Revision: https://reviews.llvm.org/D138013

21 months ago[flang] Add hlfir.assign definition
Jean Perier [Tue, 15 Nov 2022 10:56:29 +0000 (11:56 +0100)]
[flang] Add hlfir.assign definition

Add hlfir.assign that represent Fortran assignment.
See https://github.com/llvm/llvm-project/blob/main/flang/docs/HighLevelFIR.md.
Operation attributes will be added later when they can be used.

Differential Revision: https://reviews.llvm.org/D138012

21 months ago[AArch64][SVE] Fix bad PTEST(PG, OP(PG, ...)) optimization
Cullen Rhodes [Tue, 15 Nov 2022 10:34:23 +0000 (10:34 +0000)]
[AArch64][SVE] Fix bad PTEST(PG, OP(PG, ...)) optimization

AArch64InstrInfo::optimizePTestInstr attempts to remove a PTEST of a
predicate generating operation that identically sets flags (implictly).

When the PTEST and the predicate-generating operation use the same mask
the PTEST is currently removed. This is incorrect since it doesn't
consider element size. PTEST operates on 8-bit predicates, but for
instructions like compare that also support 16/32/64-bit predicates, the
implicit PTEST performed by the instruction will consider fewer lanes
for these element sizes and could set different first or last active
flags.

For example, consider the following instruction sequence

  ptrue p0.b ; P0=1111-1111-1111-1111
  index z0.s, #0, #1 ; Z0=<0,1,2,3>
  index z1.s, #1, #1 ; Z1=<1,2,3,4>
  cmphi p1.s, p0/z, z1.s, z0.s  ; P1=0001-0001-0001-0001
;       ^ last active
  ptest p0, p1.b ; P1=0001-0001-0001-0001
;     ^ last active

where the compare generates a canonical all active 32-bit predicate (equivalent
to 'ptrue p1.s, all'). The implicit PTEST sets the last active flag, whereas
the PTEST instruction with the same mask doesn't.

This patch restricts the optimization to instructions operating on 8-bit
predicates. One caveat is the optimization is safe regardless of element
size for any active, this will be addressed in a later patch.

Reviewed By: bsmith

Differential Revision: https://reviews.llvm.org/D137716

21 months ago[AArch64][SVE] Add more ptest removal tests
Cullen Rhodes [Tue, 15 Nov 2022 10:32:28 +0000 (10:32 +0000)]
[AArch64][SVE] Add more ptest removal tests

Precommit test containing examples where PTEST is incorrectly optimized
away. Later patches will fix incorrect optimizations.

Reviewed By: bsmith

Differential Revision: https://reviews.llvm.org/D137715

21 months ago[lit] [Windows] Print exit codes > 255 as hex too
Martin Storsjö [Wed, 2 Nov 2022 13:41:06 +0000 (13:41 +0000)]
[lit] [Windows] Print exit codes > 255 as hex too

891bb4872c8098f8a851a92e989af3252b46f5ad made negative exit codes
be printed as hex, which makes it easier to recognize e.g.
0xC0000005 instead of -1073741819. However, current Python versions
(at least the ones I'm using) seem to end up with positive unsigned
return codes, so that again ends up printed as 3221225477.

Print any return code over 255 as a hexadecimal number instead.

Differential Revision: https://reviews.llvm.org/D137771

21 months ago[clang][Tooling] Make the filename behaviour consistent
Kadir Cetinkaya [Mon, 14 Nov 2022 17:55:19 +0000 (18:55 +0100)]
[clang][Tooling] Make the filename behaviour consistent

Dotdots were removed only when converting a relative path to absolute
one. This patch makes the behaviour consistent for absolute file paths by
removing them in that case as well. Also updates the documentation to mention
the behaviour.

Fixes https://github.com/clangd/clangd/issues/1317

Differential Revision: https://reviews.llvm.org/D137962

21 months agoPre-commit load/store cases for PowerPC direct-move
Qiu Chaofan [Tue, 15 Nov 2022 09:34:18 +0000 (17:34 +0800)]
Pre-commit load/store cases for PowerPC direct-move

21 months ago[C++20] [Modules] Attach implicitly declared allocation funcitons to
Chuanqi Xu [Tue, 15 Nov 2022 03:35:21 +0000 (11:35 +0800)]
[C++20] [Modules] Attach implicitly declared allocation funcitons to
global module fragment

[basic.stc.dynamic.general]p2 says:
> The library provides default definitions for the global allocation
> and deallocation functions. Some global allocation and
> deallocation
> functions are replaceable ([new.delete]); these are attached to
> the global module ([module.unit]).

But we didn't take this before and the implicitly generated functions
will live in the module purview if we're compiling a module unit. This
is bad since the owning module will affect the linkage of the
declarations. This patch addresses this.

Closes https://github.com/llvm/llvm-project/issues/58560

21 months ago[gn build] Port 1ebfe9b264bb
LLVM GN Syncbot [Tue, 15 Nov 2022 09:07:55 +0000 (09:07 +0000)]
[gn build] Port 1ebfe9b264bb

21 months ago[CMake][compiler-rt] Don't load LLVM config in the runtimes build
Petr Hosek [Fri, 11 Nov 2022 22:50:14 +0000 (22:50 +0000)]
[CMake][compiler-rt] Don't load LLVM config in the runtimes build

LLVM runtimes build already loads the LLVM config and sets all
appropriate variables, no need to do it again.

Differential Revision: https://reviews.llvm.org/D137870

21 months ago[TargetParser] Split AArch64TargetParser from ARMTargetParser
Tomas Matheson [Sun, 23 Oct 2022 21:19:49 +0000 (22:19 +0100)]
[TargetParser] Split AArch64TargetParser from ARMTargetParser

AArch64TargetParser reuses data structures and some data from ARMTargetParser,
which causes more problems than it solves. This change separates them.

Code which is common to ARM and AArch64 is moved to ARMTargetParserCommon
which both ARMTargetParser and AArch64TargetParser use.

Some of the information in AArch64TargetParser.def was unused or nonsensical
(CPU_ATTR, ARCH_ATTR, ARCH_FPU) because it reused data strutures from
ARMTargetParser where some of these make sense. These are removed.

Differential Revision: https://reviews.llvm.org/D137924

21 months ago[code-owners] Add Vassil as a code owner for clang incremental compilation.
Vassil Vassilev [Tue, 15 Nov 2022 08:52:45 +0000 (08:52 +0000)]
[code-owners] Add Vassil as a code owner for clang incremental compilation.

Update the code owners file based on the discussion in the forum:
https://discourse.llvm.org/t/rfc-add-a-code-owner-for-incremental-compilation-incremental-c/66345

21 months ago[flang] Lower symbols to hlfir.declare
Jean Perier [Tue, 15 Nov 2022 08:49:19 +0000 (09:49 +0100)]
[flang] Lower symbols to hlfir.declare

Update lowering to generate hlfir.declare instead of fir.declare.
Introduce the hlfir::Entity class that will be used to work with
Fortran objects in HLFIR transformation.

Fix lower bounds that where swapped with extents in fir.declare
generation.

Update tests that expected fir.declare.

Differential Revision: https://reviews.llvm.org/D137951

21 months ago[GlobalISel] Remove semantic operand of G_IS_FPCLASS
Serge Pavlov [Tue, 15 Nov 2022 05:02:06 +0000 (12:02 +0700)]
[GlobalISel] Remove semantic operand of G_IS_FPCLASS

Instruction G_IS_FPCLASS had an operand that represented floating-point
semantics of its first operand. It allowed types that have the same length,
like `bfloat16` and `half`, to be distinguished. Unfortunately, it is
not sufficient, as other operation still cannot distinguish such types.
Solution of this problem must be more general, so now this operand is removed.

Differential Revision: https://reviews.llvm.org/D138004

21 months ago[LoongArch] Implement the TargetLowering::getRegisterByName hook
gonglingqin [Tue, 15 Nov 2022 08:09:26 +0000 (16:09 +0800)]
[LoongArch] Implement the TargetLowering::getRegisterByName hook

Only reserved registers can be read.

Differential Revision: https://reviews.llvm.org/D137532

21 months ago[lldb] Re-phase comments in `ScriptedProcess.get_loaded_images` method (NFC)
Med Ismail Bennani [Mon, 14 Nov 2022 08:17:22 +0000 (00:17 -0800)]
[lldb] Re-phase comments in `ScriptedProcess.get_loaded_images` method (NFC)

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
21 months agoUpdate bazel files.
Johannes Reifferscheid [Tue, 15 Nov 2022 06:48:51 +0000 (07:48 +0100)]
Update bazel files.

21 months ago[libunwind][LoongArch] Add 64-bit LoongArch support
zhanglimin [Tue, 15 Nov 2022 06:36:30 +0000 (14:36 +0800)]
[libunwind][LoongArch] Add 64-bit LoongArch support

Defines enums for the LoongArch registers.
Adds the register class implementation for LoongArch.
Adds save and restore context functionality.

This only supports 64 bits integer and float-point register
implementation.

Fix https://github.com/llvm/llvm-project/issues/55398

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D137010

21 months ago[RISCV] Teach shouldSinkOperands that vp.add and friends are commutative.
Craig Topper [Tue, 15 Nov 2022 05:51:08 +0000 (21:51 -0800)]
[RISCV] Teach shouldSinkOperands that vp.add and friends are commutative.

We previously had a bug that our isel patterns weren't commutative,
but that has been fixed for a while.

21 months agoFix `unsafe-fp-math` attribute emission.
Michele Scandale [Fri, 11 Nov 2022 20:06:29 +0000 (12:06 -0800)]
Fix `unsafe-fp-math` attribute emission.

The conditions for which Clang emits the `unsafe-fp-math` function
attribute has been modified as part of
`84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7`.
In the backend code generators `"unsafe-fp-math"="true"` enable floating
point contraction for the whole function.
The intent of the change in `84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7`
was to prevent backend code generators performing contractions when that
is not expected.
However the change is inaccurate and incomplete because it allows
`unsafe-fp-math` to be set also when only in-statement contraction is
allowed.

Consider the following example
```
float foo(float a, float b, float c) {
  float tmp = a * b;
  return tmp + c;
}
```
and compile it with the command line
```
clang -fno-math-errno -funsafe-math-optimizations -ffp-contract=on \
  -O2 -mavx512f -S -o -
```
The resulting assembly has a `vfmadd213ss` instruction which corresponds
to a fused multiply-add. From the user perspective there shouldn't be
any contraction because the multiplication and the addition are not in
the same statement.

The optimized IR is:
```
define float @test(float noundef %a, float noundef %b, float noundef %c) #0 {
  %mul = fmul reassoc nsz arcp afn float %b, %a
  %add = fadd reassoc nsz arcp afn float %mul, %c
  ret float %add
}

attributes #0 = {
  [...]
  "no-signed-zeros-fp-math"="true"
  "no-trapping-math"="true"
  [...]
  "unsafe-fp-math"="true"
}
```
The `"unsafe-fp-math"="true"` function attribute allows the backend code
generator to perform `(fadd (fmul a, b), c) -> (fmadd a, b, c)`.

In the current IR representation there is no way to determine the
statement boundaries from the original source code.
Because of this for in-statement only contraction the generated IR
doesn't have instructions with the `contract` fast-math flag and
`llvm.fmuladd` is being used to represent contractions opportunities
that occur within a single statement.
Therefore `"unsafe-fp-math"="true"` can only be emitted when contraction
across statements is allowed.

Moreover the change in `84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7` doesn't
take into account that the floating point math function attributes can
be refined during IR code generation of a function to handle the cases
where the floating point math options are modified within a compound
statement via pragmas (see `CGFPOptionsRAII`).
For consistency `unsafe-fp-math` needs to be disabled if the contraction
mode for any scope/operation is not `fast`.
Similarly for consistency reason the initialization of `UnsafeFPMath` of
in `TargetOptions` for the backend code generation should take into
account the contraction mode as well.

Reviewed By: zahiraam

Differential Revision: https://reviews.llvm.org/D136786

21 months ago[NFC] [C++20] [Modules] Remove unused Global Module Fragment variables/arguments
Chuanqi Xu [Tue, 15 Nov 2022 03:50:51 +0000 (11:50 +0800)]
[NFC] [C++20] [Modules] Remove unused Global Module Fragment variables/arguments

21 months ago[RISCV] Expand i32 abs to negw+max at isel.
Craig Topper [Tue, 15 Nov 2022 03:37:04 +0000 (19:37 -0800)]
[RISCV] Expand i32 abs to negw+max at isel.

This adds a RISCVISD::ABSW to remember that we started with an i32
abs. Previously we used a DAG combine of (sext_inreg (abs)) to
delay emitting a freeze from type legalization in order to make
ComputeNumSignBits optimizations work on other promoted nodes.

This new approach always uses negw+max even if the result doesn't
need to be sign extended. This helps the RISCVSExtWRemoval pass
if the sext.w is in another basic block.

21 months ago[dsymutil] Fix assertion in the Reproducer/FileCollector when TMPDIR is empty
Jonas Devlieghere [Fri, 11 Nov 2022 19:50:45 +0000 (11:50 -0800)]
[dsymutil] Fix assertion in the Reproducer/FileCollector when TMPDIR is empty

Fix a assertion in dsymutil coming from the Reproducer/FileCollector.
When TMPDIR is empty, the root becomes a relative path, triggering an
assertion when adding a relative path to the VFS mapping. This patch
fixes the issue by resolving the relative path and also moves the
assertion up to make it easier to diagnose these issues in the future.

rdar://102170986

Differential revision: https://reviews.llvm.org/D137959

21 months ago[PowerPC] make expensive mflr be away from its user in the function prologue
Chen Zheng [Tue, 8 Nov 2022 06:39:09 +0000 (01:39 -0500)]
[PowerPC] make expensive mflr be away from its user in the function prologue

mflr is kind of expensive on Power version smaller than 10, so we should
schedule the store for the mflr's def away from mflr.

In epilogue, the expensive mtlr has no user for its def, so it doesn't
matter that the load and the mtlr are back-to-back.

Reviewed By: RolandF

Differential Revision: https://reviews.llvm.org/D137423

21 months ago[mlir][AttrTypeReplacer] Make attribute dictionary replacement optional
River Riddle [Tue, 15 Nov 2022 00:55:33 +0000 (16:55 -0800)]
[mlir][AttrTypeReplacer] Make attribute dictionary replacement optional

This provides an optimization opportunity for clients that don't want/need
to recurse attribute dictionaries.

21 months ago[LoongArch] Handle register spill in BranchRelaxation pass
Xiaodong Liu [Tue, 15 Nov 2022 01:55:03 +0000 (09:55 +0800)]
[LoongArch] Handle register spill in BranchRelaxation pass

When the range of the unconditional branch is overflow, the indirect
branch way is used. The case when there is no scavenged register for
indirect branch needs to spill register to stack.

Reviewed By: SixWeining, wangleiat

Differential Revision: https://reviews.llvm.org/D137821

21 months ago[mlir][arith][spirv] Clean up arith-to-spirv. NFC.
Jakub Kuderski [Tue, 15 Nov 2022 01:54:14 +0000 (20:54 -0500)]
[mlir][arith][spirv] Clean up arith-to-spirv. NFC.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D137978

21 months ago[mlir][arith] Add `arith.shrsi` support to WIE
Jakub Kuderski [Tue, 15 Nov 2022 01:51:58 +0000 (20:51 -0500)]
[mlir][arith] Add `arith.shrsi` support to WIE

This includes LIT tests over the generated ops and runtime tests.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D137965

21 months ago[mlir][tosa] Create a profile validation pass for TOSA dialect
TatWai Chong [Tue, 15 Nov 2022 01:29:45 +0000 (17:29 -0800)]
[mlir][tosa] Create a profile validation pass for TOSA dialect

Add a separate validation pass to check if TOSA operations match with
the specification against given requirement. Perform profile type
checking as the initial feature in the pass.

This is an optional pass that can be enabled via command line. e.g.
$mlir-opt --tosa-validate="profile=bi" for validating against the
base inference profile.

Description:
TOSA defines a variety of operator behavior and requirements in the
specification. It would be helpful to have a separate validation pass
to keep TOSA operation input match with TOSA specification for given
criteria, and also diminish the burden of dialect validation during
compilation.

TOSA supports three profiles of which two are for inference purposes.
The main inference profile supports both integer and floating-point
data types, but the base inference profile only supports integers.
In this initial PR, validate the operations against a given profile
of TOSA, so that validation would fail if a floating point tensor is
present when the base inference profile is selected. Afterward, others
checking will be added to the pass if needed. e.g. control flow
operators and custom operators validation.

The pass is expected to be able to run on any point of TOSA dialect
conversion/transformation pipeline, and not depend on a particular
pass run ahead. So that it is can be used to validate the initial tosa
operations just converted from other dialects, the intermediate form,
or the final tosa operations output.

Change-Id: Ib58349c873c783056e89d2ab3b3312b8d2c61863

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D137279

21 months ago[NFC][Clang] Autogenerate checklines in a test being affected by a patch
Roman Lebedev [Tue, 15 Nov 2022 00:29:58 +0000 (03:29 +0300)]
[NFC][Clang] Autogenerate checklines in a test being affected by a patch

21 months ago[mlir][sparse] avoid nop rewriting on runtime lib path in pipeline
Aart Bik [Mon, 14 Nov 2022 21:51:23 +0000 (13:51 -0800)]
[mlir][sparse] avoid nop rewriting on runtime lib path in pipeline

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D137981

21 months ago[NVPTX] Emit pragma nounroll for llvm.loop.unroll.count=1
Dmitry Vassiliev [Tue, 15 Nov 2022 00:30:00 +0000 (04:30 +0400)]
[NVPTX] Emit pragma nounroll for llvm.loop.unroll.count=1

Emit pragma nounroll for llvm.loop.unroll.count=1 (#pragma unroll 1).

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D137991

21 months ago[mlir][sparse] fix memory leak sparse2sparse reshape
Peiming Liu [Tue, 15 Nov 2022 00:02:43 +0000 (00:02 +0000)]
[mlir][sparse] fix memory leak sparse2sparse reshape

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D137994

21 months agoRevert "[mlir][sparse] Macros to clean up StridedMemRefType in the SparseTensorRuntim...
Stella Stamenova [Tue, 15 Nov 2022 00:18:04 +0000 (16:18 -0800)]
Revert "[mlir][sparse] Macros to clean up StridedMemRefType in the SparseTensorRuntime" and "[mlir][sparse] move SparseTensorReader functions into the _mlir_ciface_ section"

This reverts commits 6c22dad and 92bc3fb.

These broke the windows mlir buildbot.

21 months agoGlobalISel: Add debug print for applied rule in generated combiner
Matt Arsenault [Mon, 14 Nov 2022 20:42:08 +0000 (12:42 -0800)]
GlobalISel: Add debug print for applied rule in generated combiner

21 months agoRevert "[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm"
Fangrui Song [Mon, 14 Nov 2022 23:51:03 +0000 (15:51 -0800)]
Revert "[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm"

This reverts commit bf8381a8bce28fc69857645cc7e84a72317e693e.

There is a layering violation: LLVMAnalysis depends on LLVMCore, so
LLVMCore should not include LLVMAnalysis header
llvm/Analysis/ModuleSummaryAnalysis.h

21 months ago[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm
Alexander Shaposhnikov [Mon, 14 Nov 2022 23:15:19 +0000 (23:15 +0000)]
[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm

Enable using -module-summary with -S
(similarly to what currently can be achieved with opt <input> -o - | llvm-dis).
This is a recommit of ef9e62469.

Test plan: ninja check-all

Differential revision: https://reviews.llvm.org/D137768

21 months ago[mlir][sparse] fix memory leak in test cases
Peiming Liu [Mon, 14 Nov 2022 22:28:12 +0000 (22:28 +0000)]
[mlir][sparse] fix memory leak in test cases

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D137985

21 months ago[mlir][sparse] Fix warning on GCC
wren romano [Mon, 14 Nov 2022 22:43:03 +0000 (14:43 -0800)]
[mlir][sparse] Fix warning on GCC

Reviewed By: aartbik, Peiming

Differential Revision: https://reviews.llvm.org/D137987

21 months ago[RISCV] Add codegen coverage for select idioms which might benefit from XVentanaCondOps
Philip Reames [Mon, 14 Nov 2022 22:21:51 +0000 (14:21 -0800)]
[RISCV] Add codegen coverage for select idioms which might benefit from XVentanaCondOps

21 months ago[libc] Forward LLVM_LIBC options when using a runtimes build
Joseph Huber [Mon, 14 Nov 2022 20:58:19 +0000 (14:58 -0600)]
[libc] Forward LLVM_LIBC options when using a runtimes build

The `LLVM_ENABLE_RUNTIMES' mode is commonly used to build runtimes that
depend on an up-to-date version of clang. Currently, `libc` uses some
internal variables that are not forwarded when building in this mode.
This patch forwards the relevent arguments beginning with `LLVM_LIBC` to
the build when built this way.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D137977

21 months ago[mlir][sparse] Make three tests run with the codegen path.
bixia1 [Mon, 14 Nov 2022 18:05:19 +0000 (10:05 -0800)]
[mlir][sparse] Make three tests run with the codegen path.

Reviewed By: aartbik, Peiming

Differential Revision: https://reviews.llvm.org/D137964

21 months ago[mlir][sparse] move SparseTensorReader functions into the _mlir_ciface_ section
wren romano [Wed, 9 Nov 2022 21:38:11 +0000 (13:38 -0800)]
[mlir][sparse] move SparseTensorReader functions into the _mlir_ciface_ section

Depends On D137735

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D137737

21 months ago[mlir][sparse] Macros to clean up StridedMemRefType in the SparseTensorRuntime
wren romano [Wed, 9 Nov 2022 21:35:45 +0000 (13:35 -0800)]
[mlir][sparse] Macros to clean up StridedMemRefType in the SparseTensorRuntime

In particular, this silences warnings from [-Wsign-compare].

Depends On D137681

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D137735

21 months ago[mlir][sparse] Making way for SparseTensorRuntime to support non-permutations
wren romano [Wed, 9 Nov 2022 21:33:01 +0000 (13:33 -0800)]
[mlir][sparse] Making way for SparseTensorRuntime to support non-permutations

Systematically updates the SparseTensorRuntime to properly distinguish tensor-dimensions from storage-levels (and their associated ranks, shapes, sizes, indices, etc).  With a few exceptions which are noted in the code, this ensures the runtime has all the **semantic** changes necessary to support non-permutations.

(Whereas **operationally**, since we're still using `std::vector<uing64_t>` to represent the mappings, there's no way to pass in any interesting non-permutations.  Changing the representation to `std::function` will be done in a separate differential.)

Depends On D137680

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D137681

21 months ago[RISCV] Add PseudoCCMOVGPR to RISCVSExtWRemoval.
Craig Topper [Mon, 14 Nov 2022 21:34:24 +0000 (13:34 -0800)]
[RISCV] Add PseudoCCMOVGPR to RISCVSExtWRemoval.

This instruction is a conditional move. It propagates sign bits
from its inputs.

21 months agoRevert "[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm"
Alexander Shaposhnikov [Mon, 14 Nov 2022 21:31:30 +0000 (21:31 +0000)]
Revert "[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm"

This reverts commit ef9e624694c0f125c53f7d0d3472fd486bada57d
for further investigation offline.
It appears to break the buildbot
llvm-clang-x86_64-sie-ubuntu-fast.

21 months ago[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm
Alexander Shaposhnikov [Mon, 14 Nov 2022 21:10:24 +0000 (21:10 +0000)]
[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm

Enable using -module-summary with -S
(similarly to what currently can be achieved with opt <input> -o - | llvm-dis).

Test plan: ninja check-all

Differential revision: https://reviews.llvm.org/D137768

21 months agoRevert "[dsymutil] Fix assertion in the Reproducer/FileCollector when TMPDIR is empty"
Jonas Devlieghere [Mon, 14 Nov 2022 21:03:29 +0000 (13:03 -0800)]
Revert "[dsymutil] Fix assertion in the Reproducer/FileCollector when TMPDIR is empty"

This reverts commit 68efb4772c0d0e60cbfb09ea619b58d80c31ff0f because the
test fails on some of the buildbots.

21 months ago[DirectX backend] Fix build and test error caused by out of sync with upstream change.
Xiang Li [Fri, 11 Nov 2022 08:00:11 +0000 (00:00 -0800)]
[DirectX backend] Fix build and test error caused by out of sync with upstream change.

Fix build and test error caused by
https://github.com/llvm/llvm-project/commit/a2620e00ffa232a406de3a1d8634beeda86956fd#
and
https://github.com/llvm/llvm-project/commit/304f1d59ca41872c094def3aee0a8689df6aa398

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D137815

21 months ago[dsymutil] Fix assertion in the Reproducer/FileCollector when TMPDIR is empty
Jonas Devlieghere [Fri, 11 Nov 2022 19:50:45 +0000 (11:50 -0800)]
[dsymutil] Fix assertion in the Reproducer/FileCollector when TMPDIR is empty

Fix a assertion in dsymutil coming from the Reproducer/FileCollector.
When TMPDIR is empty, the root becomes a relative path, triggering an
assertion when adding a relative path to the VFS mapping. This patch
fixes the issue by resolving the relative path and also moves the
assertion up to make it easier to diagnose these issues in the future.

rdar://102170986

Differential revision: https://reviews.llvm.org/D137959

21 months ago[cmake] Fix _GNU_SOURCE being added unconditionally
Andreas Hollandt [Mon, 14 Nov 2022 20:27:12 +0000 (12:27 -0800)]
[cmake] Fix _GNU_SOURCE being added unconditionally

Reviewed By: tstellar

Differential Revision: https://reviews.llvm.org/D137917

21 months ago[COFF, Mach-O] Include -mllvm options in thinlto cache key
Nico Weber [Mon, 14 Nov 2022 19:23:56 +0000 (14:23 -0500)]
[COFF, Mach-O] Include -mllvm options in thinlto cache key

Like D134013, but for COFF and Mach-O.

Also expand the ELF test a bit. I at first didn't realize that `getValue()` for
`-mllvm -foo=bar` would return `-foo=bar` instead of just `bar`, and so
I wrote the test to check if we indeed get this wrong. We don't, but
having the test for it seems nice, so I'm including it.

Differential Revision: https://reviews.llvm.org/D137971

21 months ago[mlir][arith][spirv] Handle i1 sign extension in arith-to-spirv
Jakub Kuderski [Mon, 14 Nov 2022 20:07:18 +0000 (15:07 -0500)]
[mlir][arith][spirv] Handle i1 sign extension in arith-to-spirv

Also fix some surrounding nits.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D137974

21 months agoApply clang-tidy fixes for readability-identifier-naming in AlgebraicSimplification...
Mehdi Amini [Mon, 14 Nov 2022 06:24:54 +0000 (06:24 +0000)]
Apply clang-tidy fixes for readability-identifier-naming in AlgebraicSimplification.cpp (NFC)

21 months agoApply clang-tidy fixes for readability-simplify-boolean-expr in GPUDialect.cpp (NFC)
Mehdi Amini [Mon, 14 Nov 2022 06:10:39 +0000 (06:10 +0000)]
Apply clang-tidy fixes for readability-simplify-boolean-expr in GPUDialect.cpp (NFC)

21 months ago[mlir][tosa] Remove zero-fill of tosa.concat outputs when lowering to linalg.
Rob Suderman [Mon, 14 Nov 2022 19:38:16 +0000 (11:38 -0800)]
[mlir][tosa] Remove zero-fill of tosa.concat outputs when lowering to linalg.

Since all output elements are known to be overridden by construction the fill is not required. This change makes the tosa lowering consistent with the MHLO and Torch lowerings of concat which do not do the fill.

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D137967

21 months ago[MachineCSE] Allow CSE for instructions with ignorable operands
Guozhi Wei [Mon, 14 Nov 2022 19:34:59 +0000 (19:34 +0000)]
[MachineCSE] Allow CSE for instructions with ignorable operands

Ignorable operands don't impact instruction's behavior, we can safely do CSE on
the instruction.

It is split from D130919. It has big impact to some AMDGPU test cases.
For example in atomic_optimizations_raw_buffer.ll, when trying to check if the
following instruction can be CSEed

  %37:vgpr_32 = V_MOV_B32_e32 0, implicit $exec

Function isCallerPreservedOrConstPhysReg is called on operand "implicit $exec",
this function is implemented as

  -  return TRI.isCallerPreservedPhysReg(Reg, MF) ||
  +  return TRI.isCallerPreservedPhysReg(Reg, MF) || TII.isIgnorableUse(MO) ||
            (MRI.reservedRegsFrozen() && MRI.isConstantPhysReg(Reg));

Both TRI.isCallerPreservedPhysReg and MRI.isConstantPhysReg return false on this
operand, so isCallerPreservedOrConstPhysReg is also false, it causes LLVM failed
to CSE this instruction.

With this patch TII.isIgnorableUse returns true for the operand $exec, so
isCallerPreservedOrConstPhysReg also returns true, it causes this instruction to
be CSEed with previous instruction

  %14:vgpr_32 = V_MOV_B32_e32 0, implicit $exec

So I got different result from here. AMDGPU's implementation of isIgnorableUse
is

  bool SIInstrInfo::isIgnorableUse(const MachineOperand &MO) const {
    // Any implicit use of exec by VALU is not a real register read.
    return MO.getReg() == AMDGPU::EXEC && MO.isImplicit() &&
           isVALU(*MO.getParent()) && !resultDependsOnExec(*MO.getParent());
  }

Since the operand $exec is not a real register read, my understanding is it's
reasonable to do CSE on such instructions.

Because more instructions are CSEed, so I get less instructions generated for
these tests.

Differential Revision: https://reviews.llvm.org/D137222

21 months agoclang/AMDGPU: Use Support's wrapper around getenv
Matt Arsenault [Fri, 28 Oct 2022 23:10:41 +0000 (16:10 -0700)]
clang/AMDGPU: Use Support's wrapper around getenv

This does some extra stuff for Windows, so might as well
use it just in case.

21 months ago[mlir][MemRef] Change the anchor point of a reshapeLikeOp pattern
Quentin Colombet [Thu, 20 Oct 2022 22:18:58 +0000 (22:18 +0000)]
[mlir][MemRef] Change the anchor point of a reshapeLikeOp pattern

Essentially, this patches changes the anchor point of the
`extract_strided_metadata(reshapeLikeOp)` pattern from
`extract_strided_metadata` to `reshapeLikeOp`.

In details, this means that instead of replacing:
```
base, offset, sizes, strides =
  extract_strided_metadata(reshapeLikeOp(src))
```
With
```
base, offset = extract_strided_metadata(src)
sizes = <some math>
strides = <some math>
```

We replace only the reshapeLikeOp part and connect it back with a
reinterpret_cast:
```
val = reshapeLikeOp(src)
```
=>
```
base, offset, ... = extract_strided_metadata(src)
sizes = <some math>
strides = <some math>
val = reinterpret_cast base, offset, sizes, strides

Differential Revision: https://reviews.llvm.org/D136386

21 months ago[mlir][MemRef] Change the anchor point of a subview pattern
Quentin Colombet [Wed, 12 Oct 2022 21:23:27 +0000 (21:23 +0000)]
[mlir][MemRef] Change the anchor point of a subview pattern

Essentially, this patches changes the anchor point of the
`extract_strided_metadata(subview)` pattern from
`extract_strided_metadata` to `subview`.

In details, this means that instead of replacing:
```
base, offset, sizes, strides = extract_strided_metadata(subview(src))
```
With
```
base, ... = extract_strided_metadata(src)
offset = <some math>
sizes = subSizes
strides = <some math>
```

We replace only the subview part and connect it back with a
reinterpret_cast:
```
val = subview(src)
```
=>
```
base, ... = extract_strided_metadata(src)
offset = <some math>
sizes = subSizes
strides = <some math>
val = reinterpret_cast base, offset, sizes, strides
```

Differential Revision: https://reviews.llvm.org/D135839

21 months ago[mlir][MemRef] Simplify extract_strided_metadata(reinterpret_cast)
Quentin Colombet [Wed, 12 Oct 2022 21:18:53 +0000 (21:18 +0000)]
[mlir][MemRef] Simplify extract_strided_metadata(reinterpret_cast)

This patch adds a pattern to simplify
```
base, offset, sizes, strides =
  extract_strided_metadata(
    reinterpret_cast(src, srcOffset, srcSizes, srcStrides))
```

Into
```
base, baseOffset, ... = extract_strided_metadata(src)
offset = srcOffset
sizes = srcSizes
strides = srcStrides
```

Note: Reinterpret_cast with unranked sources are not simplified since
they cannot feed extract_strided_metadata operations.

Differential Revision: https://reviews.llvm.org/D135837

21 months ago[lto] Update function name in comment after 5f312ad45
Nico Weber [Mon, 14 Nov 2022 18:30:55 +0000 (13:30 -0500)]
[lto] Update function name in comment after 5f312ad45

21 months ago[RISCV] Add scalar FP compares to isSignExtendingOpW in RISCVSExtWRemoval.
Craig Topper [Mon, 14 Nov 2022 18:28:29 +0000 (10:28 -0800)]
[RISCV] Add scalar FP compares to isSignExtendingOpW in RISCVSExtWRemoval.

21 months ago[AMDGPU][MC][NFC] Rename VOP3 VOPC test files
Joe Nash [Mon, 14 Nov 2022 15:15:27 +0000 (10:15 -0500)]
[AMDGPU][MC][NFC] Rename VOP3 VOPC test files

D136149 and D136148 renamed the MC test files for VOP3 promoted from VOP1 and
VOP2 in a consistent way. Do the same for VOP3 coming from VOPC.

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D137950

21 months ago[RISCV] Move FixableDef handling out of isSignExtendingOpW.
Craig Topper [Mon, 14 Nov 2022 17:59:03 +0000 (09:59 -0800)]
[RISCV] Move FixableDef handling out of isSignExtendingOpW.

We have two layers of opcode checks. The first is in
isSignExtendingOpW. If that returns false, a second switch is used
for looking through nodes by adding them to the worklist.

Move the FixableDef handling to the second switch. This simplies
the interface of isSignExtendingOpW and makes that function more
accurate to its name.

21 months ago[GlobalIsel][AMDGPU] Changing legalize rule for G_{UADDO|UADDE|USUBO|USUBE|SADDE...
Yashwant Singh [Mon, 14 Nov 2022 17:57:08 +0000 (23:27 +0530)]
[GlobalIsel][AMDGPU] Changing legalize rule for G_{UADDO|UADDE|USUBO|USUBE|SADDE|SSUBE}

Generic add and sub with carry are now legalized in a way to explicitly calculate carry/borrow output. i.e
%6:_(s64), %7:_(s1) = G_UADDO %0, %1
becomes,
%13:_(s32), %14:_(s1) = G_UADDO %2, %4
%15:_(s32), %16:_(s1) = G_UADDE %3, %5, %14
%6:_(s64) = G_MERGE_VALUES %13(s32), %15(s32)
%7:_(s1) = G_ICMP intpred(ult), %6(s64), %1

Here G_MERGE and G_ICMP instructions are redundant for recalculating carry output. (Similar case for sub with borrow)
This change fix this.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D137932

21 months ago[mlir][linalg] Add reduction tiling using scf.foreachthread
Thomas Raoux [Sun, 13 Nov 2022 18:52:03 +0000 (18:52 +0000)]
[mlir][linalg] Add reduction tiling using scf.foreachthread

This adds a transformation to tile reduction operations to partial
reduction using scf.foreachthread. This uses
PartialReductionOpInterface to create a merge operation of the partial
tiles.

Differential Revision: https://reviews.llvm.org/D137912

21 months ago[bazel] Add another missing dependency after D137833
Benjamin Kramer [Mon, 14 Nov 2022 18:02:56 +0000 (19:02 +0100)]
[bazel] Add another missing dependency after D137833

While there run buildifier.

21 months ago[mlir][MemRef] Make reinterpret_cast(extract_strided_metadata) more robust
Quentin Colombet [Wed, 12 Oct 2022 00:29:39 +0000 (00:29 +0000)]
[mlir][MemRef] Make reinterpret_cast(extract_strided_metadata) more robust

Prior to this patch the canonicalization pattern that turns
`reinterpret_cast(extract_strided_metadata)` into cast was only applied
when all the input operands of the `reinterpret_cast` are exactly all the
output results of the `extract_strided_metadata`.

This missed simplification opportunities when the values would have hold
the same constant values, but yet, come from different actual values.

E.g., prior to this patch, a pattern of the form:
```
%base, %offset = extract_strided_metadata %source : memref<i16>
reinterpret_cast %base to offset:[0]
```
Wouldn't have been simplified into a simple cast, because %offset is not
directly the same value object as 0.

This patch teaches this pattern how to check if the constant values
match what the results of the `extract_strided_metadata` operation would
have hold.

Differential Revision: https://reviews.llvm.org/D135736