platform/upstream/llvm.git
18 months ago[mlir][TilingInterface] Add an option to tile and fuse to yield replacement for the...
Mahesh Ravishankar [Thu, 5 Jan 2023 00:57:50 +0000 (00:57 +0000)]
[mlir][TilingInterface] Add an option to tile and fuse to yield replacement for the fused producer.

This patch adds an option to the method that fuses a producer with a
tiled consumer, to also yield from the tiled loops a value that can be
used to replace the original producer. This is only valid if it can be
assertained that the slice of the producer computed within each
iteration of the tiled loop nest does not compute slices of the
producer redundantly. The analysis to derive this is very involved. So
this is left to the caller to assertain.  A test is added that mimics
the `scf::tileConsumerAndFuseProducersGreedilyUsingSCFForOp`, but also
yields the values of all fused producers. This can be used as a
reference for how a caller could use this functionality.

Differential Revision: https://reviews.llvm.org/D141028

18 months ago[Thumb2][MVE] Recognise shuffle truncation patterns suitable for ARMISD::MVETRUNC
Simon Pilgrim [Mon, 16 Jan 2023 17:59:38 +0000 (17:59 +0000)]
[Thumb2][MVE] Recognise shuffle truncation patterns suitable for ARMISD::MVETRUNC

I'm helping with the remaining regressions on D127115, and one of my candidate fixes caused some regressions with MVE interleaved shuffles due to poor handling of 'truncation' style shuffle masks (0,2,4,6,...).

This patch attempts to use the ARMISD::MVETRUNC node to handle these cases, based off existing code in LowerTruncate.

It handles both (0,2,4,6,...) and (1,3,5,7,....) 'top' style patterns (assuming no endian problems). I shift down the 'top' patterns - a basic search of ARM docs suggests MVE has some top/bottom truncation/narrowing instructions but I don't seem to be able to get them to be used.

Differential Revision: https://reviews.llvm.org/D141791

18 months ago[InstCombine] canonicalize a signum (spaceship) that ends in add
Sanjay Patel [Mon, 16 Jan 2023 17:34:26 +0000 (12:34 -0500)]
[InstCombine] canonicalize a signum (spaceship) that ends in add

(A s>> (BW - 1)) + (zext (A s> 0)) --> (A s>> (BW - 1)) | (zext (A != 0))

https://alive2.llvm.org/ce/z/V-nM8N

This is not the form that we currently match as m_Signum(),
but I'm not sure if one is better than the other, so there's
a follow-up patch needed either way.

For this patch, it should be better for analysis to use a
not-null test and bitwise logic rather than >0 with add.
Codegen doesn't seem significantly different on any targets
that I looked at.

Also note that none of these variants is shown in issue #60012 -
those generally include at least one 'select', so that's likely
where these patterns will end up.

18 months ago[InstCombine] add tests for signum (spaceship) variant; NFC
Sanjay Patel [Mon, 16 Jan 2023 16:46:12 +0000 (11:46 -0500)]
[InstCombine] add tests for signum (spaceship) variant; NFC

18 months agoFix format for `case` in .proto files
Matt Kulukundis [Mon, 16 Jan 2023 17:24:10 +0000 (17:24 +0000)]
Fix format for `case` in .proto files

Fix format for `case` in .proto files

Reviewed By: krasimir, echristo

Differential Revision: https://reviews.llvm.org/D141547

18 months ago[lld][COFF] Provide unwinding information for Chunk injected by /delayloaded
serge-sans-paille [Fri, 13 Jan 2023 13:25:54 +0000 (14:25 +0100)]
[lld][COFF] Provide unwinding information for Chunk injected by /delayloaded

For each symbol in a /delayloaded library, lld injects a small piece of
code to handle the symbol lazy loading. This code doesn't have unwind
information, which may be troublesome.

Provide these information for AMD64.

Thanks to Yannis Juglaret <yjuglaret@mozilla.com> for contributing the
unwinding info and for his support while crafting this patch.

Fix #59639

Differential Revision: https://reviews.llvm.org/D141691

18 months ago[libcxx] Add missing includes
Michael Buch [Mon, 16 Jan 2023 10:24:22 +0000 (10:24 +0000)]
[libcxx] Add missing includes

This fixes the remaining errors when building the llvm-project
with `LLVM_ENABLE_MODULES=ON` (and `LLVM_ENABLE_LOCAL_SUBMODULE_VISIBILITY=ON`,
which currently is the LLVM default).

Previously this would fail in the `CXX_SUPPORTS_MODULES` check.

Differential Revision: https://reviews.llvm.org/D141833

18 months ago[AArch64] Move default extensions from clang Driver to TargetParser
David Green [Mon, 16 Jan 2023 16:58:18 +0000 (16:58 +0000)]
[AArch64] Move default extensions from clang Driver to TargetParser

The default extensions would be better added in the TargetParser, not by
the driver. This removes the addition of +i8mm and +bf16 features in the
driver as they are already added in 8.6/9.1 architectures. AEK_MOPS and
AEK_HBC have been added to 8.8/9.3 architectures to replace the need for
+hbc and +mops features.

Differential Revision: https://reviews.llvm.org/D141518

18 months ago[mlir][vector] Add scalable vectors support to OuterProductOp
Frank (Fang) Gao [Fri, 13 Jan 2023 15:47:44 +0000 (10:47 -0500)]
[mlir][vector] Add scalable vectors support to OuterProductOp

This will probably be the first in a series of patches that tries to
enable code generation for ARM SME (extension of SVE).

Since SME's core operation is the outer product instruction, I figured
that it would probably be a good idea to enable the outer product
operation to properly accept and generate scalable vectors.

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D138718

18 months agoRevert "[mlir][vector] Add scalable vectors support to OuterProductOp"
Prabhdeep Singh Soni [Mon, 16 Jan 2023 16:46:33 +0000 (11:46 -0500)]
Revert "[mlir][vector] Add scalable vectors support to OuterProductOp"

This reverts commit be4c5ad54c929f2d817ab4a55707f0beda73a05f.

This patch did not include the test case.

18 months agoCheck for FunctionOpInterface when looking up a parent function in GPU lowering
Mehdi Amini [Mon, 16 Jan 2023 16:38:43 +0000 (16:38 +0000)]
Check for FunctionOpInterface when looking up a parent function in GPU lowering

This makes it more robust when expanding code in other function than
func.func, like spv.func for example.

Fixes #60072

18 months ago[AMDGPU][AsmParser][NFC] Refine defining single-bit custom operands.
Ivan Kosarev [Mon, 16 Jan 2023 16:11:52 +0000 (16:11 +0000)]
[AMDGPU][AsmParser][NFC] Refine defining single-bit custom operands.

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D141301

18 months agoRevert "[codegen] Store address of indirect arguments on the stack"
Felipe de Azevedo Piovezan [Mon, 16 Jan 2023 16:05:22 +0000 (13:05 -0300)]
Revert "[codegen] Store address of indirect arguments on the stack"

This reverts commit 7e4447a17db4a070f01c8f8a87505a4b2a1b0e3a.

18 months ago[Support] Fix incorrect assertion in backref compilation
Nikita Popov [Mon, 16 Jan 2023 14:19:22 +0000 (15:19 +0100)]
[Support] Fix incorrect assertion in backref compilation

These should be == rather than !=.

18 months ago[Support] Fix REDEBUG compilation
Nikita Popov [Mon, 16 Jan 2023 14:11:12 +0000 (15:11 +0100)]
[Support] Fix REDEBUG compilation

18 months ago[DebugInfo] Add CIE::getAugmentationData() and FDE::getCIEPointer()
Benjamin Maxwell [Mon, 9 Jan 2023 16:24:07 +0000 (16:24 +0000)]
[DebugInfo] Add CIE::getAugmentationData() and FDE::getCIEPointer()

Public getters are provided for other similar members of both the CIE
and FDE, these fields are also displayed by the llvm-drawfdump tool,
so it seems like not exposing them was likely an oversight.

These are needed for tools based on LLVM that need access to all the
parsed DWARF data.

Differential Revision: https://reviews.llvm.org/D141475

18 months ago[mlir][NFC] GreedyPatternRewriteDriver: Consistent return values
Matthias Springer [Mon, 16 Jan 2023 15:23:58 +0000 (16:23 +0100)]
[mlir][NFC] GreedyPatternRewriteDriver: Consistent return values

All `apply...` functions now return a LogicalResult indicating whether the iterative process converged or not.

Differential Revision: https://reviews.llvm.org/D141845

18 months ago[mlir][NFC] GreedyPatternRewriteDriver: Remove overridden eraseOp
Matthias Springer [Mon, 16 Jan 2023 15:18:23 +0000 (16:18 +0100)]
[mlir][NFC] GreedyPatternRewriteDriver: Remove overridden eraseOp

It is not necessary to override `eraseOp`, we can use the existing `notifyOperationRemoved`.

Differential Revision: https://reviews.llvm.org/D141844

18 months agoExplicitly more Error when returning it (NFC)
Mehdi Amini [Mon, 16 Jan 2023 15:07:46 +0000 (15:07 +0000)]
Explicitly more Error when returning it (NFC)

This is an attempt to fix a build failure:

llvm/lib/Object/ELFObjectFile.cpp:300:12: error: call to deleted constructor of 'llvm::Error'
    return E;

18 months ago[docs] Expand example on stand-alone builds.
Francesco Petrogalli [Mon, 16 Jan 2023 09:29:11 +0000 (10:29 +0100)]
[docs] Expand example on stand-alone builds.

1. Make explicit that the folder where to build a subproject in stand-alone mode can not be the same folder where LLVM was build.
2. Add a cut 'n paste example for building stand-alone `clang`.

Differential Revision: https://reviews.llvm.org/D141825

18 months ago[X86] Prefer fpext(splat(X)) to splat(fpext(x)).
Freddy Ye [Mon, 16 Jan 2023 14:16:02 +0000 (22:16 +0800)]
[X86] Prefer fpext(splat(X)) to splat(fpext(x)).

This patch is to fix regression of D122875. X86 has fpext instructions
supporting rmb form, which takes advantage of fpext(fplat(X)) than
splat(fpext(X)).

Reviewed By: RKSimon, skan

Differential Revision: https://reviews.llvm.org/D141657

18 months ago[X86] Add more test case for folding select on vXi1
Luo, Yuanke [Mon, 16 Jan 2023 14:03:39 +0000 (22:03 +0800)]
[X86] Add more test case for folding select on vXi1

18 months agoDeprecate MemIntrinsicBase::getDestAlignment() and MemTransferBase::getSourceAlignment()
Guillaume Chatelet [Mon, 16 Jan 2023 12:34:40 +0000 (12:34 +0000)]
Deprecate MemIntrinsicBase::getDestAlignment() and MemTransferBase::getSourceAlignment()

Differential Revision: https://reviews.llvm.org/D141840

18 months ago[codegen] Store address of indirect arguments on the stack
Felipe de Azevedo Piovezan [Fri, 6 Jan 2023 18:52:22 +0000 (15:52 -0300)]
[codegen] Store address of indirect arguments on the stack

With codegen prior to this patch, truly indirect arguments -- i.e.
those that are not `byval` -- can have their debug information lost even
at O0. Because indirect arguments are passed by pointer, and this
pointer is likely placed in a register as per the function call ABI,
debug information is lost as soon as the register gets clobbered.

This patch solves the issue by storing the address of the parameter on
the stack, using a similar strategy employed when C++ references are
passed. In other words, this patch changes codegen from:

```
define @foo(ptr %arg) {
   call void @llvm.dbg.declare(%arg, [...], metadata !DIExpression())
```

To:

```
define @foo(ptr %arg) {
   %ptr_storage = alloca ptr
   store ptr %arg, ptr %ptr_storage
   call void @llvm.dbg.declare(%ptr_storage, [...], metadata !DIExpression(DW_OP_deref))
```

Some common cases where this may happen with C or C++ function calls:
  1. "Big enough" trivial structures passed by value under the ARM ABI.
  2. Structures that are non-trivial for the purposes of call (as per
  the Itanium ABI) when passed by value.

A few tests were matching the wrong alloca (matching against the new
alloca, instead of the old one), so they were updated to either match
both allocas or include a `,` right after the alloca type, to prevent
matching against a pointer type.

Differential Revision: https://reviews.llvm.org/D141381

18 months ago[llvm-objdump][RISCV] Use new common method to parse ARCH RISCV attribute
Elena Lepilkina [Wed, 7 Dec 2022 06:47:51 +0000 (09:47 +0300)]
[llvm-objdump][RISCV] Use new common method to parse ARCH RISCV attribute

Differential Revision: https://reviews.llvm.org/D139553

18 months ago[libc++] allow redefined macro in non_trivial_copy_move_ABI test
Ed Maste [Fri, 13 Jan 2023 01:05:42 +0000 (20:05 -0500)]
[libc++] allow redefined macro in non_trivial_copy_move_ABI test

__config defines _LIBCPP_DEPRECATED_ABI_DISABLE_PAIR_TRIVIAL_COPY_CTOR
on FreeBSD, which conflicts with a command-line definition used by the
non_trivial_copy_move_ABI test.

Add -Wno-macro-redefined to ADDITIONAL_COMPILE_FLAGS in this test.

Reviewed By: philnik

Differential Revision: https://reviews.llvm.org/D141774

18 months agoThis patch allows llvm-dwarfutil to utilize accelerator tables
Alexey Lapshin [Sun, 15 Jan 2023 21:31:35 +0000 (22:31 +0100)]
This patch allows llvm-dwarfutil to utilize accelerator tables
generation code from DWARFLinker. It adds command line option:

--build-accelerator [none,DWARF]
                        Build accelerator tables(default: none)
  =none - Do not build accelerators
  =DWARF - Build accelerator tables according to the resulting DWARF version
       DWARFv4: .debug_pubnames and .debug_pubtypes
       DWARFv5: .debug_names

Differential Revision: https://reviews.llvm.org/D139638

18 months ago[NFC] Use `llvm::enumerate` in llvm/unittests/Object
Sergei Barannikov [Sun, 15 Jan 2023 11:38:34 +0000 (14:38 +0300)]
[NFC] Use `llvm::enumerate` in llvm/unittests/Object

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D141788

18 months ago[NFC] Remove dead code
Guillaume Chatelet [Mon, 16 Jan 2023 12:57:13 +0000 (12:57 +0000)]
[NFC] Remove dead code

18 months agoDeprecate Argument::getParamAlignment()
Guillaume Chatelet [Mon, 16 Jan 2023 12:43:52 +0000 (12:43 +0000)]
Deprecate Argument::getParamAlignment()

18 months ago[LoopUnroll] Don't update DT for changeToUnreachable.
Florian Hahn [Mon, 16 Jan 2023 12:25:34 +0000 (12:25 +0000)]
[LoopUnroll] Don't update DT for changeToUnreachable.

There is no need to update the DT here, because there must be a unique
latch. Hence if the latch is not exiting it must directly branch back
to the original loop header and does not dominate any nodes.

Skipping a DT update here simplifies D141487.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D141810

18 months ago[bazel] Another blank-line format fix for the utils/bazel/configure.bzl, NFC
Haojian Wu [Mon, 16 Jan 2023 12:13:20 +0000 (13:13 +0100)]
[bazel] Another blank-line format fix for the utils/bazel/configure.bzl, NFC

18 months agoRevert "[GVN] Refactor handling of pointer-select in GVN pass"
Sergey Kachkov [Mon, 16 Jan 2023 12:13:17 +0000 (15:13 +0300)]
Revert "[GVN] Refactor handling of pointer-select in GVN pass"

This reverts commit fc7cdaa373308ce3d72218b4d80101ae19850a6c.

18 months ago[AArch64] Add tests for dotreduce to check for wider vectors.
Zain Jaffal [Mon, 16 Jan 2023 12:04:39 +0000 (12:04 +0000)]
[AArch64] Add tests for dotreduce to check for wider vectors.

Currently we only reduce vector.reduce.add to sdot if the vectors are either <8 x i8> or <16 x i8>.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D141692

18 months ago[JumpThreading] Preserve profile metadata during select unfolding, take 2
Max Kazantsev [Mon, 16 Jan 2023 11:53:34 +0000 (18:53 +0700)]
[JumpThreading] Preserve profile metadata during select unfolding, take 2

Jump threading can replace select and unconditional branch with
conditional branch, but when doing so loses profile information.

This destructive transform can eventually lead to a performance
degradation due to folding of branches in
shouldFoldCondBranchesToCommonDestination as branch probabilities
are no longer known.

The first version was reverted due to assert caused by i32 overflow,
fixed in this version.

Patch by Roman Paukner!

Differential Revision: https://reviews.llvm.org/D138132
Reviewed By: mkazantsev

18 months ago[bazel] Fix the format of utils/bazel/configure.bzl, NFC
Haojian Wu [Mon, 16 Jan 2023 11:57:57 +0000 (12:57 +0100)]
[bazel] Fix the format of utils/bazel/configure.bzl, NFC

18 months ago[AArch64][SME] Add an instruction mapping for SME pseudos
Kerry McLaughlin [Mon, 16 Jan 2023 11:36:37 +0000 (11:36 +0000)]
[AArch64][SME] Add an instruction mapping for SME pseudos

Adds an instruction mapping to SMEInstrFormats which matches SME
pseudos with the real instructions they are transformed to.
A new flag is also added to AArch64Inst (SMEMatrixType), which is
used to indicate the base register required when emitting many
of the SME instructions.

This reduces the number of pseudos handled by the switch statement
in EmitInstrWithCustomInserter.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D136856

18 months ago[NFC] Fixed a typo in clang help docs
Jolanta Jensen [Tue, 10 Jan 2023 13:53:28 +0000 (13:53 +0000)]
[NFC] Fixed a typo in clang help docs

Fixed minor typo in clang help docs.

Differential Revision: https://reviews.llvm.org/D141507

18 months ago[OpenCL] Allow undefining header-only features
Sven van Haastregt [Mon, 16 Jan 2023 11:32:12 +0000 (11:32 +0000)]
[OpenCL] Allow undefining header-only features

`opencl-c-base.h` always defines 5 particular feature macros for
SPIR-V, making it impossible to disable those features.

To allow disabling any of those features, let the header recognize
`__undef_<feature>` macros.  The user can then pass the
`-D__undef_<feature>` flag on the command line to disable a specific
feature.  The __undef macro could potentially also be set from
`-cl-ext=-feature`, but for now only change the header and only
provide __undef macros for the 5 features that are always enabled in
`opencl-c-base.h`.

Differential Revision: https://reviews.llvm.org/D141297

18 months agoAdd test for an invalid requirement in requires expr.
Utkarsh Saxena [Mon, 16 Jan 2023 06:29:38 +0000 (07:29 +0100)]
Add test for an invalid requirement in requires expr.

The one introduced in D140547 was brittle. Fixing max template depth to
a small value would still test the same issue without causing actual
stack exhaustion.

Differential Revision: https://reviews.llvm.org/D141818

18 months ago[Clang] Convert test to opaque pointers (NFC)
Nikita Popov [Mon, 16 Jan 2023 11:03:29 +0000 (12:03 +0100)]
[Clang] Convert test to opaque pointers (NFC)

A very annoying update, because some but now all of the zero-index
GEPs are omitted with opaque pointers.

18 months ago[GVN] Refactor handling of pointer-select in GVN pass
Sergey Kachkov [Thu, 22 Dec 2022 13:59:06 +0000 (16:59 +0300)]
[GVN] Refactor handling of pointer-select in GVN pass

This patch introduces new type of memory dependency - Select to
consistently handle it like Def/Clobber dependency.

Differential Revision: https://reviews.llvm.org/D141619

18 months ago[mlir][NFC] Set `useFoldAPI` to `kEmitRawAttributesFolder` value for some dialects...
Markus Böck [Sun, 15 Jan 2023 13:52:05 +0000 (14:52 +0100)]
[mlir][NFC] Set `useFoldAPI` to `kEmitRawAttributesFolder` value for some dialects missed previously

Found these while working on https://reviews.llvm.org/D141604. These were previously not found due to the old implementation only emitting warnings if an Op has a `fold`.

Changing these values both avoid the deprecation warning and if new `fold`s were added to ops of these dialects, that they are already using the new API.

Differential Revision: https://reviews.llvm.org/D141795

18 months ago[AArch64] Sink to umull if we know tops bits are zero.
David Green [Mon, 16 Jan 2023 10:44:38 +0000 (10:44 +0000)]
[AArch64] Sink to umull if we know tops bits are zero.

This is an extension to the code for sinking splats to multiplies, where
if we can detect that the top bits are known-zero we can treat the
instruction like a zext. The existing code was also adjusted in the
process to make it more precise about only sinking if both operands are
zext or sext.

Differential Revision: https://reviews.llvm.org/D141275

18 months ago[VPlan] Use VPDef prefix for VPDef IDs instead of VPRecipeBase (NFC).
Florian Hahn [Mon, 16 Jan 2023 10:23:51 +0000 (10:23 +0000)]
[VPlan] Use VPDef prefix for VPDef IDs instead of VPRecipeBase (NFC).

Various places in the code where still using the VPRecipeBase:: prefix
for VPDef IDs or not prefix at all. Now that the VPDef IDs have been
moved to VPDef, use this prefix instead and consistently use it.

18 months ago[SCEV][NFC] Make computeExitLimitFromCond public
Max Kazantsev [Mon, 16 Jan 2023 09:49:54 +0000 (16:49 +0700)]
[SCEV][NFC] Make computeExitLimitFromCond public

Make it available for external use.

Differential Revision: https://reviews.llvm.org/D141457
Reviewed By: lebedev.ri

18 months ago[AArch64][InstCombine] Simplify repeated complex patterns in dupqlane
Matt Devereau [Fri, 16 Dec 2022 11:19:28 +0000 (11:19 +0000)]
[AArch64][InstCombine] Simplify repeated complex patterns in dupqlane

Repeated floating-point complex patterns in dupqlane such as (f32 a, f32 b, f32
a, f32 b) can be simplified to shufflevector(f64(a, b), undef, 0)

18 months agoPartially revert 931d04be2fc8
Benjamin Kramer [Mon, 16 Jan 2023 09:37:53 +0000 (10:37 +0100)]
Partially revert 931d04be2fc8

This change makes StringRefTest fail in some configuration. Might be a
standard library issue or miscompile.

18 months ago[clang][Interp][NFC] Use range for loop
Timm Bäder [Mon, 16 Jan 2023 07:39:37 +0000 (08:39 +0100)]
[clang][Interp][NFC] Use range for loop

18 months ago[LLDB] Fix help text for "platform settings"
David Spickett [Fri, 28 Oct 2022 09:13:13 +0000 (09:13 +0000)]
[LLDB] Fix help text for "platform settings"

This claims to take a platform name argument but doesn't.

That was probably the intent in fbb7634934d40548b650574a2f2a85ab41527674
but it has only ever worked with the current platform.

Reviewed By: clayborg

Differential Revision: https://reviews.llvm.org/D136928

18 months ago[WebAssembly] Convert test to opaque pointers (NFC)
Nikita Popov [Mon, 16 Jan 2023 09:27:46 +0000 (10:27 +0100)]
[WebAssembly] Convert test to opaque pointers (NFC)

This test was testing both typed and opaque pointers. Remove the
typed pointer check lines, and update the input IR to use opaque
pointers. Note that with opaque pointers, the "bitcast" is not
explicit, but rather just a mismatch in function type between
the call and the declaration.

18 months ago[WebAssembly] Remove redundant opaque pointers test (NFC)
Nikita Popov [Mon, 16 Jan 2023 09:26:11 +0000 (10:26 +0100)]
[WebAssembly] Remove redundant opaque pointers test (NFC)

add-prototype.ll has since been converted.

18 months ago[LoongArch] Convert tests to opaque pointers (NFC)
Nikita Popov [Mon, 16 Jan 2023 09:20:21 +0000 (10:20 +0100)]
[LoongArch] Convert tests to opaque pointers (NFC)

18 months ago[include-cleaner] FindHeaders respects IWYU export pragma for standard headers.
Haojian Wu [Fri, 13 Jan 2023 10:39:00 +0000 (11:39 +0100)]
[include-cleaner] FindHeaders respects IWYU export pragma for standard headers.

Fixes https://github.com/llvm/llvm-project/issues/59927

Differential Revision: https://reviews.llvm.org/D141670

18 months agoAdd Release Notes and Doc for -fmodule-output
Chuanqi Xu [Mon, 16 Jan 2023 08:58:13 +0000 (16:58 +0800)]
Add Release Notes and Doc for -fmodule-output

As the summary explained in https://reviews.llvm.org/D137058,
the design of `-fmodule-output` changes relatively frequently
so I skipped the release notes and docs for -fmodule-output in the
the patches. And the patches get accepted and landed. The patch adds
the related release notes and docs.

18 months agoRevert "[C2x] reject type definitions in offsetof"
Yingchi Long [Mon, 16 Jan 2023 08:52:50 +0000 (16:52 +0800)]
Revert "[C2x] reject type definitions in offsetof"

This reverts commit e327b52766ed497e4779f4e652b9ad237dfda8e6.

18 months ago[flang] Switch spread first argument lowering from asAddr to asBox
Valentin Clement [Mon, 16 Jan 2023 08:37:38 +0000 (09:37 +0100)]
[flang] Switch spread first argument lowering from asAddr to asBox

Use asBox so no simply contiguous argument do not issue a copy and also
support polymorphic entity out of the box.

Depends on D141667

Reviewed By: jeanPerier, PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D141678

18 months ago[flang] Update createTempMutableBox to support polymorphic entities
Valentin Clement [Mon, 16 Jan 2023 08:36:28 +0000 (09:36 +0100)]
[flang] Update createTempMutableBox to support polymorphic entities

When creating temporary from a polymorphic entity, its dynamic type
information must be carried over to the temporary.
This patch updates createTempMutableBox to support passing a source_box
from which the information will be carried over.
This is tested on the spread intrinsic and follow-up patches will updates
other temporary creation where needed.

Reviewed By: jeanPerier, PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D141667

18 months ago[flang][NFC] centralize FreeMemOp generation in IntrinsicCall.cpp
Jean Perier [Mon, 16 Jan 2023 08:20:54 +0000 (09:20 +0100)]
[flang][NFC] centralize FreeMemOp generation in IntrinsicCall.cpp

The current intrinsic call lowering contains a lot of repetitive
patterns when it comes to dealing with temporary allocatable
results allocated by the runtime that need to be dereferenced and
for which a clean-up (free) must be scheduled in the StatementContext.

For HLFIR lowering, I will need to deal with the clean-up in a different
way since the results will be "moved" into expression nodes and
the clean-up will be inserted in bufferization after the last hlfir.expr
usage. Centralizing the clean-up code will make that easier, and is
regardless of this motivation a quality improvement.

Some static helpers had to be moved to IntrinsicBuilder method so that
they could call the readAndAddCleanUp code.

Differential Revision: https://reviews.llvm.org/D141669

18 months ago[MLIR] Fix tiling for `tensor.unpack` with outer permutations
Lorenzo Chelini [Fri, 13 Jan 2023 16:32:34 +0000 (17:32 +0100)]
[MLIR] Fix tiling for `tensor.unpack` with outer permutations

An outer dim permutation requires adjusting the offsets and sizes of the
`tensor.extract_slice` operations generated during tiling. Originally
this was done by computing an inverse permutation of the outer
permutation for both `tensor.pack` and `tensor.unpack`. For packing, the
tiling is applied on interchanged dimensions; thus, it is correct to
compute the inverse. For unpacking, on the other hand, tiling involves
the output tensor that does not have interchanged dimensions, and no
inverse is required.

Reviewed By: hanchung

Differential Revision: https://reviews.llvm.org/D141688

18 months ago[lldb] Fix comments referring to BCR_M_IMVA_MATCH
Saagar Jha [Mon, 16 Jan 2023 07:38:41 +0000 (23:38 -0800)]
[lldb] Fix comments referring to BCR_M_IMVA_MATCH

It seems like these were copied from the single-step code and not
updated to match the new flags.

Differential revision: https://reviews.llvm.org/D141816

18 months ago[AMDGPU] Regenerate extend-phi-subrange-not-in-parent.mir
Pierre van Houtryve [Mon, 16 Jan 2023 07:28:08 +0000 (02:28 -0500)]
[AMDGPU] Regenerate extend-phi-subrange-not-in-parent.mir

18 months ago[RISCV] Invert an if statement in lowerSELECT to reduce nesting. NFC
Craig Topper [Mon, 16 Jan 2023 07:10:42 +0000 (23:10 -0800)]
[RISCV] Invert an if statement in lowerSELECT to reduce nesting. NFC

The if body returned at the end and contained more code than what
came after it. Reverse the condition and move the simpler code from
the end of the function into the new if body.

18 months ago[NFC] [Modules] Add test from PR60036
Chuanqi Xu [Mon, 16 Jan 2023 07:08:14 +0000 (15:08 +0800)]
[NFC] [Modules] Add test from PR60036

Although I failed to reproduce the problem in pr60036, it should be
always good to have more tests.

18 months ago[ORC-RT] Fix a typo in file header.
Lang Hames [Mon, 16 Jan 2023 06:24:49 +0000 (22:24 -0800)]
[ORC-RT] Fix a typo in file header.

18 months ago[Driver] [C++20] [Modules] Don't emit unused-argument warning for '-fmodule-output...
Chuanqi Xu [Mon, 16 Jan 2023 06:07:58 +0000 (14:07 +0800)]
[Driver] [C++20] [Modules] Don't emit unused-argument warning for '-fmodule-output'  and '-fmodule-output='

Suppres the r unused-argument warning for -fmodule-output according to
the discussion thread in
https://gcc.gnu.org/pipermail/gcc/2022-December/240239.html.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D140001

18 months ago[Driver] [C++20] [Modules] Support -fmodule-output= (2/2)
Chuanqi Xu [Mon, 16 Jan 2023 03:21:57 +0000 (11:21 +0800)]
[Driver] [C++20] [Modules] Support -fmodule-output= (2/2)

The patch implements `-fmodule-output=`. This is helpful if the build
systems want to generate these output files in other places which is not
the same with -o specified or the input file lived.

Reviewed By: dblaikie, iains

Differential Revision: https://reviews.llvm.org/D137059

18 months ago[NFC] Only run clang/test/Driver/module-output.cppm on x86 registered targets
Chuanqi Xu [Mon, 16 Jan 2023 05:42:36 +0000 (13:42 +0800)]
[NFC] Only run clang/test/Driver/module-output.cppm on x86 registered targets

On other targets (like ppc64-aix), the default output for `-c` may be `.s` instead of `.o`,
which makes the test failing. The patch require the test only run on
x86 registered targets to avoid the problem.

18 months ago[mlir][TilingInterface] NFC: Separate out a utility method to perform one step of...
Mahesh Ravishankar [Tue, 6 Dec 2022 06:24:54 +0000 (06:24 +0000)]
[mlir][TilingInterface] NFC: Separate out a utility method to perform one step of tile + fuse.

Differential Revision: https://reviews.llvm.org/D141027

18 months ago[mlir][TilingInterface] NFC: Consolidate yield handling.
Mahesh Ravishankar [Tue, 6 Dec 2022 06:07:05 +0000 (06:07 +0000)]
[mlir][TilingInterface] NFC: Consolidate yield handling.

Add a new utility method to yield the tiled value as well as
preserving destination passing style.

Differential Revision: https://reviews.llvm.org/D139392

18 months ago[mlir] Add a method to `RewriteBase` to replace a `Value` selectively.
Mahesh Ravishankar [Wed, 4 Jan 2023 20:32:18 +0000 (20:32 +0000)]
[mlir] Add a method to `RewriteBase` to replace a `Value` selectively.

This method allows to selectively control from the caller when to
replace the uses of a `Value`. Still notifies the rewriter that the
user is updated in-place.

Differential Revision: https://reviews.llvm.org/D141026

18 months ago[RISCV] Generate march string from target features
wangpc [Mon, 16 Jan 2023 03:07:34 +0000 (11:07 +0800)]
[RISCV] Generate march string from target features

As what has been mentioned in D137517, this patch is to simplify
processors definitions in RISCV.td. We don't have to specify march
string since we can generate it from target features.

Reviewed By: fpetrogalli, kito-cheng

Differential Revision: https://reviews.llvm.org/D141479

18 months ago[NFC] Require tests to skip on windows to avoid handling the different
Chuanqi Xu [Mon, 16 Jan 2023 03:39:04 +0000 (11:39 +0800)]
[NFC] Require tests to skip on windows to avoid handling the different
slash in the filesystem

The modified test fails on windows for the diffeent slash direction ('/'
in linux and '/' on windows). The patch requires the test to skip on
windows to avoid such differences.

18 months ago[InstCombine] Remove dead code from foldICmpShlOne. NFC
Craig Topper [Mon, 16 Jan 2023 02:55:15 +0000 (18:55 -0800)]
[InstCombine] Remove dead code from foldICmpShlOne. NFC

This code handles (icmp eq/ne (1 << Y), C) if C is a power of 2.

This case is also handled by the more general foldICmpShlConstConst
which is called before we reach foldICmpShlOne.

18 months ago[Driver] [Modules] Support -fmodule-output (1/2)
Chuanqi Xu [Wed, 28 Dec 2022 05:48:08 +0000 (13:48 +0800)]
[Driver] [Modules] Support -fmodule-output (1/2)

Patches to support the one-phase compilation model for modules.

The behavior:
(1) If -o and -c is specified , the module file is in the same path
within the same directory as the output the -o specified and with a new
suffix .pcm.
(2) Otherwise, the module file is in the same path within the working
directory directory with the name of the input file with a new suffix
.pcm

For example,

```
Hello.cppm Use.cpp
```

A trivial one and the contents are ignored. When we run:

```
clang++ -std=c++20 -fmodule-output Hello.cppm -c
```

The directory would look like:

```
Hello.cppm  Hello.o  Hello.pcm Use.cpp
```

And if we run:

```
clang++ -std=c++20 -fmodule-output Hello.cppm -c -o output/Hello.o
```

Then the `output` directory may look like:

```
Hello.o  Hello.pcm
```

Reviewed By: dblaikie, iains, tahonermann

Differential Revision: https://reviews.llvm.org/D137058

18 months ago[DAG] Recombine (binop (shift x y))
Amaury Séchet [Sun, 15 Jan 2023 21:53:46 +0000 (21:53 +0000)]
[DAG] Recombine (binop (shift x y))

This helps address regressions in D127115 .

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D141809

18 months ago[X86] Add AVX512FP16 test coverage to splat(fpext) tests.
Freddy Ye [Mon, 16 Jan 2023 02:18:04 +0000 (10:18 +0800)]
[X86] Add AVX512FP16 test coverage to splat(fpext) tests.

18 months ago[FuncitonComparator] Clamp StringRef compare output to [-1,1]
Benjamin Kramer [Mon, 16 Jan 2023 00:43:35 +0000 (01:43 +0100)]
[FuncitonComparator] Clamp StringRef compare output to [-1,1]

The comparison can have different values (but same sign) on big endian
platforms, avoid that to make the unit test green there.

18 months ago[NFC][X86] Improve test coverage for shuffles-of-shifts
Roman Lebedev [Sun, 15 Jan 2023 21:52:06 +0000 (00:52 +0300)]
[NFC][X86] Improve test coverage for shuffles-of-shifts

18 months ago[InstCombine] Generalize (icmp sgt (1 << Y), -1) -> (icmp ne Y, BitWidth-1) to any...
Craig Topper [Sun, 15 Jan 2023 21:36:57 +0000 (13:36 -0800)]
[InstCombine] Generalize (icmp sgt (1 << Y), -1) -> (icmp ne Y, BitWidth-1) to any negative constant.

Similar for the sle version which will be canonicalized to slt first.

Alive2 proof as implemented: https://alive2.llvm.org/ce/z/_YawdM

@spatel's  original Alive2: https://alive2.llvm.org/ce/z/3YB2vs

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D141773

18 months ago[NFC][X86] Add tests for splat-in-disguise of shift-by-imm of splat
Roman Lebedev [Sun, 15 Jan 2023 18:45:32 +0000 (21:45 +0300)]
[NFC][X86] Add tests for splat-in-disguise of shift-by-imm of splat

18 months ago[NFC][TargetLowering] `isSplatValueForTargetNode()`: add `DAG` operand
Roman Lebedev [Sun, 15 Jan 2023 18:36:34 +0000 (21:36 +0300)]
[NFC][TargetLowering] `isSplatValueForTargetNode()`: add `DAG` operand

Without it we can't recurse further.

18 months ago[lldb] Unbreak test after 931d04be2fc8f3f0505b43e64297f75d526cb42a
Benjamin Kramer [Sun, 15 Jan 2023 20:39:31 +0000 (21:39 +0100)]
[lldb] Unbreak test after 931d04be2fc8f3f0505b43e64297f75d526cb42a

18 months ago[ADT] Forward some StringRef::find overloads to std::string_view
Benjamin Kramer [Sun, 15 Jan 2023 19:58:09 +0000 (20:58 +0100)]
[ADT] Forward some StringRef::find overloads to std::string_view

These are identical in terms of functionality and performance (checked
libc++ and libstdc++). We could do the same for rfind, but that actually
has a off-by one on its position argument.

StringRef::find(StringRef) seems to be quite a bit more optimized than
the standard library one, so leave it alone.

18 months ago[ADT] Make StringRef::compare like std::string_view::compare
Benjamin Kramer [Sun, 15 Jan 2023 19:56:34 +0000 (20:56 +0100)]
[ADT] Make StringRef::compare like std::string_view::compare

string_view has a slightly weaker contract, which only specifies whether
the value is bigger or smaller than 0. Adapt users accordingly and just
forward to the standard function (that also compiles down to memcmp)

18 months ago[OpenMP][JIT] Introduce more debugging configuration options
Johannes Doerfert [Sun, 15 Jan 2023 19:28:03 +0000 (11:28 -0800)]
[OpenMP][JIT] Introduce more debugging configuration options

The JIT is a great debugging tool since we can modify the IR manually
before launching it in an existing test case. The new flasks allow to
skip optimizations, to use the exact given IR, as well as to provide a
finished object file. The latter is useful to try out different backend
options and to have complete freedom with pass pipelines.

Documentation is included. Minimal refactoring was performed to make the
second object fit in nicely.

18 months ago[OpenMP][JIT] Cleanup JIT interface, caching, and races
Johannes Doerfert [Wed, 4 Jan 2023 19:33:44 +0000 (11:33 -0800)]
[OpenMP][JIT] Cleanup JIT interface, caching, and races

The JIT interface was somewhat irregular as it used multiple global
functions. It also did not cache the results of the JIT, hence multiple
GPU systems would perform the work multiple times. Finally, there might
have been races on the state if we have multi-threaded initialization of
different embedded images, or one image initialized on multiple devices.

This patch tries to rectify all of the above. The JITEngine is now a
part of the GenericPluginTy and tied to one target triple. To support
multiple "ComputeUnitKind"s (previously confusingly called Arch or
[M]CPU) and to avoid re-jitting for the same ComputeUnitKind, we keep a
map of JIT results per ComputeUnitKind. All interaction with the JIT
happens through the JITEngine directly, two functions are exposed. Both
use (shared) locks to avoid races and cache the result. All JIT-related
environment variables are now defined together.

Differential Revision: https://reviews.llvm.org/D141081

18 months ago[OpenMP][NFC] Introduce helper functions to hide casts and such
Johannes Doerfert [Wed, 28 Dec 2022 06:02:47 +0000 (22:02 -0800)]
[OpenMP][NFC] Introduce helper functions to hide casts and such

Differential Revision: https://reviews.llvm.org/D140719

18 months ago[lldb] Fix typos and update "GDB To LLDB Command Map" to be a bit more clear
Saagar Jha [Sun, 15 Jan 2023 19:09:27 +0000 (11:09 -0800)]
[lldb] Fix typos and update "GDB To LLDB Command Map" to be a bit more clear

I've gone through the GDB To LLDB Command Map and tried to improve it:

 - Fix obvious typos (e.g. <cope> → <code>)
 - Wrap code and program names in <code> tags
 - Reword a couple parts where (IMHO) the phrasing could be a bit better

Differential revision: https://reviews.llvm.org/D28758

18 months ago[DAGCombiner] `combineShuffleOfSplatVal()`: try to canonicalize to a splat shuffle
Roman Lebedev [Sun, 15 Jan 2023 17:20:44 +0000 (20:20 +0300)]
[DAGCombiner] `combineShuffleOfSplatVal()`: try to canonicalize to a splat shuffle

As noted in https://reviews.llvm.org/D141778#inline-1369900,
we fail to produce splat shuffles from certain sequences
of shuffles, that may have non-shuffles in the middle of seq.

There is a big pitfail to avoid here: just because `isSplatValue()`
says that all demanded elements are splat, we can't pick any random
one of them, because some of them could be undef! We must ignore those!

18 months ago[gn build] Port 94461822c75d
LLVM GN Syncbot [Sun, 15 Jan 2023 17:42:57 +0000 (17:42 +0000)]
[gn build] Port 94461822c75d

18 months agoDAG: Avoid stack lowering if bitcast has an illegal vector result type
Matt Arsenault [Fri, 13 Jan 2023 21:15:52 +0000 (16:15 -0500)]
DAG: Avoid stack lowering if bitcast has an illegal vector result type

A bitcast of <10 x i32> to <5 x i64> was ending up on the
stack. Instead of doing that, handle the case where the new type
doesn't evenly divide but the elements do. Extract the individual
elements and pad with undef.

Avoids stack usage for bitcasts involving <5 x i64>. In some of these
cases, later optimizations actually eliminated the stack objects but
left behind the unused temporary stack object to final emission.

Fixes: SWDEV-377548

18 months ago[libc++][ranges] implement `std::views::elements_view`
Hui Xie [Fri, 4 Nov 2022 11:53:38 +0000 (11:53 +0000)]
[libc++][ranges] implement `std::views::elements_view`

`subrange` is also a `tuple-like`. To avoid the add entire `subrange` dependencies to `tuple-like`, we need forward declaration of `subrange`. However, the class template constraints of `subrange` currently requires `__iterator/concepts.h`, which requires `<concepts>`. The problem is that currently `tuple-like` is used in several different places, including libc++ extension for pair constructors. we don't want to add `<concepts>` to pair and other stuff. So this change also created several small headers that `subrange`'s declaration needed inside `__iterator/concepts/`

Differential Revision: https://reviews.llvm.org/D136268

18 months ago[MLIR] Simplify predicate in Matchers.h, NFC
Chris Lattner [Sun, 15 Jan 2023 06:05:54 +0000 (22:05 -0800)]
[MLIR] Simplify predicate in Matchers.h, NFC

The ConstantLike trait already static_asserts that operations
implementing it have a single result and zero operands, so we
don't need to redundantly check in Matchers.h

The static assert is in `class ConstantLike` in OpDefinition.h

Differential Revision: https://reviews.llvm.org/D141783

18 months ago[mlir][ods] Rework how transitive use of deprecated defs are handled
Markus Böck [Sun, 15 Jan 2023 13:52:14 +0000 (14:52 +0100)]
[mlir][ods] Rework how transitive use of deprecated defs are handled

The code currently attempting to recursively find uses of a deprecated def has a few deficiences:
* It recurses into all def uses. This is problematic as it also causes any users of a def using a deprecated def, to be considered deprecated, causing a transitive chain of deprecated defs (see `H_ButNotTransitivelyInNonAnonymousDef` in test case for reproducer)
* It did not recurse into other kinds of fields, such as list and DAGs

This patch fixes the issue by reworking the code to properly recurse into inits and not to recurse into def uses unless they are anonymous defs. Since inits (including DAG, List and anonymous defs) are uniqued, the memoization is kept and remains profitable.

Differential Revision: https://reviews.llvm.org/D141794

18 months ago[Support] clang-format partMSBpartMSB and partLSB (NFC)
Kazu Hirata [Sun, 15 Jan 2023 17:13:26 +0000 (09:13 -0800)]
[Support] clang-format partMSBpartMSB and partLSB (NFC)

18 months agoUse the default parameters of countTrailingZeros and find{First,Last}Set (NFC)
Kazu Hirata [Sun, 15 Jan 2023 17:04:57 +0000 (09:04 -0800)]
Use the default parameters of countTrailingZeros and find{First,Last}Set (NFC)

This patch uses the default parameters of countTrailingZeros,
findFirstSet, and findLastSet, which are ZB_Width, ZB_Max, and ZB_Max,
respectively.

18 months agoGlobalISel: Enable CSE for G_SEXT_INREG
Matt Arsenault [Tue, 21 Jul 2020 23:10:01 +0000 (19:10 -0400)]
GlobalISel: Enable CSE for G_SEXT_INREG

18 months ago[X86] Move isShuffleMaskInputInPlace to allow additional uses in a future patch....
Simon Pilgrim [Sun, 15 Jan 2023 15:43:19 +0000 (15:43 +0000)]
[X86] Move isShuffleMaskInputInPlace to allow additional uses in a future patch. NFCI.

A future patch needs isShuffleMaskInputInPlace defined earlier in the source file.

18 months agoValueTracking: Teach CannotBeOrderedLessThanZero about rounding intrinsics
Matt Arsenault [Sun, 4 Dec 2022 04:24:01 +0000 (23:24 -0500)]
ValueTracking: Teach CannotBeOrderedLessThanZero about rounding intrinsics

These should obviously preserve the sign although the variety of these
always confuses me.