platform/upstream/llvm.git
20 months ago[LinkerWrapper] Perform device linking steps in parallel
Joseph Huber [Tue, 25 Oct 2022 17:28:28 +0000 (12:28 -0500)]
[LinkerWrapper] Perform device linking steps in parallel

This patch changes the device linking steps to be performed in parallel
when multiple offloading architectures are being used. We use the LLVM
parallelism support to accomplish this by simply doing each inidividual
device linking job in a single thread. This change required re-parsing
the input arguments as these arguments have internal state that would
not be properly shared between the threads otherwise.

By default, the parallelism uses all threads availible. But this can be
controlled with the `--wrapper-jobs=` option. This was required in a few
tests to ensure the ordering was still deterministic.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D136701

20 months ago[lldb] Update regex to be less fragile in TestDataFormatterGenericUnordered
Dave Lee [Fri, 11 Nov 2022 19:07:37 +0000 (11:07 -0800)]
[lldb] Update regex to be less fragile in TestDataFormatterGenericUnordered

Follow up to D129386 where libc++ naming conventions were made consistent.

This changes the pattern to not rely on the internal name (`__cc` or `__cc_`),
and instead uses a pattern to check that the child has the form:

```
[0] = {
  first = ...
```

Thanks to @rupprecht for pointing out this issue: https://reviews.llvm.org/D133259#3773120

Reviewed By: rupprecht

Differential Revision: https://reviews.llvm.org/D133395

20 months ago[lldb] [cmake] Fix typo in unittest directory path
Michał Górny [Fri, 11 Nov 2022 19:38:56 +0000 (20:38 +0100)]
[lldb] [cmake] Fix typo in unittest directory path

Fix a typo in a11cd0d94ed3cabf0998a0289aead05da94c86eb that resulted
in additional "}" in unittest directory path, e.g.:

    CMake Error at cmake/modules/LLDBStandalone.cmake:104 (add_subdirectory):
      add_subdirectory given source
      "/var/tmp/portage/dev-util/lldb-16.0.0_pre20221111/work/lldb/../third-party}/utils/unittest"
      which is not an existing directory.
    Call Stack (most recent call first):
      CMakeLists.txt:29 (include)

20 months agoRevert "[Clang][AArch64][Darwin] Enable GlobalISel by default for Darwin ARM64 platfo...
Alex Brachet [Fri, 11 Nov 2022 19:40:08 +0000 (19:40 +0000)]
Revert "[Clang][AArch64][Darwin] Enable GlobalISel by default for Darwin ARM64 platforms."

This reverts commit f64802e8d3e9db299cad913ffcb734c8d35dc5f0.

20 months agoAdd a const version of SDUse::getUser [nfc]
Philip Reames [Fri, 11 Nov 2022 19:10:29 +0000 (11:10 -0800)]
Add a const version of SDUse::getUser [nfc]

20 months agoModel UB in integer division operations in the arith dialect
Sanjoy Das [Fri, 11 Nov 2022 05:31:33 +0000 (21:31 -0800)]
Model UB in integer division operations in the arith dialect

Before this commit `arith.{ceil}div{u|s}i` were marked `Pure` which is
incorrect because these operations invoke UB on certain inputs.

Fixes: https://github.com/llvm/llvm-project/issues/58700

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D137814

20 months ago[ObjectYAML] Basic support for chained fixups.
Daniel Rodríguez Troitiño [Fri, 11 Nov 2022 18:13:37 +0000 (10:13 -0800)]
[ObjectYAML] Basic support for chained fixups.

Add basic binary support for chained fixups. This allows basic tests
with chained fixups without trying to create a format for them until the
work on the Object library is considered finished.

Reviewed By: pete

Differential Revision: https://reviews.llvm.org/D134250

20 months agoFix typo; NFC
Aaron Ballman [Fri, 11 Nov 2022 17:22:56 +0000 (12:22 -0500)]
Fix typo; NFC

Co-authored-by: Guillot Tony <tony.guillot@protonmail.com>
20 months agoRevert "[LLDB] Devirtualize coroutine promise types for `std::coroutine_handle`"
Adrian Vogelsgesang [Fri, 11 Nov 2022 17:59:08 +0000 (09:59 -0800)]
Revert "[LLDB] Devirtualize coroutine promise types for `std::coroutine_handle`"

This reverts commit 558db7787005348e2efaabb628ec36f1c461a741 due to
buildbot failures on ARM
* https://lab.llvm.org/buildbot/#/builders/96/builds/31416
* https://lab.llvm.org/buildbot/#/builders/17/builds/30086

20 months ago[NFC][AArch64]Call encoding functions for left-shift immediate (which is no-op in...
Mingming Liu [Thu, 10 Nov 2022 20:12:21 +0000 (12:12 -0800)]
[NFC][AArch64]Call encoding functions for left-shift immediate (which is no-op in terms of value but better code style)

Call encoding functions for left-shfit immidate for consistency (and
easier tracking if the encoding ever changes in the future).

Differential Revision: https://reviews.llvm.org/D137797

20 months ago[mlir][sparse] Extend more integration to run on the codegen path.
bixia1 [Fri, 11 Nov 2022 16:46:55 +0000 (08:46 -0800)]
[mlir][sparse] Extend more integration to run on the codegen path.

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D137850

20 months ago[X86] Split int2double and float2double scheduler classes on Haswell/Broadwell to...
Simon Pilgrim [Fri, 11 Nov 2022 17:39:14 +0000 (17:39 +0000)]
[X86] Split int2double and float2double scheduler classes on Haswell/Broadwell to remove overrides

Haswell/Broadwell have numerous conversion instructions that use different scheduler pipes for the reg-reg and reg-mem variants (and not an additional Port23 uop for memory folding) - so declare the classes separately instead of using the HWWriteResPair/BWWriteResPair helpers

20 months ago[LLDB] Devirtualize coroutine promise types for `std::coroutine_handle`
Adrian Vogelsgesang [Wed, 24 Aug 2022 03:53:00 +0000 (20:53 -0700)]
[LLDB] Devirtualize coroutine promise types for `std::coroutine_handle`

This commit teaches the `std::coroutine_handle` pretty-printer to
devirtualize type-erased promise types. This is particularly useful to
resonstruct call stacks, either of asynchronous control flow or of
recursive invocations of `std::generator`. For the example recently
introduced by https://reviews.llvm.org/D132451, printing the `__promise`
variable now shows

```
(std::__coroutine_traits_sfinae<task, void>::promise_type) __promise = {
  continuation = coro frame = 0x555555562430 {
    resume = 0x0000555555556310 (a.out`task detail::chain_fn<1>() at llvm-nested-example.cpp:66)
    destroy = 0x0000555555556700 (a.out`task detail::chain_fn<1>() at llvm-nested-example.cpp:66)
    promise = {
      continuation = coro frame = 0x5555555623e0 {
        resume = 0x0000555555557070 (a.out`task detail::chain_fn<2>() at llvm-nested-example.cpp:66)
        destroy = 0x0000555555557460 (a.out`task detail::chain_fn<2>() at llvm-nested-example.cpp:66)
        promise = {
          ...
        }
      }
      result = 0
    }
  }
  result = 0
}
```

(shortened to keep the commit message readable) instead of

```
(std::__coroutine_traits_sfinae<task, void>::promise_type) __promise = {
  continuation = coro frame = 0x555555562430 {
    resume = 0x0000555555556310 (a.out`task detail::chain_fn<1>() at llvm-nested-example.cpp:66)
    destroy = 0x0000555555556700 (a.out`task detail::chain_fn<1>() at llvm-nested-example.cpp:66)
  }
  result = 0
}
```

Note how the new debug output reveals the complete asynchronous call
stack: our own function resumes `chain_fn<1>` which in turn will resume
`chain_fn<2>` and so on. Thereby this change allows users of lldb to
inspect the logical coroutine call stack without using any custom debug
scripts (although the display is still a bit clumsy. It would be nicer
to also integrate this into lldb's backtrace feature, but I don't know
how to do so)

The devirtualization currently works by introspecting the function
pointed to by the `destroy` pointer. (The `resume` pointer is not worth
much, given that for the final suspend point `resume` is set to a
nullptr. We have to use the `destroy` pointer instead.) We then look
for a `__promise` variable inside the `destroy` function. This
`__promise` variable is synthetically generated by LLVM, and looking at
its type reveals the type-erased promise_type.

This approach only works for clang-generated code, though. While gcc
also adds a `_Coro_promise` variable to the `resume` function, it does
not do so for the `destroy` function. However, we can't use the `resume`
function, as it will be reset to a nullptr at the final suspension
point. For the time being, I am happy with de-virtualization only working
for clang. A follow-up commit will further improve devirtualization and
also expose the variables spilled to the coroutine frame. As part of
this, I will also revisit gcc support.

Differential Revision: https://reviews.llvm.org/D132624

20 months agoAMDGPU: Disable some class simplifications for strictfp
Matt Arsenault [Fri, 11 Nov 2022 16:51:24 +0000 (08:51 -0800)]
AMDGPU: Disable some class simplifications for strictfp

20 months ago[mlir][sparse] Fix a bug in rewriting dense2dense convert op.
bixia1 [Fri, 11 Nov 2022 07:13:20 +0000 (23:13 -0800)]
[mlir][sparse] Fix a bug in rewriting dense2dense convert op.

Permutation wasn't handled correctly. Add a test for the rewriting.

Extend an integration test to run with enable_runtime_library=false to
also test the rewriting.

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D137845

20 months agoconsistency: use spaces instead of tabs
Sylvestre Ledru [Fri, 11 Nov 2022 16:36:07 +0000 (17:36 +0100)]
consistency: use spaces instead of tabs

20 months ago[NFC] Remove unused OrigLoopID vars
Jordan Rupprecht [Fri, 11 Nov 2022 15:51:40 +0000 (07:51 -0800)]
[NFC] Remove unused OrigLoopID vars

20 months ago[LV] Remove unused OrigLoopID argument from completeLoopSekelton (NFC).
Florian Hahn [Fri, 11 Nov 2022 15:39:07 +0000 (15:39 +0000)]
[LV] Remove unused OrigLoopID argument from completeLoopSekelton (NFC).

The argument is not used any longer and can be removed.

20 months agoPrecommit for redundant and after SVE load
Benjamin Maxwell [Tue, 8 Nov 2022 11:52:55 +0000 (11:52 +0000)]
Precommit for redundant and after SVE load

20 months agoThe handling of 'funsafe-math-optimizations' doesn't update the 'MathErrno'
Zahira Ammarguellat [Mon, 7 Nov 2022 18:54:42 +0000 (13:54 -0500)]
The handling of 'funsafe-math-optimizations' doesn't update the 'MathErrno'
flag. But the driver checks for 'fno-math-errno' before passing
'funsafe-math-optimizations' to the FE. In GCC, the option
'funsafe-math-optimizations' doesn't affect the 'fmath-errno' flag.
This patch aligns clang with GCC.

'-ffast-math' sets the FPContract to 'fast'. But 'funsafe-math-optimizations'
the driver doesn't consider the FPContract when handling the option.
Unfortunately there are places in the BE that interpret unsafe math
mode as allowing FMA. This patch makes -ffast-math' and
'funsafe-math-optimizations' behave similarly in regard to the setting of the
FPContract.

Differential Revision: https://reviews.llvm.org/D137578

20 months ago[X86] Replace unnecessary CVTPS2DQ folded overrides with better base class defs
Simon Pilgrim [Fri, 11 Nov 2022 14:51:05 +0000 (14:51 +0000)]
[X86] Replace unnecessary CVTPS2DQ folded overrides with better base class defs

Broadwell just needed the load latency to be tweaked for the overrides to be unnecessary - I think this was due to Issue #38536 (underestimation of most broadwell load latencies)

20 months ago[InstSimplify] add test for fsub with inf operand; NFC
Sanjay Patel [Fri, 11 Nov 2022 13:51:13 +0000 (08:51 -0500)]
[InstSimplify] add test for fsub with inf operand; NFC

Verify that constant negation works with a partial undef vector.
Also, remove a bogus TODO comment on a related test.

20 months ago[MemCpyOpt] Avoid moving lifetime marker above def (PR58903)
Nikita Popov [Fri, 11 Nov 2022 14:05:11 +0000 (15:05 +0100)]
[MemCpyOpt] Avoid moving lifetime marker above def (PR58903)

This is unlikely to happen with opaque pointers, so just bail out
of the transform, rather than trying to move bitcasts/etc as well.

Fixes https://github.com/llvm/llvm-project/issues/58903.

20 months ago[include-cleaner] NFC, move the macro location fixme to findHeaders.
Haojian Wu [Fri, 11 Nov 2022 13:54:20 +0000 (14:54 +0100)]
[include-cleaner] NFC, move the macro location fixme to findHeaders.

20 months ago[InstSimplify] fold fsub nnan with Inf operand
Sanjay Patel [Thu, 10 Nov 2022 22:35:36 +0000 (17:35 -0500)]
[InstSimplify] fold fsub nnan with Inf operand

Similar to fbc2c8f2fbbb, but if we have a non-canonical
fsub with constant operand 1, then flip the sign of the
Infinity:
https://alive2.llvm.org/ce/z/vKWfhW

If Infinity is operand 0, then the sign remains:
https://alive2.llvm.org/ce/z/73d97C

20 months ago[include-cleaner] NFC, correct a comment in
Haojian Wu [Fri, 11 Nov 2022 13:40:08 +0000 (14:40 +0100)]
[include-cleaner] NFC, correct a comment in
PragmaIncludes::RecordPragma.

20 months ago[mlir][bufferize][NFC] Consolidate transform header files
Matthias Springer [Fri, 11 Nov 2022 12:49:02 +0000 (13:49 +0100)]
[mlir][bufferize][NFC] Consolidate transform header files

Differential Revision: https://reviews.llvm.org/D137830

20 months ago[compiler-rt] Mark $t* as clobbered for Linux/LoongArch syscalls
XingLi [Fri, 11 Nov 2022 13:22:45 +0000 (21:22 +0800)]
[compiler-rt] Mark $t* as clobbered for Linux/LoongArch syscalls

Linux/LoongArch doesn't preserve temporary registers across syscalls,
so we have to explicitly mark them as clobbered to avoid trashing local variables.

Reviewed By: xry111, xen0n, tangyouling, SixWeining

Differential Revision: https://reviews.llvm.org/D137396

20 months agofor Vignesh: land changes to disable two recent ompd random fails
Ron Lieberman [Fri, 11 Nov 2022 13:09:03 +0000 (07:09 -0600)]
for Vignesh: land changes to disable two recent ompd random fails

Differential Revision: https://reviews.llvm.org/D137831

20 months ago[clang-include-cleaner] make SymbolLocation a real class, move FindHeaders
Sam McCall [Fri, 11 Nov 2022 11:41:45 +0000 (12:41 +0100)]
[clang-include-cleaner] make SymbolLocation a real class, move FindHeaders

- replace SymbolLocation std::variant with enum-exposing version similar to
  those in types.cpp. There's no appropriate implementation file, added
  LocateSymbol.cpp in anticipation of locateDecl/locateMacro.
- FindHeaders is not part of the public Analysis interface, so should not
  be implemented/tested there (just code organization)
- rename findIncludeHeaders->findHeaders to avoid confusion with Include concept

Differential Revision: https://reviews.llvm.org/D137825

20 months ago[Clang][LoongArch] Remove duplicate declaration. NFC
wanglei [Fri, 11 Nov 2022 12:36:18 +0000 (20:36 +0800)]
[Clang][LoongArch] Remove duplicate declaration. NFC

20 months ago[include-cleaner] Provide public to_string of RefType (for HTMLReport), clean up...
Sam McCall [Fri, 11 Nov 2022 12:25:22 +0000 (13:25 +0100)]
[include-cleaner] Provide public to_string of RefType (for HTMLReport), clean up includes. NFC

20 months ago[openmp] [test] Set the right calling convention for the Windows thread start function
Martin Storsjö [Thu, 10 Nov 2022 10:37:00 +0000 (10:37 +0000)]
[openmp] [test] Set the right calling convention for the Windows thread start function

This is required on i386 Windows; this fixes 99 testcases in that
build configuration.

Differential Revision: https://reviews.llvm.org/D137776

20 months ago[openmp] [test] Use omp_testsuite.h instead of directly including pthread.h
Martin Storsjö [Wed, 2 Nov 2022 13:35:50 +0000 (13:35 +0000)]
[openmp] [test] Use omp_testsuite.h instead of directly including pthread.h

OpenMP tests that use pthread functions include this header instead.
On Unix systems, this header includes pthread.h, while it provides
minimal implementations of the used pthread functions for Windows.

Differential Revision: https://reviews.llvm.org/D137746

20 months ago[openmp] [test] Fix building the affinity/format/fields_values.c testcase on Windows
Martin Storsjö [Wed, 2 Nov 2022 11:55:39 +0000 (11:55 +0000)]
[openmp] [test] Fix building the affinity/format/fields_values.c testcase on Windows

Add a missing <process.h> include for _getpid. Don't typedef the
pid_t type on mingw, as mingw headers already provide a typedef for
it.

Differential Revision: https://reviews.llvm.org/D137745

20 months ago[openmp] Fix building in debug mode with mingw
Martin Storsjö [Sun, 6 Nov 2022 22:57:07 +0000 (00:57 +0200)]
[openmp] Fix building in debug mode with mingw

Mingw doesn't provide the _malloc_dbg/_free_dbg functions.

Differential Revision: https://reviews.llvm.org/D137743

20 months ago[include-cleaner] verbatimSpelling->verbatim, clean up some silly init-lists. NFC
Sam McCall [Fri, 11 Nov 2022 11:10:01 +0000 (12:10 +0100)]
[include-cleaner] verbatimSpelling->verbatim, clean up some silly init-lists. NFC

20 months ago[mlir] Fix asan errors in gpu transform dialect
Guray Ozen [Fri, 11 Nov 2022 10:57:00 +0000 (11:57 +0100)]
[mlir] Fix asan errors in gpu transform dialect

20 months ago[mlir][bufferize] Eliminate tensor.empty ops instead of bufferization.alloc_tensor ops
Matthias Springer [Fri, 11 Nov 2022 09:32:05 +0000 (10:32 +0100)]
[mlir][bufferize] Eliminate tensor.empty ops instead of bufferization.alloc_tensor ops

tensor.empty op elimination is an optimization that brings IR in a more bufferization-friendly form. E.g.:

```
%0 = tensor.empty()
%1 = linalg.fill(%cst, %0) {inplace = [true]}
%2 = tensor.insert_slice %1 into %t[10][20][1]
```

Is rewritten to:

```
%0 = tensor.extract_slice %t[10][20][1]
%1 = linalg.fill(%cst, %0) {inplace = [true]}
%2 = tensor.insert_slice %1 into %t[10][20][1]
```

This optimization used to operate on bufferization.alloc_tensor ops. This is not correct because the documentation of bufferization.alloc_tensor says that it always bufferizes to an allocation. Instead, this optimization should operate on tensor.empty ops, which can then be lowered to bufferization.alloc_tensor ops (if they don't get eliminated).

Differential Revision: https://reviews.llvm.org/D137162

20 months ago[LoongArch] Generate PCALAU12I + JIRL instruction pair for medium codemodel
wanglei [Fri, 11 Nov 2022 10:13:52 +0000 (18:13 +0800)]
[LoongArch] Generate PCALAU12I + JIRL instruction pair for medium codemodel

In LoongArch, when `CodeModel=Medium`, it just increases the jumping
ability of function calls relative to PC, from 2^28 to 2^32.

Depends on D137393

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D137394

20 months ago[AMDGPU][MC] Disable SGPRs as src operands of VOP3 VINTRP instructions
Dmitry Preobrazhensky [Fri, 11 Nov 2022 10:14:42 +0000 (13:14 +0300)]
[AMDGPU][MC] Disable SGPRs as src operands of VOP3 VINTRP instructions

Differential Revision: https://reviews.llvm.org/D137575

20 months ago[LoongArch] Moved expansion of PseudoCALL to LoongArchPreRAExpandPseudo pass
wanglei [Fri, 11 Nov 2022 01:52:26 +0000 (09:52 +0800)]
[LoongArch] Moved expansion of PseudoCALL to LoongArchPreRAExpandPseudo pass

This patch moves the expansion of the `PseudoCALL` insturction to
`LoongArchPreRAExpandPseudo` pass. This helps to expand into different
instruction sequences according to different CodeModels.

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D137393

20 months ago[Hexagon] Use default attributes for intrinsics
Nikita Popov [Tue, 8 Nov 2022 10:48:03 +0000 (11:48 +0100)]
[Hexagon] Use default attributes for intrinsics

This switches Hexagon intrinsics to use the default attributes
(nosync, nofree, nocallback and willreturn). Especially willreturn
is needed to prevent optimization regressions in the future.

The only intrinsics I've excluded here are the load/store locked
intrinsics, which presumably aren't nosync.

Differential Revision: https://reviews.llvm.org/D137623

20 months agoRevert "Revert "[mlir][linalg] Replace "string" iterator_types attr with enums in...
Oleg Shyshkov [Thu, 10 Nov 2022 10:42:48 +0000 (11:42 +0100)]
Revert "Revert "[mlir][linalg] Replace "string" iterator_types attr with enums in LinalgInterface.""

With python code fixed.

This reverts commit 41280908e43d47903960c66237ab49caa5641b4d.

20 months ago[mlir] Fix forward the fix for incorrect Optional<ArrayAttr> usage.
Alexander Belyaev [Fri, 11 Nov 2022 09:52:08 +0000 (10:52 +0100)]
[mlir] Fix forward the fix for incorrect Optional<ArrayAttr> usage.

20 months ago[Test] Add test for crash in IRCE when IV is AddRec for another loop
Dmitry Makogon [Fri, 11 Nov 2022 09:45:22 +0000 (16:45 +0700)]
[Test] Add test for crash in IRCE when IV is AddRec for another loop

This adds a test for https://github.com/llvm/llvm-project/issues/58912.
IRCE crashes when it tries to check whether it is possible to safely
calculate the bounds of a loop with IV AddRec which is in another loop.

20 months ago[mlir] Fix incorrect access to the Optional<ArrayAttr> underlying values.
Alexander Belyaev [Fri, 11 Nov 2022 09:46:04 +0000 (10:46 +0100)]
[mlir] Fix incorrect access to the Optional<ArrayAttr> underlying values.

20 months ago[include-cleaner] Initial version for the "Location=>Header" step
Haojian Wu [Fri, 11 Nov 2022 09:19:28 +0000 (10:19 +0100)]
[include-cleaner] Initial version for the "Location=>Header" step

This patch implements the initial version of "Location => Header" step:

- define the interface;
- integrate into the existing workflow, and use the PragmaIncludes;

Differential Revision: https://reviews.llvm.org/D137320

20 months ago[AArch64] Add smull sinking extract-and-splat tests and regenerate neon-vmull-high...
David Green [Fri, 11 Nov 2022 08:27:44 +0000 (08:27 +0000)]
[AArch64] Add smull sinking extract-and-splat tests and regenerate neon-vmull-high-p8.ll. NFC

20 months ago[opt] Remove support for using -O[0|1|2|3|s|z] with legacy PM in opt
Bjorn Pettersson [Tue, 8 Nov 2022 20:19:25 +0000 (21:19 +0100)]
[opt] Remove support for using -O[0|1|2|3|s|z] with legacy PM in opt

When running a default pipeline (for a specific O-level) in opt it is
now expected that the new PM should be used. Only reason to use the
legacy PM is when testing a pass that is locked to the legacy PM (or
when testing single passes, for example used by the llc backend).

If a test should run both a default pipeline plus some other passes,
the solution would be to invoke opt twice (separating the default
pipeline execution from the execution of individual passes).

Starting with this patch "opt -O0" etc. will result in an error.

Differential Revision: https://reviews.llvm.org/D137663

20 months ago[mlir] Introduce device mapper attribute for `thread_dim_map` and `mapped to dims`
Guray Ozen [Thu, 10 Nov 2022 16:55:49 +0000 (17:55 +0100)]
[mlir] Introduce device mapper attribute for `thread_dim_map` and `mapped to dims`

`scf.foreach_thread` defines mapping its loops to processors via an integer array, see an example below. A lowering can use this mapping. However, expressing mapping as an integer array is very confusing, especially when there are multiple levels of parallelism. In addition, the op does not verify the integer array. This change introduces device mapping attribute to make mapping descriptive and verifiable. Then it makes GPU transform dialect use it.

```
scf.foreach_thread (%i, %j) in (%c1, %c2) {
scf.foreach_thread (%i2, %j2) in (%c1, %c2)
{...} { thread_dim_mapping = [0, 1]}
} { thread_dim_mapping = [0, 1]}
```

It first introduces a `DeviceMappingInterface` which is an attribute interface. `scf.foreach_thread` defines its mapping via this interface. A lowering must define its attributes and implement this interface as well. This way gives us a clear validation.

The change also introduces two new attributes (`#gpu.thread<x/y/z>` and `#gpu.block<x,y,z>` ). After this change, the above code prints as below, as seen here, this way clarifies the loop mappings. The change also implements consuming of these two new attribute by the transform dialect. Transform dialect binds the outermost loops to the thread blocks and innermost loops to threads.

```
scf.foreach_thread (%i, %j) in (%c1, %c2) {
scf.foreach_thread (%i2, %j2) in (%c1, %c2)
{...} { thread_dim_mapping = [#gpu.thread<x>, #gpu.thread<y>]}
} { thread_dim_mapping = [#gpu.block<x>, #gpu.block<y>]}
```

Reviewed By: ftynse, nicolasvasilache

Differential Revision: https://reviews.llvm.org/D137413

20 months ago[clang][Interp] Protect Record creation against infinite recursion
Timm Bäder [Thu, 27 Oct 2022 10:06:44 +0000 (12:06 +0200)]
[clang][Interp] Protect Record creation against infinite recursion

This happens only in error cases, but we need to handle it anyway.

Differential Revision: https://reviews.llvm.org/D136831

20 months ago[clang][Interp] Support alignof()
Timm Bäder [Wed, 2 Nov 2022 10:20:01 +0000 (11:20 +0100)]
[clang][Interp] Support alignof()

Support alignof() and __alignof() expressions.

Fixes #58816

Differential Revision: https://reviews.llvm.org/D137240

20 months ago[clang][Interp] DerivedToBase casts
Timm Bäder [Mon, 7 Nov 2022 13:19:48 +0000 (14:19 +0100)]
[clang][Interp] DerivedToBase casts

Differential Revision: https://reviews.llvm.org/D137545

20 months agoAdd builtin_elementwise_sin and builtin_elementwise_cos
Joshua Batista [Fri, 11 Nov 2022 06:49:35 +0000 (22:49 -0800)]
Add builtin_elementwise_sin and builtin_elementwise_cos

Add codegen for llvm cos and sin elementwise builtins
The sin and cos elementwise builtins are necessary for HLSL codegen.
Tests were added to make sure that the expected errors are encountered
when these functions are given inputs of incompatible types.
The new builtins are restricted to floating point types only.

Reviewed By: craig.topper, fhahn

Differential Revision: https://reviews.llvm.org/D135011

20 months ago[OpenMP] [OMPD] Testcases for libompd
Vignesh Balasubramanian [Fri, 11 Nov 2022 04:46:59 +0000 (10:16 +0530)]
[OpenMP] [OMPD] Testcases for libompd

This is part of the OMPD Path set started from review.
https://reviews.llvm.org/D100181

Reviewed By: @jdoerfert, @dreachem

20 months ago[RISCV] Remove unused CHECK lines from test. NFC
Craig Topper [Fri, 11 Nov 2022 06:39:28 +0000 (22:39 -0800)]
[RISCV] Remove unused CHECK lines from test. NFC

These aren't included in the check-prefixes.

20 months ago[LangRef][LoongArch] Update inline asm constraint code and operand modifier
Xiaodong Liu [Fri, 11 Nov 2022 06:24:54 +0000 (14:24 +0800)]
[LangRef][LoongArch] Update inline asm constraint code and operand modifier

According to:
https://reviews.llvm.org/D134157
https://reviews.llvm.org/D136841
https://reviews.llvm.org/D136835

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D137528

20 months agoAtomicExpand: Support cmpxchg expansion for small FP types
Matt Arsenault [Thu, 22 Sep 2022 14:51:33 +0000 (10:51 -0400)]
AtomicExpand: Support cmpxchg expansion for small FP types

Handles f16 atomics for AMDGPU.

20 months agoAvoid fallthrough after ffb109b6852d248c9d2e3202477dccf20aac7151
Jordan Rupprecht [Fri, 11 Nov 2022 06:05:09 +0000 (22:05 -0800)]
Avoid fallthrough after ffb109b6852d248c9d2e3202477dccf20aac7151

Fallthrough appears to be not intended here, as otherwise this is a completely dead store: `DOPRegIsUnique` will be overwritten by the next case.

20 months ago[LTO] Make local linkage GlobalValue in non-prevailing COMDAT available_externally
Fangrui Song [Fri, 11 Nov 2022 05:54:43 +0000 (21:54 -0800)]
[LTO] Make local linkage GlobalValue in non-prevailing COMDAT available_externally

For a local linkage GlobalObject in a non-prevailing COMDAT, it remains defined while its
leader has been made available_externally. This violates the COMDAT rule that
its members must be retained or discarded as a unit.

To fix this, update the regular LTO change D34803 to track local linkage
GlobalValues, and port the code to ThinLTO (GlobalAliases are not handled.)

This fixes two problems.

(a) `__cxx_global_var_init` in a non-prevailing COMDAT group used to
linger around (unreferenced, hence benign), and is now correctly discarded.
```
int foo();
inline int v = foo();
```

(b) Fix https://github.com/llvm/llvm-project/issues/58215:
as a size optimization, we place private `__profd_` in a COMDAT with a
`__profc_` key. When FuncImport.cpp makes `__profc_` available_externally due to
a non-prevailing COMDAT, `__profd_` incorrectly remains private. This change
makes the `__profd_` available_externally.

```
cat > c.h <<'eof'
extern void bar();
inline __attribute__((noinline)) void foo() {}
eof
cat > m1.cc <<'eof'
#include "c.h"
int main() {
  bar();
  foo();
}
eof
cat > m2.cc <<'eof'
#include "c.h"
__attribute__((noinline)) void bar() {
  foo();
}
eof

clang -O2 -fprofile-generate=./t m1.cc m2.cc -flto -fuse-ld=lld -o t_gen
rm -fr t && ./t_gen && llvm-profdata show -function=foo t/default_*.profraw

clang -O2 -fprofile-generate=./t m1.cc m2.cc -flto=thin -fuse-ld=lld -o t_gen
rm -fr t && ./t_gen && llvm-profdata show -function=foo t/default_*.profraw
```

If a GlobalAlias references a GlobalValue which is just changed to
available_externally, change the GlobalAlias as well (e.g. C5/D5 comdats due to
cc1 -mconstructor-aliases). The GlobalAlias may be referenced by other
available_externally functions, so it cannot easily be removed.

Depends on D137441: we use available_externally to mark a GlobalAlias in a
non-prevailing COMDAT, similar to how we handle GlobalVariable/Function.
GlobalAlias may refer to a ConstantExpr, not changing GlobalAlias to
GlobalVariable gives flexibility for future extensions (the use case is niche.
For simplicity we don't handle it yet). In addition, available_externally
GlobalAlias is the most straightforward implementation and retains the aliasee
information to help optimizers.

See windows-vftable.ll: Windows vftable uses an alias pointing to a
private constant where the alias is the COMDAT leader. The COMDAT use case
is skeptical and ThinLTO does not discard the alias in the non-prevailing COMDAT.
This patch retains the behavior.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D135427

20 months ago[RISCV] Use OPCFG format record for vsetvli in tablgen. NFC
Craig Topper [Fri, 11 Nov 2022 02:00:35 +0000 (18:00 -0800)]
[RISCV] Use OPCFG format record for vsetvli in tablgen. NFC

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D137808

20 months ago[RISCV] Add OPCFG format of vector. NFC
Craig Topper [Fri, 11 Nov 2022 01:59:47 +0000 (17:59 -0800)]
[RISCV] Add OPCFG format of vector. NFC

Refer to https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#101-vector-arithmetic-instruction-encoding

Patch by Jiejie Rong

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D137694

20 months agoAMDGPU: Use generic is.fpclass enum instead of locally defined copy
Matt Arsenault [Thu, 10 Nov 2022 23:38:38 +0000 (15:38 -0800)]
AMDGPU: Use generic is.fpclass enum instead of locally defined copy

The generic intrinsic uses the same bitlayout as the amdgcn intrinsic,
so re-use the enum.

20 months ago[lldb/test] Fix app_specific_backtrace_crashlog.test (NFC)
Med Ismail Bennani [Fri, 11 Nov 2022 02:28:53 +0000 (18:28 -0800)]
[lldb/test] Fix app_specific_backtrace_crashlog.test (NFC)

This patch fixes app_specific_backtrace_crashlog.test.

It was failing because one of the loaded images was built with
optimization which added a new warning message between the first
`CHECK` and the `CHECK-NEXT`, breaking the expected ordering.

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
20 months agoRevert "[LTO] Make local linkage GlobalValue in non-prevailing COMDAT available_exter...
Alan Zhao [Fri, 11 Nov 2022 01:48:18 +0000 (17:48 -0800)]
Revert "[LTO] Make local linkage GlobalValue in non-prevailing COMDAT available_externally"

This reverts commit 89ddcff1d2d6e9f4de78f3a563a8b1987bf7ea8f.

Reason: This breaks bootstrapping builds of LLVM on Windows using
ThinLTO; see https://crbug.com/1382839

20 months ago[Clang][LoongArch] Implement __builtin_loongarch_crc_w_d_w builtin and add diagnostics
gonglingqin [Thu, 10 Nov 2022 12:06:17 +0000 (20:06 +0800)]
[Clang][LoongArch] Implement __builtin_loongarch_crc_w_d_w builtin and add diagnostics

This patch adds support to prevent __builtin_loongarch_crc_w_d_w from compiling
on loongarch32 in the front end and adds diagnostics accordingly.

Reference: https://github.com/gcc-mirror/gcc/blob/master/gcc/config/loongarch/larchintrin.h#L175-L184

Depends on D136906

Differential Revision: https://reviews.llvm.org/D137316

20 months ago[AArch64][SVE] Support logical operation BIC with DestructiveBinary patterns
zhongyunde [Fri, 11 Nov 2022 01:10:14 +0000 (09:10 +0800)]
[AArch64][SVE] Support logical operation BIC with DestructiveBinary patterns

Logical operation BIC with DestructiveBinary patterns is temporarily removed as
causes an assert (commit 3c382ed71f15), so try to fix that.
The most significant being that for pseudo instructions that do not have real instructions (including movpfx'd ones) that cover all combinations of register allocation, their expansion will be broken. This is the main reason the zeroing is an experimental feature because it has known bugs.
So we add an extra LSL for movprfx expand BIC_ZPZZ_ZERO A, P, A, A when necessary.
  movprfx z0.s, p0/z, z0.s
  lsl z0.b, p0/m, z0.b, #0
  bic z0.s, p0/m, z0.s, z0.s

Depends on D88595

20 months agoAdd missing changes for "[Clang][LoongArch] Handle -march/-m{single,double,soft}...
Weining Lu [Fri, 11 Nov 2022 00:58:12 +0000 (08:58 +0800)]
Add missing changes for "[Clang][LoongArch] Handle -march/-m{single,double,soft}-float/-mfpu options"

Some changes in D136146 were lost by an accidentally sumbit. So recover
them.

20 months ago[mlir][vector] Add insertOp src shape check for BubbleUpBitCastForStridedSliceInsert
stanley-nod [Fri, 11 Nov 2022 00:41:59 +0000 (16:41 -0800)]
[mlir][vector] Add insertOp src shape check for BubbleUpBitCastForStridedSliceInsert

Not all shape of vectors can be casted into other types, we add a check
to not fold insertOp into bitcast if the shape does not support it.

Examples of unsupported shape castings are f16 vectors to f32 if the
shape is not multiple of 2s. or int8 to int32 if shapes are not multiple
of 4.

Reviewed By: antiagainst, ThomasRaoux

Differential Revision: https://reviews.llvm.org/D137802

20 months ago[libclang] Expose completion result kind in `CXCompletionResult`
Egor Zhdan [Mon, 31 Oct 2022 22:46:43 +0000 (15:46 -0700)]
[libclang] Expose completion result kind in `CXCompletionResult`

This allows clients of libclang to check whether a completion result is a keyword. Previously, keywords had `CursorKind == CXCursor_NotImplemented` and it wasn't trivial to distinguish a keyword from a pattern.

This change moves `CodeCompletionResult::ResultKind` to `clang-c` under a new name `CXCompletionResultKind`. It also tweaks `c-index-test` to print the result kind instead of `NotImplemented`, and adjusts the tests for the new output.

rdar://91852088

Differential Revision: https://reviews.llvm.org/D136844

20 months agoCheck m_dyld_up directly in LoadBinariesViaMetadata
Jason Molenda [Thu, 10 Nov 2022 23:46:32 +0000 (15:46 -0800)]
Check m_dyld_up directly in LoadBinariesViaMetadata

In the restructuring I did in https://reviews.llvm.org/D133680 , I
call ObjectFile::LoadBinariesViaMetadata, and the process m_dyld
may be set by a method under there -- in
ProcessMachCore::LoadBinariesViaMetadata I wanted to check to see
if m_dyld_up had been set.  I did this by calling the GetDynamicLoader()
method, but that method will call FindPlugin() if there is no
dynamic loader yet, and the static dynamic loader plugin was being
loaded, preventing the scan for userland binaries in a userland
corefile.

Differential Revision: https://reviews.llvm.org/D137807
rdar://102210820

20 months agoApply clang-tidy fixes for readability-identifier-naming in TosaOps.cpp (NFC)
Mehdi Amini [Thu, 3 Nov 2022 20:44:53 +0000 (20:44 +0000)]
Apply clang-tidy fixes for readability-identifier-naming in TosaOps.cpp (NFC)

20 months agoApply clang-tidy fixes for performance-unnecessary-value-param in SparseTensorDialect...
Mehdi Amini [Thu, 3 Nov 2022 20:33:56 +0000 (20:33 +0000)]
Apply clang-tidy fixes for performance-unnecessary-value-param in SparseTensorDialect.cpp (NFC)

20 months ago[mlir][sparse] Fix a test to check all output coordinates.
bixia1 [Thu, 10 Nov 2022 23:01:28 +0000 (15:01 -0800)]
[mlir][sparse] Fix a test to check all output coordinates.

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D137805

20 months agoApparently I moved the wrong one to "2", then Jason moved the right
Jim Ingham [Thu, 10 Nov 2022 23:23:51 +0000 (15:23 -0800)]
Apparently I moved the wrong one to "2", then Jason moved the right
one, so this commit moves the wrong one back to no-"2"...

20 months ago[SelectDagISEL] refactor HandlePHINodesInSuccessorBlocks NFC.
Nick Desaulniers [Thu, 10 Nov 2022 22:26:47 +0000 (14:26 -0800)]
[SelectDagISEL] refactor HandlePHINodesInSuccessorBlocks NFC.

While working on this code to support outputs from callbr along indirect
branches, I kept making these changes again and again. Precommit these.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D137445

20 months ago[lld-macho] Set 4-byte alignment for `__init_offsets`
Daniel Bertalan [Thu, 10 Nov 2022 21:42:19 +0000 (22:42 +0100)]
[lld-macho] Set 4-byte alignment for `__init_offsets`

dyld refuses to run initializers if this section is unaligned.

Fixes https://bugs.chromium.org/p/chromium/issues/detail?id=1383240

Differential Revision: https://reviews.llvm.org/D137803

20 months ago[mlir][sparse] Fix a bug in rewriting for the convert op.
bixia1 [Thu, 10 Nov 2022 21:07:57 +0000 (13:07 -0800)]
[mlir][sparse] Fix a bug in rewriting for the convert op.

The code to retrieve the number of entries isn't correct.

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D137795

20 months ago[CodeGen][Test] simplify callbr-asm-outputs.ll with nounwind NFC
Nick Desaulniers [Thu, 10 Nov 2022 22:22:55 +0000 (14:22 -0800)]
[CodeGen][Test] simplify callbr-asm-outputs.ll with nounwind NFC

The CFI directives add noise to the test. Remove them via nounwind fn
attrs. Also remove clobbers.

Reviewed By: void

Differential Revision: https://reviews.llvm.org/D137596

20 months ago[Flang] Allow registering plugin extensions with the pass builder
Usman Nadeem [Thu, 10 Nov 2022 22:09:51 +0000 (14:09 -0800)]
[Flang] Allow registering plugin extensions with the pass builder

Pass plugins are compiled and linked dynamically by default. Setting
`LLVM_${NAME}_LINK_INTO_TOOLS` to `ON` turns the project into a
statically linked extension. Projects like Polly can be used this way by
adding `-DLLVM_POLLY_LINK_INTO_TOOLS=ON` to the `cmake` command.

The changes in this patch makes the PassBuilder in Flang aware of
statically linked pass plugins, see the documentation for more details:
https://github.com/llvm/llvm-project/blob/main/llvm/docs/WritingAnLLVMNewPMPass.rst#id21

Differential Revision: https://reviews.llvm.org/D137673

Change-Id: Id1aa501dcb4821d0ec779f375cc8e8d6b0b92fce

20 months ago[InstSimplify] fold X +nnan Inf
Sanjay Patel [Thu, 10 Nov 2022 22:10:46 +0000 (17:10 -0500)]
[InstSimplify] fold X +nnan Inf

If we exclude NaN (and therefore the opposite Inf),
anything plus Inf is Inf:
https://alive2.llvm.org/ce/z/og3dj9

20 months ago[InstSimplify] add tests for fadd/fsub with inf constant operand; NFC
Sanjay Patel [Thu, 10 Nov 2022 21:47:36 +0000 (16:47 -0500)]
[InstSimplify] add tests for fadd/fsub with inf constant operand; NFC

20 months ago[lldb][test] TestConstStaticIntegralMember.py: fix for clang-{9,11,13}
Michael Buch [Thu, 10 Nov 2022 18:46:02 +0000 (10:46 -0800)]
[lldb][test] TestConstStaticIntegralMember.py: fix for clang-{9,11,13}

**Summary**

The public lldb matrix bot is failing for tests compiled with clang-9, clang-11, clang-13.

This patch addresses these failures by evaluating the enum case that
doesn't cause malformed DWARF in older version of clang.

There was no particular reason we had to use `true` enum case
to reproduce the bug in #58383, so simply switch to use `false`
to get all bots passing again.

**Details**

In older versions of clang, the following snippet:
```
enum EnumBool : bool {
  enum_bool_case1 = false,
  enum_bool_case2 = true,
};

struct A {
  const static EnumBool enum_bool_val = enum_bool_case2;
};
```

…results in following DWARF:
```
0x00000052:   DW_TAG_structure_type
                DW_AT_calling_convention        (DW_CC_pass_by_value)
                DW_AT_name      ("A")
                DW_AT_byte_size (0x01)
                DW_AT_decl_file ("/Users/michaelbuch/Git/llvm-project/lldb/test/API/lang/cpp/const_static_integral_member/repro.cpp")
                DW_AT_decl_line (6)

0x0000005b:     DW_TAG_member
                  DW_AT_name    ("enum_bool_val")
                  DW_AT_type    (0x0000000000000068 "const EnumBool")
                  DW_AT_decl_file       ("/Users/michaelbuch/Git/llvm-project/lldb/test/API/lang/cpp/const_static_integral_member/repro.cpp")
                  DW_AT_decl_line       (7)
                  DW_AT_external        (true)
                  DW_AT_declaration     (true)
                  DW_AT_const_value     (-1)

```

Note the `DW_AT_const_value == -1`

When evaluating `A::enum_bool_val` in the lldb we get:
```
(lldb) p A::enum_bool_val
error: expression failed to parse:
error: Couldn't lookup symbols:
  __ZN1A13enum_bool_valE
```

Enabling the DWARF logs we see:

```
(arm64) clang-13.out: DWARFASTParserClang::ParseTypeFromDWARF (die = 0x00000068, decl_ctx = 0x136ac1e30 (die 0x0000000b)) DW_TAG_const_type name = '(null)')
Failed to add const value to variable A::enum_bool_val: Can't store unsigned value 18446744073709551615 in integer with 1 bits.
```

This occurs because a boolean enum is considered an unsigned integer
type, but we try to initialize it with a `-1`.

**Testing**

- Confirmed locally that top-of-tree lldb correctly
  evaluates the previously failing expression when
  the test program is compiled with clang-13

Differential Revision: https://reviews.llvm.org/D137793

20 months ago[clang-format][NFC] More sorting in getLLVMStyle()
Björn Schäpers [Thu, 10 Nov 2022 21:36:25 +0000 (22:36 +0100)]
[clang-format][NFC] More sorting in getLLVMStyle()

Seems I've missed that.

Amends 41a09a07ce4ddd1e97ce0430d1debe1dcc853890

20 months ago[clang-format] Add BreakBeforeInlineASMColon configuration
Anastasiia Lukianenko [Thu, 10 Nov 2022 21:28:15 +0000 (22:28 +0100)]
[clang-format] Add BreakBeforeInlineASMColon configuration

If true, colons in ASM parameters will be placed after line breaks.

true:
asm volatile("string",
                     :
                     : val);

false:
asm volatile("string", : : val);

Differential Revision: https://reviews.llvm.org/D91950

20 months agocmake: Inline the add_llvm_symbol_exports.py script
Tom Stellard [Thu, 10 Nov 2022 21:18:44 +0000 (13:18 -0800)]
cmake: Inline the add_llvm_symbol_exports.py script

This fixes stand-alone builds.

Reviewed By: andrewng

Differential Revision: https://reviews.llvm.org/D137611

20 months agodocs: Add instructions for stand-alone builds of clang
Tom Stellard [Thu, 10 Nov 2022 20:18:49 +0000 (12:18 -0800)]
docs: Add instructions for stand-alone builds of clang

More sub-projects will be added to the table once they have been verified
to be buildable in stand-alone mode.

Reviewed By: MaskRay, mgorny

Differential Revision: https://reviews.llvm.org/D123968

20 months ago[InstCombine] PR58901 - fix bug with swapping GEP of different types
William Huang [Thu, 10 Nov 2022 00:34:07 +0000 (00:34 +0000)]
[InstCombine] PR58901 - fix bug with swapping GEP of different types

Fix https://github.com/llvm/llvm-project/issues/58901 by adding stricter check whether non-opaque GEP can be swapped. This will not affect GEP swapping optimization in the future since we are switching to opaque GEP

Reviewed By: clin1

Differential Revision: https://reviews.llvm.org/D137752

20 months ago[release] Add third-party tarball to release for standalone builds
Konrad Kleine [Thu, 10 Nov 2022 11:11:33 +0000 (12:11 +0100)]
[release] Add third-party tarball to release for standalone builds

With the advent of https://reviews.llvm.org/D131919 and
https://github.com/llvm/llvm-project/commit/a11cd0d94ed3cabf0998a0289aead05da94c86eb
 the third-party directory is required to build LLVM and other packages and in standalone
builds the third-party directory is not available from the llvm tarball anymore.

Differential Revision: https://reviews.llvm.org/D137777

20 months agoUpdated contact email address.
Anastasia Stulova [Thu, 10 Nov 2022 19:50:19 +0000 (19:50 +0000)]
Updated contact email address.

20 months ago[libc++] Documents details of the pre-commit CI.
Mark de Wever [Thu, 4 Aug 2022 16:31:03 +0000 (18:31 +0200)]
[libc++] Documents details of the pre-commit CI.

This documentation aims to make it cleare how the libc++ pre-commit CI
works. For libc++ developers and other LLVM projects whose changes can
affect libc++.

This was discusses with @aaron.ballman as a follow on some unclearities
for the Clang communitee how the libc++ pre-commit CI works.

Note some parts depend on patches under review as commented in the
documentation.

Reviewed By: ldionne, #libc, philnik

Differential Revision: https://reviews.llvm.org/D133249

20 months ago[VectorCombine] widen a load with subvector insert
Sanjay Patel [Thu, 10 Nov 2022 19:09:57 +0000 (14:09 -0500)]
[VectorCombine] widen a load with subvector insert

This adapts/copies code from the existing fold that allows
widening of load scalar+insert. It can help in IR because
it removes a shuffle, and the backend can already narrow
loads if that is profitable in codegen.

We might be able to consolidate more of the logic, but
handling this basic pattern should be enough to make a small
difference on one of the motivating examples from issue #17113.
The final goal of combining loads on those patterns is not
solved though.

Differential Revision: https://reviews.llvm.org/D137341

20 months ago[SystemZ] add test for mergeTruncStores miscompile; NFC
Sanjay Patel [Thu, 10 Nov 2022 17:08:25 +0000 (12:08 -0500)]
[SystemZ] add test for mergeTruncStores miscompile; NFC

This is based on the example in issue #58883. I'm not sure
if the output currently shows the potential miscompile,
so we may want to adjust the test in a follow-up.

20 months agoAArch64/GlobalISel: Regenerate some test checks to include -NEXT
Matt Arsenault [Thu, 10 Nov 2022 17:09:25 +0000 (09:09 -0800)]
AArch64/GlobalISel: Regenerate some test checks to include -NEXT

20 months ago[SLP]Redesign vectorization of the gather nodes.
Alexey Bataev [Fri, 16 Sep 2022 20:57:04 +0000 (13:57 -0700)]
[SLP]Redesign vectorization of the gather nodes.

Gather nodes are vectorized as simply vector of the scalars instead of
relying on the actual node. It leads to the fact that in some cases
we may miss incorrect transformation (non-matching set of scalars is
just ended as a gather node instead of possible vector/gather node).
Better to rely on the actual nodes, it allows to improve stability and
better detect missed cases.

Differential Revision: https://reviews.llvm.org/D135174

20 months ago[OpenCL] Fix diagnostics with templates in kernel args.
Anastasia Stulova [Thu, 10 Nov 2022 15:20:34 +0000 (15:20 +0000)]
[OpenCL] Fix diagnostics with templates in kernel args.

Improve checking for the standard layout type when diagnosing
the kernel argument with templated types. The check doesn't work
correctly for references or pointers due to the lazy template
instantiation.

Current fix only improves cases where nested types in the templates
do not depend on the template parameters.

Differential Revision: https://reviews.llvm.org/D134445

20 months ago[lldb] Make callback-based formatter matching available from the CLI.
Jorge Gorbe Moya [Thu, 10 Nov 2022 18:25:04 +0000 (10:25 -0800)]
[lldb] Make callback-based formatter matching available from the CLI.

This change adds a `--recognizer-function` (`-R`) to `type summary add`
and `type synth add` that allows users to specify that the names in
the command are not type names but python function names.

It also adds an example to lldb/examples, and a section in the data
formatters documentation on how to use recognizer functions.

Differential Revision: https://reviews.llvm.org/D137000

20 months agoadd LoongArchTargetParser.def to LLVM_Utils module
Jason Molenda [Thu, 10 Nov 2022 18:21:29 +0000 (10:21 -0800)]
add LoongArchTargetParser.def to LLVM_Utils module

Weinling Lu's change from https://reviews.llvm.org/D136146
fails to build with -DLLVM_ENABLE_MODULES=1 cmake builds
like the LLDB Incremental CI bot on greendragon; this entry
is sufficient to unblock that style of build, it seems.

20 months ago[SLP][NFC]Add a test for vectorization with scheduling blocks order
Alexey Bataev [Thu, 10 Nov 2022 18:12:51 +0000 (10:12 -0800)]
[SLP][NFC]Add a test for vectorization with scheduling blocks order
different than the instruction order, NFC.