platform/upstream/llvm.git
17 months ago[libc++] Improve binary size when using __transaction
Nikolas Klauser [Thu, 8 Dec 2022 08:40:54 +0000 (09:40 +0100)]
[libc++] Improve binary size when using __transaction

__exception_guard is a no-op in -fno-exceptions mode to produce better code-gen. This means that we don't provide the strong exception guarantees. However, Clang doesn't generate cleanup code with exceptions disabled, so even if we wanted to provide the strong exception guarantees we couldn't. This is also only relevant for constructs with a stack of -fexceptions > -fno-exceptions > -fexceptions code, since the exception can't be caught where exceptions are disabled. While -fexceptions > -fno-exceptions is quite common (e.g. libc++.dylib > -fno-exceptions), having another layer with exceptions enabled seems a lot less common, especially one that tries to catch an exception through -fno-exceptions code.

Fixes https://github.com/llvm/llvm-project/issues/56783

Reviewed By: ldionne, Mordante, huixie90, #libc

Spies: EricWF, alexfh, hans, joanahalili, libcxx-commits

Differential Revision: https://reviews.llvm.org/D133661

17 months ago[DAGCombine]Expand usage of CreateBuildVecShuffle to make full use of vector ops
Wang, Xin10 [Mon, 23 Jan 2023 02:37:26 +0000 (10:37 +0800)]
[DAGCombine]Expand usage of CreateBuildVecShuffle to make full use of vector ops

Now, when llc encounters the case that contains a lot of
extract_vector_elt and a BUILD_VECTOR, it will replace these to
vector_shuffle to decrease the size of code, the actions are done in
createBuildVecShuffle in DAGCombiner.cpp, but now the code cannot handle
the case that the size of source vector reg is more than twice the dest
size.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D139685

17 months ago[Support] Use llvm::byteswap in SwapByteOrder.h (NFC)
Kazu Hirata [Mon, 23 Jan 2023 03:14:33 +0000 (19:14 -0800)]
[Support] Use llvm::byteswap in SwapByteOrder.h (NFC)

This patch defines ByteSwap_{32,64} and getSwappedBytes with
llvm::byteswap.

It's tempting to define something like:

  template <typename T,
            typename = std::enable_if_t<std::is_integral_v<T>>>
  inline T getSwappedBytes(T C) { return llvm::byteswap(C); }

But this doesn't work.  The host compiler would issue:

  error: call to 'getSwappedBytes' is ambiguous

while compiling lldb/source/Utility/UUID.cpp.

17 months ago[OpenMP][FIX] Adjust enum size to avoid assertion after D142320
Johannes Doerfert [Mon, 23 Jan 2023 02:30:46 +0000 (18:30 -0800)]
[OpenMP][FIX] Adjust enum size to avoid assertion after D142320

17 months ago[HIP] Change default offload arch to gfx906
Yaxun (Sam) Liu [Fri, 20 Jan 2023 19:42:58 +0000 (14:42 -0500)]
[HIP] Change default offload arch to gfx906

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D142246

17 months agoARM: Add baseline test for fneg + fcmp + select combine
Matt Arsenault [Thu, 15 Dec 2022 18:33:16 +0000 (13:33 -0500)]
ARM: Add baseline test for fneg + fcmp + select combine

17 months ago[MC] Replace single-case switch with an if (NFC)
Sergei Barannikov [Mon, 23 Jan 2023 00:53:22 +0000 (03:53 +0300)]
[MC] Replace single-case switch with an if (NFC)

Same as e5f746e9 but for MasmParser.

17 months ago[AVR] Emit 'eicall' for devices with large program memory
Ben Shi [Sun, 22 Jan 2023 05:47:57 +0000 (13:47 +0800)]
[AVR] Emit 'eicall' for devices with large program memory

Fixes https://github.com/llvm/llvm-project/issues/58856

Reviewed By: aykevl

Differential Revision: https://reviews.llvm.org/D142298

17 months ago[OpenMP] Merge barrier elimination into AAExecutionDomain
Johannes Doerfert [Wed, 14 Dec 2022 23:08:35 +0000 (15:08 -0800)]
[OpenMP] Merge barrier elimination into AAExecutionDomain

With this patch we track aligned barriers in AAExecutionDomain and also
delete unnecessary barriers there. This allows us to eliminate barriers
across blocks, across functions, and in the presence of complex accesses
that do not force a barrier. Further, we can use the collected
information to enable store-load forwarding in a threaded environment
(follow up patch).

Differential Revision: https://reviews.llvm.org/D140463

17 months ago[MC] Replace a switch with two 'if's (NFC)
Sergei Barannikov [Mon, 23 Jan 2023 00:10:02 +0000 (03:10 +0300)]
[MC] Replace a switch with two 'if's (NFC)

This simplifies logic a bit and helps to reduce the future diff.

17 months ago[OpenMP][DeviceRTL][NFC] Use `OMPTgtExecModeFlags` from `llvm/include/llvm/Frontend...
Shilei Tian [Mon, 23 Jan 2023 00:10:46 +0000 (19:10 -0500)]
[OpenMP][DeviceRTL][NFC] Use `OMPTgtExecModeFlags` from `llvm/include/llvm/Frontend/OpenMP/OMPDeviceConstants.h`

This patch makes preparation for a series that will enable per-kernel information
used in both host and device runtime. Some variables/enums, such as `OMPTgtExecModeFlags`,
have to be shared by both of them. A new header `OMPDeviceConstants.h` is added,
containing code that will be shared by them. We will introduce more variables soon.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D142320

17 months ago[OpenMP] Guarding restrictions are required only for guarding
Johannes Doerfert [Fri, 23 Dec 2022 02:21:00 +0000 (18:21 -0800)]
[OpenMP] Guarding restrictions are required only for guarding

If we do not guard code during SPMDzation, we do not need to check
conditions for successfull guarding. That is, even if some code is
executed in different modes, it does not prevent SPMDzation if there is
no guarded code in there.

17 months ago[OpenMP][FIX] Properly update ParallelLevels tracker
Johannes Doerfert [Sun, 22 Jan 2023 23:44:30 +0000 (15:44 -0800)]
[OpenMP][FIX] Properly update ParallelLevels tracker

17 months ago[OpenMP][FIX] Use thread id not team id for masked section
Johannes Doerfert [Sat, 14 Jan 2023 03:10:46 +0000 (19:10 -0800)]
[OpenMP][FIX] Use thread id not team id for masked section

17 months ago[Support] Use llvm::bit_floor in PowerOf2Floor (NFC)
Kazu Hirata [Sun, 22 Jan 2023 22:34:43 +0000 (14:34 -0800)]
[Support] Use llvm::bit_floor in PowerOf2Floor (NFC)

17 months ago[llvm] Use llvm::bit_ceil (NFC)
Kazu Hirata [Sun, 22 Jan 2023 22:05:14 +0000 (14:05 -0800)]
[llvm] Use llvm::bit_ceil (NFC)

In both of these cases, the arguments to Log2_32_Ceil are known to be
nonzero.

17 months ago[llvm] Use llvm::bit_floor (NFC)
Kazu Hirata [Sun, 22 Jan 2023 21:41:23 +0000 (13:41 -0800)]
[llvm] Use llvm::bit_floor (NFC)

In all these cases, the arguments to Log2_32 are known to be nonzero,
so we don't have to worry about "1 << -1".

17 months ago[compiler-rt][builtins] Skip building (b)float16 support on i386-freebsd
Dimitry Andric [Sun, 16 Oct 2022 18:24:42 +0000 (20:24 +0200)]
[compiler-rt][builtins] Skip building (b)float16 support on i386-freebsd

Since bfloat16 and float16 support is not available for i386-freebsd,
the `truncdfbf2.c` and `truncsfbf2.c` builtin sources should be skipped
when targeting that platform, and `COMPILER_RT_HAS_FLOAT16` should not
be defined.

However, the CMake configuration stage runs its tests with the default
target, which normally is amd64-freebsd, so it will detect both bfloat16
and float16 support.

Move adding of the `COMPILER_RT_HAS_FLOAT16` define to the `foreach()`
loop where all the supported architectures are handled, and do not
enable it when targeting i386-freebsd.

Also remove the bfloat16 sources from the `i386_SOURCES` list, when
targeting i386-freebsd.

Differential Revision: https://reviews.llvm.org/D136044

17 months agoUse llvm::popcount instead of llvm::countPopulation(NFC)
Kazu Hirata [Sun, 22 Jan 2023 20:48:51 +0000 (12:48 -0800)]
Use llvm::popcount instead of llvm::countPopulation(NFC)

17 months ago[CMake] Look up target subcomponents in LLVM_AVAILABLE_LIBS
Aaron Puchert [Sun, 22 Jan 2023 20:35:09 +0000 (21:35 +0100)]
[CMake] Look up target subcomponents in LLVM_AVAILABLE_LIBS

In an installation using the all-contained libLLVM.so, individual
components are not available as targets, so we have to look them up in
LLVM_AVAILABLE_LIBS just like llvm_map_components_to_libnames does it.
Here I don't think we need the capitalized names though because we know
the right capitalization. But I might be wrong.

This is required by dragonffi, who call llvm_map_components_to_libnames
on a list containing ${LLVM_NATIVE_ARCH}. Downstream bug report:
https://bugzilla.opensuse.org/show_bug.cgi?id=1180748.

Differential Revision: https://reviews.llvm.org/D96670

17 months ago[SCEV] `getRangeRefIter()`: don't forget to recurse into casts
Roman Lebedev [Sun, 22 Jan 2023 19:25:38 +0000 (22:25 +0300)]
[SCEV] `getRangeRefIter()`: don't forget to recurse into casts

I'm not really sure the problem can be nicely exposed via a lit test,
since we don't give up on range calculation for deeply nested ranges,
but if i add an assertion that those opcodes are never encountered,
the assertion fails in a number of existing tests.

In reality, the iterative approach is still pretty partial:
1. `Seen` should not be there. We want the last instance of expression, not the first one
2. There should be a check that `getRangeRefIter()` does not self-recurse

17 months ago[NFC][SCEV] Reflow `getRangeRefIter()` into an exhaustive switch
Roman Lebedev [Sun, 22 Jan 2023 18:49:04 +0000 (21:49 +0300)]
[NFC][SCEV] Reflow `getRangeRefIter()` into an exhaustive switch

And, this shows a bug in the original code:
why do we not recurse into casts?

If i add an assertion that those opcodes are never encountered,
the assertion fails in a number of existing tests.

17 months ago[NFC][SCEV] `GetMinTrailingZerosImpl()`: deduplicate handling
Roman Lebedev [Sun, 22 Jan 2023 18:55:28 +0000 (21:55 +0300)]
[NFC][SCEV] `GetMinTrailingZerosImpl()`: deduplicate handling

`scPtrToInt` recieves same treatment as normal n-ary ops.

17 months ago[NFC][SCEV] Reflow `GetMinTrailingZerosImpl()` into an exhaustive switch
Roman Lebedev [Sun, 22 Jan 2023 18:29:07 +0000 (21:29 +0300)]
[NFC][SCEV] Reflow `GetMinTrailingZerosImpl()` into an exhaustive switch

17 months ago[Dominators] Introduce DomTreeNodeTraits to allow customization. (NFC)
Florian Hahn [Sun, 22 Jan 2023 20:22:41 +0000 (20:22 +0000)]
[Dominators] Introduce DomTreeNodeTraits to allow customization. (NFC)

This patch introduces DomTreeNodeTraits for customization. Clients can implement
DomTreeNodeTraitsCustom to provide custom ParentPtr, getEntryNode and getParent.
There's also a default specialization if DomTreeNodeTraitsCustom is not implemented,
that assume a Function-like NodeT. This is what is used for the existing DominatorTree
and MachineDominatorTree.

The main motivation for this patch is using DominatorTreeBase across all
regions of a VPlan, see D140513.

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D142162

17 months ago[NFC] Fix "form/from" typos
Piotr Fusik [Sun, 22 Jan 2023 18:59:52 +0000 (19:59 +0100)]
[NFC] Fix "form/from" typos

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D142007

17 months ago[Support] Use functions from bit.h (NFC)
Kazu Hirata [Sun, 22 Jan 2023 18:41:13 +0000 (10:41 -0800)]
[Support] Use functions from bit.h (NFC)

This patch makes the following replacements:

  countLeadingZeros  -> llvm::countl_zero
  countTrailingZeros -> llvm::countr_zero
  countPopulation    -> llvm::popcount

17 months ago[ADT] llvm::bit_cast - use __builtin_bit_cast if available
Simon Pilgrim [Sun, 22 Jan 2023 18:21:08 +0000 (18:21 +0000)]
[ADT] llvm::bit_cast - use __builtin_bit_cast if available

If the compiler supports __builtin_bit_cast we should try to use it instead of std::memcpy (and avoid including the cstring header).

Differential Revision: https://reviews.llvm.org/D142305

17 months ago[ADT] Add llvm::byteswap to bit.h
Kazu Hirata [Sun, 22 Jan 2023 17:29:35 +0000 (09:29 -0800)]
[ADT] Add llvm::byteswap to bit.h

This patch adds C++23-style byteswap to bit.h.

The implementation and tests are largely taken from
llvm/include/llvm/Support/SwapByteOrder.h and
llvm/unittests/Support/SwapByteOrderTest.cpp, respectively.

Differential Revision: https://reviews.llvm.org/D142274

17 months ago[MC][test] Fix a typo
Sergei Barannikov [Sun, 22 Jan 2023 17:07:16 +0000 (20:07 +0300)]
[MC][test] Fix a typo

17 months ago[PowerPC] Regenerate vec_absd.ll test checks
Simon Pilgrim [Sun, 22 Jan 2023 17:09:46 +0000 (17:09 +0000)]
[PowerPC] Regenerate vec_absd.ll test checks

17 months ago[DAG] visitINSERT_VECTOR_ELT - use mergeEltWithShuffle to merge inserted vector eleme...
Simon Pilgrim [Sun, 22 Jan 2023 15:41:44 +0000 (15:41 +0000)]
[DAG] visitINSERT_VECTOR_ELT - use mergeEltWithShuffle to merge inserted vector element chain into base shuffle node

This allows us to merge insert_elt(insert_elt(shuffle(x,y),extract_elt(x,c1),c2),extract_elt(y,c3),c4) style insertion chains into a new shuffle node.

I had hoped to remove mergeInsertEltWithShuffle entirely, but that case doesn't have the one use limits so we would regress in a few other cases.

Fixes the vector-shuffle-combining.ll regressions in D127115

17 months ago[Flang][NFC] fix a cpoy-paste in fold-logical.cpp
Shivam Gupta [Sun, 22 Jan 2023 16:11:19 +0000 (21:41 +0530)]
[Flang][NFC] fix a cpoy-paste in fold-logical.cpp

found by PVS-Studio.

17 months ago[NFC][SCEV] Reflow `impliesPoison()` into an exhaustive switch
Roman Lebedev [Sun, 22 Jan 2023 15:50:52 +0000 (18:50 +0300)]
[NFC][SCEV] Reflow `impliesPoison()` into an exhaustive switch

17 months ago[PVS-Studio][NFC] fix a typo in ShapeUtils.h
Shivam Gupta [Sun, 22 Jan 2023 15:54:52 +0000 (21:24 +0530)]
[PVS-Studio][NFC] fix a typo in ShapeUtils.h

17 months ago[libc++][test] Disable parts requiring locales.
Mark de Wever [Sun, 22 Jan 2023 15:49:39 +0000 (16:49 +0100)]
[libc++][test] Disable parts requiring locales.

This part should be guarded, but there are no proper guards yet.
Therefore disable the offending part. This was reported post commit in
D140653.

17 months ago[InstSimplify] (X || Y) && Y --> Y (for poison-safe logical ops)
Sanjay Patel [Sun, 22 Jan 2023 14:43:35 +0000 (09:43 -0500)]
[InstSimplify] (X || Y) && Y --> Y (for poison-safe logical ops)

https://alive2.llvm.org/ce/z/oT_tEh

This is the conjugate/sibling pattern suggested in post-commit
feedback for:
9444252a674df5952bb5af2b76348ae4b45

issue #60167

17 months ago[InstSimplify] add tests for poison-safe variants of (X || Y) && Y; NFC
Sanjay Patel [Sun, 22 Jan 2023 14:19:55 +0000 (09:19 -0500)]
[InstSimplify] add tests for poison-safe variants of (X || Y) && Y; NFC

17 months ago[clang][doc] Fixes formatting of a text block.
Mark de Wever [Sun, 22 Jan 2023 15:21:11 +0000 (16:21 +0100)]
[clang][doc] Fixes formatting of a text block.

17 months ago[X86] avx2-vbroadcast.ll - use X86 check prefix instead of X32
Simon Pilgrim [Sun, 22 Jan 2023 15:19:17 +0000 (15:19 +0000)]
[X86] avx2-vbroadcast.ll - use X86 check prefix instead of X32

We try to use X32 for tests on gnux32 triples

17 months ago[mlir][ods] Simplify signature of `custom` printers and parsers of Attributes and...
Markus Böck [Sun, 22 Jan 2023 15:11:27 +0000 (16:11 +0100)]
[mlir][ods] Simplify signature of `custom` printers and parsers of Attributes and Types in presence of default constructible parameters

The vast majority of parameters of C++ types used as parameters for Attributes and Types are likely to be default constructible. Nevertheless, TableGen conservatively generates code for the custom directive, expecting signatures using FailureOr<T> for all parameter types T to accomodate them possibly not being default constructible. This however reduces the ergonomics of the likely case of default constructible parameters.

This patch fixes that issue, while barely changing the generated TableGen code, by using a helper function that is used to pass any parameters into custom parser methods. If the type is default constructible, as deemed by the C++ compiler, a default constructible instance is created and passed into the parser method by reference. In all other cases it is a Noop and a FailureOr is passed as before.

Documentation was also updated to document the new behaviour.

Fixes https://github.com/llvm/llvm-project/issues/60178

Differential Revision: https://reviews.llvm.org/D142301

17 months ago[X86] commute-3dnow.ll - use X86 check prefix instead of X32
Simon Pilgrim [Sun, 22 Jan 2023 14:07:20 +0000 (14:07 +0000)]
[X86] commute-3dnow.ll - use X86 check prefix instead of X32

We try to use X32 for tests on gnux32 triples

17 months ago[X86] avx-vbroadcastf128.ll - use X86 check prefix instead of X32
Simon Pilgrim [Sun, 22 Jan 2023 14:01:04 +0000 (14:01 +0000)]
[X86] avx-vbroadcastf128.ll - use X86 check prefix instead of X32

We try to use X32 for tests on gnux32 triples

17 months ago[NFC][SCEVExpander] `CmpSelCost`: use the cost of the expression, not operand
Roman Lebedev [Sun, 22 Jan 2023 14:27:17 +0000 (17:27 +0300)]
[NFC][SCEVExpander] `CmpSelCost`: use the cost of the expression, not operand

Currently, for all invocations, it's equivalent, since that is literally
how `SCEVMinMaxExpr::getType()` is defined. But for e.g. `select`,
we'll want to ask about the hand type, and not the type of the operand
that happens to be first.

17 months ago[NFC][SCEV] Reflow `computeSCEVAtScope()` into an exhaustive switch
Roman Lebedev [Sun, 22 Jan 2023 14:35:25 +0000 (17:35 +0300)]
[NFC][SCEV] Reflow `computeSCEVAtScope()` into an exhaustive switch

17 months ago[NFC][SCEV] `getRelevantLoop()`: deduplicate handling
Roman Lebedev [Sun, 22 Jan 2023 14:15:16 +0000 (17:15 +0300)]
[NFC][SCEV] `getRelevantLoop()`: deduplicate handling

17 months ago[NFC][SCEV] `getBlockDisposition()`: deduplicate handling
Roman Lebedev [Sun, 22 Jan 2023 14:08:45 +0000 (17:08 +0300)]
[NFC][SCEV] `getBlockDisposition()`: deduplicate handling

17 months ago[NFC][SCEV] `getLoopDisposition()`: deduplicate handling
Roman Lebedev [Sun, 22 Jan 2023 13:56:28 +0000 (16:56 +0300)]
[NFC][SCEV] `getLoopDisposition()`: deduplicate handling

17 months ago[NFC][SCEV] `computeSCEVAtScope()`: deduplicate handling
Roman Lebedev [Sun, 22 Jan 2023 13:32:02 +0000 (16:32 +0300)]
[NFC][SCEV] `computeSCEVAtScope()`: deduplicate handling

Casts and udiv get the exactly the same handling as n-ary,
there is no point in special-handling anything.

17 months agoAMDGPU: Copy a source modifier test for f16/v2f16
Matt Arsenault [Fri, 16 Dec 2022 03:04:36 +0000 (22:04 -0500)]
AMDGPU: Copy a source modifier test for f16/v2f16

This is essentially a modernized copy of
select-fabs-fneg-extract.ll. Stop using kernels with loads and stores,
don't use fsub for fneg, and port the examples to half.

17 months agoAMDGPU: Add modern copy of fneg combines test
Matt Arsenault [Sun, 18 Dec 2022 12:25:56 +0000 (07:25 -0500)]
AMDGPU: Add modern copy of fneg combines test

17 months ago[DAG] mergeInsertEltWithShuffle - pull out mergeEltWithShuffle helper. NFCI.
Simon Pilgrim [Sun, 22 Jan 2023 13:57:49 +0000 (13:57 +0000)]
[DAG] mergeInsertEltWithShuffle - pull out mergeEltWithShuffle helper. NFCI.

This will allow us to reuse the code to merge an extracted scalar into an updated shuffle in a future patch.

Another step towards fixing some shuffle regressions in D127115.

17 months ago[NFC][X86] Fixup typo in `blend-of-shift.ll`
Roman Lebedev [Sun, 22 Jan 2023 13:14:06 +0000 (16:14 +0300)]
[NFC][X86] Fixup typo in `blend-of-shift.ll`

17 months ago[NFC][X86] Fixup `-mattr=<>` in one runline in `elementwise-store-of-scalar-splat.ll`
Roman Lebedev [Sun, 22 Jan 2023 13:13:16 +0000 (16:13 +0300)]
[NFC][X86] Fixup `-mattr=<>` in one runline in `elementwise-store-of-scalar-splat.ll`

17 months ago[NFC] Small indentation fix in lld/ELF/Relocations.cpp
Shivam Gupta [Sun, 22 Jan 2023 13:10:58 +0000 (18:40 +0530)]
[NFC] Small indentation fix in lld/ELF/Relocations.cpp

17 months ago[SVE] Add intrinsics for integer binops that explicitly undefine the result for inact...
Paul Walker [Fri, 13 Jan 2023 12:00:11 +0000 (12:00 +0000)]
[SVE] Add intrinsics for integer binops that explicitly undefine the result for inactive lanes.

The intent is to lower the clang X form SVE builtins to these
intrinsics. The suffix _x is already in use to signify unpredicated
SVE intrinsics hence my choice to use _u to signify those intrinsics
where the result for inactive lanes is undefined.

Differential Revision: https://reviews.llvm.org/D141937

17 months ago[Mips] Use MCInstrInfo::get in MipsAsmParser instead of reinventing it. NFC.
Jay Foad [Wed, 11 Jan 2023 15:34:37 +0000 (15:34 +0000)]
[Mips] Use MCInstrInfo::get in MipsAsmParser instead of reinventing it. NFC.

Differential Revision: https://reviews.llvm.org/D141503

17 months ago[clang-format][NFC] Add .clang-format to clang/tools/clang-format/
Owen Pan [Sun, 22 Jan 2023 10:59:23 +0000 (02:59 -0800)]
[clang-format][NFC] Add .clang-format to clang/tools/clang-format/

And reformat ClangFormat.cpp in the directory.

17 months ago[clang-format][NFC] Set LineEnding to LF in config files
Owen Pan [Sun, 22 Jan 2023 10:37:39 +0000 (02:37 -0800)]
[clang-format][NFC] Set LineEnding to LF in config files

To prevent \r\n line endings from getting into the source files.

Differential Revision: https://reviews.llvm.org/D141098

17 months ago[LoongArch] Allow %pc_lo12 relocs in JIRL's immediate operand position
WANG Xuerui [Sun, 22 Jan 2023 05:24:43 +0000 (13:24 +0800)]
[LoongArch] Allow %pc_lo12 relocs in JIRL's immediate operand position

Currently, gcc-13 will generate such assembly when `-mcmodel=medium`,
which is ostensibly a dirty hack to allow bigger offsets for extern
function calls without having to add more reloc types. This is not the
best way to accomplish the original goal, but such usages will appear
soon and we have to support it anyway.

Example:

```c
extern int foo(int);

int bar(int x) {
    return foo(x + 123);
}
```

will produce the following (simplified) assembly when compiled with
`-O2 -mcmodel=medium`:

```
    .globl  bar
    .type   bar, @function
bar:
    .cfi_startproc
    addi.w  $r4,$r4,123
    pcalau12i   $r12,%pc_hi20(foo)
    jirl    $r0,$r12,%pc_lo12(foo)
    .cfi_endproc
```

Reviewed By: SixWeining, wangleiat, MaskRay, xry111

Differential Revision: https://reviews.llvm.org/D142278

17 months ago[C++20][Modules] Fix named module import diagnostics.
Iain Sandoe [Tue, 13 Dec 2022 08:45:08 +0000 (08:45 +0000)]
[C++20][Modules] Fix named module import diagnostics.

We have been incorrectly disallowing imports of named modules in the
global and private module fragments.

This addresses: https://github.com/llvm/llvm-project/issues/59688

Differential Revision: https://reviews.llvm.org/D140927

17 months ago[bazel] Add missing dependencies for 4f1e244eb5
Benjamin Kramer [Sun, 22 Jan 2023 09:58:47 +0000 (10:58 +0100)]
[bazel] Add missing dependencies for 4f1e244eb5

17 months ago[OpenMP] Simplify `llvm.assume` operands in device code
Johannes Doerfert [Sun, 22 Jan 2023 09:27:41 +0000 (01:27 -0800)]
[OpenMP] Simplify `llvm.assume` operands in device code

17 months ago[Attributor] Handle constant icmp expressions in AAPotentialValues
Johannes Doerfert [Sun, 22 Jan 2023 09:13:24 +0000 (01:13 -0800)]
[Attributor] Handle constant icmp expressions in AAPotentialValues

A `ConstantExpr` ICmp is pretty much the same thing as an ICmpInst when
we want to simplify it. We just need to be less restrictive wrt. the
type and use the static helper functions directly.

Fixes: https://github.com/llvm/llvm-project/issues/59767

17 months ago[clang][Interp][NFCI] Make InitMap::isInitialized() const
Timm Bäder [Sat, 21 Jan 2023 18:48:37 +0000 (19:48 +0100)]
[clang][Interp][NFCI] Make InitMap::isInitialized() const

17 months ago[clang][Interp][NFC] Forward-declare Boolean in PrimTypes.h
Timm Bäder [Sat, 21 Jan 2023 18:32:02 +0000 (19:32 +0100)]
[clang][Interp][NFC] Forward-declare Boolean in PrimTypes.h

We don't need the full header file here.

17 months ago[clang][Interp][NFC] Fix header comment file name
Timm Bäder [Sat, 21 Jan 2023 16:25:57 +0000 (17:25 +0100)]
[clang][Interp][NFC] Fix header comment file name

17 months agoTransform ctpop(Pow2) -> icmp ne Pow2, 0
Noah Goldstein [Sun, 22 Jan 2023 06:00:14 +0000 (22:00 -0800)]
Transform ctpop(Pow2) -> icmp ne Pow2, 0

This makes folding to 0/1 later on easier and regardless `icmp ne` is
'probably' faster on most targets (especially for vectors).

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D142253

17 months agoAdd tests for ctpop(Pow2); NFC
Noah Goldstein [Sun, 22 Jan 2023 05:59:58 +0000 (21:59 -0800)]
Add tests for ctpop(Pow2); NFC

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D142252

17 months ago[libc++] Rename take_while_view::__sentinel to __take_while_view_sentinel
Nikolas Klauser [Sat, 21 Jan 2023 07:34:47 +0000 (08:34 +0100)]
[libc++] Rename take_while_view::__sentinel to __take_while_view_sentinel

This makes it easier to specialize traits classes, like __segmented_iterator_traits.

Reviewed By: var-const, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D142276

17 months ago[BPF][Clang] Fix func argument pattern in bpf-stack-protector test
Yonghong Song [Sun, 22 Jan 2023 06:24:22 +0000 (22:24 -0800)]
[BPF][Clang] Fix func argument pattern in bpf-stack-protector test

Commit 56b038f887f3("[BPF][clang] Ignore stack protector options for BPF
target") added a test for its corresponding functionality.
Douglas Yung found that the test will fail with the release build
buildbot due to different func argument patterns (from %msg
to %0). This patch fixed the issue by using pattern [0-9a-z]+
which allows both %msg and %0.

17 months ago[OpenMP] Try to fix Flang after new clause was added
Johannes Doerfert [Sun, 22 Jan 2023 04:24:43 +0000 (20:24 -0800)]
[OpenMP] Try to fix Flang after new clause was added

17 months ago[OpenMP][FIX] Split test into amdgpu and nvptx specific ones
Johannes Doerfert [Sun, 22 Jan 2023 03:54:35 +0000 (19:54 -0800)]
[OpenMP][FIX] Split test into amdgpu and nvptx specific ones

This avoids running the test for the host.

17 months ago[OpenMP][FIX] Add default clause to switch
Johannes Doerfert [Sun, 22 Jan 2023 03:50:22 +0000 (19:50 -0800)]
[OpenMP][FIX] Add default clause to switch

17 months ago[OpenMP] Introduce the `ompx_dyn_cgroup_mem(<N>)` clause
Johannes Doerfert [Sun, 8 Jan 2023 00:14:48 +0000 (16:14 -0800)]
[OpenMP] Introduce the `ompx_dyn_cgroup_mem(<N>)` clause

Dynamic memory allows users to allocate fast shared memory when a kernel
is launched. We support a single size for all kernels via the
`LIBOMPTARGET_SHARED_MEMORY_SIZE` environment variable but now we can
control it per kernel invocation, hence allow computed values.

Note: Only the nextgen plugins will allocate memory based on the clause,
      the old plugins will silently miscompile.

Differential Revision: https://reviews.llvm.org/D141233

17 months agoAdd the test dialect as dependent for the "test-legalize-patterns" test pass
Mehdi Amini [Sun, 22 Jan 2023 02:39:49 +0000 (02:39 +0000)]
Add the test dialect as dependent for the "test-legalize-patterns" test pass

Fixes #60183

17 months agoAdd missing dependent dialects to "convert-gpu-to-rocdl"
Mehdi Amini [Sun, 22 Jan 2023 02:26:34 +0000 (02:26 +0000)]
Add missing dependent dialects to "convert-gpu-to-rocdl"

Fixes #60198

17 months ago[NFC][SCEV] `CompareSCEVComplexity`: deduplicate handling
Roman Lebedev [Sun, 22 Jan 2023 01:45:38 +0000 (04:45 +0300)]
[NFC][SCEV] `CompareSCEVComplexity`: deduplicate handling

For all but unknown/constant/recurrences, the handling is identical,
there is no point in special-casing anything.

17 months ago[NFC][SCEV] `SCEVTraversal::visitAll()`: deduplicate handling
Roman Lebedev [Sun, 22 Jan 2023 01:12:11 +0000 (04:12 +0300)]
[NFC][SCEV] `SCEVTraversal::visitAll()`: deduplicate handling

They don't not do anything different from what we do for n-ary expressions,
there is no point in special-casing them.

17 months ago[NFC][SCEV] `createNodeForSelectOrPHIInstWithICmpInstCond()`: directly take `Type...
Roman Lebedev [Sun, 22 Jan 2023 01:09:39 +0000 (04:09 +0300)]
[NFC][SCEV] `createNodeForSelectOrPHIInstWithICmpInstCond()`: directly take `Type`, not `Instruction`

We don't use the `Instruction` itself, only it's type anyways.

17 months ago[NFC][SCEV] `createNodeForSelectOrPHIInstWithICmpInstCond()`: return optional
Roman Lebedev [Sun, 22 Jan 2023 01:07:26 +0000 (04:07 +0300)]
[NFC][SCEV] `createNodeForSelectOrPHIInstWithICmpInstCond()`: return optional

We only want about the result if it succeeds, and don't want `SCEVUnknown`.

17 months agoRemove trailing whitespace from comment
Noah Goldstein [Sat, 21 Jan 2023 19:33:38 +0000 (11:33 -0800)]
Remove trailing whitespace from comment

Differential Revision: https://reviews.llvm.org/D142289

17 months ago[llvm] Use llvm::bit_width (NFC)
Kazu Hirata [Sat, 21 Jan 2023 22:48:32 +0000 (14:48 -0800)]
[llvm] Use llvm::bit_width (NFC)

17 months ago[NFC][SCEV] Reflow `getRangeRef()` into an exhaustive switch
Roman Lebedev [Sat, 21 Jan 2023 22:34:22 +0000 (01:34 +0300)]
[NFC][SCEV] Reflow `getRangeRef()` into an exhaustive switch

17 months ago[NFC][SCEV] Reflow `getRelevantLoop()` into an exhaustive switch
Roman Lebedev [Sat, 21 Jan 2023 22:09:13 +0000 (01:09 +0300)]
[NFC][SCEV] Reflow `getRelevantLoop()` into an exhaustive switch

17 months ago[llvm] Use llvm::bit_width (NFC)
Kazu Hirata [Sat, 21 Jan 2023 21:56:47 +0000 (13:56 -0800)]
[llvm] Use llvm::bit_width (NFC)

17 months ago[OpenMP][FIX] Runtime args are not kernel args
Johannes Doerfert [Sat, 21 Jan 2023 21:43:10 +0000 (13:43 -0800)]
[OpenMP][FIX] Runtime args are not kernel args

Clang passes `KernelArgs.NumArgs` to the runtime but not all are kernel
arguments. This ensures we fallback to the old logic. In a follow up we
should introduce a new `KernelArgs.NumKernelArgs` field and set it in
the runtime.

17 months ago[OpenMP][FIX] Remove version check lines in clang test
Johannes Doerfert [Sat, 21 Jan 2023 21:23:03 +0000 (13:23 -0800)]
[OpenMP][FIX] Remove version check lines in clang test

We really need a way to make the check line script deal with these
automatically.

17 months ago[X86] `X86TargetLowering`: override `allowsMemoryAccess()`
Roman Lebedev [Sat, 21 Jan 2023 21:12:27 +0000 (00:12 +0300)]
[X86] `X86TargetLowering`: override `allowsMemoryAccess()`

The baseline `allowsMemoryAccess()` is wrong for X86.
It assumes that aligned memory operations are always allowed,
but that is not true.

For example, We can not perform a 32-byte aligned non-temporal load
of a 32-byte vector, without AVX2 that is, yet `allowsMemoryAccess()`
will say it is allowed, so we may end up merging non-temporal loads,
only to split them up to legalize them, and here we go again.

NOTE: the test changes here are superfluous. The main effect is that without this change,
in D141777, we'd get stuck endlessly merging and splitting non-temporal stores.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D141776

17 months ago[NFC][SCEV] `computeSCEVAtScope()`: reserve vector size upfront
Roman Lebedev [Sat, 21 Jan 2023 20:42:11 +0000 (23:42 +0300)]
[NFC][SCEV] `computeSCEVAtScope()`: reserve vector size upfront

17 months ago[NFC][SCEV] `computeSCEVAtScope()`: `scUnknown`: use early-returns
Roman Lebedev [Sat, 21 Jan 2023 20:27:51 +0000 (23:27 +0300)]
[NFC][SCEV] `computeSCEVAtScope()`: `scUnknown`: use early-returns

17 months ago[NFC][SCEV] Reflow `computeSCEVAtScope()` into an exhaustive switch
Roman Lebedev [Sat, 21 Jan 2023 20:23:59 +0000 (23:23 +0300)]
[NFC][SCEV] Reflow `computeSCEVAtScope()` into an exhaustive switch

Otherwise instead of a compile-time error that you forgot to modify it,
you'd get a run-time error, which happened every time i've added new expr.

This is completely NFC, there are no other changes here.

17 months ago[NFC][SCEV] `computeSCEVAtScope()`: clang-format
Roman Lebedev [Sat, 21 Jan 2023 20:23:20 +0000 (23:23 +0300)]
[NFC][SCEV] `computeSCEVAtScope()`: clang-format

17 months ago[clang/driver] Make sure that `-gno-modules` by itself doesn't enable debug info
Argyrios Kyrtzidis [Sat, 21 Jan 2023 19:31:21 +0000 (11:31 -0800)]
[clang/driver] Make sure that `-gno-modules` by itself doesn't enable debug info

17 months ago[OpenMP] Modernize the kernel launching interface and APIs
Johannes Doerfert [Thu, 19 Jan 2023 21:40:58 +0000 (13:40 -0800)]
[OpenMP] Modernize the kernel launching interface and APIs

We already created a versioned `__tgt_kernel_arguments` struct but it
was only briefly used and its content was passed in isolation anyway.
This makes it hard to add more information in the future. With this
patch we fully embrace the struct as means to pass information from the
compiler to the plugin as part of a kernel launch.

The patch also extends and renames the struct, bumping the version
number to 2. Version 1 entries are auto-upgraded. This is in preparation
for "bare" kernel launches, per kernel dynamic shared memory, CUDA/HIP
lowering, etc.

The `__tgt_target_kernel_nowait` interface was deprecated as it was
unused. Once we actually implement support for something like that, we
can add an appropriate API.

Note: Only plugins with the `launch_kernel` interface are now supported.
      That means that a new clang won't be able to use an old runtime.
      An old clang can still use the new runtime since the libomptarget
      interface did not change.

Differential Revision: https://reviews.llvm.org/D141232

17 months ago[RISCV] Use llvm::bit_width (NFC)
Kazu Hirata [Sat, 21 Jan 2023 18:54:09 +0000 (10:54 -0800)]
[RISCV] Use llvm::bit_width (NFC)

I've verified that the arguments to llvm::bit_width are all of
uint64_t with:

  static_assert(std::is_same_v<uint64_t, decltype(Mask)>)

17 months ago[DAG] Convert static combineABSToABD to DAGCombiner::foldABSToABD. NFCI.
Simon Pilgrim [Sat, 21 Jan 2023 18:23:36 +0000 (18:23 +0000)]
[DAG] Convert static combineABSToABD to DAGCombiner::foldABSToABD. NFCI.

This will make some future legality checks easier.

17 months ago[ARM] Cortex-M55 Scheduling Model
David Green [Sat, 21 Jan 2023 18:03:24 +0000 (18:03 +0000)]
[ARM] Cortex-M55 Scheduling Model

This adds an Arm Cortex-M55 scheduling model, using the information from
https://developer.arm.com/documentation/102692/latest/

Differential Revision: https://reviews.llvm.org/D141523

17 months ago[AArch64] Simplify isSeveralBitsExtractOpFromShr (NFC)
Kazu Hirata [Sat, 21 Jan 2023 17:23:39 +0000 (09:23 -0800)]
[AArch64] Simplify isSeveralBitsExtractOpFromShr (NFC)

This patch simplifies isSeveralBitsExtractOpFromShr.

The following statements are equivalent:

  unsigned BitWide = 64 - countLeadingOnes(~(AndMask >> SrlImm));
  unsigned BitWide = 64 - countLeadingZeros(AndMask >> SrlImm);

Now, consider:

  if (BitWide && isMask_64(AndMask >> SrlImm)) {

When isMask_64 returns true, AndMask >> SrlImm and BitWide must be
nonzero.  Since BitWide does not contribute to narrowing the
condition, we can simplify the condition as:

  if (isMask_64(AndMask >> SrlImm)) {

We can negate the condition for an early exit as recommended by the
LLVM Coding Standards.

Now, all of the following are equivalent if AndMask >> SrlImm is
nonzero:

  MSB = BitWide + SrlImm - 1
  MSB = (64 - countLeadingZero(AndMask >> SrlImm)) + SrlImm - 1
  MSB = (63 - countLeadingZero(AndMask >> SrlImm)) + SrlImm
  MSB = 63 - countLeadingZero(AndMask)
  MSB = 63 ^ countLeadingZero(AndMask)
  MSB = findLastSet(AndMask, ZB_Undefined)

17 months ago[VPlan] Consider all recipes in replicate blocks as sink candidates.
Florian Hahn [Sat, 21 Jan 2023 17:14:13 +0000 (17:14 +0000)]
[VPlan] Consider all recipes in replicate blocks as sink candidates.

Update sinkScalarOperands to consider all operands of recipes in
replicate blocks as sink candidates This enables additional sinking
opportunities and is another step towards retiring LLVM IR-based
sinkScalarOperands.

This enables iterative sinking of operands for successive calls of
sinkScalarOperands.

Depends on D139788.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D139790