platform/upstream/llvm.git
22 months ago[M68k] Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros
Brad Smith [Thu, 29 Dec 2022 09:54:39 +0000 (04:54 -0500)]
[M68k] Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros

Define __GCC_HAVE_SYNC_COMPARE_AND_SWAP macros

Fixes #58974

Reviewed By: myhsu, glaubitz, 0x59616e

Differential Revision: https://reviews.llvm.org/D140695

22 months ago[mlir] Add a newline character in the Linalg debug macro
Andrzej Warzynski [Thu, 29 Dec 2022 09:11:33 +0000 (09:11 +0000)]
[mlir] Add a newline character in the Linalg debug macro

Differential Revision: https://reviews.llvm.org/D140752

22 months ago[InstCombine] Fold (X << Z) / (X * Y) -> (1 << Z) / Y
Chenbing Zheng [Thu, 29 Dec 2022 09:30:49 +0000 (17:30 +0800)]
[InstCombine] Fold (X << Z) / (X * Y) -> (1 << Z) / Y

Alive2: https://alive2.llvm.org/ce/z/CBJLeP

22 months agoFix build of nvptx-arch with CLANG_LINK_CLANG_DYLIB
Jonas Hahnfeld [Thu, 29 Dec 2022 08:44:19 +0000 (09:44 +0100)]
Fix build of nvptx-arch with CLANG_LINK_CLANG_DYLIB

The function clang_target_link_libraries must only be used with real
Clang libraries; with CLANG_LINK_CLANG_DYLIB, it will instead link in
clang-cpp. We must use the standard CMake target_link_libraries for
the CUDA library.

22 months ago[RISCV] Add Svpbmt extension support.
Yeting Kuo [Tue, 27 Dec 2022 11:34:29 +0000 (03:34 -0800)]
[RISCV] Add Svpbmt extension support.

Spec of Svpbmt: https://github.com/riscv/riscv-isa-manual/blob/master/src/supervisor.tex#L2399

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D140692

22 months ago[RISCV] Add SH1ADD/SH2ADD/SH3ADD to RISCVDAGToDAGISel::hasAllNBitUsers.
Craig Topper [Thu, 29 Dec 2022 07:38:12 +0000 (23:38 -0800)]
[RISCV] Add SH1ADD/SH2ADD/SH3ADD to RISCVDAGToDAGISel::hasAllNBitUsers.

22 months ago[Clang][RISCV] Use poison instead of undef
eopXD [Tue, 27 Dec 2022 09:42:31 +0000 (01:42 -0800)]
[Clang][RISCV] Use poison instead of undef

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D140687

22 months ago[BOLT] Respect -function-order in lite mode
Amir Ayupov [Thu, 29 Dec 2022 04:49:30 +0000 (20:49 -0800)]
[BOLT] Respect -function-order in lite mode

Process functions listed in -function-order file even in lite mode.

Reviewed By: #bolt, maksfb

Differential Revision: https://reviews.llvm.org/D140435

22 months ago[RISCV] Prefer ADDI over ORI if the known bits are disjoint.
Craig Topper [Thu, 29 Dec 2022 03:43:18 +0000 (19:43 -0800)]
[RISCV] Prefer ADDI over ORI if the known bits are disjoint.

There is no compressed form of ORI but there is a compressed form
for ADDI.

This also works for XORI since DAGCombine will turn Xor with disjoint
bits in Or.

Note: The compressed forms require a simm6 immediate, but I'm doing
this for the full simm12 range.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D140674

22 months ago[DFSan] Add `zeroext` attribute for callbacks with 8bit shadow variable arguments
Weining Lu [Thu, 29 Dec 2022 03:37:46 +0000 (11:37 +0800)]
[DFSan] Add `zeroext` attribute for callbacks with 8bit shadow variable arguments

Add `zeroext` attribute for below callbacks' first parameter
(8bit shadow variable arguments) to conform to many platforms'
ABI calling convention and some compiler behavior.
- __dfsan_load_callback
- __dfsan_store_callback
- __dfsan_cmp_callback
- __dfsan_conditional_callback
- __dfsan_conditional_callback_origin
- __dfsan_reaches_function_callback
- __dfsan_reaches_function_callback_origin

The type of these callbacks' first parameter is u8 (see the
definition of `dfsan_label`). First, many platforms' ABI
requires unsigned integer data types (except unsigned int)
are zero-extended when stored in general-purpose register.
Second, the problem is that compiler optimization may assume
the arguments are zero-extended and, if not, misbehave, e.g.
it uses an `i8` argument to index into a jump table. If the
argument has non-zero high bits, the output executable may
crash at run-time. So we need to add the `zeroext` attribute
when declaring and calling them.

Reviewed By: browneee, MaskRay

Differential Revision: https://reviews.llvm.org/D140689

22 months ago[XRay] Unsupport version<2 sled entry
Fangrui Song [Thu, 29 Dec 2022 02:08:29 +0000 (18:08 -0800)]
[XRay] Unsupport version<2 sled entry

For many features we expect clang and compiler-rt to have a version lock
relation, yet for XRaySledEntry we have kept version<2 compatibility for more
than 2 years (I migrated away the last user mips in 2020-09 (D87977)).
I think it's fair to call an end to version<2 now. This should discourage more
work on version<2 (e.g. D140725).

Reviewed By: ianlevesque

Differential Revision: https://reviews.llvm.org/D140739

22 months agoRevert "[MLIR][Arith] Remove unused assertions"
Ben Shi [Thu, 29 Dec 2022 00:54:01 +0000 (08:54 +0800)]
Revert "[MLIR][Arith] Remove unused assertions"

This reverts commit 50e6c306b1cb03fe398aebc41d1bef5b6c9d9bb0.

22 months ago[NFC][Codegen][X86] Add exhaustive-ish test coverage for ZERO_EXTEND_VECTOR_INREG
Roman Lebedev [Wed, 28 Dec 2022 23:09:57 +0000 (02:09 +0300)]
[NFC][Codegen][X86] Add exhaustive-ish test coverage for ZERO_EXTEND_VECTOR_INREG

It should be possible to deduplicate AVX2 and AVX512F checklines,
but i'm not sure which combination of check prefixes would do that.

https://godbolt.org/z/sndT9n1nz

22 months ago[mlir][py] Add StrAttr convenience builder.
Jacques Pienaar [Thu, 29 Dec 2022 00:02:08 +0000 (16:02 -0800)]
[mlir][py] Add StrAttr convenience builder.

22 months ago[dfsan][test] Replace REQUIRES: x86_64-target-arch with lit.cfg.py check
Fangrui Song [Wed, 28 Dec 2022 23:35:09 +0000 (15:35 -0800)]
[dfsan][test] Replace REQUIRES: x86_64-target-arch with lit.cfg.py check

Make it easier to support a new architecture.

Reviewed By: #sanitizers, vitalybuka

Differential Revision: https://reviews.llvm.org/D140744

22 months ago[RISCV] Fix mistakes in fixed-vectors-vreductions-mask.ll command lines. NFC
Craig Topper [Wed, 28 Dec 2022 23:16:13 +0000 (15:16 -0800)]
[RISCV] Fix mistakes in fixed-vectors-vreductions-mask.ll command lines. NFC

There were 4 RUN lines, but only 2 of them were unique. I believe
we were trying to test LMUL=1 and LMUL=8 with riscv32 and riscv64.
But put riscv32 on both LMUL=1 lines and riscv64 on both LMUL=8 lines.

22 months ago[RISCV] Add RISCV::XORI to RISCVDAGToDAGISel::hasAllNBitUsers.
Craig Topper [Wed, 28 Dec 2022 21:39:33 +0000 (13:39 -0800)]
[RISCV] Add RISCV::XORI to RISCVDAGToDAGISel::hasAllNBitUsers.

22 months ago[Clang] Move AMDGPU IAS enabling to Generic_GCC::IsIntegratedAssemblerDefault, NFC
Brad Smith [Wed, 28 Dec 2022 22:51:52 +0000 (17:51 -0500)]
[Clang] Move AMDGPU IAS enabling to Generic_GCC::IsIntegratedAssemblerDefault, NFC

Reviewed By: scott.linder

Differential Revision: https://reviews.llvm.org/D140657

22 months agoApply clang-tidy fixes for readability-identifier-naming in InferTypeOpInterface...
Mehdi Amini [Sat, 10 Dec 2022 12:53:27 +0000 (12:53 +0000)]
Apply clang-tidy fixes for readability-identifier-naming in InferTypeOpInterface.cpp (NFC)

22 months agoApply clang-tidy fixes for readability-simplify-boolean-expr in BufferizableOpInterfa...
Mehdi Amini [Sat, 10 Dec 2022 12:25:09 +0000 (12:25 +0000)]
Apply clang-tidy fixes for readability-simplify-boolean-expr in BufferizableOpInterfaceImpl.cpp (NFC)

22 months ago[RISCV] Support SRLI in hasAllNBitUsers.
Craig Topper [Wed, 28 Dec 2022 21:08:28 +0000 (13:08 -0800)]
[RISCV] Support SRLI in hasAllNBitUsers.

We can recursively look through SRLI if the shift amount is less
than the demanded bits. We can reduce the demanded bit count by
the shift amount and check the users of the SRLI.

22 months ago[RISCV] Refactor RISCV::hasAllWUsers to hasAllNBitUsers similar to RISCVISelDAGToDAG...
Craig Topper [Wed, 28 Dec 2022 19:41:01 +0000 (11:41 -0800)]
[RISCV] Refactor RISCV::hasAllWUsers to hasAllNBitUsers similar to RISCVISelDAGToDAG's version. NFC

Move to RISCVInstrInfo since we need RISCVSubtarget now.

Instead of asking if only the lower 32 bits are used we can now
ask if the lower N bits are used. This will be needed by a future
patch.

22 months agoCodingStandards: restrict CamelCase variable names guideline to llvm/clang/clang...
Fangrui Song [Wed, 28 Dec 2022 20:48:13 +0000 (12:48 -0800)]
CodingStandards: restrict CamelCase variable names guideline to llvm/clang/clang-tools-extra/polly/bolt

See https://discourse.llvm.org/t/top-level-clang-tidy-options-and-variablename-suggestion-on-codingstandards/58783 ,
the CamelCase variable names guideline does not reflect the truth:
flang, libc, libclc, libcxx, libcxxabi, libunwind, lld, mlir, openmp,
and pstl use camelCase. lldb uses snake_case.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D140585

22 months ago[MLIR][Affine] Make fusion helper check method significantly more efficient
Uday Bondhugula [Wed, 28 Dec 2022 20:06:21 +0000 (01:36 +0530)]
[MLIR][Affine] Make fusion helper check method significantly more efficient

The `hasDependencePath` method in affine fusion is quite inefficient as
it does a DFS on the complete graph for what is a small part of the
checks before fusion can be performed. Make this efficient by using the
fact that the nodes involved are all at the top-level of the same block.
With this change, for large graphs with about 10,000 nodes, the check
runs in a few seconds instead of not terminating even in a few hours.

This is NFC from a functionality standpoint; it only leads to an
improvement in pass running time on large IR.

Differential Revision: https://reviews.llvm.org/D140522

22 months ago[XRay] Fix Hexagon sled version
Fangrui Song [Wed, 28 Dec 2022 20:03:09 +0000 (12:03 -0800)]
[XRay] Fix Hexagon sled version

D113638 emitted version 0 for XRaySledEntry, which will lead to an incorrect
address computation in the runtime.

While here, improve the test.

22 months ago[OpenMP][JIT] Fixed a couple of issues in the initial implementation of JIT
Shilei Tian [Wed, 28 Dec 2022 19:40:46 +0000 (14:40 -0500)]
[OpenMP][JIT] Fixed a couple of issues in the initial implementation of JIT

This patch fixes a couple of issues:
1. Instead of using `llvm_unreachable` for those base virtual functions, unknown
   value will be returned. The previous method could cause runtime error for those
   targets where the image is not compatible but JIT is not implemented.
2. Fixed the type in CMake that causes the `Target` CMake variable is undefined.

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D140732

22 months ago[RISCV] Add const qualifiers to some function arguments. NFC
Craig Topper [Wed, 28 Dec 2022 19:07:31 +0000 (11:07 -0800)]
[RISCV] Add const qualifiers to some function arguments. NFC

22 months ago[X86] Emit RIP-relative access to local function in PIC medium code model
Thomas Köppe [Wed, 28 Dec 2022 19:14:39 +0000 (11:14 -0800)]
[X86] Emit RIP-relative access to local function in PIC medium code model

Currently, the medium code model for x86_64 emits position-dependent relocations (R_X86_64_64) for local functions, regardless of PIC or no-PIC mode. (This means generically that code compiled with the medium model cannot be linked into a position-independent executable.)

Example:

```
static int g(int n) {
  return 2 * n + 3;
}

void f(int(**p)(int)) {
  *p = g;
}
```

This results in:

```
Disassembly of section .text:

0000000000000000 <f>:
       0: 48 b8 00 00 00 00 00 00 00 00 movabs rax, 0x0
       a: 48 89 07                      mov qword ptr [rdi], rax
       d: c3                            ret
```

```
Relocation section '.rela.text' at offset 0xf0 contains 1 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000002  0000000200000001 R_X86_64_64            0000000000000000 .text + 10
```

This patch changes the behaviour to unconditionally emit a RIP-relative access, both in PIC and non-PIC mode. This fixes PIC mode, and is perhaps an improvement in non-PIC mode, too, since it results in a shorter instruction. A 32-bit relocation should suffice since the medium memory model demands that all code fit within 2GiB.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D140593

22 months ago[InstSimplify] add tests for div exact; NFC
Sanjay Patel [Wed, 28 Dec 2022 17:06:57 +0000 (12:06 -0500)]
[InstSimplify] add tests for div exact; NFC

22 months ago[InstCombine] add tests for udiv-by-constant demanded bits; NFC
Sanjay Patel [Wed, 28 Dec 2022 16:04:02 +0000 (11:04 -0500)]
[InstCombine] add tests for udiv-by-constant demanded bits; NFC

22 months ago[libc++][CI] Improves clang-(tidy|query) selection.
Mark de Wever [Wed, 7 Dec 2022 16:43:59 +0000 (17:43 +0100)]
[libc++][CI] Improves clang-(tidy|query) selection.

Hardcode the version of the tools used in the test feature script
instead of the tests. By changing the hard-coded location it's
easier to make the location flexible in the future.

Drive-by change
- The minimum required version for clang-query is now 15, which matches
  our future idea as outlined in the Dockerfile.
- The minimum required version for clang-tidy is now 16, which enables
  the new clang-tidy ADL plugin. This plugin is disabled for C++03
  due to false positives when using `noexcept`, which is not an operator
  in C++03.

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D139545

22 months ago[lld] Fix iwyu problems after 83d59e05b201760e3f364ff6316301d347cbad95
Fangrui Song [Wed, 28 Dec 2022 18:46:45 +0000 (10:46 -0800)]
[lld] Fix iwyu problems after 83d59e05b201760e3f364ff6316301d347cbad95

The commit transitively includes lld/include/lld/Common/ErrorHandler.h into
lld/include/lld/Common/Driver.h, which is not intended.

22 months ago[NVPTX] Emit .noreturn directive
Pavel Kopyl [Wed, 28 Dec 2022 15:17:07 +0000 (18:17 +0300)]
[NVPTX] Emit .noreturn directive

Differential Revision: https://reviews.llvm.org/D140238

22 months agoHandle simple diamond CFG hoisting in DivRemPairs.
Owen Anderson [Sat, 24 Dec 2022 04:22:36 +0000 (21:22 -0700)]
Handle simple diamond CFG hoisting in DivRemPairs.

Previous we only handled triangle CFGs. This patch expands that
to support diamonds, where the div and rem appear in the then/else
sides of a condition. In that case, we can hoist the div into the
shared predecessor.

This could be generalized further to use nearest common ancestors,
but some of the conditions for hoisting would then require
post-dominator information.

Reviewed By: nikic, lebedev.ri

Differential Revision: https://reviews.llvm.org/D140647

22 months ago[AArch64] Fix AArch64TargetParser.def includes for standalone builds.
Pavel Iliin [Wed, 28 Dec 2022 16:55:16 +0000 (16:55 +0000)]
[AArch64] Fix AArch64TargetParser.def includes for standalone builds.

22 months ago[NFC][libc++] Replaces tabs by spaces.
Mark de Wever [Wed, 28 Dec 2022 17:10:39 +0000 (18:10 +0100)]
[NFC][libc++] Replaces tabs by spaces.

22 months ago[test] Exclude //llvm/unittests:llvm_exegesis_tests due to buildkite environment.
Jordan Rupprecht [Wed, 28 Dec 2022 16:43:04 +0000 (08:43 -0800)]
[test] Exclude //llvm/unittests:llvm_exegesis_tests due to buildkite environment.

Buildkite does not allow user perf monitoring and fails: https://buildkite.com/llvm-project/upstream-bazel/builds/49579.

```
[ RUN      ] PerfHelperTest.FunctionalTest
Unable to open event. ERRNO: Permission denied. Make sure your kernel allows user space perf monitoring.
You may want to try:
$ sudo sh -c 'echo -1 > /proc/sys/kernel/perf_event_paranoid'
llvm_exegesis_tests: external/llvm-project/llvm/tools/llvm-exegesis/lib/PerfHelper.cpp:111: llvm::exegesis::pfm::Counter::Counter(llvm::exegesis::pfm::PerfEvent &&): Assertion `FileDescriptor != -1 && "Unable to open event"' failed.
```

22 months ago[mlir][sparse] Use DLT in the mangled function names for insertion.
bixia1 [Tue, 27 Dec 2022 20:28:29 +0000 (12:28 -0800)]
[mlir][sparse] Use DLT in the mangled function names for insertion.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D140484

22 months ago[bazel] Restore libpfm as a conditional dependency for exegesis.
Jordan Rupprecht [Wed, 28 Dec 2022 16:13:20 +0000 (08:13 -0800)]
[bazel] Restore libpfm as a conditional dependency for exegesis.

We used to have `pfm` built into exegesis, although since it's an external dependency we marked it as a manual target. Because of this we didn't have buildbot coverage and so we removed it in D134510 after we had a few breakages that weren't caught. This adds it back, but with three possible states similar to the story with `mpfr`, i.e. it can either be disabled, built from external sources (git/make), or use whatever `-lpfm` is installed on the system.

This change is modeled after D119547. Like that patch, the default is off (matching the status quo), but unlike that patch we don't enable it for CI because IIRC we don't have the package installed there, and building from source might be expensive. We could  enable it later either after installing it on buildbot machines or by measuring build cost and deeming it OK.

Reviewed By: GMNGeoffrey

Differential Revision: https://reviews.llvm.org/D138470

22 months ago[NFC][libc++] Fixes ADL calls.
Mark de Wever [Wed, 28 Dec 2022 16:04:43 +0000 (17:04 +0100)]
[NFC][libc++] Fixes ADL calls.

22 months ago[InstCombine] preserve signbit semantics of NAN with fold to fabs
Sanjay Patel [Wed, 28 Dec 2022 15:28:23 +0000 (10:28 -0500)]
[InstCombine] preserve signbit semantics of NAN with fold to fabs

As discussed in issue #59279, we want fneg/fabs to conform to the
IEEE-754 spec for signbit operations - quoting from section 5.5.1
of IEEE-754-2008:
"negate(x) copies a floating-point operand x to a destination in
the same format, reversing the sign bit"
"abs(x) copies a floating-point operand x to a destination in the
same format, setting the sign bit to 0 (positive)"
"The operations treat floating-point numbers and NaNs alike."

So we gate this transform with "nnan" in addition to "nsz":
(X > 0.0) ? X : -X --> fabs(X)

Without that restriction, we could have for example:
(+NaN > 0.0) ? +NaN : -NaN --> -NaN
(because an ordered compare with NaN is always false)
That would be different than fabs(+NaN) --> +NaN.

More fabs/fneg patterns demonstrated here:
https://godbolt.org/z/h8ecc659d
(without any FMF, these are correct independently of this patch -
no fabs should be created)

The code change is a one-liner, but we have lots of tests diffs
because there are many variations of the basic pattern.

Differential Revision: https://reviews.llvm.org/D139785

22 months ago[SLP]Use ShuffleInstructionBuilder for vector shrinking.
Alexey Bataev [Wed, 21 Dec 2022 21:44:30 +0000 (13:44 -0800)]
[SLP]Use ShuffleInstructionBuilder for vector shrinking.

We can use ShuffleInstructionBuilder now for shrinking shuffle emission.
It allows to remove extra shuffle from the emitted code and reuse
original vector.

Part of D110978

Differential Revision: https://reviews.llvm.org/D140499

22 months ago[NFC][exegesis] By default, don't dump objects to disk
Roman Lebedev [Wed, 28 Dec 2022 13:56:20 +0000 (16:56 +0300)]
[NFC][exegesis] By default, don't dump objects to disk

It's a strictly-developer feature, which is useless most of the time.

Fixes https://github.com/llvm/llvm-project/issues/59082

Reviewed By: RKSimon, gchatelet

Differential Revision: https://reviews.llvm.org/D140700

22 months ago[clangd] Fix action `RemoveUsingNamespace` for inline namespace
v1nh1shungry [Wed, 28 Dec 2022 12:34:41 +0000 (13:34 +0100)]
[clangd] Fix action `RemoveUsingNamespace` for inline namespace

Existing version ignores symbols declared in an inline namespace `ns`
when removing `using namespace ns`

Reviewed By: tom-anders

Differential Revision: https://reviews.llvm.org/D138028

22 months ago[mlir] NFC: work around gcc-aarch64 v8.3 compilation issue in getRegionBranchSuccesso...
Christian Sigg [Wed, 28 Dec 2022 11:08:37 +0000 (12:08 +0100)]
[mlir] NFC: work around gcc-aarch64 v8.3 compilation issue in getRegionBranchSuccessorOperands implementation.

22 months ago[clang][Interp][NFC] Fix typo in comment
Timm Bäder [Wed, 28 Dec 2022 11:08:29 +0000 (12:08 +0100)]
[clang][Interp][NFC] Fix typo in comment

22 months ago[RISCV][NFC] Remove redundant setOperationAction.
Hsiangkai Wang [Wed, 28 Dec 2022 03:42:53 +0000 (03:42 +0000)]
[RISCV][NFC] Remove redundant setOperationAction.

ISD::INSERT_VECTOR_ELT is already set above.

Differential Revision: https://reviews.llvm.org/D140716

22 months ago[X86] Rename CMPCCXADD intrinsics.
Freddy Ye [Wed, 28 Dec 2022 08:44:54 +0000 (16:44 +0800)]
[X86] Rename CMPCCXADD intrinsics.

"__cmpccxadd_epi*" -> "_cmpccxadd_epi*"
This is to align with other intrinsics to follow single leading "_" style. Gcc
and intrinsic guide website will also apply this change.

Reviewed By: LuoYuanke, skan

Differential Revision: https://reviews.llvm.org/D140281

22 months ago[RISCV] Fix typos in RISCVUsage.rst
Jie Fu [Wed, 28 Dec 2022 08:32:44 +0000 (16:32 +0800)]
[RISCV] Fix typos in RISCVUsage.rst

Fix typos `riscv-toolchai-convention` --> `riscv-toolchain-convention`

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D140717

22 months agoFix failure of ldst-16-byte.mir
Qiu Chaofan [Wed, 28 Dec 2022 06:23:32 +0000 (14:23 +0800)]
Fix failure of ldst-16-byte.mir

22 months ago[PowerPC] Enable track-subreg-liveness by default
Qiu Chaofan [Wed, 28 Dec 2022 06:06:01 +0000 (14:06 +0800)]
[PowerPC] Enable track-subreg-liveness by default

This option helps some MMA related cases to reduce unnecessary copies.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D108902

22 months ago[mlir][py] Fix negative cached value in attribute builder
Jacques Pienaar [Wed, 28 Dec 2022 05:56:58 +0000 (21:56 -0800)]
[mlir][py] Fix negative cached value in attribute builder

Previously this was incorrectly assigning py::none to where function was
expected which resulted in failure if one used a non-attribute for
attribute without registered builder.

22 months ago[MC][BPF] Add bpf guard for MC test data-section-prefix.ll
Yonghong Song [Wed, 28 Dec 2022 03:57:30 +0000 (19:57 -0800)]
[MC][BPF] Add bpf guard for MC test data-section-prefix.ll

Commit f27c4903c43b ("MC: Add .data. and .rodata. prefixes to MCContext
section classification") added a test assuming bpf target. But it
is possible bpf target is not configured in the clang build.
Let us add explicit bpf-target requirement for the test
so the test can be ingored properly for clang build without enabling
bpf target.

22 months ago[mlir][vector] Fix typo, NFC.
jacquesguan [Tue, 27 Dec 2022 04:45:06 +0000 (12:45 +0800)]
[mlir][vector] Fix typo, NFC.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D140681

22 months ago[OpenMP] Introduce basic JIT support to OpenMP target offloading
Shilei Tian [Wed, 28 Dec 2022 03:18:57 +0000 (22:18 -0500)]
[OpenMP] Introduce basic JIT support to OpenMP target offloading

This patch adds the basic JIT support for OpenMP. Currently it only works on Nvidia GPUs.

The support for AMDGPU can be extended easily by just implementing three interface functions. However, the infrastructure requires a small extra extension (add a pre process hook) to support portability for AMDGPU because the AMDGPU backend reads target features of functions. https://github.com/shiltian/llvm-project/commit/02bc7effccc6ff2f5ab3fe5218336094c0485766#diff-321c2038035972ad4994ff9d85b29950ba72c08a79891db5048b8f5d46915314R432 shows how it roughly works.

As for the test, even though I added the corresponding code in CMake files, the test still cannot be triggered because some code is missing in the new plugin CMake file, which has nothing to do with this patch. It will be fixed later.

In order to enable JIT mode, when compiling, `-foffload-lto` is needed, and when linking, `-foffload-lto -Wl,--embed-bitcode` is needed. That implies that, LTO is required to enable JIT mode.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D139287

22 months agoRevert "[OpenMP] Introduce basic JIT support to OpenMP target offloading"
Shilei Tian [Wed, 28 Dec 2022 02:52:07 +0000 (21:52 -0500)]
Revert "[OpenMP] Introduce basic JIT support to OpenMP target offloading"

This reverts commit 58906e4901ec5b7ed230d7fa96123654f6a974af because it breaks AMD's buildbot.

22 months ago[LV] Remove duplicate name set of vector header basic block. NFC
Michael Maitland [Fri, 16 Dec 2022 21:44:52 +0000 (13:44 -0800)]
[LV] Remove duplicate name set of vector header basic block. NFC

The preheader was named explicitly in 256c6b0ba14e8a7ab6373b61b7193ea8c0a3651c
which makes setting the name in prior commit 95b2aa511eea1f31e183a2a3aed4d2aa852d089c
unnecessary.

Differential Revision: https://reviews.llvm.org/D140246

22 months ago[LLDB][LoongArch] Optimize EmulateInstructionLoongArch related code
Hui Li [Wed, 28 Dec 2022 01:12:09 +0000 (09:12 +0800)]
[LLDB][LoongArch] Optimize EmulateInstructionLoongArch related code

This is a code optimization patch that does not include feature additions
or deletions.

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D140616

22 months ago[mlir][sparse] move emitter ownership into environment
Aart Bik [Tue, 27 Dec 2022 23:47:02 +0000 (15:47 -0800)]
[mlir][sparse] move emitter ownership into environment

last bits and pieces of the environment refactoring

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D140709

22 months ago[SCEV] Properly clean up duplicated FoldCacheUser ID entries.
Florian Hahn [Wed, 28 Dec 2022 00:09:52 +0000 (00:09 +0000)]
[SCEV] Properly clean up duplicated FoldCacheUser ID entries.

The current code did not properly handled duplicated FoldCacheUser ID
entries when overwriting an existing entry in the FoldCache.

This triggered verification failures reported by @uabelho and #59721.

The patch fixes that by removing stale IDs when overwriting an existing
entry in the cache.

Fixes #59721.

22 months ago[OpenMP] Introduce basic JIT support to OpenMP target offloading
Shilei Tian [Wed, 28 Dec 2022 00:07:24 +0000 (19:07 -0500)]
[OpenMP] Introduce basic JIT support to OpenMP target offloading

This patch adds the basic JIT support for OpenMP. Currently it only works on Nvidia GPUs.

The support for AMDGPU can be extended easily by just implementing three interface functions. However, the infrastructure requires a small extra extension (add a pre process hook) to support portability for AMDGPU because the AMDGPU backend reads target features of functions. https://github.com/shiltian/llvm-project/commit/02bc7effccc6ff2f5ab3fe5218336094c0485766#diff-321c2038035972ad4994ff9d85b29950ba72c08a79891db5048b8f5d46915314R432 shows how it roughly works.

As for the test, even though I added the corresponding code in CMake files, the test still cannot be triggered because some code is missing in the new plugin CMake file, which has nothing to do with this patch. It will be fixed later.

In order to enable JIT mode, when compiling, `-foffload-lto` is needed, and when linking, `-foffload-lto -Wl,--embed-bitcode` is needed. That implies that, LTO is required to enable JIT mode.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D139287

22 months agoMC: Add .data. and .rodata. prefixes to MCContext section classification
Dave Marchevsky [Tue, 27 Dec 2022 23:56:47 +0000 (15:56 -0800)]
MC: Add .data. and .rodata. prefixes to MCContext section classification

Commit 463da422f019 ("MC: make section classification a bit more
thorough") changed MCContext::getELFSection section classification logic
to default to SectionKind::getText (previously default was
SectionKind::getReadOnly) and added some matching based on section name
to determine internal section classification.

The BPF runtime implements global variables using 'BPF map'
datastructures, specifically the arraymap BPF map type. Global variables
in a section are placed in a single arraymap value at predictable byte
offsets. Variables in different sections are placed in separate
arraymaps, so in this example:

  #define SEC(name) __attribute__((section(name)))
  SEC(".data.A") u32 one;
  SEC(".data.A") u32 two;
  SEC(".data.B") u32 three;
  SEC(".data.B") u32 four;

variables one and two would correspond to some byte offsets (probably 0
and 4) in one arraymap, while three and four would be in a separate
arraymap. Variables of a bpf_spin_lock type are considered to protect
next-generation BPF datastructure types in the same arraymap value and
there can only be a single bpf_spin_lock variable per arraymap value -
and thus per section.

As a result it's necessary to keep bpf_spin_locks and the datastructures
they guard in separate data sections. Before the aforementioned commit,
a section whose name starts with ".data." - like ".data.A" - would be
classified as SectionKind::getReadOnly, whereas after it is
SectionKind::getText. If 4-byte padding is required in such a section due to
alignment of some symbol within it, classification of the section as
SectionKind::getText will result in compilation of those variables to
BPF backend failing with an error like "unable to write nop sequence of
4 bytes". This is due to nop instruction emitted in
BPFAsmBackend::writeNopData being 8 bytes, so the function fails since
it cannot emit a 4-byte nop instruction.

Let's follow the pattern of matching section names starting with ".bss."
and ".tbss." prefixes resulting in proper classification of the section
as data by adding similar matches for ".data." and ".rodata." prefixes.
This will bring padding behavior for these sections back to what it was
before that commit and fix the crash.

Differential Revision: https://reviews.llvm.org/D138477

22 months ago[libc++] Remove self-include from header file NFC
Weverything [Tue, 27 Dec 2022 23:23:28 +0000 (15:23 -0800)]
[libc++] Remove self-include from header file NFC

22 months ago[IVUsers] Precommit test for zext SCEV invalidation issue.
Florian Hahn [Tue, 27 Dec 2022 23:24:21 +0000 (23:24 +0000)]
[IVUsers] Precommit test for zext SCEV invalidation issue.

Test case for issue reported by @uabelho and #59721

22 months ago[LV] Convert a few tests to use opaque pointers (NFC).
Florian Hahn [Tue, 27 Dec 2022 23:01:40 +0000 (23:01 +0000)]
[LV] Convert a few tests to use opaque pointers (NFC).

22 months agoReland "[AArch64] FMV support and necessary target features dependencies."
Pavel Iliin [Wed, 21 Dec 2022 11:29:53 +0000 (11:29 +0000)]
Reland "[AArch64] FMV support and necessary target features dependencies."

This relands commits e43924a75145d2f9e722f74b673145c3e62bfd07,
a43f36142c501e2d3f4797ef938db4e0c5e0eeec,
bf94eac6a3f7c5cd8941956d44c15524fa3751bd with MSan buildbot
https://lab.llvm.org/buildbot/#/builders/5/builds/30139
use-of-uninitialized-value errors fixed.

Differential Revision: https://reviews.llvm.org/D127812

22 months ago[mlir][sparse] refactoring loop emitter into its own files.
Peiming Liu [Tue, 27 Dec 2022 18:23:21 +0000 (18:23 +0000)]
[mlir][sparse] refactoring loop emitter into its own files.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D140701

22 months ago[sparse][mlir][vectorization] add support for shift-by-invariant
Aart Bik [Fri, 23 Dec 2022 01:20:52 +0000 (17:20 -0800)]
[sparse][mlir][vectorization] add support for shift-by-invariant

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D140596

22 months ago[LV] Sink scalar operands and merge regions repeatedly.
Florian Hahn [Tue, 27 Dec 2022 18:08:31 +0000 (18:08 +0000)]
[LV] Sink scalar operands and merge regions repeatedly.

Merging regions can enable new sinking opportunities (e.g. if users of a
scalar value are moved from different VPBBs into the same VPBB). Sinking
in turn can also enable new merging opportunities (e.g. if a recipe
between to merge-able regions is moved.

To enable more sinking opportunities, repeat sinking & merging if
regions could be merged.

Also fix mergeReplicateRegions to return the correct Changed status.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D139788

22 months ago[RISCV] Use SmallVector::append to replace some for loops in intrinsic creation. NFC
Craig Topper [Tue, 27 Dec 2022 17:41:29 +0000 (09:41 -0800)]
[RISCV] Use SmallVector::append to replace some for loops in intrinsic creation. NFC

Reviewed By: eopXD

Differential Revision: https://reviews.llvm.org/D140678

22 months ago[AArch64][MachineScheduler] Set no side effect for movprfx
zhongyunde [Tue, 27 Dec 2022 17:17:55 +0000 (01:17 +0800)]
[AArch64][MachineScheduler] Set no side effect for movprfx

The movprfx is a vector copy, so it doesn't access memory. Set the
value of hasSideEffects 0 to avoid return true for the hasUnmodeledSideEffects(),
which will block the machine scheduler which load/store instructions.

Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D140680

22 months ago[mlir] Fix missing OpInterface docs newline
Schuyler Eldridge [Thu, 22 Dec 2022 22:34:27 +0000 (17:34 -0500)]
[mlir] Fix missing OpInterface docs newline

Fix incorrect markdown generated by mlir-tblgen for an InterfaceMethod
that includes a body.  Previously, this would cause the next method to
show up on the same line and produce incorrect markdown.  Newlines would
only be added if the method did _not_ provide a body.  E.g., previously
this was generating markdown like:

    some function comment#### `next method`

This change makes this generate as:

    some function comment

    #### `next method`

Signed-off-by: Schuyler Eldridge <schuyler.eldridge@sifive.com>
Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D140590

22 months ago[InstCombine] Remove redundant evaluateGEPOffsetExpression() fold (NFCI)
Nikita Popov [Tue, 27 Dec 2022 16:00:24 +0000 (17:00 +0100)]
[InstCombine] Remove redundant evaluateGEPOffsetExpression() fold (NFCI)

If we go through the generic EmitGEPOffset code, the resulting
expression can be (and is) reduced in the same way this code did
manually. There are no changes in lit tests or llvm-test-suite.

This fold predates the time where we started adding nsw to the adds
created by EmitGEPOffset, so it was likely needed back then.

This might not actually be NFC due to worklist order changes etc.

22 months ago[InstCombine] Convert test to opaque pointers (NFC)
Nikita Popov [Tue, 27 Dec 2022 15:34:04 +0000 (16:34 +0100)]
[InstCombine] Convert test to opaque pointers (NFC)

Slightly adjust the test so it uses non-zero GEP indices, otherwise
these would get folded away with opaque pointers.

22 months ago[InstCombine] Convert some tests to opaque pointers (NFC)
Nikita Popov [Tue, 27 Dec 2022 15:22:10 +0000 (16:22 +0100)]
[InstCombine] Convert some tests to opaque pointers (NFC)

Check lines for these were regenerated, but without any
significant changes (mostly different GEP source element types).

22 months ago[InstCombine] Regenerate test checks (NFC)
Nikita Popov [Tue, 27 Dec 2022 15:21:31 +0000 (16:21 +0100)]
[InstCombine] Regenerate test checks (NFC)

22 months agoRevert "[InstCombine] Convert some tests to opaque pointers (NFC)"
Nikita Popov [Tue, 27 Dec 2022 15:19:13 +0000 (16:19 +0100)]
Revert "[InstCombine] Convert some tests to opaque pointers (NFC)"

This reverts commit 66cea84681e16f3d4ebdc69031824b114a0d5681.

I did not intend to commit all the changes in here, but only the
ones with no significant differences.

22 months ago[InstCombine] Convert some tests to opaque pointers (NFC)
Nikita Popov [Tue, 27 Dec 2022 14:49:19 +0000 (15:49 +0100)]
[InstCombine] Convert some tests to opaque pointers (NFC)

22 months ago[mlir][Linalg] Properly propagate transform result in ScalarizeOp
Nicolas Vasilache [Tue, 27 Dec 2022 14:14:58 +0000 (06:14 -0800)]
[mlir][Linalg] Properly propagate transform result in ScalarizeOp

22 months ago[RS4GC] Rematerialize derived pointers before uses.
Denis Antrushin [Tue, 29 Nov 2022 11:47:35 +0000 (18:47 +0700)]
[RS4GC] Rematerialize derived pointers before uses.

Introduce an option to rematerialize derived pointers immediately
before their uses instead of after every statepoint. This can be
beneficial when pointer is live across many statepoints but has
few uses.
Initial implementation is simple and rematerializes derived pointer
before every use, even if there are several uses in the same block
or rematerialization instructions can be hoisted etc.
Transformation is considered profitable if we would insert less
instructions than we would insert after every live statepoint.

Depends on D138910, D138911

Reviewed By: anna, skatkov

Differential Revision: https://reviews.llvm.org/D138912

22 months ago[SLP]Fix PR59693: Do not crash trying to set insert point for buildvector
Alexey Bataev [Tue, 27 Dec 2022 13:12:16 +0000 (05:12 -0800)]
[SLP]Fix PR59693: Do not crash trying to set insert point for buildvector
of extractvalues.

No need to get the last instruction only for vectorized extractvalues,
for gathered(buildvector sequence) still need to get the insertion
  point.

22 months ago[mlir] NFC - Expose scf::canonicalizeMinMaxOp
Nicolas Vasilache [Tue, 27 Dec 2022 13:47:02 +0000 (05:47 -0800)]
[mlir] NFC - Expose scf::canonicalizeMinMaxOp

22 months agoReapply [MergedLoadStoreMotion] Convert tests to opaque pointers (NFC)
Nikita Popov [Tue, 27 Dec 2022 11:20:05 +0000 (12:20 +0100)]
Reapply [MergedLoadStoreMotion] Convert tests to opaque pointers (NFC)

Reapply after reapplying dependent revision.

22 months ago[LoadStoreVectorizer] Convert tests to opaque pointers (NFC)
Nikita Popov [Tue, 27 Dec 2022 12:09:25 +0000 (13:09 +0100)]
[LoadStoreVectorizer] Convert tests to opaque pointers (NFC)

22 months ago[LoadStoreVectorize] Regenerate test checks (NFC)
Nikita Popov [Tue, 27 Dec 2022 11:57:52 +0000 (12:57 +0100)]
[LoadStoreVectorize] Regenerate test checks (NFC)

22 months ago[LoadStoreVectorizer] Convert some tests to opaque pointers (NFC)
Nikita Popov [Tue, 27 Dec 2022 11:53:30 +0000 (12:53 +0100)]
[LoadStoreVectorizer] Convert some tests to opaque pointers (NFC)

22 months ago[LoopBoundSplit] Convert tests to opaque pointers (NFC)
Nikita Popov [Tue, 27 Dec 2022 11:52:22 +0000 (12:52 +0100)]
[LoopBoundSplit] Convert tests to opaque pointers (NFC)

22 months agoReapply [MergeLoadStoreMotion] Don't require GEP for sinking
Nikita Popov [Tue, 27 Dec 2022 11:17:58 +0000 (12:17 +0100)]
Reapply [MergeLoadStoreMotion] Don't require GEP for sinking

Reapply with a fix for a failing debuginfo assignment tracking test.

-----

Allow sinking stores where both operands are the same, don't require
them to have an identical GEP in each block.

This came up when migrating tests to opaque pointers, where
zero-index GEPs are omitted.

22 months ago[GVNHoist] Make test more robust (NFC)
Nikita Popov [Tue, 27 Dec 2022 11:44:17 +0000 (12:44 +0100)]
[GVNHoist] Make test more robust (NFC)

Make sure these stores cannot be sunk, which might defeat the
intent of the test.

22 months agoRevert "[MergeLoadStoreMotion] Don't require GEP for sinking"
Nikita Popov [Tue, 27 Dec 2022 11:37:49 +0000 (12:37 +0100)]
Revert "[MergeLoadStoreMotion] Don't require GEP for sinking"

I missed a test failure in the DebugInfo directory.

This reverts commit 2c15b9d9e1a898cfd849db81b36d278eac3ef24e.
This reverts commit fb435e1cb5842e1437436e9e7378dfc4106fdad8.

22 months ago[MergedLoadStoreMotion] Convert tests to opaque pointers (NFC)
Nikita Popov [Tue, 27 Dec 2022 11:20:05 +0000 (12:20 +0100)]
[MergedLoadStoreMotion] Convert tests to opaque pointers (NFC)

22 months ago[MergeLoadStoreMotion] Don't require GEP for sinking
Nikita Popov [Tue, 27 Dec 2022 11:17:58 +0000 (12:17 +0100)]
[MergeLoadStoreMotion] Don't require GEP for sinking

Allow sinking stores where both operands are the same, don't require
them to have an identical GEP in each block.

This came up when migrating tests to opaque pointers, where
zero-index GEPs are omitted.

22 months ago[MergedLoadStoreMotion] Add tests for store without GEPs (NFC)
Nikita Popov [Tue, 27 Dec 2022 11:03:59 +0000 (12:03 +0100)]
[MergedLoadStoreMotion] Add tests for store without GEPs (NFC)

MergedLoadStoreMotion currently only handles the case where each
store has it's own GEP. It fails to handle the case where the
store argument is exactly the same.

22 months ago[Tests] Rename InstMerge -> MergedLoadStoreMotion (NFC)
Nikita Popov [Tue, 27 Dec 2022 10:52:33 +0000 (11:52 +0100)]
[Tests] Rename InstMerge -> MergedLoadStoreMotion (NFC)

These are tests for the MergeLoadStoreMotion pass, so name them
accordingly.

22 months ago[reland][libc][NFC] Add -fno-lax-vector-conversions compilation flag
Guillaume Chatelet [Tue, 27 Dec 2022 08:25:32 +0000 (08:25 +0000)]
[reland][libc][NFC] Add -fno-lax-vector-conversions compilation flag

Now that a3d2c344773cc4fc95136fd67245880b34d8e335 has been submitted.

22 months ago[libc][NFC] Fix lax vector conversion for aarch64
Guillaume Chatelet [Tue, 27 Dec 2022 10:16:23 +0000 (10:16 +0000)]
[libc][NFC] Fix lax vector conversion for aarch64

22 months ago[InterleavedAccess] Convert tests to opaque pointers (NFC)
Nikita Popov [Tue, 27 Dec 2022 09:57:34 +0000 (10:57 +0100)]
[InterleavedAccess] Convert tests to opaque pointers (NFC)

22 months ago[LCSSA] Convert tests to opaque pointers (NFC)
Nikita Popov [Tue, 27 Dec 2022 09:56:49 +0000 (10:56 +0100)]
[LCSSA] Convert tests to opaque pointers (NFC)

22 months ago[Internalize] Convert tests to opaque pointers (NFC)
Nikita Popov [Tue, 27 Dec 2022 09:54:35 +0000 (10:54 +0100)]
[Internalize] Convert tests to opaque pointers (NFC)

22 months ago[InferFunctionAttrs] Convert tests to opaque pointers (NFC)
Nikita Popov [Tue, 27 Dec 2022 09:53:42 +0000 (10:53 +0100)]
[InferFunctionAttrs] Convert tests to opaque pointers (NFC)