platform/upstream/llvm.git
15 months ago[scudo] Explicit casting for u16 arithmetic operation
Chia-hung Duan [Fri, 7 Jul 2023 19:57:47 +0000 (19:57 +0000)]
[scudo] Explicit casting for u16 arithmetic operation

This fixes the werror from https://lab.llvm.org/buildbot/#/builders/165/builds/38829

Reviewed By: enh

Differential Revision: https://reviews.llvm.org/D154733

15 months ago[LV] Skip VFs < iterations remaining for epilogue vectorization.
Florian Hahn [Fri, 7 Jul 2023 19:33:41 +0000 (20:33 +0100)]
[LV] Skip VFs < iterations remaining for epilogue vectorization.

If a candidate VF for epilogue vectorization is less than the number of
remaining iterations, the epilogue loop would be dead. Skip such factors.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D154264

15 months agoAMDGPU: Delete custom combine on class intrinsic
Matt Arsenault [Tue, 27 Jun 2023 14:59:57 +0000 (10:59 -0400)]
AMDGPU: Delete custom combine on class intrinsic

This is no longer necessary as class-with-constant will always be
transformed to the generic class intrinsic.

https://reviews.llvm.org/D153901

15 months agoclang: Stop emitting "strictfp"
Matt Arsenault [Wed, 7 Dec 2022 23:50:42 +0000 (18:50 -0500)]
clang: Stop emitting "strictfp"

The attribute is a proper enum attribute, strictfp. We were getting
strictfp and "strictfp" set on every function with
-fexperimental-strict-floating-point.

https://reviews.llvm.org/D139629

15 months agoclang: Regenerate test checks
Matt Arsenault [Fri, 7 Jul 2023 19:25:40 +0000 (15:25 -0400)]
clang: Regenerate test checks

15 months ago[Libomptarget] Refine logic for determining if we support RPC
Joseph Huber [Fri, 7 Jul 2023 18:18:41 +0000 (13:18 -0500)]
[Libomptarget] Refine logic for determining if we support RPC

Summary:
Add a requirement for the GPU libc to only be on if its enabled
explicitly. Fix the logic around the pythonification of the variable.

15 months ago[mlir] add a simple gpu barrier elimination mechanism
Alex Zinenko [Fri, 7 Jul 2023 14:35:04 +0000 (14:35 +0000)]
[mlir] add a simple gpu barrier elimination mechanism

GPU code generation, and specifically the shared memory copy insertion
may introduce spurious barriers guarding read-after-read dependencies or
read-after-write on non-aliasing data, which degrades performance due to
unnecessary synchronization. Add a pattern and transform op that removes
such barriers by analyzing memory effects that the barrier actually
guards that are not also guarded by other barriers. The code is adapted
from the Polygeist incubator project.

Co-authored-by: William Moses <gh@wsmoses.com>
Co-authored-by: Ivan Radanov Ivanov <ivanov.i.aa@m.titech.ac.jp>
Reviewed By: nicolasvasilache, wsmoses

Differential Revision: https://reviews.llvm.org/D154720

15 months ago[ASAN] Disable mmap test on Windows.
Kirill Stoimenov [Fri, 7 Jul 2023 18:45:26 +0000 (18:45 +0000)]
[ASAN] Disable mmap test on Windows.

15 months agoReland "[PowerPC] Remove extend between shift and and"
Nemanja Ivanovic [Fri, 7 Jul 2023 16:03:28 +0000 (12:03 -0400)]
Reland "[PowerPC] Remove extend between shift and and"

The commit originally caused a bootstrap failure on the big endian
PPC bot as the combine was interfering with the legalizer when
applied on illegal types. This update restricts the combine to
the only types for which it is actually needed. Tested on PPC BE
bootstrap locally.

15 months agoNew features and bug fix in MLIR test generation tool
Rafael Ubal Tena [Fri, 7 Jul 2023 17:52:36 +0000 (10:52 -0700)]
New features and bug fix in MLIR test generation tool

- Option `--variable_names <names>` allows the user to pass names for FileCheck
  regexps representing variables. Variable names are separated by commas, and
  empty names can be used to generate specific variable names automatically.
  For example, `--variable-names arg_0,arg_1,,,result` will produce regexp names
  `ARG_0`, `ARG_1`, `VAR_0`, `VAR_1`, `RESULT`, `VAR_2`, `VAR_3`, ...

- Option '--attribute_names <names>' can be used to generate global regexp names
  to represent attributes. Useful for affine maps. Same behavior as
  '--variable_names'.

- Bug fixed for scope detection of SSA variables in ops with nested regions that
  return SSA values (e.g., 'linalg.generic'). Originally, returned SSA values were
  inserted in the nested scope.

This version of the tool has been used to generate unit tests for the following
patch: https://reviews.llvm.org/D153291

For example, the main body of the test named 'test_select_2d_one_dynamic' was
generated using the following command:

```
$ mlir-opt -pass-pipeline='builtin.module(func.func(tosa-to-linalg))' test_select_2d_one_dynamic.tosa.mlir | generate-test-checks.py --attribute_names map0,map1,map2 --variable_names arg0,arg1,arg2,const1,arg0_dim1,arg1_dim1,,arg2_dim1,max_dim1,,,arg0_broadcast,,,,,,,arg1_broadcast,,,,,,,arg2_broadcast,,,,,,result
```

Reviewed By: eric-k256

Differential Revision: https://reviews.llvm.org/D154458

15 months ago[gn build] Port b4a62b1fa546
LLVM GN Syncbot [Fri, 7 Jul 2023 17:56:47 +0000 (17:56 +0000)]
[gn build] Port b4a62b1fa546

15 months ago[Libomptarget] Fix test logic for optionally adding the libcgpu.a
Joseph Huber [Fri, 7 Jul 2023 17:49:09 +0000 (12:49 -0500)]
[Libomptarget] Fix test logic for optionally adding the libcgpu.a

Summary:
This was not operating as expected and was causing the build to fail on
non-configured systems.

15 months ago[AMDGPU][SILowerSGPRSpills] Spill SGPRs to virtual VGPRs
Christudasan Devadasan [Wed, 17 May 2023 05:46:58 +0000 (11:16 +0530)]
[AMDGPU][SILowerSGPRSpills] Spill SGPRs to virtual VGPRs

Currently, the custom SGPR spill lowering pass spills
SGPRs into physical VGPR lanes and the remaining VGPRs
are used by regalloc for vector regclass allocation.
This imposes many restrictions that we ended up with
unsuccessful SGPR spilling when there won't be enough
VGPRs and we are forced to spill the leftover into
memory during PEI. The custom spill handling during PEI
has many edge cases and often breaks the compiler time
to time.

This patch implements spilling SGPRs into virtual VGPR
lanes. Since we now split the register allocation for
SGPRs and VGPRs, the virtual registers introduced for
the spill lanes would get allocated automatically in
the subsequent regalloc invocation for VGPRs.

Spill to virtual registers will always be successful,
even in the high-pressure situations, and hence it avoids
most of the edge cases during PEI. We are now left with
only the custom SGPR spills during PEI for special registers
like the frame pointer which is an unproblematic case.

Differential Revision: https://reviews.llvm.org/D124196

15 months ago[Libomptarget] Begin implementing support for RPC services
Joseph Huber [Sat, 1 Jul 2023 03:08:42 +0000 (22:08 -0500)]
[Libomptarget] Begin implementing support for RPC services

This patch adds the intial support for running an RPC server in
libomptarget to handle host services. We interface with the library
provided by the `libc` project to stand up a basic server. We introduce
a new type that is controlled by the plugin and has each device
intialize its interface. We then run a basic server to check the RPC
buffer.

This patch does not fully implement the interface. In the future each
plugin will want to define special handlers via the interface to support
things like malloc or H2D copies coming from RPC. We will also want to
allow the plugin to specify t he number of ports. This is currently
capped in the implementation but will be adjusted soon.

Right now running the server is handled by whatever thread ends up doing
the waiting. This is probably not a completely sound solution but I am
not overly familiar with the behaviour of OpenMP tasks and what would be
required here. This works okay with synchrnous regions, and somewhat
fine with `nowait` regions, but I've observed some weird behavior when
one of those regions calls `exit`.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D154312

15 months ago[PEI][Mips] Switch to backwards frame index elimination
Jay Foad [Thu, 8 Jun 2023 08:50:47 +0000 (09:50 +0100)]
[PEI][Mips] Switch to backwards frame index elimination

This adds support for running PEI::replaceFrameIndicesBackward with no
RegisterScavenger, and basic support for eliminating call frame pseudo
instructions.

Differential Revision: https://reviews.llvm.org/D154347

15 months ago[PEI] Simplify iterator handling in replaceFrameIndicesBackward. NFCI.
Jay Foad [Tue, 6 Jun 2023 14:14:53 +0000 (15:14 +0100)]
[PEI] Simplify iterator handling in replaceFrameIndicesBackward. NFCI.

Differential Revision: https://reviews.llvm.org/D154346

15 months ago[AMDGPU] Enable whole wave register copy
Christudasan Devadasan [Fri, 7 Jul 2023 17:28:06 +0000 (22:58 +0530)]
[AMDGPU] Enable whole wave register copy

So far, we haven't exposed the allocation of whole-wave
registers to regalloc. We hand-picked them for various
whole wave mode operations. With a future patch, we
want the allocator to efficiently allocate them rather
than using the custom pre-allocation pass.

Any liverange split of virtual registers involved in
whole-wave operations require the resulting COPY
introduced with the split to be performed for all
lanes. It isn't implemented in the compiler yet.

This patch would identify all such copies and
manipulate the exec mask around them to enable all
lanes without affecting the value of exec mask
elsewhere.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D143762

15 months ago[scudo] Allow pushing single block to the freelist of BatchClass
Chia-hung Duan [Fri, 7 Jul 2023 17:27:46 +0000 (17:27 +0000)]
[scudo] Allow pushing single block to the freelist of BatchClass

This CL removes the restriction that pushing blocks into BatchClassId
can only be done when freelist is not empty. Without this constraint,
BatchClassId is also available for gathering blocks into groups.

Reviewed By: cferris

Differential Revision: https://reviews.llvm.org/D153492

15 months ago[AMDGPU] Implement whole wave register spill
Christudasan Devadasan [Fri, 7 Jul 2023 17:21:32 +0000 (22:51 +0530)]
[AMDGPU] Implement whole wave register spill

To reduce the register pressure during allocation,
when the allocator spills a virtual register that
corresponds to a whole wave mode operation, the
spill loads and restores should be activated for
all lanes by temporarily flipping all bits in exec
register to one just before the spills. It is not
implemented in the compiler as of today and this
patch enables the necessary support.

This is a pre-patch before the SGPR spill to virtual
VGPR lanes that would eventually causes the whole
wave register spills during allocation.

Reviewed By: arsenm, cdevadas

Differential Revision: https://reviews.llvm.org/D143759

15 months ago[lldb][NFCI] Remove use of Stream * from TypeSystem
Alex Langford [Wed, 5 Jul 2023 21:56:25 +0000 (14:56 -0700)]
[lldb][NFCI] Remove use of Stream * from TypeSystem

We always assume these streams are valid, might as well take references
instead of raw pointers.

Differential Revision: https://reviews.llvm.org/D154549

15 months ago[CodeGen]Allow targets to use target specific COPY instructions for live range splitting
Yashwant Singh [Fri, 7 Jul 2023 16:59:30 +0000 (22:29 +0530)]
[CodeGen]Allow targets to use target specific COPY instructions for live range splitting

Replacing D143754. Right now the LiveRangeSplitting during register allocation uses
TargetOpcode::COPY instruction for splitting. For AMDGPU target that creates a
problem as we have both vector and scalar copies. Vector copies perform a copy over
a vector register but only on the lanes(threads) that are active. This is mostly sufficient
however we do run into cases when we have to copy the entire vector register and
not just active lane data. One major place where we need that is live range splitting.

Allowing targets to use their own copy instructions(if defined) will provide a lot of
flexibility and ease to lower these pseudo instructions to correct MIR.

- Introduce getTargetCopyOpcode() virtual function and use if to generate copy in Live range
 splitting.
- Replace necessary MI.isCopy() checks with TII.isCopyInstr() in register allocator pipeline.

Reviewed By: arsenm, cdevadas, kparzysz

Differential Revision: https://reviews.llvm.org/D150388

15 months ago[LLVM-C] Use unwrapDI in LLVMDITypeGetName
Tamir Duberstein [Fri, 7 Jul 2023 16:50:52 +0000 (16:50 +0000)]
[LLVM-C] Use unwrapDI in LLVMDITypeGetName

This function otherwise crashes. This behavior has been incorrect since
its introduction in 260b581498bed0b847e7e086852e0082d266711d (D46627).

Reviewed By: scott.linder

Differential Revision: https://reviews.llvm.org/D154630

15 months ago[Clang] Emit KCFI type hashes for member functions
Sami Tolvanen [Thu, 6 Jul 2023 23:42:01 +0000 (23:42 +0000)]
[Clang] Emit KCFI type hashes for member functions

With `-fsanitize=kcfi`, Clang currently won't emit type hashes for
C++ member functions, which leads to check failures if they are
indirectly called. As there's no reason to exclude member functions
in CodeGenModule::setKCFIType, emit type hashes also for them to fix
member function pointer calls with KCFI, and add a test to confirm
that types are emitted correctly.

15 months ago[libc] Enable aliasing on AMDGPU targets
Joseph Huber [Fri, 7 Jul 2023 11:43:08 +0000 (06:43 -0500)]
[libc] Enable aliasing on AMDGPU targets

AMDGPU supports aliases now, so we can drop this case and leave it only
for the NVPTX target. Unfortunately it's unlikely that NVPTX will be
able to support this in the future due to their PTX language being very
limited.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D154704

15 months agoAppease clang-tidy
Tamir Duberstein [Fri, 7 Jul 2023 16:28:22 +0000 (16:28 +0000)]
Appease clang-tidy

Address clang-tidy warnings, correct usage text.

Differential Revision: https://reviews.llvm.org/D154661

15 months agorevert filter_iterator_impl::operator++ changes
Alex Brachet [Fri, 7 Jul 2023 16:24:51 +0000 (16:24 +0000)]
revert filter_iterator_impl::operator++ changes

15 months ago[mlir][doc][transform] Fix link to documentation in tutorial. (NFC)
Ingo Müller [Fri, 7 Jul 2023 16:06:00 +0000 (16:06 +0000)]
[mlir][doc][transform] Fix link to documentation in tutorial. (NFC)

Reviewed By: ingomueller-net

Differential Revision: https://reviews.llvm.org/D154724

15 months ago[RISCV] Add a guard condition to orc_b/brev8 handling in ReplaceNodeResults.
Craig Topper [Fri, 7 Jul 2023 15:48:58 +0000 (08:48 -0700)]
[RISCV] Add a guard condition to orc_b/brev8 handling in ReplaceNodeResults.

The orc_b and brev8 intrinsics are type overloaded, but only
i32 and XLen are supported types. The type legalization code in
ReplaceNodeResults only handles the i32 case on RV64. Add some
checks so we will fail type legalization for other types.

15 months ago[mlir] Implement one-to-n structural conversion for ForOp
Spenser Bauman [Fri, 7 Jul 2023 15:50:24 +0000 (15:50 +0000)]
[mlir] Implement one-to-n structural conversion for ForOp

Add the missing one-to-n structural type conversion pattern for the
scf.for operation.

Reviewed By: ingomueller-net

Differential Revision: https://reviews.llvm.org/D154299

15 months ago[ASAN] Add mmap and munmap interceptor in ASAN
Kirill Stoimenov [Thu, 6 Jul 2023 21:17:48 +0000 (21:17 +0000)]
[ASAN] Add mmap and munmap interceptor in ASAN

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D154659

15 months ago[mlir] use shared pointer to prevent vector resizes from destroying ops
Ashay Rane [Fri, 7 Jul 2023 14:54:40 +0000 (09:54 -0500)]
[mlir] use shared pointer to prevent vector resizes from destroying ops

The `MapVector` type stores key-value pairs in a vector, which, when
resized, copies the entries and destroys the old ones.  This causes the
underlying operations to be deleted, subsequently causing segfaults.

This patch makes the `mappings` map type refer to a shared pointer
instead, so that resizing the vector doesn't call the operations'
destructors.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D154511

15 months ago[RISCV] Remove pseudos for vwcvt.f.x(u) with rounding mode.
Craig Topper [Fri, 7 Jul 2023 15:15:16 +0000 (08:15 -0700)]
[RISCV] Remove pseudos for vwcvt.f.x(u) with rounding mode.

vwcvt.f.x doesn't use rounding mode. The integer value fits in
the mantissa of a 2x larger FP type so no rounding is required.

I've remove the Uses = [FRM] that is also not needed.

I deleted the isel patterns. Alternatively, we could keep them and
drop the rounding mode immediate. The patterns are currently untested
so I chose to delete them. If they become needed in the future, we
can decide then if we should have the patterns or teach the node
creation to use the non-RM form for widening.

This reverts part of D142102.

Reviewed By: luke

Differential Revision: https://reviews.llvm.org/D154653

15 months ago[ADT] fix postfix filter_iterator_impl::operator++
Alex Brachet [Fri, 7 Jul 2023 15:10:56 +0000 (15:10 +0000)]
[ADT] fix postfix filter_iterator_impl::operator++

15 months agoFix the Clang sphinx bot
Aaron Ballman [Fri, 7 Jul 2023 15:08:17 +0000 (11:08 -0400)]
Fix the Clang sphinx bot

This adds AMDGPUSupport to the index page to address the issue found by:
https://lab.llvm.org/buildbot/#/builders/92/builds/46936

15 months agoReland: "[ADT] fix filter_iterator_impl::operator++"
Alex Brachet [Fri, 7 Jul 2023 14:46:44 +0000 (14:46 +0000)]
Reland: "[ADT] fix filter_iterator_impl::operator++"

15 months ago[PowerPC] Update InputOps of Power10 SchedModel
Qiu Chaofan [Fri, 7 Jul 2023 14:42:46 +0000 (22:42 +0800)]
[PowerPC] Update InputOps of Power10 SchedModel

Count of input operands affect pipeline forwarding in scheduling model.
Previous Power10 model definition arranges some instructions into
incorrect groups, by counting the wrong number of input operands.

This patch updates the model, setting the input operands count correctly
by excluding irrelevant immediate operands and count memory operands of
load instructions correctly.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D153842

15 months ago[mlir] generalize matchers to support batch matmul
Alex Zinenko [Wed, 5 Jul 2023 10:18:24 +0000 (10:18 +0000)]
[mlir] generalize matchers to support batch matmul

Mostly the same logic applies, with a different rank.

Additionally expose the logic to do identify contraction dimensions and
contraction-like bodies as independent transform ops. This allows us to
recognize "generic" operations and not only the named ones.

Rework the contraction body matching logic to no longer rely on
contraction operations beign uniquely named.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D154498

15 months ago[Clang][NFC] Fix the title of P2361 in the release notes
Corentin Jabot [Fri, 7 Jul 2023 14:41:17 +0000 (16:41 +0200)]
[Clang][NFC] Fix the title of P2361 in the release notes

15 months ago[RISCV] Update loop vectorizer interleaved access test output
Luke Lau [Fri, 7 Jul 2023 14:36:17 +0000 (15:36 +0100)]
[RISCV] Update loop vectorizer interleaved access test output

02bb33c3ce7a83d47244ae16c8b4c625aba187a2 changed it so it no longer unrolls the
loop.

15 months ago[Clang][NFC] Mark that P2280 was approved as a DR
cor3ntin [Fri, 7 Jul 2023 14:36:47 +0000 (16:36 +0200)]
[Clang][NFC] Mark that P2280 was approved as a DR

15 months ago[amdgpu] start documenting amdgpu support by clang
Yaxun (Sam) Liu [Thu, 29 Jun 2023 19:09:14 +0000 (15:09 -0400)]
[amdgpu] start documenting amdgpu support by clang

Reviewed by: Matt Arsenault, Johannes Doerfert, Jacob Lambert

Differential Revision: https://reviews.llvm.org/D154133

15 months ago[RISCV] Check for alignment when lowering interleaved/deinterleaved loads/stores
Luke Lau [Wed, 5 Jul 2023 18:14:58 +0000 (19:14 +0100)]
[RISCV] Check for alignment when lowering interleaved/deinterleaved loads/stores

As noted by @reames, we should be checking that the memory access is aligned to
the element size (or the unaligned vector memory access feature is enabled)
before lowering vlseg/vsseg intrinsics via the interleaved access pass.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D154536

15 months ago[RISCV] Add tests for unaligned segmented loads and stores
Luke Lau [Wed, 5 Jul 2023 17:37:28 +0000 (18:37 +0100)]
[RISCV] Add tests for unaligned segmented loads and stores

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D154535

15 months agoA new code layout algorithm for function reordering [1/3]
spupyrev [Tue, 13 Jun 2023 17:08:00 +0000 (10:08 -0700)]
A new code layout algorithm for function reordering [1/3]

We are brining a new algorithm for function layout (reordering) based on the
call graph (extracted from a profile data). The algorithm is an improvement of
top of a known heuristic, C^3. It tries to co-locate hot and frequently executed
together functions in the resulting ordering. Unlike C^3, it explores a larger
search space and have an objective closely tied to the performance of
instruction and i-TLB caches. Hence, the name CDS = Cache-Directed Sort.
The algorithm can be used at the linking or post-linking (e.g., BOLT) stage.

This diff modifies the existing data structures to facilitate the implementation
(down the stack). This is a no-op change.

Reviewed By: hoy

Differential Revision: https://reviews.llvm.org/D152833

15 months ago[InstCombine] Handle unreachable edge when branching to loop
Nikita Popov [Fri, 7 Jul 2023 13:17:02 +0000 (15:17 +0200)]
[InstCombine] Handle unreachable edge when branching to loop

The successor is unreachable if either this is the only edge, or
this is an edge into a loop, in which case other predecessors
don't matter. This is exactly what the edge dominance check does.

15 months agoRevert "[ADT] fix filter_iterator_impl::operator++"
Alex Brachet [Fri, 7 Jul 2023 14:03:58 +0000 (14:03 +0000)]
Revert "[ADT] fix filter_iterator_impl::operator++"

This reverts commit 5a67fa2b17c4db64ef00fbe672a4b59d26039828.

15 months agoSimpleLoopUnswitch: Restore uniform unswitch test
Matt Arsenault [Thu, 1 Jun 2023 23:04:47 +0000 (19:04 -0400)]
SimpleLoopUnswitch: Restore uniform unswitch test

This was supposed to document the new PM limitation but
was deleted in fb4113ef0c8b2c5e5e2817e9ca14fb57a6d252be

Switch to generated checks since that's more reliable than XFAIL, and
just preserve the preferred results as comments.

15 months ago[ADT] fix filter_iterator_impl::operator++
Alex Brachet [Fri, 7 Jul 2023 13:50:12 +0000 (13:50 +0000)]
[ADT] fix filter_iterator_impl::operator++

15 months agoTTI: Pass function to hasBranchDivergence in a few passes
Matt Arsenault [Fri, 2 Jun 2023 10:59:25 +0000 (06:59 -0400)]
TTI: Pass function to hasBranchDivergence in a few passes

https://reviews.llvm.org/D152033

15 months agoReland "[BOLT][Instrumentation] Put Allocator itslef in shared memory by default"
Denis Revunov [Tue, 20 Jun 2023 15:00:36 +0000 (15:00 +0000)]
Reland "[BOLT][Instrumentation] Put Allocator itslef in shared memory by default"

The issue was caused by the absence of placement new definition. It
worked for clang and thus passed Phabricator checks, but broke when
compiled with GCC on buildbot.
Full problem description: https://reviews.llvm.org/D153771#4468239

Original patch description:
In absence of instrumentation-file-append-pid option,
global allocator uses shared pages for allocation. However, since it is a
global variable, it gets COW'd after fork if instrumentation-sleep-time
is used, or if a process forks by itself. This means it handles the same
pages to every process which causes hash table corruption. Thus, if we
want shared pages, we need to put the allocator itself in a shared page,
which we do in this commit in __bolt_instr_setup.
I also added a couple of assertions to sanity-check the hash table.

Reviewed By: rafauler, Amir
Differential Revision: https://reviews.llvm.org/D153771

15 months ago[mlir][transform] Add VerifyOp
Matthias Springer [Fri, 7 Jul 2023 13:14:31 +0000 (15:14 +0200)]
[mlir][transform] Add VerifyOp

This transform op runs the verifier on the targeted payload ops. It is for debugging only.

Differential Revision: https://reviews.llvm.org/D154711

15 months agoHIP: Directly call round builtins
Matt Arsenault [Sun, 20 Nov 2022 16:43:08 +0000 (08:43 -0800)]
HIP: Directly call round builtins

15 months agoHIP: Directly call ceil builtins
Matt Arsenault [Tue, 22 Nov 2022 17:28:41 +0000 (12:28 -0500)]
HIP: Directly call ceil builtins

15 months agoAMDGPU: Remove attempt at simplifying the format string in printf lowering
Matt Arsenault [Wed, 28 Jun 2023 17:49:50 +0000 (13:49 -0400)]
AMDGPU: Remove attempt at simplifying the format string in printf lowering

This avoids computing the dominator tree by removing the
simplifyInstruction use.

This was applying simplification with some kind of questionable
load-store forwarding and looking for the global. This had to have
been an ancient hack copied from previous backends. In the OpenCL
case, this is always emitted as required the direct global reference
anyway.

15 months ago[InstCombine] Add test for unreachable loop (NFC)
Nikita Popov [Fri, 7 Jul 2023 13:19:21 +0000 (15:19 +0200)]
[InstCombine] Add test for unreachable loop (NFC)

We only catch this case on the second InstCombine iteration.

15 months ago[OpenMP][OMPT] Change OMPT kind for OpenMP test lock functions
Joachim Jenke [Fri, 7 Jul 2023 12:26:21 +0000 (14:26 +0200)]
[OpenMP][OMPT] Change OMPT kind for OpenMP test lock functions

The OpenMP specification mentions that omp_test_lock and
omp_test_nest_lock dispatch OMPT callbacks with ompt_mutex_test_lock
and ompt_mutex_test_nest_lock for their kind respectively. Previously,
the values ompt_mutex_lock and ompt_mutex_nest_lock were used. This
could cause issues in application relying on the kind to correctly
determine lock states. This commit changes the kind to the expected
ones.

Also update callback.h and OMPT tests to reflect this change.

Patch prepared by Thyre

Differential Review: https://reviews.llvm.org/D153028
Differential Review: https://reviews.llvm.org/D153031
Differential Review: https://reviews.llvm.org/D153032

15 months ago[lldb][NFC] Factor out code from SymbolFileDWARF::ParseVariableDIE
Felipe de Azevedo Piovezan [Wed, 5 Jul 2023 15:10:00 +0000 (11:10 -0400)]
[lldb][NFC] Factor out code from SymbolFileDWARF::ParseVariableDIE

This function does a _lot_ of different things:

1. Parses a DIE,
2. Builds an ExpressionList
3. Figures out lifetime of variable
4. Remaps addresses for debug maps
5. Handles external variables
6. Figures out scope of variables

A lot of this functionality is coded in a complex nest of conditions, variables
that are declared and then initialized much later, variables that are updated in
multiple code paths. All of this makes the code really hard to follow.

This commit attempts to improve the state of things by factoring out (3), adding
documentation on how the expression list is built, and by reducing the scope of
variables.

Differential Revision: https://reviews.llvm.org/D154513

15 months ago[LoopVectorize] Regenerate test checks (NFC)
Nikita Popov [Fri, 7 Jul 2023 12:34:49 +0000 (14:34 +0200)]
[LoopVectorize] Regenerate test checks (NFC)

15 months agoRemove rdar links; NFC
Aaron Ballman [Fri, 7 Jul 2023 12:38:35 +0000 (08:38 -0400)]
Remove rdar links; NFC

This removes links to rdar, which is an internal bug tracker that the
community doesn't have visibility into.

See further discussion at:
https://discourse.llvm.org/t/code-review-reminder-about-links-in-code-commit-messages/71847

15 months ago[MLIR][Linalg] Add max named op to linalg
Renato Golin [Fri, 7 Jul 2023 11:04:30 +0000 (12:04 +0100)]
[MLIR][Linalg] Add max named op to linalg

I've been trying to come up with a simple and clean implementation for
ReLU. TOSA uses `clamp` which is probably the goal, but that means
table-gen to make it efficient (attributes, only lower `min` or `max`).

For now, `max` is a reasonable named op despite ReLU, so we can start
using it for tiling and fusion, and upon success, we create a more
complete op `clamp` that doesn't need a whole tensor filled with zeroes
or ones to implement the different activation functions.

As with other named ops, we start "requiring" type casts and broadcasts,
and zero filled constant tensors to a more complex pattern-matcher, and
can slowly simplify with attributes or structured matchers (ex. PDL) in
the future.

Differential Revision: https://reviews.llvm.org/D154703

15 months ago[dataflow] Remove [[deprecated]] from deprecated functions
Sam McCall [Fri, 7 Jul 2023 12:34:57 +0000 (14:34 +0200)]
[dataflow] Remove [[deprecated]] from deprecated functions

This fixes -Werror -Wdeprecated builds
See D153469

15 months agoInstCombine: Fold ldexp(ldexp(x, a), b) -> ldexp(x, a + b)
Matt Arsenault [Tue, 4 Jul 2023 16:50:41 +0000 (12:50 -0400)]
InstCombine: Fold ldexp(ldexp(x, a), b) -> ldexp(x, a + b)

The problem here is overflow or underflow which would have occurred in
the inner operation, which the exponent offsetting avoids. We can do
this if we know the two exponents are in the same direction, or
reassoc flags allow unsafe reassociates.

15 months agoInstCombine: Add baseline tests for ldexp reassociation combine
Matt Arsenault [Tue, 4 Jul 2023 16:55:25 +0000 (12:55 -0400)]
InstCombine: Add baseline tests for ldexp reassociation combine

15 months agoInstSimplify: Update another cannotBeOrderedLessThanZero use
Matt Arsenault [Tue, 25 Apr 2023 23:24:29 +0000 (19:24 -0400)]
InstSimplify: Update another cannotBeOrderedLessThanZero use

Pass all the optional arguments to enable assumes.

15 months ago[MLIR][Presburger] Implement composition for PresburgerRelation
iambrj [Fri, 7 Jul 2023 12:07:47 +0000 (17:37 +0530)]
[MLIR][Presburger] Implement composition for PresburgerRelation

This patch implements range and domain composition for PresburgerRelations

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D154444

15 months ago[gn build] Port fc9821a877d4
LLVM GN Syncbot [Fri, 7 Jul 2023 12:07:16 +0000 (12:07 +0000)]
[gn build] Port fc9821a877d4

15 months ago[gn build] Port c0221e006d47
LLVM GN Syncbot [Fri, 7 Jul 2023 12:07:15 +0000 (12:07 +0000)]
[gn build] Port c0221e006d47

15 months agoRevert "[DWARF][BOLT] Implement new mechanism for DWARFRewriter"
Nico Weber [Fri, 7 Jul 2023 12:01:41 +0000 (08:01 -0400)]
Revert "[DWARF][BOLT] Implement new mechanism for DWARFRewriter"

This reverts commit 460a2244430fae192298a5fd9fa2a269e540e8c1.
It breaks building on macOS, and it was landed with a review URL
pointing to some Facebook-internal service.

Also reverts a bunch of follow-ups:

Revert "[BOLT][DWARF] Don't check string offsets"
This reverts commit f9d6f48c8bf5acaac07502403c41cf0b0d89c8d2.

Revert "[BOLT][DWARF] Change to process and write out TUs first then CUs in batches"
This reverts commit 88e95c1e4bb6e2ad3bfd185b96341ad5c09eff6b.

Revert "[BOLT][DWARF] Output DWO files as they are being processed"
This reverts commit 46ca2e3fcd419b1246357ed3b9cd36630f16e64d.

Revert "[BOLT][DWARF] Don't check string offsets"
This reverts commit cfe4a4b04f219a9dbb4e3fc01883437b6ff0e702.

Revert "[BOLT][DWARF] Numerous fixes for a new DWARFRewriter"
This reverts commit 2701a661daa393ad5901ac88d420d7aa931eda0d.

15 months ago[OpenMP][OMPT] Rename callback master to masked in ompt-multiplex.h
Joachim Jenke [Fri, 7 Jul 2023 11:59:56 +0000 (13:59 +0200)]
[OpenMP][OMPT] Rename callback master to masked in ompt-multiplex.h

OpenMP 5.1 replaced callback ompt_callback_master_t by
ompt_callback_masked_t. In order to stick to the standard,
the implementation is updated accordingly.

Patch prepared by Semih Burak

Differential Revision: https://reviews.llvm.org/D112798

15 months ago[OpenMP][OMPT] Add two missing nullpointer checks in ompt-multiplex.h
Joachim Jenke [Fri, 7 Jul 2023 11:58:38 +0000 (13:58 +0200)]
[OpenMP][OMPT] Add two missing nullpointer checks in ompt-multiplex.h

In the functions ompt_multiplex_get_own_ompt_data
and ompt_multiplex_get_client_ompt_data in addition to
data being NULL, also the void pointer field "ptr" of
"data" could be NULL, leading to a subsequent
segfault.
This patch add the corresponding checks.

Patch prepared by Semih Burak

Differential Revision: https://reviews.llvm.org/D112806

15 months ago[AST] Stop evaluate constant expression if the condition expression which in switch...
yronglin [Fri, 7 Jul 2023 11:55:04 +0000 (19:55 +0800)]
[AST] Stop evaluate constant expression if the condition expression which in switch statement contains errors

This fix issue: https://github.com/llvm/llvm-project/issues/63453

```
constexpr int foo(unsigned char c) {
    switch (f) {
    case 0:
        return 7;
    default:
        break;
    }
    return 0;
}

static_assert(foo('d'));

```

Reviewed By: aaron.ballman, erichkeane, hokein

Differential Revision: https://reviews.llvm.org/D153296

15 months ago[OpenMP][Tools] Add omp_all_memory support for Archer
Joachim Jenke [Fri, 7 Jul 2023 11:52:55 +0000 (13:52 +0200)]
[OpenMP][Tools] Add omp_all_memory support for Archer

The semantic of depend(out:omp_all_memory) is quite similar to taskwait in
that it separates all tasks (with dependency) created before an
all_memory-task from all tasks (with dependency) created after an
all_memory-task.
Only a single of such tasks can execute at a time. Similar to taskwait, we
have a CV (AllMemory[1]) in the generating task to express the dependency
sink semantic of an all_memory-task. In addition, AllMemory[0] describes the
dependency source semantic of an all_memory-task. All tasks with dependency
create an HB-arc towards the sink and terminate an HB-arc from the source.

Since we expect that not many applications will use such dependency, the
support for handling the synchronization semantic is off by default and
can be turned on using ARCHER_OPTION="all_memory=1". The most costly part
is the precautionary posting of an HB-arc towards the sink, which represents
a potentially contentious write from all concurrently executing sibling tasks.
A warning is printed at runtime, when the option is off while such dependency
is observed. In most cases the lazy activation will still lead to false alerts.

Differential Revision: https://reviews.llvm.org/D111895

15 months ago[analyzer][NFC] Simplify CStringChecker strong types
Balazs Benics [Fri, 7 Jul 2023 11:48:18 +0000 (13:48 +0200)]
[analyzer][NFC] Simplify CStringChecker strong types

15 months ago[OpenMP] Add OMPT support for omp_all_memory task dependence
Joachim Jenke [Fri, 7 Jul 2023 11:19:09 +0000 (13:19 +0200)]
[OpenMP] Add OMPT support for omp_all_memory task dependence

omp_all_memory currently has no representation in OMPT.

Adding new dependency flags as suggested by omp-lang issue #3007.

Differential Revision: https://reviews.llvm.org/D111788

15 months agoValueTracking: Update another cannotBeOrderedLessThanZero use
Matt Arsenault [Tue, 25 Apr 2023 15:52:09 +0000 (11:52 -0400)]
ValueTracking: Update another cannotBeOrderedLessThanZero use

15 months agoValueTracking: Update a use of cannotBeOrderedLessThanZero
Matt Arsenault [Tue, 25 Apr 2023 15:50:42 +0000 (11:50 -0400)]
ValueTracking: Update a use of cannotBeOrderedLessThanZero

Makes assumes work.

15 months ago[Clang][AArch64] Implement ACLE feature macro for FEAT_LRCPC3
Lucas Prates [Wed, 14 Jun 2023 09:38:46 +0000 (10:38 +0100)]
[Clang][AArch64] Implement ACLE feature macro for FEAT_LRCPC3

This implements the new value for the `__ARM_FEATURE_RCPC` feature
macro, which was introduced to the ACLE to indicate the availability of
FEAT_LRCPC3.

More details can be found on:
https://github.com/ARM-software/acle/blob/main/main/acle.md#rcpc

Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D153130

15 months ago[AArch64][RCPC3] Instruction selection for LDAP1/STL1 instructions
Lucas Prates [Tue, 13 Jun 2023 13:48:35 +0000 (14:48 +0100)]
[AArch64][RCPC3] Instruction selection for LDAP1/STL1 instructions

This implements the DAG patterns to enable instruction selection for the
LDAP1 and STL1 instructions from FEAT_LRCPC3. The instructions should
match the following combinations:

* Aqcuiring atomic load + vector insert element for LDAP1.
* Vector extract element + releasing atomic store for STL1.

Patterns have also been added to cope with the DAG structure found when
dealing with 1-lane sub-vectors.

Reviewed By: tmatheson, efriedma

Differential Revision: https://reviews.llvm.org/D153129

15 months ago[AArch64][RCPC3] Add Neon intrinsics for LDAP1 and STL1
Lucas Prates [Fri, 9 Jun 2023 14:20:46 +0000 (15:20 +0100)]
[AArch64][RCPC3] Add Neon intrinsics for LDAP1 and STL1

This adds new intrisics to support the LDAP1 and STL1 Advanced SIMD
(Neon) instructions introduced as part of FEAT_LRCPC3.
The new intrinsics `vldap1(q)_lane`/`vstl1(q)_lane` generate IR code
similar to the existing `vld1(q)_lane/st1(q)_lane` ones, but capturing
the difference in the atomic release/acquire memory model.

The LLVM code generation changes to ensure that this instruction pair
is lowered to the correct LDAP1/STL1 instructions will be covered in a
separate commit.

Based on a patch by Sam Elliott.

Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D153128

15 months agoImplement P2361 Unevaluated string literals
Corentin Jabot [Sat, 10 Jul 2021 13:52:54 +0000 (15:52 +0200)]
Implement P2361 Unevaluated string literals

This patch proposes to handle in an uniform fashion
the parsing of strings that are never evaluated,
in asm statement, static assert, attrributes, extern,
etc.

Unevaluated strings are UTF-8 internally and so currently
behave as narrow strings, but these things will diverge with
D93031.

The big question both for this patch and the P2361 paper
is whether we risk breaking code by disallowing
encoding prefixes in this context.
I hope this patch may allow to gather some data on that.

Future work:
Improve the rendering of unicode characters, line break
and so forth in static-assert messages

Reviewed By: aaron.ballman, shafik

Differential Revision: https://reviews.llvm.org/D105759

15 months ago[analyzer] Remove deprecated analyzer-config options
Balazs Benics [Fri, 7 Jul 2023 11:24:33 +0000 (13:24 +0200)]
[analyzer] Remove deprecated analyzer-config options

The `consider-single-element-arrays-as-flexible-array-members` analyzer
option was deprecated in clang-16, and now removed from clang-17 as
promised in
https://releases.llvm.org/16.0.0/tools/clang/docs/ReleaseNotes.html#static-analyzer

This shouldn't change observable behavior.

Differential Revision: https://reviews.llvm.org/D154481

15 months ago[LTO] Ensure LICM hoists expensive fdiv instructions introduced by InstCombine
David Sherwood [Tue, 21 Feb 2023 11:46:21 +0000 (11:46 +0000)]
[LTO] Ensure LICM hoists expensive fdiv instructions introduced by InstCombine

In the LTO pipeline we run InstCombine after LICM, which is
different to what we normally do without LTO. This has the
effect of undoing all the great work done by LICM to reduce
the cost of the loop when it hoists the fdiv out and replaces
it with fmul. When InstCombine runs after LICM it puts the
fdiv straight back which, on AArch64 at least, is darn
expensive. You can observe this problem in the SPEC2017
benchmark parest if you build with "-Ofast -flto" and the
loop-vectoriser uses an unroll factor of 1, which is what
often happens when tail-folding is enabled.

This is also a problem for scalar loops, or indeed any loop
where there is only one use of the preheader fdiv result in
the loop.

See InstCombinerImpl::visitFMul for the code that sinks the fdiv.

I've attempted to fix this by adding another LICM pass for Full
LTO after InstCombine. The alternative is to stop InstCombine
from sinking the fdiv into loops. See D87479 for a previous
discussion on this issue.

Differential Revision: https://reviews.llvm.org/D143631

15 months ago[mlir] Mark test-interpreter unsupported on Windows on Arm
David Spickett [Fri, 7 Jul 2023 10:44:00 +0000 (10:44 +0000)]
[mlir] Mark test-interpreter unsupported on Windows on Arm

This seems to fail every time there is some change in MLIR,
but not always.

For example: https://lab.llvm.org/buildbot/#/builders/65/builds/10415

15 months ago[libc] Adding a version of memcpy w/ software prefetching
Guillaume Chatelet [Wed, 5 Jul 2023 11:07:21 +0000 (11:07 +0000)]
[libc] Adding a version of memcpy w/ software prefetching

For machines with a lot of cores, hardware prefetchers can saturate the memory bus when utilization is high.
In this case it is desirable to turn off the hardware prefetcher completely.
This has a big impact on the performance of memory functions such as `memcpy` that rely on the fact that the next cache line will be readily available.

This patch adds the 'LIBC_COPT_MEMCPY_X86_USE_SOFTWARE_PREFETCHING' compile time option that generates a version of memcpy with software prefetching. While not fully restoring the original performances it mitigates the impact to an acceptable level.

Reviewed By: rtenneti

Differential Revision: https://reviews.llvm.org/D154494

15 months ago[LV] Do not add load to group if it moves across conflicting store.
Florian Hahn [Fri, 7 Jul 2023 10:06:30 +0000 (11:06 +0100)]
[LV] Do not add load to group if it moves across conflicting store.

This patch prevents invalid load groups from being formed, where a load
needs to be moved across a conflicting store.

Once we hit a store that conflicts with a load with an existing
interleave group, we need to stop adding earlier loads to the group, as
this would force hoisting the previous stores in the group across the
conflicting load.

To detect such cases, add a new CompletedLoadGroups set, which is used
to keep track of load groups to which no earlier loads can be added.

Fixes https://github.com/llvm/llvm-project/issues/63602

Reviewed By: anna

Differential Revision: https://reviews.llvm.org/D154309

15 months agoReland "[dataflow] Add dedicated representation of boolean formulas"
Sam McCall [Wed, 5 Jul 2023 09:35:06 +0000 (11:35 +0200)]
Reland "[dataflow] Add dedicated representation of boolean formulas"

This reverts commit 7a72ce98224be76d9328e65eee472381f7c8e7fe.

Test problems were due to unspecified order of function arg evaluation.

Reland "[dataflow] Replace most BoolValue subclasses with references to Formula (and AtomicBoolValue => Atom and BoolValue => Formula where appropriate)"

This properly frees the Value hierarchy from managing boolean formulas.

We still distinguish AtomicBoolValue; this type is used in client code.
However we expect to convert such uses to BoolValue (where the
distinction is not needed) or Atom (where atomic identity is intended),
and then fold AtomicBoolValue into FormulaBoolValue.

We also distinguish TopBoolValue; this has distinct rules for
widen/join/equivalence, and top-ness is not represented in Formula.
It'd be nice to find a cleaner representation (e.g. the absence of a
formula), but no immediate plans.

For now, BoolValues with the same Formula are deduplicated. This doesn't
seem desirable, as Values are mutable by their creators (properties).
We can probably drop this for FormulaBoolValue immediately (not in this
patch, to isolate changes). For AtomicBoolValue we first need to update
clients to stop using value pointers for atom identity.

The data structures around flow conditions are updated:
- flow condition tokens are Atom, rather than AtomicBoolValue*
- conditions are Formula, rather than BoolValue
Most APIs were changed directly, some with many clients had a
new version added and the existing one deprecated.

The factories for BoolValues in Environment keep their existing
signatures for now (e.g. makeOr(BoolValue, BoolValue) => BoolValue)
and are not deprecated. These have very many clients and finding the
most ergonomic API & migration path still needs some thought.

Differential Revision: https://reviews.llvm.org/D153469

15 months ago[RISCV] Add riscv_vsoxei_mask/riscv_vsuxei_mask to getTgtMemIntrinsic.
Yeting Kuo [Fri, 7 Jul 2023 09:00:27 +0000 (17:00 +0800)]
[RISCV] Add riscv_vsoxei_mask/riscv_vsuxei_mask to getTgtMemIntrinsic.

This constructs a proper memory operand for riscv_vsoxei_mask and riscv_vsuxei_mask.
I think they are missed in D147119.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D154694

15 months ago[compiler-rt][xray] Disable fdr-single-thread test on Arm
David Spickett [Fri, 7 Jul 2023 09:46:07 +0000 (09:46 +0000)]
[compiler-rt][xray] Disable fdr-single-thread test on Arm

For unknown reasons this casues a bus error.

See:
https://lab.llvm.org/buildbot/#/builders/178/builds/5157

15 months ago[MLIR][Linalg] Add unary named ops to linalg
Renato Golin [Thu, 6 Jul 2023 13:56:44 +0000 (14:56 +0100)]
[MLIR][Linalg] Add unary named ops to linalg

Following binary arithmetic in previous commits, this patch adds unary
maths ops to linalg.

It also fixes a few of the previous tests, and makes the binary ops call
BinaryFn.<op> directly instead of relying on Python to recognise the
operation.

Differential Revision: https://reviews.llvm.org/D154618

15 months ago[flang][hlfir] allow assoicate where the expr is also used by shape_of
Tom Eccles [Wed, 5 Jul 2023 16:09:08 +0000 (16:09 +0000)]
[flang][hlfir] allow assoicate where the expr is also used by shape_of

This fixes the majority of cases where we hit the "hlfir.associate of
hlfir.expr with more than one use" TODO. In particular, this allows cam4
to be built.

hlfir.shape_of is just a way to delay reading shape information until
after intrinsics have been lowered to FIR runtime calls. It gets the
shape information from reading existing SSA values (e.g. fetching the
shape used when hlfir.declare'ing the variable).

Therefore hlfir.shape_of doesn't affect decisions about when to
deallocate the buffer.

Differential Revision: https://reviews.llvm.org/D154521

15 months ago[mlir] Add InsertionGuards to OneToNPatternRewriter.
Ingo Müller [Fri, 7 Jul 2023 05:49:01 +0000 (05:49 +0000)]
[mlir] Add InsertionGuards to OneToNPatternRewriter.

This fixes bad behavior of that class that surfaced in
https://reviews.llvm.org/D154299, where calling applySignatureConversion
left the insertion point different from before the call, which broke a
subsequent call to replaceOp. This patch introduces a fix in both
functions, each of which is enough to fix the specific problem in the
aforementioned diff: (1) applySignatureConversion now resets the
insertion point with a guard for the whole function and (2) replace sets
the insertion point to the op that should be replaced (and resets it
with a guard).

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D154684

15 months ago[mlir] Avoid unnecessary copies in SCF's OneToNTypeConversions. (NFC)
Ingo Müller [Fri, 7 Jul 2023 06:03:50 +0000 (06:03 +0000)]
[mlir] Avoid unnecessary copies in SCF's OneToNTypeConversions. (NFC)

In two places, a ResultRange was copied into a SmallVector just to be
passed as a ValueRange argument. With this patch, the ResultRanges are
passed directly, avoiding a copy.

Reviewed By: ingomueller-net

Differential Revision: https://reviews.llvm.org/D154685

15 months ago[ARM][Driver] Change float-abi warning
Michael Platings [Fri, 7 Jul 2023 08:37:34 +0000 (09:37 +0100)]
[ARM][Driver] Change float-abi warning

Previously the warning stated "flag ignored" which is only partially
true - the invalid flag would prevent -feature +soft-float-abi from
being emitted which resulted in user-visible behaviour like
__ARM_PCS_VFP being defined.

Rather than attempt to coerce invalid flags into valid behaviour, don't
describe the expected behaviour.

Ideally the warning would be an error, as it is in GCC. However there
are tests in llvm-project that trigger the warning. Therefore one has to
assume that making the warning an error would break other code that
already exists in the wild.

Also apply test improvements suggested by @MaskRay on D150902.

Reviewed By: simon_tatham

Differential Revision: https://reviews.llvm.org/D154578

15 months ago[InstCombine] Preserve inbounds when folding select of GEP
Nikita Popov [Thu, 29 Jun 2023 08:28:28 +0000 (10:28 +0200)]
[InstCombine] Preserve inbounds when folding select of GEP

The select base, (gep base, offset) to gep base, select (0, offset)
fold used to drop inbounds, because the gep base, 0 this introduces
might not be inbounds. After the semantics change in D154051, such
a GEP is always considered inbounds, in which allows us to preserve
the flag here.

As the PhaseOrdering test demonstrates, this can result in major
optimization improvements in some cases.

Differential Revision: https://reviews.llvm.org/D154055

15 months ago[bazel] Port for 88e95c1e4bb6e2ad3bfd185b96341ad5c09eff6b
Haojian Wu [Fri, 7 Jul 2023 07:01:27 +0000 (09:01 +0200)]
[bazel] Port for 88e95c1e4bb6e2ad3bfd185b96341ad5c09eff6b

15 months ago[compiler-rt][RISCV] Fix __fe_getround and __fe_raise_inexact for Zfinx
Kito Cheng [Fri, 7 Jul 2023 06:24:30 +0000 (14:24 +0800)]
[compiler-rt][RISCV] Fix __fe_getround and __fe_raise_inexact for Zfinx

Zfinx extension also provide floating point environment like F extension, so
enable that on `__fe_getround` and `__fe_raise_inexact` too.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D154570

15 months ago[RISCV] Add a pass to combine `cm.pop` and `ret` insts
WuXinlong [Fri, 7 Jul 2023 06:01:22 +0000 (14:01 +0800)]
[RISCV] Add a pass to combine `cm.pop` and `ret` insts

`RISCVPushPopOptimizer.cpp` combine `cm.pop` and `ret` to generates `cm.popretz` or `cm.popret` .

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D150416

15 months ago[X86] Support some Intel CPUs for cpu_specific/dispatch feature
Freddy Ye [Fri, 7 Jul 2023 05:46:46 +0000 (13:46 +0800)]
[X86] Support some Intel CPUs for cpu_specific/dispatch feature

Reviewed By: RKSimon, skan

Differential Revision: https://reviews.llvm.org/D154493

15 months ago[RISCV] Rename prefix `fixed-vector` to `fixed-vectors` to be the same with other...
Jim Lin [Fri, 7 Jul 2023 05:02:39 +0000 (13:02 +0800)]
[RISCV] Rename prefix `fixed-vector` to `fixed-vectors` to be the same with other testcases. NFC.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154679

15 months ago[Attributor] Check all NoFPClass attributes found in the IR
Johannes Doerfert [Fri, 30 Jun 2023 17:51:05 +0000 (10:51 -0700)]
[Attributor] Check all NoFPClass attributes found in the IR