platform/upstream/llvm.git
17 months ago[OpenMP] Fix the issue where `num_threads` still takes effect incorrectly
Shilei Tian [Wed, 14 Jun 2023 15:45:49 +0000 (11:45 -0400)]
[OpenMP] Fix the issue where `num_threads` still takes effect incorrectly

This patch fixes the issue that, if we have a compile-time serialized parallel
region (such as `if (0)`) with `num_threads`, followed by a regular parallel
region, the regular parallel region will pick up the value set in the serialized
parallel region incorrectly. The reason is, in the front end, if we can prove a
parallel region has to serialized, instead of emitting `__kmpc_fork_call`, the
front end directly emits `__kmpc_serialized_parallel`, body, and `__kmpc_end_serialized_parallel`.
However, this "optimization" doesn't consider the case where `num_threads` is
used such that `__kmpc_push_num_threads` is still emitted. Since we don't reset
the value in `__kmpc_serialized_parallel`, it will affect the next parallel region
followed by it.

Fix #63197.

Reviewed By: tlwilmar

Differential Revision: https://reviews.llvm.org/D152883

17 months agoAdd support for __debug_line_str in Mach-O
Adrian Prantl [Wed, 14 Jun 2023 00:31:32 +0000 (17:31 -0700)]
Add support for __debug_line_str in Mach-O

This patch resolves an issue that currently accounts for the vast
majority of failures on the matrix bot.

Differential Revision: https://reviews.llvm.org/D152872

17 months ago[LegalizeTypes][AArch64] Use scalar_to_vector to eliminate bitcast
zhongyunde [Wed, 14 Jun 2023 15:28:46 +0000 (23:28 +0800)]
[LegalizeTypes][AArch64] Use scalar_to_vector to eliminate bitcast

```
Legalize t3: v2i16 = bitcast i32
with   (v2i16 extract_subvector (v4i16 bitcast (v2i32 scalar_to_vector (i32 in))), 0)
```
Fix https://github.com/llvm/llvm-project/issues/61638

NOTE: Don't touch getPreferredVectorAction like X86 as this will touch
too many test cases.

Reviewed By: dmgreen, paulwalker-arm, efriedma
Differential Revision: https://reviews.llvm.org/D147678

17 months ago[test] Update the checking base for LE and BE
zhongyunde [Wed, 14 Jun 2023 15:26:07 +0000 (23:26 +0800)]
[test] Update the checking base for LE and BE

precommit tests for D147678 as we need tests cover BE too.

Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D152815

17 months ago[InstCombine] Add tests for binop of shift fold (NFC)
Nikita Popov [Wed, 14 Jun 2023 15:23:51 +0000 (17:23 +0200)]
[InstCombine] Add tests for binop of shift fold (NFC)

17 months ago[flang] rename PPC specific intrinsic modules (NFC)
Kelvin Li [Fri, 9 Jun 2023 02:57:27 +0000 (22:57 -0400)]
[flang] rename PPC specific intrinsic modules (NFC)

17 months ago[AArch64] Neoverse V2 scheduling model
Ricardo Jesus [Wed, 22 Feb 2023 16:02:47 +0000 (16:02 +0000)]
[AArch64] Neoverse V2 scheduling model

This adds a scheduling model for the Neoverse V2. All information is
taken from the Neoverse V2 Software Optimisation Guide:

https://developer.arm.com/documentation/PJDOC-466751330-593177/r0p2

Differential Revision: https://reviews.llvm.org/D151894

17 months ago[flang][openacc] Add lowering for min operator
Valentin Clement [Wed, 14 Jun 2023 15:17:00 +0000 (08:17 -0700)]
[flang][openacc] Add lowering for min operator

Add lowering support for the min operator
in reduction clause.

Depends on D151565

Reviewed By: jeanPerier

Differential Revision: https://reviews.llvm.org/D151671

17 months ago[libc] Fix merging issue with test/src/math/exhaustive/expm1f_test
Tue Ly [Wed, 14 Jun 2023 14:59:33 +0000 (10:59 -0400)]
[libc] Fix merging issue with test/src/math/exhaustive/expm1f_test

17 months ago[HWASAN] Fix bot test failure caused by D152763 by switching to
Kirill Stoimenov [Wed, 14 Jun 2023 14:55:31 +0000 (14:55 +0000)]
[HWASAN] Fix bot test failure caused by D152763 by switching to
unaligned memory tagging

17 months ago[libc] Enable hermetic floating point tests again.
Tue Ly [Wed, 14 Jun 2023 14:52:23 +0000 (10:52 -0400)]
[libc] Enable hermetic floating point tests again.

Fixing an issue with LLVM libc's fenv.h defined rounding mode macros
differently from system libc, making get_round() return different values from
fegetround().  Also letting math tests to skip rounding modes that cannot be
set.  This should allow math tests to be run on platforms in which fenv.h is not
implemented yet.

This allows us to re-enable hermatic floating point tests in
https://reviews.llvm.org/D151123 and reverting https://reviews.llvm.org/D152742.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D152873

17 months ago[flang] semantic checking for unsupported features in PPC vector type
Kelvin Li [Tue, 30 May 2023 23:37:44 +0000 (19:37 -0400)]
[flang] semantic checking for unsupported features in PPC vector type

Assumed-shape, deferred-shape and assumed rank entities of PPC vector
type are not supported.

Differential Revision: https://reviews.llvm.org/D152864

17 months ago[GlobalIsel][X86] Update legalization of G_UADDE
Simon Pilgrim [Wed, 14 Jun 2023 14:34:02 +0000 (15:34 +0100)]
[GlobalIsel][X86] Update legalization of G_UADDE

Replace the legacy legalizer versions - still WIP but matches existing s32 handling, we should be able to add full scalar support for G_UADDO/G_USUBE/G_USUBO as well very easily

17 months ago[libc][NFC] Fix some issues with LIBC_INLINE
Alex Brachet [Wed, 14 Jun 2023 14:07:58 +0000 (14:07 +0000)]
[libc][NFC] Fix some issues with LIBC_INLINE

We define LIBC_INLINE to include [[clang::internal_linkage]], and these
must appear before other specifiers. Additionally, there was also a
missing cast that was causing warnings.

Differential Revision: https://reviews.llvm.org/D152865

17 months ago[clangd] Use include_cleaner spelling strategies in clangd.
Viktoriia Bakalova [Fri, 9 Jun 2023 15:11:13 +0000 (15:11 +0000)]
[clangd] Use include_cleaner spelling strategies in clangd.

Differential Revision: https://reviews.llvm.org/D152913

17 months ago[GlobalIsel][X86] Regenerate legalize-add.mir with common CHECK prefix
Simon Pilgrim [Wed, 14 Jun 2023 14:01:06 +0000 (15:01 +0100)]
[GlobalIsel][X86] Regenerate legalize-add.mir with common CHECK prefix

17 months ago[libc] Enable custom logging in LibcTest
Guillaume Chatelet [Wed, 14 Jun 2023 11:55:29 +0000 (11:55 +0000)]
[libc] Enable custom logging in LibcTest

This patch mimics the behavior of Google Test and allow users to log custom messages after all flavors of ASSERT_ / EXPECT_.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D152630

17 months ago[lldb][AArch64] Add Scalable Matrix Extension option to QEMU launch script
David Spickett [Tue, 6 Jun 2023 08:18:12 +0000 (09:18 +0100)]
[lldb][AArch64] Add Scalable Matrix Extension option to QEMU launch script

The Scalable Matrix Extension (SME) does not require extra options
beyond setting the cpu to "max".

https://qemu-project.gitlab.io/qemu/system/arm/cpu-features.html#sme-cpu-property-examples

SME depends on SVE, so that will be enabled too even if you don't ask
for it by name.

--sve --sme -> SVE and SME
--sme       -> SVE and SME
--sve       -> Only SVE

Reviewed By: omjavaid

Differential Revision: https://reviews.llvm.org/D152519

17 months ago[RDF] Print something useful for NodeId == 0 instead of crashing
Krzysztof Parzyszek [Tue, 6 Jun 2023 19:20:37 +0000 (12:20 -0700)]
[RDF] Print something useful for NodeId == 0 instead of crashing

17 months ago[RDF] Remove unused parameter AllRefs from buildPhis
Krzysztof Parzyszek [Tue, 6 Jun 2023 14:51:27 +0000 (07:51 -0700)]
[RDF] Remove unused parameter AllRefs from buildPhis

17 months ago[MISched] Fix non-debug builds.
Francesco Petrogalli [Wed, 14 Jun 2023 13:11:44 +0000 (15:11 +0200)]
[MISched] Fix non-debug builds.

As reported in https://github.com/llvm/llvm-project/issues/63225, we
need to make sure we can use the `&operator<<` on instances of the
`ResourceSegments` class for builds that set `NDEBUG`.

Reviewed By: sylvestre.ledru

Differential Revision: https://reviews.llvm.org/D152817

17 months ago[NFC] Add tests cases for isTruncateOf for D151916
Amaury Séchet [Wed, 14 Jun 2023 12:57:02 +0000 (12:57 +0000)]
[NFC] Add tests cases for isTruncateOf for D151916

17 months ago[InstCombine] Avoid infinite loop in insert/extract combine
Nikita Popov [Wed, 14 Jun 2023 12:56:10 +0000 (14:56 +0200)]
[InstCombine] Avoid infinite loop in insert/extract combine

Fix the infinite loop reported on https://reviews.llvm.org/D151807#4420467.

collectShuffleElements() will widen vectors and replace extracts
via replaceExtractElements(), to allow the next call of
collectShuffleElements() to fold. However, it's possible for another
fold to run first, and break the expected sequence again. To ensure
this does not happen, directly rerun the collectShuffleElements()
fold if we have adjusted extracts.

17 months ago[DwarfDebug] Move emission of imported entities from beginModule() to endModule(...
Kristina Bessonova [Mon, 13 Mar 2023 12:29:13 +0000 (13:29 +0100)]
[DwarfDebug] Move emission of imported entities from beginModule() to endModule() (2/7)

RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544

!Note! Extracted from the following patch for review purpose only, should
be squashed with the next patch (D144004) before committing.

Currently the back-end emits imported entities in `DwarfDebug::beginModule()`.
However in case an imported declaration is a function, it must point to an
abstract subprogram if it exists (see PR51501). But in `DwarfDebug::beginModule()`
the DWARF generator doesn't have information to identify if an abstract
subprogram needs to be created.

Only by entering `DwarfDebug::endModule()` all subprograms are processed,
so it's clear which subprogram DIE should be referred to. Hence, the patch moves
the emission there.

The patch is need to fix PR51501, but it only does the preliminary
work. Since it changes the order of debug entities in emitted DWARF and
therefore affect many tests it's separated from the fix for the sake of
simplifying review.

Note that there are other issues with handling an imported declaration in
`DwarfDebug::beginModule()`. They are described in more details in D114705.

Differential Revision: https://reviews.llvm.org/D143985

Depends on D143984

17 months ago[clang][Sema] Fix diagnostic message for unused constant variable templates
Takuya Shimizu [Wed, 14 Jun 2023 12:43:03 +0000 (21:43 +0900)]
[clang][Sema] Fix diagnostic message for unused constant variable templates

BEFORE this patch, unused const-qualified variable templates such as `template <typename T> const double var_t = 0;` were diagnosed as `unused variable 'var_t'`
This patch fixes this message to `unused variable template 'var_t'`

Differential Revision: https://reviews.llvm.org/D152796

17 months agoUpdate with warning message for comparison to NULL pointer
Krishna Narayanan [Wed, 14 Jun 2023 12:28:35 +0000 (08:28 -0400)]
Update with warning message for comparison to NULL pointer

The tautological comparison warning was not properly looking through
parenthesized expressions, which is now fixed.

Fixes https://github.com/llvm/llvm-project/issues/42992
Differential Revision: https://reviews.llvm.org/D149000

17 months ago[update_mir_test_checks] Tolerate -simplify-mir output
Jay Foad [Wed, 14 Jun 2023 10:44:41 +0000 (11:44 +0100)]
[update_mir_test_checks] Tolerate -simplify-mir output

D135579 added support for fixedStack, but did not cope with the output
of -simplify-mir which does not include the fixedStack section by
default.

Differential Revision: https://reviews.llvm.org/D152896

17 months ago[docs] Add missing label
Simon Pilgrim [Wed, 14 Jun 2023 12:07:57 +0000 (13:07 +0100)]
[docs] Add missing label

17 months agoLowerMemIntrinsics: Check address space aliasing for memmove expansion
Matt Arsenault [Sat, 10 Jun 2023 17:22:34 +0000 (13:22 -0400)]
LowerMemIntrinsics: Check address space aliasing for memmove expansion

For cases where we cannot insert an addrspacecast, we can still expand
like a memcpy if we know the address spaces cannot alias. Normally
non-aliasing memmoves are optimized to memcpy, but we cannot rely on
that for lowering. If a target has aliasing address spaces that cannot
be casted between, we still have to give up lowering this.

17 months ago[docs] Add missing empty line at start of code-block
Simon Pilgrim [Wed, 14 Jun 2023 11:49:07 +0000 (12:49 +0100)]
[docs] Add missing empty line at start of code-block

17 months ago[X86] X86FixupVectorConstantsPass - attempt to replace full width integer vector...
Simon Pilgrim [Wed, 14 Jun 2023 11:25:59 +0000 (12:25 +0100)]
[X86] X86FixupVectorConstantsPass - attempt to replace full width integer vector constant loads with broadcasts on AVX2+ targets (REAPPLIED)

lowerBuildVectorAsBroadcast will not broadcast splat constants in all cases, resulting in a lot of situations where a full width vector load that has failed to fold but is loading splat constant values could use a broadcast load instruction just as cheaply, and save constant pool space.

This is an updated commit of ab4b924832ce26c21b88d7f82fcf4992ea8906bb after being reverted at 78de45fd4a902066617fcc9bb88efee11f743bc6

17 months ago[clang-tidy] Fix build bot break after 474a2b9367ad
Nemanja Ivanovic [Wed, 14 Jun 2023 11:45:05 +0000 (06:45 -0500)]
[clang-tidy] Fix build bot break after 474a2b9367ad

The commmit added clang-tidy checks without adding
the required library to the link step.
Caused failures with shared library builds.

17 months ago[AMDGPU][AsmParser][NFC] Get rid of custom default operand handlers.
Ivan Kosarev [Wed, 14 Jun 2023 10:53:12 +0000 (11:53 +0100)]
[AMDGPU][AsmParser][NFC] Get rid of custom default operand handlers.

Removes the need to add and remove them manually depending on whether
they are used in cvt*() functions. Also removes the compiler warnings
about unused handlers when it happens to be the case.

Part of <https://github.com/llvm/llvm-project/issues/62629>.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D151688

17 months ago[AMDGPU] Use a common check prefix in regbankselect-amdgcn.s.buffer.load.ll
Jay Foad [Wed, 14 Jun 2023 11:05:38 +0000 (12:05 +0100)]
[AMDGPU] Use a common check prefix in regbankselect-amdgcn.s.buffer.load.ll

17 months ago[SLPVectorizer] Don't include isAssumeLikeIntrinsics in ScheduleRegionSize
Mikael Holmen [Mon, 12 Jun 2023 12:08:28 +0000 (14:08 +0200)]
[SLPVectorizer] Don't include isAssumeLikeIntrinsics in ScheduleRegionSize

We don't want the existence of debug instructions affect codegen so we now
ignore debug instructions and other "isAssumeLikeIntrinsics in the
"extend schedule region" search loop in
BoUpSLP::BlockScheduling::extendSchedulingRegion.

Differential Revision: https://reviews.llvm.org/D152441

17 months ago[AMDGPU][GFX11] Add test coverage for 16-bit conversions, part 2.
Ivan Kosarev [Wed, 14 Jun 2023 10:40:48 +0000 (11:40 +0100)]
[AMDGPU][GFX11] Add test coverage for 16-bit conversions, part 2.

Reviewed By: Joe_Nash

Differential Revision: https://reviews.llvm.org/D152715

17 months ago[docs] Add missing empty line before lists
Simon Pilgrim [Wed, 14 Jun 2023 10:39:12 +0000 (11:39 +0100)]
[docs] Add missing empty line before lists

17 months agoRevert D152630 "[libc] Enable custom logging in LibcTest"
Guillaume Chatelet [Wed, 14 Jun 2023 10:31:49 +0000 (10:31 +0000)]
Revert D152630 "[libc] Enable custom logging in LibcTest"

Failing buildbot https://lab.llvm.org/buildbot/#/builders/73/builds/49707
This reverts commit 9a7b4c934893d6bc571e1ce8efab2127ae5f4e45.

17 months ago[libc] Enable custom logging in LibcTest
Guillaume Chatelet [Wed, 14 Jun 2023 09:18:42 +0000 (09:18 +0000)]
[libc] Enable custom logging in LibcTest

This patch mimics the behavior of Google Test and allow users to log custom messages after all flavors of ASSERT_ / EXPECT_.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D152630

17 months ago[CostModel][X86] Tweak SSE2 v2i64 multiply costs based off D46276 script
Simon Pilgrim [Wed, 14 Jun 2023 10:06:15 +0000 (11:06 +0100)]
[CostModel][X86] Tweak SSE2 v2i64 multiply costs based off D46276 script

It looks like we were trying to account for SLM costs, which are actually handled separately

Fixes #62969

17 months ago[TTI][X86] Recognise PMULUDQ costs for vXi64 multiplies
Simon Pilgrim [Tue, 13 Jun 2023 19:06:21 +0000 (20:06 +0100)]
[TTI][X86] Recognise PMULUDQ costs for vXi64 multiplies

Addresses part of Issue #62969 - if the upper 32-bits of the vXi64 elements are known to be zero, then a multiply simplifies to a single (fast) PMULUDQ instruction

We still have the problem that minRequiredElementSize can't determine that the upper bits are zero for the test case from Issue #62969 - I'll take a look at that next.

17 months ago[mlir][llvm] Add memset support for mem2reg/sroa
Théo Degioanni [Wed, 14 Jun 2023 08:43:10 +0000 (08:43 +0000)]
[mlir][llvm] Add memset support for mem2reg/sroa

This revision introduces support for memset intrinsics in SROA and
mem2reg for the LLVM dialect. This is achieved for SROA by breaking
memsets of aggregates into multiple memsets of scalars, and for mem2reg
by promoting memsets of single integer slots into the value the memset
operation would yield.

The SROA logic supports breaking memsets of static size operating at the
start of a memory slot. The intended most common case is for memsets
covering the entirety of a struct, most often as a way to initialize it
to 0.

The mem2reg logic supports dynamic values and static sizes as input to
promotable memsets. This is achieved by lowering memsets into
`ceil(log_2(n))` LeftShift operations, `ceil(log_2(n))` Or operations
and up to one ZExt operation (for n the byte width of the integer),
computing in registers the integer value the memset would create. Only
byte-aligned integers are supported, more types could easily be added
afterwards.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D152367

17 months agoRevert "[mlir][ArmSME] Add initial dialect with basic lowering of vector.transfer...
Cullen Rhodes [Wed, 14 Jun 2023 09:02:53 +0000 (09:02 +0000)]
Revert "[mlir][ArmSME] Add initial dialect with basic lowering of vector.transfer write to zero"

Apologies I shouldn't have comitted this, need to wait until the planned
MLIR ODM:

  https://discourse.llvm.org/t/rfc-creating-a-armsme-dialect/67208/76

This reverts commit a48fe898857c95a063fa6c201343dca969bc098a.

17 months ago[mlir][ArmSME] Add initial dialect with basic lowering of vector.transfer write to...
Cullen Rhodes [Wed, 14 Jun 2023 08:26:44 +0000 (08:26 +0000)]
[mlir][ArmSME] Add initial dialect with basic lowering of vector.transfer write to zero

This patch adds support for lowering a `vector.transfer_write` of zeroes
and type `vector<[16x16]xi8>` to the SME `zero {za}` instruction [1],
which zeroes the entire accumulator.

This contributes to supporting a path from `linalg.fill` to SME.

[1] https://developer.arm.com/documentation/ddi0602/2022-06/SME-Instructions/ZERO--Zero-a-list-of-64-bit-element-ZA-tiles-

Reviewed By: awarzynski, dcaballe

Differential Revision: https://reviews.llvm.org/D152508

17 months ago[libc] Dispatch memmove to memcpy when buffers are disjoint
Guillaume Chatelet [Tue, 13 Jun 2023 14:41:17 +0000 (14:41 +0000)]
[libc] Dispatch memmove to memcpy when buffers are disjoint

Most of the time `memmove` is called on buffers that are disjoint, in that case we can use `memcpy` which is faster.
The additional test is branchless on x86, aarch64 and RISCV with the zbb extension (bitmanip).
On x86 this patch adds a latency of 2 to 3 cycles.

Before
```
--------------------------------------------------------------------------------
Benchmark                      Time             CPU   Iterations UserCounters...
--------------------------------------------------------------------------------
BM_Memmove/0/0_median       5.00 ns         5.00 ns           10 bytes_per_cycle=1.25477/s bytes_per_second=2.62933G/s items_per_second=199.87M/s __llvm_libc::memmove,memmove Google A
BM_Memmove/1/0_median       6.21 ns         6.21 ns           10 bytes_per_cycle=3.22173/s bytes_per_second=6.75106G/s items_per_second=160.955M/s __llvm_libc::memmove,memmove Google B
BM_Memmove/2/0_median       8.09 ns         8.09 ns           10 bytes_per_cycle=5.31462/s bytes_per_second=11.1366G/s items_per_second=123.603M/s __llvm_libc::memmove,memmove Google D
BM_Memmove/3/0_median       5.95 ns         5.95 ns           10 bytes_per_cycle=2.71865/s bytes_per_second=5.69687G/s items_per_second=167.967M/s __llvm_libc::memmove,memmove Google L
BM_Memmove/4/0_median       5.63 ns         5.63 ns           10 bytes_per_cycle=2.28294/s bytes_per_second=4.78383G/s items_per_second=177.615M/s __llvm_libc::memmove,memmove Google M
BM_Memmove/5/0_median       5.68 ns         5.68 ns           10 bytes_per_cycle=2.16798/s bytes_per_second=4.54295G/s items_per_second=176.015M/s __llvm_libc::memmove,memmove Google Q
BM_Memmove/6/0_median       7.46 ns         7.46 ns           10 bytes_per_cycle=3.97619/s bytes_per_second=8.332G/s items_per_second=134.044M/s __llvm_libc::memmove,memmove Google S
BM_Memmove/7/0_median       5.40 ns         5.40 ns           10 bytes_per_cycle=1.79695/s bytes_per_second=3.76546G/s items_per_second=185.211M/s __llvm_libc::memmove,memmove Google U
BM_Memmove/8/0_median       5.62 ns         5.62 ns           10 bytes_per_cycle=3.18747/s bytes_per_second=6.67927G/s items_per_second=177.983M/s __llvm_libc::memmove,memmove Google W
BM_Memmove/9/0_median        101 ns          101 ns           10 bytes_per_cycle=9.77359/s bytes_per_second=20.4803G/s items_per_second=9.9333M/s __llvm_libc::memmove,uniform 384 to 4096
```
After
```
BM_Memmove/0/0_median       3.57 ns         3.57 ns           10 bytes_per_cycle=1.71375/s bytes_per_second=3.59112G/s items_per_second=280.411M/s __llvm_libc::memmove,memmove Google A
BM_Memmove/1/0_median       4.52 ns         4.52 ns           10 bytes_per_cycle=4.47557/s bytes_per_second=9.37843G/s items_per_second=221.427M/s __llvm_libc::memmove,memmove Google B
BM_Memmove/2/0_median       5.70 ns         5.70 ns           10 bytes_per_cycle=7.37396/s bytes_per_second=15.4519G/s items_per_second=175.399M/s __llvm_libc::memmove,memmove Google D
BM_Memmove/3/0_median       4.47 ns         4.47 ns           10 bytes_per_cycle=3.4148/s bytes_per_second=7.15563G/s items_per_second=223.743M/s __llvm_libc::memmove,memmove Google L
BM_Memmove/4/0_median       4.53 ns         4.53 ns           10 bytes_per_cycle=2.86071/s bytes_per_second=5.99454G/s items_per_second=220.69M/s __llvm_libc::memmove,memmove Google M
BM_Memmove/5/0_median       4.19 ns         4.19 ns           10 bytes_per_cycle=2.5484/s bytes_per_second=5.3401G/s items_per_second=238.924M/s __llvm_libc::memmove,memmove Google Q
BM_Memmove/6/0_median       5.02 ns         5.02 ns           10 bytes_per_cycle=5.94164/s bytes_per_second=12.4505G/s items_per_second=199.14M/s __llvm_libc::memmove,memmove Google S
BM_Memmove/7/0_median       4.03 ns         4.03 ns           10 bytes_per_cycle=2.47028/s bytes_per_second=5.17641G/s items_per_second=247.906M/s __llvm_libc::memmove,memmove Google U
BM_Memmove/8/0_median       4.70 ns         4.70 ns           10 bytes_per_cycle=3.84975/s bytes_per_second=8.06706G/s items_per_second=212.72M/s __llvm_libc::memmove,memmove Google W
BM_Memmove/9/0_median       90.7 ns         90.7 ns           10 bytes_per_cycle=10.8681/s bytes_per_second=22.7739G/s items_per_second=11.02M/s __llvm_libc::memmove,uniform 384 to 4096
```

Reviewed By: courbet

Differential Revision: https://reviews.llvm.org/D152811

17 months ago[test][hwasan] Allow test for any platform with tagging
Vitaly Buka [Wed, 14 Jun 2023 08:15:59 +0000 (01:15 -0700)]
[test][hwasan] Allow test for any platform with tagging

17 months ago[AMDGPU] Pre-commit test for D152892 (NFC)
Carl Ritson [Wed, 14 Jun 2023 08:13:32 +0000 (17:13 +0900)]
[AMDGPU] Pre-commit test for D152892 (NFC)

17 months ago[PhaseOrdering] Regenerate test checks (NFC)
Nikita Popov [Wed, 14 Jun 2023 08:08:46 +0000 (10:08 +0200)]
[PhaseOrdering] Regenerate test checks (NFC)

Just naming changes.

17 months ago[InstCombine] Handle use count decrement in more cases
Nikita Popov [Wed, 14 Jun 2023 07:18:10 +0000 (09:18 +0200)]
[InstCombine] Handle use count decrement in more cases

These two helpers also decrement the use count of the replaced
operand, so give them the same treatment as eraseInstruction().

17 months ago[test][hwasan] Rename constants in test
Vitaly Buka [Wed, 14 Jun 2023 07:58:05 +0000 (00:58 -0700)]
[test][hwasan] Rename constants in test

17 months ago[Clang] Rename getElementBitCast() -> withElementType() (NFC)
Nikita Popov [Wed, 14 Jun 2023 07:57:01 +0000 (09:57 +0200)]
[Clang] Rename getElementBitCast() -> withElementType() (NFC)

This no longer creates a bitcast, just changes the element type
of the ConstantAddress.

17 months ago[SimpleLoopUnswitch] Unswitch AND/OR conditions of selects
Joshua Cao [Tue, 30 May 2023 03:57:20 +0000 (20:57 -0700)]
[SimpleLoopUnswitch] Unswitch AND/OR conditions of selects

If a select's condition is a AND/OR, we can unswitch invariant operands.
This patch uses existing logic from unswitching AND/OR's for branch
conditions.

This patch fixes the Cost computation for unswitching selects to have
the cost of the entire loop, since unswitching selects do not remove
branches. This is required for this patch because otherwise, there are
cases where unswitching selects of AND/OR is beating out unswitching of
branches.

This patch also prevents unswitching of logical AND/OR selects. This
should instead be done by unswitching of AND/OR branch conditions.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D151677

17 months ago[SimpleLoopUnswitch][NFC] Add tests for and/or conditions of selects
Joshua Cao [Tue, 30 May 2023 04:06:20 +0000 (21:06 -0700)]
[SimpleLoopUnswitch][NFC] Add tests for and/or conditions of selects

17 months ago[mlir][vector][bufferize] Better analysis for vector.transfer_write
Matthias Springer [Wed, 14 Jun 2023 07:31:13 +0000 (09:31 +0200)]
[mlir][vector][bufferize] Better analysis for vector.transfer_write

The destination operand does not bufferize to a memory read if it is completely overwritten.

Differential Revision: https://reviews.llvm.org/D152823

17 months agoFix test Driver/mips-mti-linux.c
Michael Platings [Wed, 14 Jun 2023 07:28:15 +0000 (08:28 +0100)]
Fix test Driver/mips-mti-linux.c

17 months ago[InstCombine] Revisit user of newly one-use instructions
Nikita Popov [Wed, 31 May 2023 14:09:06 +0000 (16:09 +0200)]
[InstCombine] Revisit user of newly one-use instructions

Many folds in InstCombine are limited to one-use instructions. For
that reason, if the use-count of an instruction drops to one, it
makes sense to revisit that one user. This is one of the most
common reasons why InstCombine fails to finish in a single iteration.

Doing this revisit actually slightly improves compile-time, because
we save an extra InstCombine iteration in enough cases to make a
visible difference.

This is conceptually NFC, but not NFC in practice, because differences
in worklist order can result in slightly different folding behavior.

The regressed tests in or-shifted-masks.ll now require a sequence of
instcombine,early-cse,instcombine to fold fully. D152876 would make
these fold in a single instcombine run again.

Differential Revision: https://reviews.llvm.org/D151807

17 months ago[11/11][Clang][RISCV] Expand all variants for vset on tuple types
eopXD [Sat, 3 Jun 2023 16:02:37 +0000 (09:02 -0700)]
[11/11][Clang][RISCV] Expand all variants for vset on tuple types

This is the 11th patch of the patch-set. For the cover letter, please
checkout D152069.

Depends on D152078.

This patch also fixes the suffix for non-overloaded variants for
vset on tuple types.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D152079

17 months ago[10/11][Clang][RISCV] Expand all variants for vget on tuple types
eopXD [Sat, 3 Jun 2023 15:41:40 +0000 (08:41 -0700)]
[10/11][Clang][RISCV] Expand all variants for vget on tuple types

This is the 10th patch of the patch-set. For the cover letter, please
checkout D152069.

Depends on D152077.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D152078

17 months ago[9/11][Clang][RISCV] Expand all variants for indexed strided segment store
eopXD [Sat, 3 Jun 2023 03:25:51 +0000 (20:25 -0700)]
[9/11][Clang][RISCV] Expand all variants for indexed strided segment store

This is the 9th patch of the patch-set. For the cover letter, please
checkout D152069.

Depends on D152076.

This patch expands all variants of indexed strided segment store.
This patch also fixes the trailing suffix in the intrinsics' function
name that representing the return type, adding `x{NF}`.

For the same reason mentioned in [3/11], only full test case for
vsuxseg2ei32, vsoxseg2ei32 is added for now.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D152077

17 months ago[8/11][Clang][RISCV] Expand all variants for indexed strided segment load
eopXD [Sat, 3 Jun 2023 03:20:04 +0000 (20:20 -0700)]
[8/11][Clang][RISCV] Expand all variants for indexed strided segment load

This is the 8th patch of the patch-set. For the cover letter, please
checkout D152069.

Depends on D152075.

This patch expands all variants of indexed strided segment load,
including the policy variants. This patch also fixes the trailing suffix
in the intrinsics' function name that representing the return type,
adding `x{NF}`.

For the same reason mentioned in [3/11], only full test case for
vluxseg2ei32, vloxseg2ei32 is added for now.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D152076

17 months ago[7/11][Clang][RISCV] Expand all variants for strided segment store
eopXD [Sat, 3 Jun 2023 03:07:08 +0000 (20:07 -0700)]
[7/11][Clang][RISCV] Expand all variants for strided segment store

This is the 7th patch of the patch-set. For the cover letter, please
checkout D152069.

Depends on D152074.

This patch expands all variants for strided segment store. The store
intrinsics does not have any policy variants. This patch also fixes the
trailing suffix in the intrinsics' function name that representing the
return type, adding `x{NF}`.

For the same reason mentioned in [3/11], only full test case for
vssseg2e32 is added for now.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D152075

17 months ago[6/11][Clang][RISCV] Expand all variants for strided segment load
eopXD [Sat, 3 Jun 2023 02:58:24 +0000 (19:58 -0700)]
[6/11][Clang][RISCV] Expand all variants for strided segment load

This is the 6th patch of the patch-set. For the cover letter, please
checkout D152069.

Depends on D152073.

This patch expands all variants of strided segment load, including the
policy variants. This patch also fixes the trailing suffix in the
intrinsics' function name that representing the return type, adding
`x{NF}`.

For the same reason mentioned in [3/11], only full test case for
vlsseg2e32 is added.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D152074

17 months ago[NFC] skip the test modules-vtable.cppm on windows
Chuanqi Xu [Wed, 14 Jun 2023 07:04:39 +0000 (15:04 +0800)]
[NFC] skip the test modules-vtable.cppm on windows

The new added test has problems on windows since the patch is about ABI
and MSVC ABI is not covered. Skip the test on windows to make the CI
green.

17 months ago[5/11][Clang][RISCV] Expand all variants for unit stride fault-first segment load
eopXD [Tue, 30 May 2023 17:14:06 +0000 (10:14 -0700)]
[5/11][Clang][RISCV] Expand all variants for unit stride fault-first segment load

This is the 5th patch of the patch-set. For the cover letter, please
checkout D152069.

Depends on D152072.

This patch expands all variants of unit stride fault-first segment
load, including the policy variants. This patch also fixes the
trailing suffix in the intrinsics' function name that representing
the return type, adding `x{NF}`.

For the same reason mentioned in [3/11], only full test case for
vlseg2e32ff is added.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D152073

17 months ago[CSKY] Add support for half-precision floats
Zi Xuan Wu (Zeson) [Wed, 14 Jun 2023 06:58:48 +0000 (14:58 +0800)]
[CSKY] Add support for half-precision floats

Complete fp16 support by ensuring that load extension / truncate store operations are properly expanded.

17 months ago[NFC][RISCV] rename findFirstNonVersionCharacter with findLastNonVersionCharacter
Piyou Chen [Wed, 14 Jun 2023 06:05:09 +0000 (23:05 -0700)]
[NFC][RISCV] rename findFirstNonVersionCharacter with findLastNonVersionCharacter

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D152506

17 months ago[mlir][IR] Improve listener notifications for ops without results
Matthias Springer [Wed, 14 Jun 2023 06:41:19 +0000 (08:41 +0200)]
[mlir][IR] Improve listener notifications for ops without results

`RewriterBase::Listener::notifyOperationReplaced` notifies observers that an op is about to be replaced with a range of values. This notification is not very useful for ops without results, because it does not specify the replacement op (and it cannot be deduced from the replacement values). It provides no additional information over the `notifyOperationRemoved` notification.

This revision adds an additional notification when a rewriter replaces an op with another op. By default, this notification triggers the original "op replaced with values" notification, so there is no functional change for existing code.

This new API is useful for the transform dialect, which needs to track op replacements. (Updated in a subsequent revision.)

Also includes minor documentation improvements.

Differential Revision: https://reviews.llvm.org/D152814

17 months ago[4/11][Clang][RISCV] Expand all variants for unit stride segment store
eopXD [Tue, 30 May 2023 16:50:51 +0000 (09:50 -0700)]
[4/11][Clang][RISCV] Expand all variants for unit stride segment store

This is the 4th patch of the patch-set. For the cover letter, please
checkout D152069.

Depends on D152071.

This patch expands all variants for unit stride segment store. The
store intrinsics does not have any policy variants. This patch also
fixes the trailing suffix in the intrinsics' function name that
representing the return type, adding `x{NF}`.

For the same reason mentioned in [3/11], only full test case for
vsseg2e32 is added.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D152072

17 months ago[3/11][Clang][RISCV] Expand all variants for unit stride segment load
eopXD [Tue, 30 May 2023 15:32:48 +0000 (08:32 -0700)]
[3/11][Clang][RISCV] Expand all variants for unit stride segment load

This is the 3rd patch of the patch-set. For the cover letter, please
checkout D152069.

Depends on D152070.

This patch expands all variants of unit stride segment load, including
the policy variants. This patch also fixes the trailing suffix in the
intrinsics' function name that representing the return type, adding
`x{NF}`.

Currently the tuple type co-exists with the non-tuple type intrinsics.
Since the co-existance is temporary, this patch only adds test cases of
all variants for vlseg2e32 to show the capability done.

Test cases of other data type and NF will be added in the patch-set
when the replacement happens.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D152071

17 months ago[2/11][Clang][RISCV] Expand all variants of RVV intrinsic tuple types
eopXD [Sun, 28 May 2023 13:14:11 +0000 (06:14 -0700)]
[2/11][Clang][RISCV] Expand all variants of RVV intrinsic tuple types

This is the 2nd patch of the patch-set. For the cover letter, please
checkout D152069.

Depends on D152069.

This patch also removes redundant checks related to tuples and dedicate
the check to happen in `RVVType::verifyType`.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D152070

17 months ago[flang][openacc] Lower gang dim to MLIR
Valentin Clement [Wed, 14 Jun 2023 06:20:11 +0000 (23:20 -0700)]
[flang][openacc] Lower gang dim to MLIR

Lower gang dim from the parse tree to the new MLIR
representation.

Depends on D151972

Reviewed By: razvanlupusoru, jeanPerier

Differential Revision: https://reviews.llvm.org/D151973

17 months ago[Orc][Coff] Skip registration of voltbl sections
River Riddle [Fri, 9 Jun 2023 21:08:41 +0000 (14:08 -0700)]
[Orc][Coff] Skip registration of voltbl sections

We're getting asserts for duplicate section registration during
linking which stems back to these sections. From previous
discussions, it seems like these are metadata sections that can
be dropped. See the discussion in D116474 and
https://bugs.llvm.org/show_bug.cgi?id=45111.

Differential Revision: https://reviews.llvm.org/D152574

17 months ago[Docs] Multilib design
Michael Platings [Tue, 6 Jun 2023 17:56:59 +0000 (18:56 +0100)]
[Docs] Multilib design

Reviewed By: peter.smith, MaskRay

Differential Revision: https://reviews.llvm.org/D143587

17 months ago[Driver] BareMetal ToolChain multilib layering
Michael Platings [Wed, 1 Feb 2023 15:48:46 +0000 (15:48 +0000)]
[Driver] BareMetal ToolChain multilib layering

This enables layering baremetal multilibs on top of each other.
For example a multilib containing only a no-exceptions libc++ could be
layered on top of a multilib containing C libs. This avoids the need
to duplicate the C library for every libc++ variant.

Differential Revision: https://reviews.llvm.org/D143075

17 months ago[Driver] Enable selecting multiple multilibs
Michael Platings [Tue, 31 Jan 2023 15:45:16 +0000 (15:45 +0000)]
[Driver] Enable selecting multiple multilibs

This will enable layering multilibs on top of each other.
For example a multilib containing only a no-exceptions libc++ could be
layered on top of a multilib containing C libs. This avoids the need
to duplicate the C library for every libc++ variant.

This change doesn't expose the functionality externally, it only opens
the functionality up to be potentially used by ToolChain classes.

Differential Revision: https://reviews.llvm.org/D143059

17 months ago[Driver] Enable multilib.yaml in the BareMetal ToolChain
Michael Platings [Tue, 7 Feb 2023 16:19:23 +0000 (16:19 +0000)]
[Driver] Enable multilib.yaml in the BareMetal ToolChain

The default location for multilib.yaml is lib/clang-runtimes, without
any target-specific suffix. This will allow multilibs for different
architectures to share a common include directory.

To avoid breaking the arm-execute-only.c CHECK-NO-EXECUTE-ONLY-ASM
test, add a ForMultilib argument to getARMTargetFeatures.

Since the presence of multilib.yaml can change the exact location of a
library, relax the baremetal.cpp test.

Differential Revision: https://reviews.llvm.org/D142986

17 months ago[Driver] Add -print-multi-flags-experimental option
Michael Platings [Thu, 9 Mar 2023 19:38:26 +0000 (19:38 +0000)]
[Driver] Add -print-multi-flags-experimental option

This option causes the flags used for selecting multilibs to be printed.
This is an experimental feature that is documented in detail in D143587.

Differential Revision: https://reviews.llvm.org/D142933

17 months ago[Driver] Multilib YAML parsing
Michael Platings [Tue, 6 Jun 2023 14:31:46 +0000 (15:31 +0100)]
[Driver] Multilib YAML parsing

The format includes a ClangMinimumVersion entry to avoid a potential
source of subtle errors if an older version of Clang were to be used
with a multilib.yaml that requires a newer Clang to work correctly.
This feature is comparable to CMake's cmake_minimum_required.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D142932

17 months ago[HWASAN] Implement munmap interceptor for HWASAN
Kirill Stoimenov [Sat, 10 Jun 2023 00:16:48 +0000 (00:16 +0000)]
[HWASAN] Implement munmap interceptor for HWASAN

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D152763

17 months ago[LoongArch] Ignore warnings when there is no environment in triple
Wang Rui [Wed, 14 Jun 2023 05:21:25 +0000 (13:21 +0800)]
[LoongArch] Ignore warnings when there is no environment in triple

In Rust bare-metal targets, there is no environment component in triple name. This patch ignores warnings that look like:

```
warning: triple-implied ABI conflicts with provided target-abi ‘lp64s', using target-abi
```

Reviewed By: SixWeining, xen0n

Differential Revision: https://reviews.llvm.org/D152778

17 months ago[ABI] [C++20] [Modules] Don't generate vtable if the class is defined in other module...
Chuanqi Xu [Wed, 14 Jun 2023 04:45:34 +0000 (12:45 +0800)]
[ABI] [C++20] [Modules] Don't generate vtable if the class is defined in other module unit

Close https://github.com/llvm/llvm-project/issues/61940.

The root cause is that clang will generate vtable as strong symbol now
even if the corresponding class is defined in other module units. After
I check the wording in Itanium ABI, I find this is not inconsistent.
Itanium ABI 5.2.3
(https://itanium-cxx-abi.github.io/cxx-abi/abi.html#vague-vtable) says:

> The virtual table for a class is emitted in the same object containing
> the definition of its key function, i.e. the first non-pure virtual
> function that is not inline at the point of class definition.

So the current behavior is incorrect. This patch tries to address this.
Also I think we need to do a similar change for MSVC ABI. But I don't
find the formal wording. So I don't address this in this patch.

Reviewed By: rjmccall, iains, dblaikie

Differential Revision: https://reviews.llvm.org/D150023

17 months ago[scudo] Fix bound checks in MemMap and ReservedMemory methods
Fabio D'Urso [Wed, 14 Jun 2023 03:54:08 +0000 (03:54 +0000)]
[scudo] Fix bound checks in MemMap and ReservedMemory methods

Reviewed By: Chia-hungDuan

Differential Revision: https://reviews.llvm.org/D152690

17 months ago[lldb] Fix Debugger whitespace and formatting (NFC)
Jonas Devlieghere [Wed, 14 Jun 2023 03:48:05 +0000 (20:48 -0700)]
[lldb] Fix Debugger whitespace and formatting (NFC)

Remove trailing whitespace and fix formatting.

17 months ago[lldb] Include <atomic> in LLDBAssert
Jonas Devlieghere [Wed, 14 Jun 2023 03:50:14 +0000 (20:50 -0700)]
[lldb] Include <atomic> in LLDBAssert

17 months ago[lldb] Print lldbassert to debugger diagnostics
Jonas Devlieghere [Wed, 14 Jun 2023 03:24:18 +0000 (20:24 -0700)]
[lldb] Print lldbassert to debugger diagnostics

When hitting an lldbassert in a non-assert build, we emit a blurb
including the assertion, the triggering file and line and a pretty
backtrace leading up to the issue. Currently, this is all printed to
stderr. That's fine on the command line, but when used as library, for
example from Xcode, this information doesn't make it to the user. This
patch uses the diagnostic infrastructure to report LLDB asserts as
diagnostic events.

The patch is slightly more complicated than I would've liked because of
layering. lldbassert is part of Utility while the debugger diagnostics
are implemented in Core.

Differential revision: https://reviews.llvm.org/D152866

17 months ago[flang][openacc][NFC] Remove unused genObjectList function
Valentin Clement [Wed, 14 Jun 2023 03:43:44 +0000 (20:43 -0700)]
[flang][openacc][NFC] Remove unused genObjectList function

genObjectList is not used anymore. Just remove it.

Depends on D151975

Reviewed By: razvanlupusoru

Differential Revision: https://reviews.llvm.org/D151976

17 months ago[flang][openacc] Add parsing support for dim in gang clause
Valentin Clement [Wed, 14 Jun 2023 03:33:20 +0000 (20:33 -0700)]
[flang][openacc] Add parsing support for dim in gang clause

Add parsing supprot for dim in gang clause

Depends on D151971

Reviewed By: razvanlupusoru, jeanPerier

Differential Revision: https://reviews.llvm.org/D151972

17 months ago[mlir][flang][openacc] Use new firstprivate representation for compute construct
Valentin Clement [Wed, 14 Jun 2023 03:32:04 +0000 (20:32 -0700)]
[mlir][flang][openacc] Use new firstprivate representation for compute construct

Use the new firstprivate representation on the comupte construct.

Reviewed By: razvanlupusoru, jeanPerier

Differential Revision: https://reviews.llvm.org/D151975

17 months ago[flang] Fix flang-aarch64-latest-gcc build failure
Kelvin Li [Wed, 14 Jun 2023 02:57:47 +0000 (22:57 -0400)]
[flang] Fix flang-aarch64-latest-gcc build failure

The failure is due to mismatch of the SmallVector parameter and the
return when built by gcc.

17 months ago[gn] Fix case of directory I added in 9239cde390e
Nico Weber [Wed, 14 Jun 2023 03:08:43 +0000 (20:08 -0700)]
[gn] Fix case of directory I added in 9239cde390e

17 months ago[gn build] Port 2700da5fe28d (lld/unittests etc)
Nico Weber [Wed, 14 Jun 2023 02:41:34 +0000 (19:41 -0700)]
[gn build] Port 2700da5fe28d (lld/unittests etc)

17 months agoRevert "[RISCV] Fold binary op into select if profitable."
Craig Topper [Wed, 14 Jun 2023 01:01:22 +0000 (18:01 -0700)]
Revert "[RISCV] Fold binary op into select if profitable."

This reverts commit d0189584631e587279ee5f0af5feb94d8045bb31.

Build failures have been reported in the Linux kernel.

17 months ago[InstCombine] Transform `(binop1 (binop2 (lshift X,Amt),Mask),(lshift Y,Amt))`
Noah Goldstein [Wed, 14 Jun 2023 00:32:19 +0000 (19:32 -0500)]
[InstCombine] Transform `(binop1 (binop2 (lshift X,Amt),Mask),(lshift Y,Amt))`

If `Mask` and `Amt` are not constants and `binop1` and `binop2` are
the same we can transform to:
`(binop (lshift (binop X, Y), Amt), Mask)`

If `binop` is `add`, `lshift` must be `shl`.

If `Mask` and `Amt` are constants `C` and `C1` respectively.
We can transform to:
`(lshift1 (binop1 (binop2 X, (inv_lshift1 C, C1), Y)), C1)`

Saving an instruction IFF:
`lshift1` is same opcode as `lshift2`
Either `bitwise1` and/or `bitwise2` is `and`.

Proofs(1/2): https://alive2.llvm.org/ce/z/BjN-m_
Proofs(2/2): https://alive2.llvm.org/ce/z/bZn5QB

This is to help fix the regression caused in D151807

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D152568

17 months ago[InstCombine] Add tests for (binop (binop (lshift X,Amt),Mask),(lshift Y,Amt)); NFC
Noah Goldstein [Fri, 9 Jun 2023 17:54:51 +0000 (12:54 -0500)]
[InstCombine] Add tests for (binop (binop (lshift X,Amt),Mask),(lshift Y,Amt)); NFC

Differential Revision: https://reviews.llvm.org/D152567

17 months agoTargetTransformInfo: Add addrspacesMayAlias
Matt Arsenault [Sat, 10 Jun 2023 17:03:22 +0000 (13:03 -0400)]
TargetTransformInfo: Add addrspacesMayAlias

For some reason we used to only handle address space aliasing through
chaining a target specific AA pass. We need never-fail simple queries
in order to lower memmove intrinsics based purely on the address
spaces.

I also think it would be better if BasicAA checked this, rather than
relying on the target AA passes. Currently we go through the more
expensive AA analyses before getting to the trivial address space
checks.

17 months agoDAG: Fix typo in GET_FPENV legality check
Matt Arsenault [Mon, 12 Jun 2023 12:15:42 +0000 (08:15 -0400)]
DAG: Fix typo in GET_FPENV legality check

This made GET_FPENV unusable since the DAG builder would always emit
the mem version.

17 months ago[RISCV] Minor style changes to performCombineVMergeAndVOps [nfc]
Philip Reames [Wed, 14 Jun 2023 00:07:17 +0000 (17:07 -0700)]
[RISCV] Minor style changes to performCombineVMergeAndVOps [nfc]

Making the code a bit easier to follow, so that merging an upcoming change is more straight forward.

17 months ago[mlir][Vector] Add basic scalable vectorization support to Linalg vectorizer
Diego Caballero [Sat, 10 Jun 2023 00:36:33 +0000 (00:36 +0000)]
[mlir][Vector] Add basic scalable vectorization support to Linalg vectorizer

For now, only elementwise operations are supported. Operations that perform any
kind of data permutation require changes in the representation of scalable
dimensions in VectorType.

Differential Revision: https://reviews.llvm.org/D152599

17 months ago[SLP][NFC] Precommit test that exposes a bug in ShuffleBuilder.
Vasileios Porpodas [Tue, 13 Jun 2023 19:39:23 +0000 (12:39 -0700)]
[SLP][NFC] Precommit test that exposes a bug in ShuffleBuilder.

ShuffleBuilder generates a zero mask here:
`[[TMP6:%.*]] = shufflevector <2 x float> [[TMP3]], <2 x float> poison, <4 x i32> zeroinitializer`
But the correct mask is `0,0,1,1`, or we should have reused `TMP4`.

Differential Revision: https://reviews.llvm.org/D152868

17 months ago[Attributor][NFC] Make the MustBeExecutedContextExplorer optional
Johannes Doerfert [Tue, 13 Jun 2023 23:39:49 +0000 (16:39 -0700)]
[Attributor][NFC] Make the MustBeExecutedContextExplorer optional

For a lightweight pass we do not want to instantiate or use the
MustBeExecutedContextExplorer. This simply allows such a configuration.
While at it, the explorer is now allocated with the bump allocator.