platform/upstream/llvm.git
23 months ago[mlir][sparse] cleanup small vector constant hints
Aart Bik [Tue, 15 Nov 2022 22:15:35 +0000 (14:15 -0800)]
[mlir][sparse] cleanup small vector constant hints

Following advise from

https://llvm.org/docs/ProgrammersManual.html#llvm-adt-smallvector-h

This revision removes the size hints from SmallVector (unless we are
certain of the resulting number of elements). Also, this replaces
SmallVector references with SmallVectorImpl references.

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D138063

23 months ago[libc] cleanup changes to gettimeofday.
Raman Tenneti [Tue, 15 Nov 2022 22:14:20 +0000 (14:14 -0800)]
[libc] cleanup changes to gettimeofday.

+ Deleted duplicate definitions of StructTimeVal and StructTimeValPtr.
+ Caled syscall  clock_gettime to get timespec data.
+ Added tests to test for sleeping 200 and 1000 microseconds.
+ Fixed comments from https://reviews.llvm.org/D137881

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D138064

23 months ago[BOLT-TESTS] Follow-up to D131919
Amir Ayupov [Tue, 15 Nov 2022 22:44:32 +0000 (14:44 -0800)]
[BOLT-TESTS] Follow-up to D131919

googletest was moved to third-party. Update path in BOLT's CMakeCache.

Reviewed By: #bolt, maksfb

Differential Revision: https://reviews.llvm.org/D138066

23 months ago[LSR] Check if terminating value is safe to expand before transformation
eopXD [Fri, 28 Oct 2022 09:07:17 +0000 (02:07 -0700)]
[LSR] Check if terminating value is safe to expand before transformation

According to report by @JojoR, the assertion error was hit hence we need
to have this check before the actual transformation.

Reviewed By: Meinersbur, #loopoptwg

Differential Revision: https://reviews.llvm.org/D136415

23 months ago[gn build] Port a16bd4f9f25e
LLVM GN Syncbot [Tue, 15 Nov 2022 22:46:53 +0000 (22:46 +0000)]
[gn build] Port a16bd4f9f25e

23 months ago[gn build] Port 4be39288f506
LLVM GN Syncbot [Tue, 15 Nov 2022 22:46:52 +0000 (22:46 +0000)]
[gn build] Port 4be39288f506

23 months ago[mlir][sparse] Fix rewriting for convert op and concatenate op.
bixia1 [Tue, 15 Nov 2022 21:24:15 +0000 (13:24 -0800)]
[mlir][sparse] Fix rewriting for convert op and concatenate op.

Fix a problem in convert op rewriting where it used the original index for
ToIndicesOp.

Extend the concatenate op rewriting to handle dense destination and dynamic
shape destination.

Make the concatenate op integration test run on the codegen path.

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D138057

23 months agoNFC test if rosetta is installed before running x86 binary on AS
Jason Molenda [Tue, 15 Nov 2022 22:43:20 +0000 (14:43 -0800)]
NFC test if rosetta is installed before running x86 binary on AS

Rosetta 2 is not installed by default in a fresh macOS installation
on Apple Silicon, so x86 binaries cannot be run.  CI bots are often
in this state.  Update this test to check for the rosetta debugserver,
which our debugserver also hardcodes the path of, before trying to
run an x86 process on AS systems.

23 months ago[ObjC] Fix an assertion failure in EvaluateLValue
Akira Hatanaka [Tue, 15 Nov 2022 21:55:12 +0000 (13:55 -0800)]
[ObjC] Fix an assertion failure in EvaluateLValue

Look through parentheses when determining whether the expression is a
@selector expression.

23 months ago[Clang][Sema] Refactor category declaration under CheckForIncompatibleAttributes...
eopXD [Mon, 7 Nov 2022 17:47:09 +0000 (09:47 -0800)]
[Clang][Sema] Refactor category declaration under CheckForIncompatibleAttributes. NFC

This change would allow extension of new categories be aware of adding
more code here.

This patch also updates the comments, which was originally missing the
vector predicate.

Reviewed By: mikerice

Differential Revision: https://reviews.llvm.org/D137570

23 months ago[TargetLowering][RISCV][ARM][AArch64][Mips] Reduce the number of AND mask constants...
Craig Topper [Tue, 15 Nov 2022 22:36:01 +0000 (14:36 -0800)]
[TargetLowering][RISCV][ARM][AArch64][Mips] Reduce the number of AND mask constants used by BSWAP expansion.

We can reuse constants if we use SRL followed by AND and AND followed by SHL.
Similar was done to bitreverse previously.

Differential Revision: https://reviews.llvm.org/D138045

23 months ago[AggressiveInstCombine] Remove legacy PM pass
Arthur Eubanks [Mon, 31 Oct 2022 21:50:38 +0000 (14:50 -0700)]
[AggressiveInstCombine] Remove legacy PM pass

As part of legacy PM optimization pipeline removal.

This shouldn't be used in codegen pipelines so it should be ok to remove.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D137116

23 months agoNFC test if rosetta debugserver exists before testing rosetta
Jason Molenda [Tue, 15 Nov 2022 22:33:50 +0000 (14:33 -0800)]
NFC test if rosetta debugserver exists before testing rosetta

A fresh install of macOS does not have Rosetta 2 installed by
default; the CI bots are often in this state, resulting in a
test failure.  debugserver already hardcodes the filepath of
the Rosetta 2 debugserver; test if that file exists before
running the Rosetta test.

23 months ago[opt] Print deprecation warning for use of legacy syntax with new pass manager
Arthur Eubanks [Sun, 23 Oct 2022 20:42:13 +0000 (13:42 -0700)]
[opt] Print deprecation warning for use of legacy syntax with new pass manager

And a possible opt invocation plus a link to more extensive documentation.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D136617

23 months ago[lldb/test] Fix app_specific_backtrace_crashlog.test (NFC)
Med Ismail Bennani [Tue, 15 Nov 2022 22:23:26 +0000 (14:23 -0800)]
[lldb/test] Fix app_specific_backtrace_crashlog.test (NFC)

This patch changes app_specific_backtrace_crashlog.test's crashlog file
extension from `ips` to `txt. This should prevent the test from opening
Console.app when being run.

This should also fix a test failure caused by missing symbols.

Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
23 months ago[RISCV] Move GlobalISEL specific files to sub-directory [nfc]
Philip Reames [Tue, 15 Nov 2022 21:50:39 +0000 (13:50 -0800)]
[RISCV] Move GlobalISEL specific files to sub-directory [nfc]

23 months ago[clang][deps] Remove checks that were just for exhaustiveness
Ben Langmuir [Tue, 15 Nov 2022 22:18:59 +0000 (14:18 -0800)]
[clang][deps] Remove checks that were just for exhaustiveness

Instead of checking all the paths, just ensure the one we care about is
correct. On a particular platform one of the paths seems to have been
more canonical than we were expecting, which is fine.

23 months ago[OpenMP] [OMPT] [2/8] Implemented a connector for communication of OMPT callbacks...
Dhruva Chakrabarti [Tue, 15 Nov 2022 20:27:04 +0000 (12:27 -0800)]
[OpenMP] [OMPT] [2/8] Implemented a connector for communication of OMPT callbacks between libraries.

This is part of a set of patches implementing OMPT target callback support and has been split out of the originally submitted https://reviews.llvm.org/D113728. The overall design can be found in https://rice.app.box.com/s/pf3gix2hs4d4o1aatwir1set05xmjljc

The purpose of this patch is to provide a way to register tool-provided callbacks into libomp when libomptarget is loaded.

Introduced a cmake variable LIBOMPTARGET_OMPT_SUPPORT that can be used to control OMPT target support. It follows host OMPT support, controlled by LIBOMP_HAVE_OMPT_SUPPORT.

Added a connector that can be used to communicate between OMPT implementations in libomp and libomptarget or libomptarget and a plugin.

Added a global constructor in libomptarget that uses the connector to force registration of tool-provided callbacks in libomp. A pair of init and fini functions are provided to libomp as part of the connect process which will be used to register the tool-provided callbacks in libomptarget.

Patch from John Mellor-Crummey <johnmc@rice.edu>
(With contributions from Dhruva Chakrabarti <Dhruva.Chakrabarti@amd.com>)

Reviewed By: dreachem, jhuber6

Differential Revision: https://reviews.llvm.org/D123572

23 months ago[OPENMP]Initial support for at clause
Jennifer Yu [Fri, 11 Nov 2022 02:12:35 +0000 (18:12 -0800)]
[OPENMP]Initial support for at clause

Error directive is allowed in both declared and executable contexts.
The function ActOnOpenMPAtClause is called in both places during the
parsers.

Adding a param "bool InExContext" to identify context which is used to
emit error massage.

Differential Revision: https://reviews.llvm.org/D137851

23 months ago[clang][deps] Avoid leaking modulemap paths across unrelated imports
Ben Langmuir [Mon, 14 Nov 2022 22:51:54 +0000 (14:51 -0800)]
[clang][deps] Avoid leaking modulemap paths across unrelated imports

Use a FileEntryRef when retrieving modulemap paths in the scanner so
that we use a path compatible with the original module import, rather
than a FileEntry which can allow unrelated modules to leak paths into
how we build a module due to FileManager mutating the path.

Note: the current change prevents an "unrelated" path, but does not
change how VFS mapped paths are handled (which would be calling
getNameAsRequested) nor canonicalize the path.

Differential Revision: https://reviews.llvm.org/D137989

23 months ago[SLP]Fix a crash on analysis of the vectorized node.
Alexey Bataev [Tue, 15 Nov 2022 20:52:03 +0000 (12:52 -0800)]
[SLP]Fix a crash on analysis of the vectorized node.

Need to use advanced check for the same vectorized node to avoid
possible compiler crash. We may have 2 similar nodes (vector one and
gather) after graph nodes rotation, need to do extra checks for the
exact match.

23 months ago[libc] re-enable assert
Michael Jones [Tue, 15 Nov 2022 20:05:37 +0000 (12:05 -0800)]
[libc] re-enable assert

The assert functions were disabled while the signal functions were being
fixed. This patch re-enables them.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D138056

23 months agoApply clang-tidy fixes for llvm-qualified-auto in TestTilingInterface.cpp (NFC)
Mehdi Amini [Mon, 14 Nov 2022 07:28:19 +0000 (07:28 +0000)]
Apply clang-tidy fixes for llvm-qualified-auto in TestTilingInterface.cpp (NFC)

23 months agoApply clang-tidy fixes for readability-identifier-naming in SparseTensorCodegen.cpp...
Mehdi Amini [Mon, 14 Nov 2022 06:38:25 +0000 (06:38 +0000)]
Apply clang-tidy fixes for readability-identifier-naming in SparseTensorCodegen.cpp (NFC)

23 months agoApply clang-tidy fixes for llvm-else-after-return in SparseTensorCodegen.cpp (NFC)
Mehdi Amini [Mon, 14 Nov 2022 06:37:42 +0000 (06:37 +0000)]
Apply clang-tidy fixes for llvm-else-after-return in SparseTensorCodegen.cpp (NFC)

23 months ago[libc++] Introduce helper functions __make_iter in vector and string
Louis Dionne [Mon, 14 Nov 2022 21:01:05 +0000 (11:01 -1000)]
[libc++] Introduce helper functions __make_iter in vector and string

This prepares the terrain for introducing a new type of bounded iterator
that can't be constructed like __wrap_iter. This reverts part of the
changes made to std::vector in 4eab04f84.

Differential Revision: https://reviews.llvm.org/D138036

23 months ago[mlir][sparse] avoid single default parameters in pass constructors
Aart Bik [Tue, 15 Nov 2022 19:47:53 +0000 (11:47 -0800)]
[mlir][sparse] avoid single default parameters in pass constructors

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D138054

23 months ago[mlir][SerializeToHsaco] Minimize dependencies of AMDGPU compilation
Krzysztof Drewniak [Mon, 14 Nov 2022 21:53:10 +0000 (21:53 +0000)]
[mlir][SerializeToHsaco] Minimize dependencies of AMDGPU compilation

The SerializeToHsaco uses functions from ExecutionEngineUtils to set
up LLVM pass pipelines, but does not otherwise depend on the execution
engine (except indirectly via a dependency on IPO). This commit
removes the dependency on the execution engine to prevent
unnecessarily compilations.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D138041

23 months ago[libc] Include stddef.h after D137871
Fangrui Song [Tue, 15 Nov 2022 20:22:33 +0000 (20:22 +0000)]
[libc] Include stddef.h after D137871

C standard does not require stdint.h to define size_t.

23 months ago[mlir] Fix a warning
Kazu Hirata [Tue, 15 Nov 2022 20:21:20 +0000 (12:21 -0800)]
[mlir] Fix a warning

This patch fixes:

  mlir/lib/ExecutionEngine/SparseTensorRuntime.cpp:195:30: warning:
  cast from type ‘const long unsigned int*’ to type ‘void*’ casts away
  qualifiers [-Wcast-qual]

23 months ago[mlir] Fix warnings
Kazu Hirata [Tue, 15 Nov 2022 20:16:03 +0000 (12:16 -0800)]
[mlir] Fix warnings

This patch fixes:

  mlir/lib/ExecutionEngine/SparseTensorRuntime.cpp:296:31: error:
  comparison of integers of different signs: 'int64_t' (aka 'long')
  and 'const uint64_t' (aka 'const unsigned long')
  [-Werror,-Wsign-compare]

  mlir/lib/ExecutionEngine/SparseTensorRuntime.cpp:297:67: error:
  comparison of integers of different signs: 'int64_t' (aka 'long')
  and 'const uint64_t' (aka 'const unsigned long')
  [-Werror,-Wsign-compare]

  mlir/lib/ExecutionEngine/SparseTensorRuntime.cpp:298:31: error:
 comparison of integers of different signs: 'int64_t' (aka 'long') and
 'const uint64_t' (aka 'const unsigned long') [-Werror,-Wsign-compare]

  mlir/lib/ExecutionEngine/SparseTensorRuntime.cpp:479:30: error:
  comparison of integers of different signs: 'int64_t' (aka 'long')
  and 'const uint64_t' (aka 'const unsigned long')
  [-Werror,-Wsign-compare]

23 months ago[mlir] Remove `Transforms/SideEffectUtils.h` and move the methods into `Interface...
Mahesh Ravishankar [Fri, 11 Nov 2022 17:10:38 +0000 (17:10 +0000)]
[mlir] Remove `Transforms/SideEffectUtils.h` and move the methods into `Interface/SideEffectInterfaces.h`.

The methods in `SideEffectUtils.h` (and their implementations in
`SideEffectUtils.cpp`) seem to have similar intent to methods already
existing in `SideEffectInterfaces.h`. Move the decleration (and
implementation) from `SideEffectUtils.h` (and `SideEffectUtils.cpp`)
into `SideEffectInterfaces.h` (and `SideEffectInterface.cpp`).

Also drop the `SideEffectInterface::hasNoEffect` method in favor of
`mlir::isMemoryEffectFree` which actually recurses into the operation
instead of just relying on the `hasRecursiveMemoryEffectTrait`
exclusively.

Differential Revision: https://reviews.llvm.org/D137857

23 months ago[Clang] Extend the number of case Sema::CheckForIntOverflow covers
Shafik Yaghmour [Tue, 15 Nov 2022 19:06:59 +0000 (11:06 -0800)]
[Clang] Extend the number of case Sema::CheckForIntOverflow covers

Currently Sema::CheckForIntOverflow misses several case that other compilers
diagnose for overflow in integral constant expressions. This includes the
arguments of a CXXConstructExpr as well as the expressions used in an
ArraySubscriptExpr, CXXNewExpr and CompoundLiteralExpr.

This fixes https://github.com/llvm/llvm-project/issues/58944

Differential Revision: https://reviews.llvm.org/D137897

23 months ago[mlir] Fix warnings
Kazu Hirata [Tue, 15 Nov 2022 20:01:00 +0000 (12:01 -0800)]
[mlir] Fix warnings

This patch fixes:

  mlir/include/mlir/ExecutionEngine/SparseTensor/Storage.h:955:20:
  error: unused variable 'sz' [-Werror,-Wunused-variable]

  mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp:1460:2:
  error: extra ';' outside of a function is incompatible with C++98
  [-Werror,-Wc++98-compat-extra-semi]

23 months ago[mlir][sparse] Only insert non-zero values to the result of the concatenate operation.
bixia1 [Tue, 15 Nov 2022 18:52:57 +0000 (10:52 -0800)]
[mlir][sparse] Only insert non-zero values to the result of the concatenate operation.

Modify the integration test to check number_of_entries and use it to limit for
outputing sparse tensor values.

Reviewed By: aartbik, Peiming

Differential Revision: https://reviews.llvm.org/D138046

23 months ago[mlir][sparse] fix bugs in concatenate rewriter.
Peiming Liu [Tue, 15 Nov 2022 19:45:22 +0000 (19:45 +0000)]
[mlir][sparse] fix bugs in concatenate rewriter.

Reviewed By: aartbik, bixia

Differential Revision: https://reviews.llvm.org/D138053

23 months agoAdd FP8 E4M3 support to APFloat.
Reed [Tue, 15 Nov 2022 19:11:50 +0000 (20:11 +0100)]
Add FP8 E4M3 support to APFloat.

NVIDIA, ARM, and Intel recently introduced two new FP8 formats, as described in the paper: https://arxiv.org/abs/2209.05433. The first of the two FP8 dtypes, E5M2, was added in https://reviews.llvm.org/D133823. This change adds the second of the two: E4M3.

There is an RFC for adding the FP8 dtypes here: https://discourse.llvm.org/t/rfc-add-apfloat-and-mlir-type-support-for-fp8-e5m2/65279. I spoke with the RFC's author, Stella, and she gave me the go ahead to implement the E4M3 type. The name of the E4M3 type in APFloat is Float8E4M3FN, as discussed in the RFC. The "FN" means only Finite and NaN values are supported.

Unlike E5M2, E4M3 has different behavior from IEEE types in regards to Inf and NaN values. There are no Inf values, and NaN is represented when the exponent and mantissa bits are all 1s. To represent these differences in APFloat, I added an enum field, fltNonfiniteBehavior, to the fltSemantics struct. The possible enum values are IEEE754 and NanOnly. Only Float8E4M3FN has the NanOnly behavior.

After this change is submitted, I plan on adding the Float8E4M3FN type to MLIR, in the same way as E5M2 was added in https://reviews.llvm.org/D133823.

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D137760

23 months ago[asan][darwin] This test is x86_64 specific, not non-ios in general.
Roy Sundahl [Tue, 15 Nov 2022 19:23:46 +0000 (11:23 -0800)]
[asan][darwin] This test is x86_64 specific, not non-ios in general.

This test was unsupported in iOS when a more accurate test is that the architecture is x86_64. This "fix" is first in a series of updates intended to get asan arm64 tests fully functional.

Reviewed By: thetruestblue, vitalybuka

Differential Revision: https://reviews.llvm.org/D138001

23 months agoRevert D137574 "PEI should be able to use backward walk in replaceFrameIndicesBackward."
Fangrui Song [Tue, 15 Nov 2022 19:19:46 +0000 (19:19 +0000)]
Revert D137574 "PEI should be able to use backward walk in replaceFrameIndicesBackward."

This reverts commit e05ce03cfa0b36e9b99149e21afcb1fc039df813.

Caused asan use-after-poison to 4 DebugInfo/AMDGPU/ tests.
Triggered in PEI::replaceFrameIndicesBackward called llvm::MachineInstr::getNumOperands

23 months ago[libc][math] Improve the performance and error printing of UInt.
Tue Ly [Fri, 11 Nov 2022 23:04:56 +0000 (18:04 -0500)]
[libc][math] Improve the performance and error printing of UInt.

Use add_with_carry builtin to improve the performance of
addition and multiplication of UInt class.  For 128-bit, it is as
fast as using __uint128_t.

Microbenchmark for addition:
https://quick-bench.com/q/-5a6xM4T8rIXBhqMTtLE-DD2h8w

Microbenchmark for multiplication:
https://quick-bench.com/q/P2muLAzJ_W-VqWCuxEJ0CU0bLDg

Microbenchmark for shift right:
https://quick-bench.com/q/N-jkKXaVsGQ4AAv3k8VpsVkua5Y

Microbenchmark for shift left:
https://quick-bench.com/q/5-RzwF8UdslC-zuhNajXtXdzLRM

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D137871

23 months ago[AVR] Fix use-of-uninitialized-value after D137520
Fangrui Song [Tue, 15 Nov 2022 19:06:18 +0000 (11:06 -0800)]
[AVR] Fix use-of-uninitialized-value after D137520

23 months ago[NVPTX] Fix alignment for arguments of function pointer calls
Andrew Savonichev [Tue, 15 Nov 2022 18:43:06 +0000 (21:43 +0300)]
[NVPTX] Fix alignment for arguments of function pointer calls

Alignment of function arguments can be increased only if we can do
this for all call sites. Therefore we do not increase it for external
functions, and now we skip functions that have address taken, to avoid
any issues with functions pointers.

Differential Revision: https://reviews.llvm.org/D135708

23 months ago[NVPTX] Fix pointer argument declaration for --nvptx-short-ptr
Andrew Savonichev [Tue, 15 Nov 2022 18:41:33 +0000 (21:41 +0300)]
[NVPTX] Fix pointer argument declaration for --nvptx-short-ptr

When --nvptx-short-ptr is set, local pointers are stored as 32-bit on
nvptx64 target.

Before this patch, arguments for a function declaration were always
emitted as b64 regardless of their address space, but they were set as
b32 for the corresponding call instruction:

   .extern .func test
   (
    .param .b64 test_param_0
   )
   [...]
    .param .b32 param0;
    st.param.b32 [param0+0], %r1;
    call.uni test, (param0);

This is not supported:

  ptxas: Type of argument does not match formal parameter
  'test_param_0'

Now short pointers in a function declaration are emitted as b32 if
--nvptx-short-ptr is set.

Differential Revision: https://reviews.llvm.org/D135674

23 months ago[NVPTX] Fix pointer type for short 32-bit pointers
Andrew Savonichev [Tue, 15 Nov 2022 18:39:34 +0000 (21:39 +0300)]
[NVPTX] Fix pointer type for short 32-bit pointers

Global variables used to be printed as u64/b64 even when
-nvptx-short-ptr is set.

Differential Revision: https://reviews.llvm.org/D127668

23 months ago[NFC][X86][Costmodel] Drop reduntant interleaved cost test coverage
Roman Lebedev [Tue, 15 Nov 2022 18:27:53 +0000 (21:27 +0300)]
[NFC][X86][Costmodel] Drop reduntant interleaved cost test coverage

These are already covered by the more general tests i've added.

23 months agoApply clang-tidy fixes for modernize-loop-convert in CodegenUtils.cpp (NFC)
Mehdi Amini [Mon, 14 Nov 2022 06:36:11 +0000 (06:36 +0000)]
Apply clang-tidy fixes for modernize-loop-convert in CodegenUtils.cpp (NFC)

23 months agoApply clang-tidy fixes for llvm-qualified-auto in TileUsingInterface.cpp (NFC)
Mehdi Amini [Mon, 14 Nov 2022 06:33:14 +0000 (06:33 +0000)]
Apply clang-tidy fixes for llvm-qualified-auto in TileUsingInterface.cpp (NFC)

23 months ago[libc] Fix tablegen when using a runtimes build
Joseph Huber [Tue, 15 Nov 2022 16:25:27 +0000 (10:25 -0600)]
[libc] Fix tablegen when using a runtimes build

When using `LLVM_ENABLE_RUNTIMES=libc` we need to perform a few extra
steps to include LLVM utilities similar to if we were performing a
standalone build. Libc depends on the tablegen utilities and the LLVM
libraries when performing a full build. When using an
`LLVM_ENABLE_PROJECTS=libc` build these are included as a part of the
greater LLVM build, but here we need to perform it maunally. This patch
should allow using `LLVM_LIBC_FULL_BUILD=ON` when building with
runtimes.

Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D138040

23 months ago[X86] Rewrite `getScalarizationOverhead()`
Roman Lebedev [Tue, 15 Nov 2022 18:06:39 +0000 (21:06 +0300)]
[X86] Rewrite `getScalarizationOverhead()`

All of our insert/extract ops work on 128-bit lanes.

For `Insert`, we need to extract affected 128-bit lane,
unless it's being fully overwritten (FIXME: do we need to be
careful about legalization-induced padding that we obviously don't demand?),
perform insertions, and then insert the 128-bit lane back.

But hold on. If we are operating on an 256-bit legal vector,
and thus have two 128-bit subvectors, and are fully overwriting them both,
we don't actually need to insert *both* subvectors,
only the second one, into the implicitly-widened first one.

Also, `Insert` wasn't actually querying the costs,
but just assuming them to be `1`.

`getShuffleCost(TTI::SK_ExtractSubvector)` notes:
```
  // Note that in general, the insertion starting at the beginning of a vector
  // isn't free, because we need to preserve the rest of the wide vector.
```
... so as far as i can tell, we didn't account for that.

I was hoping this would allow vectorization at a higher VF at one case i looked at,
but the subvector insertion cost is still dis-advising that.

The change for `Extract` is NFC, and is for consistency only,
i wanted to get rid of of that weird explicit discounting of insertion of 0'th element,
since the general code should already deal with that.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D137913

23 months ago[Hexagon] Fix even/odd word shuffling
Krzysztof Parzyszek [Tue, 15 Nov 2022 17:52:24 +0000 (09:52 -0800)]
[Hexagon] Fix even/odd word shuffling

Used the wrong shuffle instruction... -_-

23 months ago[IR] Fix -Wreturn-type after D135714 "[MemProf] ThinLTO summary support"
Fangrui Song [Tue, 15 Nov 2022 17:55:28 +0000 (09:55 -0800)]
[IR] Fix -Wreturn-type after D135714 "[MemProf] ThinLTO summary support"

23 months ago[mlir][SCF] Adding custom builder to SCF::WhileOp.
Mohammed Anany [Tue, 15 Nov 2022 17:10:17 +0000 (18:10 +0100)]
[mlir][SCF] Adding custom builder to SCF::WhileOp.

This is a similar builder to the one for SCF::IfOp which allows users to pass region builders to it. Refer to the builders for IfOp.

Reviewed By: tpopp

Differential Revision: https://reviews.llvm.org/D137709

23 months ago[mlir][transform] Decouple GPUDeviceMapping attribute from the GPU transfrom dialect...
Guray Ozen [Tue, 15 Nov 2022 11:02:10 +0000 (12:02 +0100)]
[mlir][transform] Decouple GPUDeviceMapping attribute from the GPU transfrom dialect code generator

`DeviceMappingAttrInterface` is implemented as unifiying mechanism for thread mapping. A code generator could use any attribute that implements this interface to lower `scf.foreach_thread` to device specific code. It is allowed to choose its own mapping and interpretation.

Currently, GPU transform dialect supports only `GPUThreadMapping` and `GPUBlockMapping`; however, other mappings should to be supported as well. This change addresses this issue. It decouples gpu transform dialect from the `GPUThreadMapping` and `GPUBlockMapping`. Now, they can work any other mapping.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D138020

23 months ago[LegacyPM] remove unset variables in PassManagerBuilder
Mikhail Goncharov [Tue, 15 Nov 2022 14:16:31 +0000 (15:16 +0100)]
[LegacyPM] remove unset variables in PassManagerBuilder

D137915 stopped setting this variables but NewGVN was still used and caused asan failure

Differential Revision: https://reviews.llvm.org/D138034

23 months agoRestore "[MemProf] ThinLTO summary support" with fixes
Teresa Johnson [Tue, 15 Nov 2022 15:59:37 +0000 (07:59 -0800)]
Restore "[MemProf] ThinLTO summary support" with fixes

This restores 47459455009db4790ffc3765a2ec0f8b4934c2a4, which was
reverted in commit 452a14efc84edf808d1e2953dad2c694972b312f, along with
fixes for a couple of bot failures.

23 months agoRevert "[libc++] Only include_next C library headers when they exist"
Nico Weber [Tue, 15 Nov 2022 16:35:00 +0000 (11:35 -0500)]
Revert "[libc++] Only include_next C library headers when they exist"

This reverts commit 226409c62879bf5ff9928cd23a4255cd7c614fe0.
Breaks check-clang on mac, see comments on https://reviews.llvm.org/D136683

23 months ago[libc++] Start classifying debug mode features with more granularity
Louis Dionne [Mon, 14 Nov 2022 20:33:03 +0000 (10:33 -1000)]
[libc++] Start classifying debug mode features with more granularity

I am starting to granularize debug-mode checks so they can be controlled
more individually. The goal is for vendors to eventually be able to select
which categories of checks they want embedded in their configuration of
the library with more granularity.

Note that this patch is a bit weird on its own because it does not touch
any of the containers that implement iterator bounds checking through the
__dereferenceable check of the legacy debug mode. However, I added TODOs
to string and vector to change that.

Differential Revision: https://reviews.llvm.org/D138033

23 months ago[pgo] Avoid introducing relocations by using private alias
Paul Kirth [Mon, 14 Nov 2022 18:12:56 +0000 (18:12 +0000)]
[pgo] Avoid introducing relocations by using private alias

Instead of using the public, interposable symbol, we can use a private
alias and avoid relocations and addends.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D137982

23 months agoRevert "[RISCV][llvm-mca] Use LMUL Instruments to provide more accurate reports on...
Michael Maitland [Tue, 15 Nov 2022 16:03:41 +0000 (08:03 -0800)]
Revert "[RISCV][llvm-mca] Use LMUL Instruments to provide more accurate reports on RISCV"

This reverts commit 5e82ee5373211db8522181054800ccd49461d9d8.

23 months ago[SDAG] avoid udiv/urem transform for vector/scalar type mismatches
Sanjay Patel [Tue, 15 Nov 2022 15:43:51 +0000 (10:43 -0500)]
[SDAG] avoid udiv/urem transform for vector/scalar type mismatches

This solves the crashing from issue #58994.
I don't know anything about VE, so I don't know if the output
is as expected or even correct.

23 months ago[SDAG] improve assert text for getSetCC type assumptions; NFC
Sanjay Patel [Tue, 15 Nov 2022 14:23:24 +0000 (09:23 -0500)]
[SDAG] improve assert text for getSetCC type assumptions; NFC

Having identical text for these 2 conditions made it harder
to find the root problem for issue #58994.

23 months ago[AArch64][SVE] Add instcombine to convert ptest.last/first to ptest.any
Bradley Smith [Fri, 11 Nov 2022 15:24:57 +0000 (15:24 +0000)]
[AArch64][SVE] Add instcombine to convert ptest.last/first to ptest.any

This allow for better optimization later in the backend.

This fixes the remaining missed optimizations in D137717.

Depends on D137930

Differential Revision: https://reviews.llvm.org/D137947

23 months ago[libc++] Only include_next C library headers when they exist
Louis Dionne [Tue, 25 Oct 2022 14:08:21 +0000 (10:08 -0400)]
[libc++] Only include_next C library headers when they exist

Some platforms don't provide all C library headers. In practice, libc++
only requires a few C library headers to exist, and only a few functions
on those headers. Missing functions that libc++ doesn't need for its own
implementation are handled properly by the using_if_exists attribute,
however a missing header is currently a hard error when we try to
do #include_next.

This patch should make libc++ more flexible on platforms that do not
provide C headers that libc++ doesn't actually require for its own
implementation. The only downside is that it may move some errors from
the #include_next point to later in the compilation if we actually try
to use something that isn't provided, which could be somewhat confusing.
However, these errors should be caught by folks trying to port libc++
over to a new platform (when running the libc++ test suite), not by end
users.

Differential Revision: https://reviews.llvm.org/D136683

23 months ago[gn build] Stop defining GTEST_LANG_CXX11, pass /Zc:__cplusplus with msvc
Nico Weber [Tue, 15 Nov 2022 15:55:50 +0000 (10:55 -0500)]
[gn build] Stop defining GTEST_LANG_CXX11, pass /Zc:__cplusplus with msvc

Ports:
* https://reviews.llvm.org/D84023
* https://reviews.llvm.org/rG4f5ccc72f6a6e
  (but see https://reviews.llvm.org/rG4901199f5b84b223)

No intended behavior change.

23 months ago[RISCV][llvm-mca] Use LMUL Instruments to provide more accurate reports on RISCV
Michael Maitland [Fri, 4 Nov 2022 15:51:39 +0000 (08:51 -0700)]
[RISCV][llvm-mca] Use LMUL Instruments to provide more accurate reports on RISCV

On x86 and AArch, SIMD instructions encode all of the scheduling information in the instruction
itself. For example, VADD.I16 q0, q1, q2 is a neon instruction that operates on 16-bit integer
elements stored in 128-bit Q registers, which leads to eight 16-bit lanes in parallel. This kind
of information impacts how the instruction takes to execute and what dependencies this may cause.

On RISCV however, the data that impacts scheduling is encoded in CSR registers such as vtype or
vl, in addition with the instruction itself. But MCA does not track or use the data in these
registers. This patch fixes this problem by introducing Instruments into MCA.

* Replace `CodeRegions` with `AnalysisRegions`
* Add `Instrument` and `InstrumentManager`
* Add `InstrumentRegions`
* Add RISCV Instrument and `InstrumentManager`
* Parse `Instruments` in driver
* Use instruments to override schedule class
* RISCV use lmul instrument to override schedule class
* Fix unit tests to pass empty instruments
* Add -ignore-im clopt to disable this change

Differential Revision: https://reviews.llvm.org/D137440

23 months ago[AArch64][SVE] Add PTEST_ANY pseudo instruction
Bradley Smith [Fri, 11 Nov 2022 14:28:45 +0000 (14:28 +0000)]
[AArch64][SVE] Add PTEST_ANY pseudo instruction

This allow recognition of when a ptest was emitted as an any condition
and allows for extra optimization to be done later.

This addresses missing optimizations from D137716 and D137718, and
partially D137717.

Depends on D137716, D137717, D137718

Differential Revision: https://reviews.llvm.org/D137930

23 months ago[AST] Don't use WeakVH for unknown insts (NFCI)
Nikita Popov [Tue, 15 Nov 2022 15:44:35 +0000 (16:44 +0100)]
[AST] Don't use WeakVH for unknown insts (NFCI)

After D138014 we do not support using AST with IR that is being
mutated. As such, we also no longer need to track unknown
instructions using WeakVH. Replace with AssertingVH to make sure
that they are not invalidated.

23 months ago[gn build] Make libcxx_enable_debug_mode work better, maybe
Nico Weber [Tue, 15 Nov 2022 15:40:34 +0000 (10:40 -0500)]
[gn build] Make libcxx_enable_debug_mode work better, maybe

Refer to _LIBCPP_ENABLE_DEBUG_MODE instead of the old _LIBCPP_DEBUG
in a comment, and write that to __config_site correctly too.

See 13ea1343231fa4 and the comments in https://crbug.com/1358646.

Also change the default of libcxx_enable_debug_mode to false for now.
Since we used to not write _LIBCPP_ENABLE_DEBUG_MODE, the previous
default of true had no effect (except for compiling debug.cpp and
legacy_debug_handler.cpp, which we now no longer build by default).
So this (mostly) preserves previous behavior.

23 months agoRevert "[MemProf] ThinLTO summary support"
Teresa Johnson [Tue, 15 Nov 2022 15:39:40 +0000 (07:39 -0800)]
Revert "[MemProf] ThinLTO summary support"

This reverts commit 47459455009db4790ffc3765a2ec0f8b4934c2a4.

Revert while I try to fix a couple of non-Linux build failures.

23 months ago[AST] Restrict AliasSetTracker to immutable IR
Nikita Popov [Tue, 15 Nov 2022 10:00:17 +0000 (11:00 +0100)]
[AST] Restrict AliasSetTracker to immutable IR

This restricts usage of AliasSetTracker to IR that does not change.
We used to use it during LICM where the underlying IR could change,
but remaining uses all use AST as part of a separate analysis phase.

This is split out from D137955, which makes use of the new guarantee
to switch to BatchAA.

Differential Revision: https://reviews.llvm.org/D138014

23 months ago[Assignment Tracking][23/*] Account for assignment tracking in SLP Vectorizer
OCHyams [Tue, 15 Nov 2022 15:17:30 +0000 (15:17 +0000)]
[Assignment Tracking][23/*] Account for assignment tracking in SLP Vectorizer

The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

The SLP-Vectorizer can merge a set of scalar stores into a single vectorized
store. Merge DIAssignID intrinsics from the scalar stores onto the new
vectorized store.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D133320

23 months agoReapply [Hexagon] Use default attributes for intrinsics
Nikita Popov [Tue, 8 Nov 2022 10:48:03 +0000 (11:48 +0100)]
Reapply [Hexagon] Use default attributes for intrinsics

The issue that caused the revert has been fixed in:
44bd80751274a81c870882968ecd478b03af292a

-----

This switches Hexagon intrinsics to use the default attributes
(nosync, nofree, nocallback and willreturn). Especially willreturn
is needed to prevent optimization regressions in the future.

The only intrinsics I've excluded here are the load/store locked
intrinsics, which presumably aren't nosync.

Differential Revision: https://reviews.llvm.org/D137623

23 months ago[Assignment Tracking][22/*] Add loop-deletion test
OCHyams [Tue, 15 Nov 2022 14:40:38 +0000 (14:40 +0000)]
[Assignment Tracking][22/*] Add loop-deletion test

The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

This test covers the NFC-for-normal-debug-info change D133303.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D133319

23 months ago[NFC] Fix the typo and the format in the StandardCPlusPlusModules
Chuanqi Xu [Tue, 15 Nov 2022 14:52:21 +0000 (22:52 +0800)]
[NFC] Fix the typo and the format in the StandardCPlusPlusModules
document

23 months ago[Hexagon] Adjust handling of stack with variable-size and extra alignment
Krzysztof Parzyszek [Mon, 14 Nov 2022 16:23:03 +0000 (08:23 -0800)]
[Hexagon] Adjust handling of stack with variable-size and extra alignment

Make the stack alignment register (AP) reserved in the given function. This
will make it available everywhere in the function, and allow aligned access
to vector register spill slots.

23 months ago[libc++] Make it an error to define _LIBCPP_DEBUG
Louis Dionne [Mon, 14 Nov 2022 19:56:35 +0000 (09:56 -1000)]
[libc++] Make it an error to define _LIBCPP_DEBUG

We have been transitioning off of that macro since LLVM 15.

Differential Revision: https://reviews.llvm.org/D137975

23 months ago[MergeICmps][NFC] Fix a couple of typos in a comment
Fraser Cormack [Tue, 15 Nov 2022 14:46:23 +0000 (14:46 +0000)]
[MergeICmps][NFC] Fix a couple of typos in a comment

23 months ago[MemProf] ThinLTO summary support
Teresa Johnson [Tue, 11 Oct 2022 21:00:37 +0000 (14:00 -0700)]
[MemProf] ThinLTO summary support

Implements the ThinLTO summary support for memprof related metadata.

This includes support for the assembly format, and for building the
summary from IR during ModuleSummaryAnalysis.

To reduce space in both the bitcode format and the in memory index,
we do 2 things:
1. We keep a single vector of all uniq stack id hashes, and record the
   index into this vector in the callsite and allocation memprof
   summaries.
2. When building the combined index during the LTO link, the callsite
   and allocation memprof summaries are only kept on the FunctionSummary
   of the prevailing copy.

Differential Revision: https://reviews.llvm.org/D135714

23 months ago[AVR] Add FeatureEIJMPCALL to FamilyAVR6
Ayke van Laethem [Mon, 7 Nov 2022 17:57:35 +0000 (18:57 +0100)]
[AVR] Add FeatureEIJMPCALL to FamilyAVR6

This feature was probably missed when adding FamilyAVR6, but should
definitely be there. I checked all four devices in the AVR6 family and
they all support eijmp/eicall.

Found while working on https://reviews.llvm.org/D137572.

Differential Revision: https://reviews.llvm.org/D137573

23 months ago[AVR][Clang] Implement __AVR_ARCH__ macro
Ayke van Laethem [Mon, 7 Nov 2022 02:36:08 +0000 (03:36 +0100)]
[AVR][Clang] Implement __AVR_ARCH__ macro

This macro is defined in avr-gcc, and is very useful especially in
assembly code to check whether particular instructions are supported. It
is also the basis for other macros like __AVR_HAVE_ELPM__.

Differential Revision: https://reviews.llvm.org/D137521

23 months ago[AVR][Clang] Move family names into MCU list
Ayke van Laethem [Mon, 7 Nov 2022 02:04:54 +0000 (03:04 +0100)]
[AVR][Clang] Move family names into MCU list

This simplifies the code by avoiding some special cases for family names
(as opposed to device names).

Differential Revision: https://reviews.llvm.org/D137520

23 months ago[clang-tidy] Optionally ignore findings in macros in `readability-const-return-type`.
Thomas Etter [Mon, 14 Nov 2022 19:08:11 +0000 (19:08 +0000)]
[clang-tidy] Optionally ignore findings in macros in `readability-const-return-type`.

Adds support for options-controlled configuration of the check to ignore results in macros.

Differential Revision: https://reviews.llvm.org/D137972

23 months ago[clang-tidy] Ignore overriden methods in `readability-const-return-type`.
Thomas Etter [Mon, 14 Nov 2022 18:56:07 +0000 (18:56 +0000)]
[clang-tidy] Ignore overriden methods in `readability-const-return-type`.

Overrides are constrained by the signature of the overridden method, so a
warning on an override is frequently unactionable.

Differential Revision: https://reviews.llvm.org/D137968

23 months agoPEI should be able to use backward walk in replaceFrameIndicesBackward.
Alexander Timofeev [Fri, 4 Nov 2022 20:16:46 +0000 (21:16 +0100)]
PEI should be able to use backward walk in replaceFrameIndicesBackward.

The backward register scavenger has correct register
liveness information. PEI should leverage the backward register scavenger.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D137574

23 months ago[Assignment Tracking][20/*] Account for assignment tracking in DSE
OCHyams [Tue, 15 Nov 2022 13:38:03 +0000 (13:38 +0000)]
[Assignment Tracking][20/*] Account for assignment tracking in DSE

The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

DeadStoreElimmination shortens stores that are shadowed by later stores such
that the overlapping part of the earlier store is omitted. Insert an unlinked
dbg.assign intrinsic with a variable fragment that describes the omitted part
to signal that that fragment of the variable has a stale value in memory.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D133315

23 months ago[AAPointerInfo] refactor how offsets and Access objects are tracked
Sameer Sahasrabuddhe [Tue, 15 Nov 2022 13:22:11 +0000 (18:52 +0530)]
[AAPointerInfo] refactor how offsets and Access objects are tracked

This restores commit b756096b0cbef0918394851644649b3c28a886e2, which was
originally reverted in 00b09a7b18abb253d36b3d3e1c546007288f6e89.

AAPointerInfo now maintains a list of all Access objects that it owns, along
with the following maps:

- OffsetBins: OffsetAndSize -> { Access }
- InstTupleMap: RemoteI x LocalI -> Access

A RemoteI is any instruction that accesses memory. RemoteI is different from
LocalI if and only if LocalI is a call; then RemoteI is some instruction in the
callgraph starting from LocalI.

Motivation: When AAPointerInfo recomputes the offset for an instruction, it sets
the value to Unknown if the new offset is not the same as the old offset. The
instruction must now be moved from its current bin to the bin corresponding to
the new offset. This happens for example, when:

- A PHINode has operands that result in different offsets.
- The same remote inst is reachable from the same local inst via different paths
  in the callgraph:

```
               A (local inst)
               |
               B
              / \
             C1  C2
              \ /
               D (remote inst)

```
This fixes a bug where a store is incorrectly eliminated in a lit test.

Reviewed By: jdoerfert, ye-luo

Differential Revision: https://reviews.llvm.org/D136526

23 months ago[Assignment Tracking][19/*] Account for assignment tracking in ADCE
OCHyams [Tue, 15 Nov 2022 13:11:25 +0000 (13:11 +0000)]
[Assignment Tracking][19/*] Account for assignment tracking in ADCE

The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

In an attempt to preserve more info, don't delete dbg.assign intrinsics that
are considered "out of scope" if they're linked to instructions.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D133314

23 months ago[AArch64][SVE] Fix bad PTEST(PTRUE_ALL, PTEST_LIKE) optimization
Cullen Rhodes [Tue, 15 Nov 2022 12:00:00 +0000 (12:00 +0000)]
[AArch64][SVE] Fix bad PTEST(PTRUE_ALL, PTEST_LIKE) optimization

AArch64InstrInfo::optimizePTestInstr attempts to remove a PTEST of a
predicate generating operation that identically sets flags (implictly).

When the mask is an all active of matching element size the PTEST is
currently removed. For while instructions this is correct since they
perform an implicit PTEST with an all active mask. However, for other
instructions such as compares the mask could be different.

This patch fixes this bug by only removing the PTEST if the same all
active mask is used by the predicating-generating instruction.

Reviewed By: bsmith

Differential Revision: https://reviews.llvm.org/D137718

23 months agoUse TI.hasBuiltinAtomic() when setting ATOMIC_*_LOCK_FREE values. NFCI
Alex Richardson [Tue, 15 Nov 2022 12:29:23 +0000 (12:29 +0000)]
Use TI.hasBuiltinAtomic() when setting ATOMIC_*_LOCK_FREE values. NFCI

I noticed that the values for __{CLANG,GCC}_ATOMIC_POINTER_LOCK_FREE were
incorrectly set to 1 instead of two in downstream CHERI targets because
pointers are handled specially there. While fixing this downstream, I
noticed that the existing code could be refactored to use
TargetInfo::hasBuiltinAtomic instead of repeating the almost identical
logic. In theory there could be a difference here since hasBuiltinAtomic() also
returns true for types less than 1 char in size, but since
InitializePredefinedMacros() never passes such a value this change should
not introduce any functional changes.

Reviewed By: rprichard, efriedma

Differential Revision: https://reviews.llvm.org/D135142

23 months ago[AArch64] Add some missing tests for FNMADD combine patterns. NFC.
Sjoerd Meijer [Tue, 15 Nov 2022 12:22:56 +0000 (17:52 +0530)]
[AArch64] Add some missing tests for FNMADD combine patterns. NFC.

23 months ago[clang][Tooling] Sort filenames in test
Kadir Cetinkaya [Tue, 15 Nov 2022 12:31:29 +0000 (13:31 +0100)]
[clang][Tooling] Sort filenames in test

23 months ago[Assignment Tracking][18/*] Account for assignment tracking in LICM
OCHyams [Tue, 15 Nov 2022 12:18:49 +0000 (12:18 +0000)]
[Assignment Tracking][18/*] Account for assignment tracking in LICM

The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

Merge DIAssignID attachments on stores that are merged and sunk out of
loops. The store may be sunk into multiple exit blocks, and in this case all
the copies of the store get the same DIAssignID.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D133313

23 months ago[NFC][AArch64] SME2 Add instruction name convention and fix LookupTable number of...
Caroline Concatto [Tue, 8 Nov 2022 12:49:56 +0000 (12:49 +0000)]
[NFC][AArch64] SME2 Add instruction name convention and fix LookupTable number of registers

This patch adds the name convention for SME instructions.
This patch fixes the number of registers for LookUpTable in the AsmParser.
The number of registers is not used atm, but it is needed.
The switch case in getNumRegsForRegKind needs to have all the
RegKind enum.

23 months ago[NFC][SME2] Change instruction name for ADD/SUB array accumulator
Caroline Concatto [Tue, 15 Nov 2022 11:50:20 +0000 (11:50 +0000)]
[NFC][SME2] Change instruction name for ADD/SUB array accumulator

Now the names for all ADD, SUB, FADD and FSUB array accumulators instructions
are consistent with the developer's page and their operands.

23 months ago[AArch64][SVE] Fix bad PTEST(X, X) optimization
Cullen Rhodes [Tue, 15 Nov 2022 10:55:15 +0000 (10:55 +0000)]
[AArch64][SVE] Fix bad PTEST(X, X) optimization

AArch64InstrInfo::optimizePTestInstr attempts to remove a PTEST of a
predicate generating operation that identically sets flags (implictly).

When the mask is the same as the input predicate the PTEST is currently
removed. This is incorrect since the mask for the implicit PTEST
performed by the flag-setting instruction differs from the mask
specified to the explicit PTEST and could set different flags.

For example, consider

  PG=<1, 1, x, x>
  Z0=<1, 2, x, x>
  Z1=<2, 1, x, x>

  X=CMPLE(PG, Z0, Z1)
   =<0, 1, x, x>       NZCV=0xxx
  PTEST(X, X),         NZCV=1xxx

where the first active flag (bit 'N' in NZCV) is set by the explicit
PTEST, but not by the implicit PTEST as part of the compare. Given the
PTEST mask and source are the same however, first is equivalent to any,
so the PTEST could be removed if the condition is changed. The same
applies to last active. It is safe to remove the PTEST for any active,
but this information isn't available in the current optimization.

This patch fixes the bad optimization, a later patch will implement the
optimization proposed above and fix the any active case.

Reviewed By: bsmith

Differential Revision: https://reviews.llvm.org/D137717

23 months ago[Assignment Tracking][17/*] Account for assignment tracking in memcpyopt
OCHyams [Tue, 15 Nov 2022 11:51:10 +0000 (11:51 +0000)]
[Assignment Tracking][17/*] Account for assignment tracking in memcpyopt

The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

Maintain and propagate DIAssignID attachments in memcpyopt.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D133312

23 months ago[Assignment Tracking][16/*] Account for assignment tracking in mldst-motion
OCHyams [Tue, 15 Nov 2022 11:28:20 +0000 (11:28 +0000)]
[Assignment Tracking][16/*] Account for assignment tracking in mldst-motion

The Assignment Tracking debug-info feature is outlined in this RFC:
https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

mldst-motion will merge and sink the stores in if-diamond branches into the
common successor. Attach a merged DIAssignID to the merged store.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D133311

23 months ago[Assignment Tracking] Update mem2reg tests to use opaque pointers
OCHyams [Tue, 15 Nov 2022 11:20:59 +0000 (11:20 +0000)]
[Assignment Tracking] Update mem2reg tests to use opaque pointers

Follow up to 0946e463e8649896654b0dd39193db76a5789e11 (D133295).

23 months ago[Assignment Tracking][12/*] Account for assignment tracking in mem2reg
OCHyams [Tue, 15 Nov 2022 10:52:45 +0000 (10:52 +0000)]
[Assignment Tracking][12/*] Account for assignment tracking in mem2reg

The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

The changes for assignment tracking in mem2reg don't require much of a
deviation from existing behaviour. dbg.assign intrinsics linked to an alloca
are treated much in the same way as dbg.declare users of an alloca, except that
we don't insert dbg.value intrinsics to describe assignments when there is
already a dbg.assign intrinsic present, e.g. one linked to a store that is
going to be removed.

Reviewed By: jmorse

Differential Revision: https://reviews.llvm.org/D133295

23 months ago[MCA][X86] Ensure the avx512 gfni tests use the upper xmm/ymm registers
Simon Pilgrim [Tue, 15 Nov 2022 11:06:50 +0000 (11:06 +0000)]
[MCA][X86] Ensure the avx512 gfni tests use the upper xmm/ymm registers

Ensure we're testing the avx512vl gfni instructions and not the avx gfni instructions