platform/upstream/llvm.git
2 years ago"Re-apply 4b6c2cd642 "Deferred Concept Instantiation Implementation""""
Erich Keane [Thu, 5 May 2022 15:35:13 +0000 (08:35 -0700)]
"Re-apply 4b6c2cd642 "Deferred Concept Instantiation Implementation""""

This includes a fix for the libc++ issue I ran across with friend
declarations not properly being identified as overloads.

This reverts commit 45c07db31cc76802a1a2e41bed1ce9c1b8198181.

2 years ago[flang] Fix windows bot after D125140
Jean Perier [Mon, 9 May 2022 13:23:46 +0000 (15:23 +0200)]
[flang] Fix windows bot after D125140

The ifdef is not required in the header, common::int128_t is always
defined. The function declaration must be available in lowering
regardless of the host int128_t support.

Differential Revision: https://reviews.llvm.org/D125211

2 years ago[riscv] Fix state tracking bug on vsetvli (phi of vsetvli) peephole
Philip Reames [Mon, 9 May 2022 13:20:24 +0000 (06:20 -0700)]
[riscv] Fix state tracking bug on vsetvli (phi of vsetvli) peephole

This fixes the first of several cases where the state computed in phase 1 and 2 of the algorithm differs from the state computed during phase 3. Note that such differences can cause miscompiles by creating disagreements about contents of the VL and VTYPE registers at block boundaries.

In this particular case, we recognize that for the first vsetvli in a block, that if the AVL is a phi of GPR results from previous vsetvlis and the VTYPE field matches, we can avoid emitting a vsetvli as the register contents don't change. Unfortunately, the abstract state does change and that update was lost.

As noted in the test change, this can actually improve results by preserving information until later state transitions in the block. However, this minor codegen improvement is not the motivation for the patch. The motivation is to avoid cases a case where we break a key internal correctness invariant.

Differential Revision: https://reviews.llvm.org/D125133

2 years ago[demangler] No need to space adjacent template closings
Nathan Sidwell [Mon, 28 Mar 2022 19:38:24 +0000 (12:38 -0700)]
[demangler] No need to space adjacent template closings

With the demangler parenthesizing 'a >> b' inside template parameters,
because C++11 parsing of >> there, we don't really need to add spaces
between adjacent template arg closing '>' chars.  In 2022, that just
looks odd.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D123134

2 years ago[DAG] Use isAnyConstantBuildVector. NFC
David Green [Mon, 9 May 2022 13:13:03 +0000 (14:13 +0100)]
[DAG] Use isAnyConstantBuildVector. NFC

As suggested from 02f8519502447de, this uses the
isAnyConstantBuildVector method in lieu of separate
isBuildVectorOfConstantSDNodes calls. It should
otherwise be an NFC.

2 years ago[ScalarEvolution] Fold %x umin_seq %y if %x cannot be zero
Nikita Popov [Mon, 9 May 2022 13:01:27 +0000 (15:01 +0200)]
[ScalarEvolution] Fold %x umin_seq %y if %x cannot be zero

Fold %x umin_seq %y to %x umin %y if %x cannot be zero. They only
differ in semantics for %x==0.

More generally %x *_seq %y folds to %x * %y if %x cannot be the
saturation fold (though currently we only have umin_seq).

2 years ago[X86] Replace avx512f integer mul reduction builtins with generic builtin
Simon Pilgrim [Mon, 9 May 2022 13:10:17 +0000 (14:10 +0100)]
[X86] Replace avx512f integer mul reduction builtins with generic builtin

D117829 added the generic "__builtin_reduce_mul" which we can use to replace the x86 specific integer mul reduction builtins - internally these were mapping to the same intrinsic already so there are no test changes required.

Differential Revision: https://reviews.llvm.org/D125222

2 years ago[ScalarEvolution] Add tests for umin_seq with non-zero operand (NFC)
Nikita Popov [Mon, 9 May 2022 13:02:41 +0000 (15:02 +0200)]
[ScalarEvolution] Add tests for umin_seq with non-zero operand (NFC)

2 years ago[AArch64][SVE] Improve codegen when extracting first lane of active lane mask
Rosie Sumpter [Mon, 9 May 2022 08:35:13 +0000 (09:35 +0100)]
[AArch64][SVE] Improve codegen when extracting first lane of active lane mask

When extracting the first lane of a predicate created using the
llvm.get.active.lane.mask intrinsic, it should give the same codegen as
when the predicate is created using the llvm.aarch64.sve.whilelo
intrinsic, since get.active.lane.mask is lowered to whilelo. This patch
ensures the codegen is the same by recognizing
llvm.get.active.lane.mask as a flag-setting operation in this case.

Differential Revision: https://reviews.llvm.org/D125215

2 years ago[clangd] Rewrite TweakTesting helpers to avoid reparsing the same code. NFC
Sam McCall [Fri, 6 May 2022 17:51:59 +0000 (19:51 +0200)]
[clangd] Rewrite TweakTesting helpers to avoid reparsing the same code. NFC

Previously the EXPECT_AVAILABLE macros would rebuild the code at each marked
point, by expanding the cases textually.
There were often lots, and it's nice to have lots!

This reduces total unittest time by ~10% on my machine.
I did have to sacrifice a little apply() coverage in AddUsingTests (was calling
expandCases directly, which was otherwise unused), but we have
EXPECT_AVAILABLE tests covering that, I don't think there's real risk here.

Differential Revision: https://reviews.llvm.org/D125109

2 years agoRecommit "[SimpleLoopUnswitch] Collect either logical ANDs/ORs but not both."
Florian Hahn [Mon, 9 May 2022 12:49:12 +0000 (13:49 +0100)]
Recommit "[SimpleLoopUnswitch] Collect either logical ANDs/ORs but not both."

This reverts commit 7211d5ce07830ebfa2cfc30818cd7155375f7e47.

This version fixes a crash that caused buildbot failures with the first
version.

2 years ago[SimpleLoopUnswitch] Add test case for crash with db7a87ed4fa7.
Florian Hahn [Mon, 9 May 2022 12:48:56 +0000 (13:48 +0100)]
[SimpleLoopUnswitch] Add test case for crash with db7a87ed4fa7.

2 years ago[clangd] Skip extra round-trip in parsing args in debug builds. NFC
Sam McCall [Sat, 7 May 2022 15:31:52 +0000 (17:31 +0200)]
[clangd] Skip extra round-trip in parsing args in debug builds. NFC

This is a clever cross-cutting sanity test for clang's arg parsing I suppose.
But clangd creates thousands of invocations, ~all with identical trivial
arguments, and problems with these would be caught by clang's tests.
This overhead accounts for 10% of total unittest time!

Differential Revision: https://reviews.llvm.org/D125169

2 years ago[clangd] Disable predefined macros in tests. NFC
Sam McCall [Sat, 7 May 2022 17:02:29 +0000 (19:02 +0200)]
[clangd] Disable predefined macros in tests. NFC

These aren't needed. With them the generated predefines buffer is 13KB.
For every TestTU, we must:
 - generate the buffer (3 times: parsing preamble, scanning preamble, main file)
 - parse the buffer (again 3 times)
 - serialize all the macros it defines in the PCH
 - compress the buffer itself to write it into the PCH
 - decompress it from the PCH

Avoiding this reduces unit test time by ~25%.

Differential Revision: https://reviews.llvm.org/D125172

2 years ago[NFC][LoopVectorize] Add SVE test for tail-folding combined with interleaving
David Sherwood [Thu, 5 May 2022 08:48:31 +0000 (09:48 +0100)]
[NFC][LoopVectorize] Add SVE test for tail-folding combined with interleaving

Differential Revision: https://reviews.llvm.org/D125001

2 years ago[demangler] Buffer peeking needs buffer
Nathan Sidwell [Mon, 28 Mar 2022 19:38:48 +0000 (12:38 -0700)]
[demangler] Buffer peeking needs buffer

The output buffer has a 'back' member, which returns NUL when you try
it with an empty buffer.  But there are no use cases that need that
additional functionality.  This makes the 'back' member behave more
like STL containers' back members.  (It still returns a value, not a
reference.)

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D123201

2 years ago[Clang] Add integer mul reduction builtin
Simon Pilgrim [Mon, 9 May 2022 10:53:27 +0000 (11:53 +0100)]
[Clang] Add integer mul reduction builtin

Similar to the existing bitwise reduction builtins, this lowers to a llvm.vector.reduce.mul intrinsic call.

For other reductions, we've tried to share builtins for float/integer vectors, but the fmul reduction intrinsic also take a starting value argument and can either do unordered or serialized, but not reduction-trees as specified for the builtins. However we address fmul support this shouldn't affect the integer case.

Differential Revision: https://reviews.llvm.org/D117829

2 years ago[clang-tidy][NFC] Replace many instances of std::string where a StringRef would suffice.
Nathan James [Mon, 9 May 2022 11:01:45 +0000 (12:01 +0100)]
[clang-tidy][NFC] Replace many instances of std::string where a StringRef would suffice.

There's many instances in clang tidy checks where owning strings are used when we already have a stable string from the options, so using a StringRef makes much more sense.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D124341

2 years agoFilter non-external static members from SBType::GetFieldAtIndex.
Sigurur sgeirsson [Mon, 9 May 2022 09:53:23 +0000 (11:53 +0200)]
Filter non-external static members from SBType::GetFieldAtIndex.

See [[ https://github.com/llvm/llvm-project/issues/55040 | issue 55040 ]] where static members of classes declared in the anonymous namespace are incorrectly returned as member fields from lldb::SBType::GetFieldAtIndex(). It appears that attrs.member_byte_offset contains a sentinel value for members that don't have a DW_AT_data_member_location.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D124409

2 years ago[SVE] Optimize new cases for lowerConvertToSVBool
Alban Bridonneau [Mon, 9 May 2022 10:15:54 +0000 (10:15 +0000)]
[SVE] Optimize new cases for lowerConvertToSVBool

Converts to SVBool are already considered as a nop, if they
are converting an operand from a ptrue or a cmp, because
they zero the extra predicate lanes by construction.

This patch adds 2 similar cases:
- The wide cmp, which were not directly recognized by the test
for other forms of cmp
- Splats of 1, which will be generated as ptrue, and as such
will also zero the extra predicate lines.

Reviewed By: paulwalker-arm, peterwaller-arm

Differential Revision: https://reviews.llvm.org/D124908

2 years ago[mlir][math] Promote (b)f16 to f32 when lowering to libm calls
Benjamin Kramer [Fri, 6 May 2022 13:34:44 +0000 (15:34 +0200)]
[mlir][math] Promote (b)f16 to f32 when lowering to libm calls

libm doesn't have overloads for the small types, so promote them to a
bigger type and use the f32 function.

Differential Revision: https://reviews.llvm.org/D125093

2 years ago[lldb/DWARF] Fix linking direction in CopyUniqueClassMethodTypes
Pavel Labath [Mon, 25 Apr 2022 09:24:45 +0000 (11:24 +0200)]
[lldb/DWARF] Fix linking direction in CopyUniqueClassMethodTypes

IIUC, the purpose of CopyUniqueClassMethodTypes is to link together
class definitions in two compile units so that we only have a single
definition of a class. It does this by adding entries to the die_to_type
and die_to_decl_ctx maps.

However, the direction of the linking seems to be reversed. It is taking
entries from the class that has not yet been parsed, and copying them to
the class which has been parsed already -- i.e., it is a very
complicated no-op.

Changing the linking order allows us to revert the changes in D13224
(while keeping the associated test case passing), and is sufficient to
fix PR54761, which was caused by an undesired interaction with that
patch.

Differential Revision: https://reviews.llvm.org/D124370

2 years ago[libcxx] [test] Fix the nasty_macros test on Windows on ARM/ARM64
Martin Storsjö [Wed, 4 May 2022 18:59:40 +0000 (21:59 +0300)]
[libcxx] [test] Fix the nasty_macros test on Windows on ARM/ARM64

This isn't a configuration that we unfortunately can add to
the CI practically at the moment, but I do run the tests
sporadically offline in this configuration.

Differential Revision: https://reviews.llvm.org/D124993

2 years ago[clang-format] Correctly handle SpaceBeforeParens for builtins.
Marek Kurdej [Fri, 6 May 2022 09:27:14 +0000 (11:27 +0200)]
[clang-format] Correctly handle SpaceBeforeParens for builtins.

That's a partial fix for https://github.com/llvm/llvm-project/issues/55292.

Before, known builtins behaved differently from other identifiers:
```
void f () { return F (__builtin_LINE() + __builtin_FOO ()); }
```
After:
```
void f () { return F (__builtin_LINE () + __builtin_FOO ()); }
```

Reviewed By: owenpan

Differential Revision: https://reviews.llvm.org/D125085

2 years ago[AArch64] Ampere1 does not support MTE
Philipp Tomsich [Sun, 8 May 2022 19:40:28 +0000 (21:40 +0200)]
[AArch64] Ampere1 does not support MTE

The initial support for the Ampere1 mistakenly signalled support for
the MTE feature.  However, the core does not include the optional MTE
functionality.

Update the target parser to not include MTE for Ampere1.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D125191

2 years ago[AArch64] Generate AND in place of CSEL for predicated CTTZ
Rahul Anand R [Mon, 9 May 2022 09:28:20 +0000 (10:28 +0100)]
[AArch64] Generate AND in place of CSEL for predicated CTTZ

This patch implements a for a target specific optimization that replaces
the cmp and csel from cttz with an and mask.

Differential Revision: https://reviews.llvm.org/D123782

2 years agoRevert "[lldb] parallelize calling of Module::PreloadSymbols()"
Pavel Labath [Mon, 9 May 2022 09:07:03 +0000 (11:07 +0200)]
Revert "[lldb] parallelize calling of Module::PreloadSymbols()"

This reverts commit b7d807dbcff0d9df466e0312b4fef57178d207be -- it
breaks TestMultipleDebuggers.py.

2 years ago[ConstraintElimination] Add initial ssub.with.overflow tests.
Florian Hahn [Mon, 9 May 2022 09:02:59 +0000 (10:02 +0100)]
[ConstraintElimination] Add initial ssub.with.overflow tests.

2 years ago[clang-format] Fix WhitespaceSensitiveMacros not being honoured when macro closing...
Marek Kurdej [Wed, 13 Apr 2022 12:06:57 +0000 (14:06 +0200)]
[clang-format] Fix WhitespaceSensitiveMacros not being honoured when macro closing parenthesis is followed by a newline.

Fixes https://github.com/llvm/llvm-project/issues/54522.

This fixes regression introduced in https://github.com/llvm/llvm-project/commit/5e5efd8a91f2e340e79a73bedbc6ab66ad4a4281.

Before the culprit commit, macros in WhitespaceSensitiveMacros were correctly formatted even if their closing parenthesis weren't followed by semicolon (or, to be precise, when they were followed by a newline).
That commit changed the type of the macro token type from TT_UntouchableMacroFunc to TT_FunctionLikeOrFreestandingMacro.

Correct formatting (with `WhitespaceSensitiveMacros = ['FOO']`):
```
FOO(1+2)
FOO(1+2);
```

Regressed formatting:
```
FOO(1 + 2)
FOO(1+2);
```

Reviewed By: HazardyKnusperkeks, owenpan, ksyx

Differential Revision: https://reviews.llvm.org/D123676

2 years ago[DAG] Prevent infinite loop combining bitcast shuffle
David Green [Mon, 9 May 2022 08:36:22 +0000 (09:36 +0100)]
[DAG] Prevent infinite loop combining bitcast shuffle

This prevents an infinite loop from D123801, where code trying to reduce
the total number of bitcasts, but also handling constants, could create
the opposite transform. Prevent the transform in these case to let the
bitcast of a constant transform naturally.

Fixes #55345

2 years ago[AVR] Add PrintMethod for operand memspi
Ben Shi [Mon, 9 May 2022 08:31:24 +0000 (08:31 +0000)]
[AVR] Add PrintMethod for operand memspi

Reviewed By: Patryk27

Differential Revision: https://reviews.llvm.org/D124913

2 years ago[AMDGPU] Regenerate checks in a mir test
Abinav Puthan Purayil [Sun, 8 May 2022 18:16:43 +0000 (23:46 +0530)]
[AMDGPU] Regenerate checks in a mir test

2 years ago[flang] retain binding label of entry subprograms
Jean Perier [Mon, 9 May 2022 07:49:11 +0000 (09:49 +0200)]
[flang] retain binding label of entry subprograms

When processing an entry-stmt in name resolution, attrs_ was
reset before SetBindNameOn was called, causing the symbol to lose
the binding label information.

Differential Revision: https://reviews.llvm.org/D125097

2 years ago[CSSPGO][Preinliner] Use linear threshold to drive inline decision.
Hongtao Yu [Mon, 9 May 2022 05:05:54 +0000 (22:05 -0700)]
[CSSPGO][Preinliner] Use linear threshold to drive inline decision.

The per-callsite size threshold used today to drive preinline decision is based on hotness/coldness cutoff. The default setup is for callsites with a sample count above the hotness cutoff (99%), a 1500 size threshold is used. Any callsite below 99.99% coldness cutoff uses a zero threshold. This has a couple issues:

1. While both cutoffs and size thoresholds are configurable, different applications may need different setups, making a universal setup impractical.

2. The callsites between hotness cutoff and coldness cutoff are not considered as inline candidates, which could be a missing opportunity.

3. Hot callsites always use the same threshold. In reality we may want a bigger threshold for hotter callsites.

In this change we are introducing a linear threshold regardless of hot/cold cutoffs. Given a sample space, a threshold is computed for a callsite based on the position of that callsite sample in the whole space. With that we no longer need to define what's hot or cold. Callsites with different hotness will get a different threshold. This should overcome the above three issues.

I have seen good results with a universal default setup for two of our internal services.

For one service, 0.2% to 0.5% perf improvement over a baseline with a previous default setup, on-par code size.
For the second service, 0.5% to 0.8% perf improvement over a baseline with a previous default setup, 0.2% code size increase; on-par performance and code size with a baseline that is with a carefully tuned cutoff to cover enough hot functions.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D125023

2 years ago[mlir][NvGpu] Fix nvgpu.mma.sync lowering to NVVM for f32, tf32 types
Christopher Bate [Thu, 5 May 2022 17:35:48 +0000 (11:35 -0600)]
[mlir][NvGpu] Fix nvgpu.mma.sync lowering to NVVM for f32, tf32 types

Adds missing logic in the lowering from NvGPU to NVVM to support fp32
(in an accumulator operand) and tf32 (in multiplicand operand) types.
Fixes logic in one of the helper functions for converting the result
of a mma.sync operation with multiple 8x256bit output tiles, which is
the case for f32 outputs.

Differential Revision: https://reviews.llvm.org/D124533

2 years ago[flang] Enforce a program not including more than one main program
Peixin-Qiao [Mon, 9 May 2022 02:47:47 +0000 (10:47 +0800)]
[flang] Enforce a program not including more than one main program

As Fortran 2018 5.2.2 states, a program shall consist of exactly one
main program. Add this semantic check.

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D125186

2 years ago[lld] Fix typo for processAux; NFC
Xiaodong Liu [Mon, 9 May 2022 02:19:38 +0000 (10:19 +0800)]
[lld] Fix typo for processAux; NFC

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D125163

2 years ago[BOLT][DWARF] Fix assert for split dwarf.
Alexander Yermolovich [Sun, 8 May 2022 00:56:48 +0000 (17:56 -0700)]
[BOLT][DWARF] Fix assert for split dwarf.

Fixing a small bug where it would assert if CU does not modify .debug_addr section.

Differential Revision: https://reviews.llvm.org/D125181

2 years ago[SLP][X86] Add test coverage for PR50392 / Issue #49736
Simon Pilgrim [Sun, 8 May 2022 18:39:57 +0000 (19:39 +0100)]
[SLP][X86] Add test coverage for PR50392 / Issue #49736

2 years ago[libc][Obvious] Fix cmake usage of list PREPEND (unavailable pre-3.15).
Tue Ly [Sun, 8 May 2022 17:58:05 +0000 (13:58 -0400)]
[libc][Obvious] Fix cmake usage of list PREPEND (unavailable pre-3.15).

2 years ago[libc] Add LINK_LIBRARIES option to add_fp_unittest and add_libc_unittest.
Tue Ly [Thu, 5 May 2022 22:58:24 +0000 (22:58 +0000)]
[libc] Add LINK_LIBRARIES option to add_fp_unittest and add_libc_unittest.

This is needed to prepare for adding FLAGS option.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D125055

2 years ago[X86] Add test coverage for PR26515 / Issue #26889
Simon Pilgrim [Sun, 8 May 2022 17:19:04 +0000 (18:19 +0100)]
[X86] Add test coverage for PR26515 / Issue #26889

2 years ago[docs] Add Office Hours for Tobias Grosser
Groverkss [Sun, 8 May 2022 15:44:31 +0000 (21:14 +0530)]
[docs] Add Office Hours for Tobias Grosser

2 years ago[X86] Set some more plausible latencies for horizontal add/subs on znver1
Simon Pilgrim [Sun, 8 May 2022 14:48:42 +0000 (15:48 +0100)]
[X86] Set some more plausible latencies for horizontal add/subs on znver1

These are all microcoded/multi-pipe nightmares on Ryzen, but we shouldn't just be using the WriteMicrocoded class which is for REALLY bad microcoded nightmares - instead use the same approximate latencies as znver2 (Agner and uops.info both suggest similar values) - and make sure we use the FPU defs for both

Fixes #53242

2 years ago[DAG] Only perform the fold (A-B)+(C-D) --> (A+C)-(B+D) when both inner subs have...
Simon Pilgrim [Sun, 8 May 2022 12:51:58 +0000 (13:51 +0100)]
[DAG] Only perform the fold (A-B)+(C-D) --> (A+C)-(B+D) when both inner subs have one use

Fixes #51381

2 years ago[X86] combine-add.ll - add test case for PR52039 / Issue #51381
Simon Pilgrim [Sun, 8 May 2022 12:45:19 +0000 (13:45 +0100)]
[X86] combine-add.ll - add test case for PR52039 / Issue #51381

Also split AVX1/AVX2 test coverage

2 years ago[X86][AMX] Simplify AMX test case.
Luo, Yuanke [Sun, 8 May 2022 11:10:24 +0000 (19:10 +0800)]
[X86][AMX] Simplify AMX test case.

Extract test for zero tile configure into a small test case.

2 years ago[SLP][X86] Add test coverage for PR42652 / Issue #41997
Simon Pilgrim [Sun, 8 May 2022 11:09:14 +0000 (12:09 +0100)]
[SLP][X86] Add test coverage for PR42652 / Issue #41997

2 years ago[SLP][X86] Add test coverage for PR41892 / Issue #41237
Simon Pilgrim [Sun, 8 May 2022 10:40:53 +0000 (11:40 +0100)]
[SLP][X86] Add test coverage for PR41892 / Issue #41237

2 years ago[SLP][X86] Add test coverage for PR49934 / Issue #49278
Simon Pilgrim [Sun, 8 May 2022 10:33:01 +0000 (11:33 +0100)]
[SLP][X86] Add test coverage for PR49934 / Issue #49278

D124284 should help us vectorize the sub-128-bit vector cases

2 years ago[SLP][X86] Add test coverage for PR47491 / Issue #46835
Simon Pilgrim [Sun, 8 May 2022 10:24:46 +0000 (11:24 +0100)]
[SLP][X86] Add test coverage for PR47491 / Issue #46835

D124284 should help us vectorize the sub-128-bit vector cases

2 years ago[InstCombine] Add test coverage for PR43261 / Issue #42606
Simon Pilgrim [Sun, 8 May 2022 10:10:49 +0000 (11:10 +0100)]
[InstCombine] Add test coverage for PR43261 / Issue #42606

2 years ago[Headers][X86] Enable basic Wdocumentation testing on X86 headers
Simon Pilgrim [Sun, 8 May 2022 09:53:28 +0000 (10:53 +0100)]
[Headers][X86] Enable basic Wdocumentation testing on X86 headers

First part of Issue #35297 - we want to enable Wdocumentation-pedantic as well, but need '\n' support first which Issue #55319 is addressing

2 years ago[Headers][X86] Replace \operation with \code{.operation}
Simon Pilgrim [Sun, 8 May 2022 09:46:12 +0000 (10:46 +0100)]
[Headers][X86] Replace \operation with \code{.operation}

\operation ... \endoperation are not valid doxygen commands and cause issues when -Wdocumentation is enabled (Issue #35297)

This patch proposes to replace them with \code{.operation} ... \endcode blocks so that the pseudo-code is correctly retained in any documentation and downstream can use the ".operation" type for its own formatting.

Differential Revision: https://reviews.llvm.org/D125170

2 years ago[VectorCombine] Attempt to fold select shuffles from reductions
David Green [Sun, 8 May 2022 09:32:41 +0000 (10:32 +0100)]
[VectorCombine] Attempt to fold select shuffles from reductions

Given a commutative reduction leading from a shuffle, the order of the
lanes on the shuffle are not important for the result. This means we can
reorder the shuffle to something simpler, which we try shuffling the
first vector lanes first. This was D123494.

The new shuffle may not be profitable though, and if it is not we can
try the folding of select shuffles from D123911. This, with some
adjustment as the output lane ordering is now unimportant, can allow the
final shuffle to simplify given the inputs to the patterns from D123911.
Where as each transformation on their own are not profitable, the
combination is.

We can only support a single shuffle when called from reductions, but we
are able to sort the ReconstructMask, potentially allowing it to
simplify to an identity or concat mask.

Differential Revision: https://reviews.llvm.org/D125086

2 years ago[X86] Fix some signedness errors in x86 headers
Simon Pilgrim [Sun, 8 May 2022 08:42:58 +0000 (09:42 +0100)]
[X86] Fix some signedness errors in x86 headers

Another step toward enabling full -Wsystem-headers testing across all x86 headers

Fix a number of cases where the arg / return value signedness doesn't match the C/C++ intrinsic.

So far I've just added explicit casts as necessary, but we might want to address some of the mismatches directly

Differential Revision: https://reviews.llvm.org/D125164

2 years ago[test][msan] Relax order of param shadow
Vitaly Buka [Sun, 8 May 2022 04:17:44 +0000 (21:17 -0700)]
[test][msan] Relax order of param shadow

Looks like different bots have them in a different order.

2 years ago[test][msa] Add more sse,avx intrinsics tests
Vitaly Buka [Sun, 8 May 2022 01:38:41 +0000 (18:38 -0700)]
[test][msa] Add more sse,avx intrinsics tests

2 years agoMake BinaryStreamWriter::padToAlignment write blocks vs bytes.
Stella Laurenzo [Mon, 2 May 2022 02:20:50 +0000 (19:20 -0700)]
Make BinaryStreamWriter::padToAlignment write blocks vs bytes.

While I think this is a performance improvement over the original, this actually fixes a correctness issue: For an appendable underlying stream, padToAlignment would fail if the additional padding would have caused the stream to grow since it was doing its own check on bounds. By deferring to the regular writeArray method this takes the same path as everything else, which does the correct bounds check in WritableBinaryStreamRef::checkOffsetForWrite (i.e. skips the extension check if BSF_Append is set). I had started to fix the existing bounds check in BinaryStreamWriter but deferred to this because it layered better and is more efficient/consistent.

It didn't look like this method was tested at all, so I added a unit test.

Differential Revision: https://reviews.llvm.org/D124746

2 years ago[Frontend] Move, don't copy the predefines buffer into PP. NFC.
Sam McCall [Sat, 7 May 2022 23:03:35 +0000 (01:03 +0200)]
[Frontend] Move, don't copy the predefines buffer into PP. NFC.

It's not trivially small, >10kb.

2 years agoGenerate sse-intel-ocl.ll automatically. NFC
Amaury Séchet [Sat, 7 May 2022 22:44:40 +0000 (22:44 +0000)]
Generate sse-intel-ocl.ll automatically. NFC

2 years agoRegenerate avx512-regcall-NoMask.ll . NFC
Amaury Séchet [Sat, 7 May 2022 22:21:48 +0000 (22:21 +0000)]
Regenerate avx512-regcall-NoMask.ll . NFC

2 years ago[IROutliner] Accomodate blocks containing PHINodes with one entry outside the region...
Andrew Litteken [Sun, 1 May 2022 23:14:18 +0000 (18:14 -0500)]
[IROutliner] Accomodate blocks containing PHINodes with one entry outside the region and others inside the region.

When a PHINode has an incoming block from outside the region, it must be handled specially when assigning a global value number to each incoming value. A PHINode has multiple predecessors, and we must handle this case rather than only the single predecessor case.

Reviewer: paquette

Differential Revision: https://reviews.llvm.org/D124777

2 years ago[RISCV] Regenerate rv32zbp-zbkb.ll
Simon Pilgrim [Sat, 7 May 2022 20:29:36 +0000 (21:29 +0100)]
[RISCV] Regenerate rv32zbp-zbkb.ll

Noticed in D124839

2 years ago[AArch64] Add missing NVCAST patterns.
David Green [Sat, 7 May 2022 20:08:14 +0000 (21:08 +0100)]
[AArch64] Add missing NVCAST patterns.

There were apparently some missing NVCAST patterns. This fills them in
using foreach, as opposed to having the specify them individually.

Fixes #55321

2 years ago[AMDGPU] lowerEXTRACT_VECTOR_ELT - fold from a SCALAR_TO_VECTOR source
Simon Pilgrim [Sat, 7 May 2022 19:23:24 +0000 (20:23 +0100)]
[AMDGPU] lowerEXTRACT_VECTOR_ELT - fold from a SCALAR_TO_VECTOR source

As suggested by @foad on D124839

If we're extracting a vector element that originally came from a scalar_to_vector, then avoid the bitcasting of a vector type and perform the shift masking on the (any-extended) scalar source directly, making use of the fact that the upper elements of a scalar_to_vector are all undef.

Differential Revision: https://reviews.llvm.org/D125173

2 years ago[LegalizeTypes] Make use of SelectionDAG::getShiftAmountConstant. NFC
Craig Topper [Sat, 7 May 2022 19:16:49 +0000 (12:16 -0700)]
[LegalizeTypes] Make use of SelectionDAG::getShiftAmountConstant. NFC

Instead of calling getShiftAmountTy and getConstant separately.

2 years ago[LegalizeTypes] Don't assume fshl/fshr shift amount type matches the other operands.
Craig Topper [Fri, 6 May 2022 17:31:18 +0000 (10:31 -0700)]
[LegalizeTypes] Don't assume fshl/fshr shift amount type matches the other operands.

Like other shifts, the type isn't required to match. We shouldn't
assume we can call ZExtPromotedInteger.

I tested the PromoteIntOp_FunnelShift locally by removing the promotion
of the shift amount from PromoteIntRes_FunnelShift. But with the final
version of this patch it is never executed on any tests.

Differential Revision: https://reviews.llvm.org/D125106

2 years ago[DAGCombine] Add node in the worklist in topological order in CombineTo
Amaury Séchet [Sat, 30 Apr 2022 20:36:40 +0000 (20:36 +0000)]
[DAGCombine] Add node in the worklist in topological order in CombineTo

This is part of an ongoing effort toward making DAGCombine process the nodes in topological order.

This is able to discover a couple of new optimizations, but also causes a couple of regression. I nevertheless chose to submit this patch for review as to start the discussion with people working on the backend so we can find a good way forward.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D124743

2 years ago[ARM] Update ror.ll test to canonicalized IR
Simon Pilgrim [Sat, 7 May 2022 16:23:42 +0000 (17:23 +0100)]
[ARM] Update ror.ll test to canonicalized IR

As discussed on D124839, we're almost certainly only ever going to see this from IR directly - which now will create funnel shift intrinsics directly

I've also added a couple of rotl(rotr()) tests to check left/right rotation merging.

2 years ago[Headers][X86] amxintrin.h - fixed unknown parameter Wdocumentation warning. NFC
Simon Pilgrim [Sat, 7 May 2022 15:20:34 +0000 (16:20 +0100)]
[Headers][X86] amxintrin.h - fixed unknown parameter Wdocumentation warning. NFC

Noticed while triaging Issue #35297

2 years ago[Bitstream] Only consider flushing to file on block boundaries
Sam McCall [Fri, 6 May 2022 22:47:40 +0000 (00:47 +0200)]
[Bitstream] Only consider flushing to file on block boundaries

The goal of flushing to disk is to keep a reasonable bound on peak memory usage.
With a a default threshold of 512MB (and most BitstreamWriters having no backing
file at all), checking after every byte whether to flush seems excessive.

This change makes clangd's unittests run 5% faster (in opt), so it's not
actually free even in the case with no backing file. Likely there are more
important workloads where it makes some difference.

Differential Revision: https://reviews.llvm.org/D125145

2 years agoconst char* for LLVMTargetMachineEmitToFile's argument
Amaury Séchet [Sat, 7 May 2022 14:26:21 +0000 (14:26 +0000)]
const char* for LLVMTargetMachineEmitToFile's argument

The `LLVMTargetMachineEmitToFile` takes a `char* Filename` right now, but it doesn't modify it.
This is annoying to use in the case where you want to pass a const string, because you either have to remove the const, or copy it somewhere else and pass that. Either way, it's not very nice.

I added a const and clang formatted it. This shouldn't break any ABI in my opinion.
I'm sorry but I didn't know whom to put as reviewer for this, so I chose someone with a lot of commits from the .cpp file.

Reviewed By: deadalnix

Differential Revision: https://reviews.llvm.org/D124453

2 years ago[X86] Add 32-bit target test coverage to clean header tests
Simon Pilgrim [Sat, 7 May 2022 14:23:29 +0000 (15:23 +0100)]
[X86] Add 32-bit target test coverage to clean header tests

2 years ago[SLP] Cluster ordering for loads
David Green [Sat, 7 May 2022 13:38:11 +0000 (14:38 +0100)]
[SLP] Cluster ordering for loads

Given a load without a better order, this patch partially sorts the
elements to form clusters of adjacent elements in memory. These clusters
can potentially be loaded in fewer loads, meaning less overall shuffling
(for example loading v4i8 clusters of a v16i8 as a single f32 loads, as
opposed to multiple independent bytes loads and inserts).

Differential Revision: https://reviews.llvm.org/D122145

2 years ago[X86] rdrand-builtins.c - add 32-bit target coverage and enable -Wall/-Werror
Simon Pilgrim [Sat, 7 May 2022 13:35:36 +0000 (14:35 +0100)]
[X86] rdrand-builtins.c - add 32-bit target coverage and enable -Wall/-Werror

2 years agoAutomatically generate aix32-cc-abi-vaarg.ll . NFC
Amaury Séchet [Sat, 7 May 2022 13:04:40 +0000 (13:04 +0000)]
Automatically generate aix32-cc-abi-vaarg.ll . NFC

2 years ago[InstCombine] fix miscompile when casting int->FP->int
Sanjay Patel [Sat, 7 May 2022 12:22:08 +0000 (08:22 -0400)]
[InstCombine] fix miscompile when casting int->FP->int

As shown in https://github.com/llvm/llvm-project/issues/55150 -
the existing fold may be wrong when converting to a signed value.
This is a quick fix to avoid the miscompile.

I added tests/comments for all of the signed/unsigned combinations
at either side of the boundary width, and tried to confirm with Alive2:
https://alive2.llvm.org/ce/z/3p9DSu

There are already some TODO items in the test file that suggest
possible refinements, so the regression with ui->FP->si is probably ok.
It seems unlikely that we'd see these kind of edge cases with
non-byte-width integer types in real code. The potential miscompile
went undetected for several years.

This and 747c6a0c734e fixes #55150.

Differential Revision: https://reviews.llvm.org/D124692

2 years ago[X86] Remove unused 'hint' argument from prefetch tests
Simon Pilgrim [Sat, 7 May 2022 12:38:40 +0000 (13:38 +0100)]
[X86] Remove unused 'hint' argument from prefetch tests

hint is a compile time constant and can't be passed in as a variable - we already hardcode

2 years ago[InstCombine] Remove side effect of replaced constrained intrinsics
Serge Pavlov [Thu, 5 May 2022 05:02:42 +0000 (12:02 +0700)]
[InstCombine] Remove side effect of replaced constrained intrinsics

If a constrained intrinsic call was replaced by some value, it was not
removed in some cases. The dangling instruction resulted in useless
instructions executed in runtime. It happened because constrained
intrinsics usually have side effect, it is used to model the interaction
with floating-point environment. In some cases side effect is actually
absent or can be ignored.

This change adds specific treatment of constrained intrinsics so that
their side effect can be removed if it actually absents.

Differential Revision: https://reviews.llvm.org/D118426

2 years agoReland "[FuzzMutate] Split out FuzzerCLI library that doesn't depend on IR."
Sam McCall [Sat, 7 May 2022 11:44:42 +0000 (13:44 +0200)]
Reland "[FuzzMutate] Split out FuzzerCLI library that doesn't depend on IR."

This reverts commit a1bb952e833b34fdf03bd571e7f8c948191be018.

I'd somehow missed updating llvm-yaml-parser-fuzzer, now fixed.

2 years agoRevert "[FuzzMutate] Split out FuzzerCLI library that doesn't depend on IR."
Aaron Ballman [Sat, 7 May 2022 11:29:57 +0000 (07:29 -0400)]
Revert "[FuzzMutate] Split out FuzzerCLI library that doesn't depend on IR."

This reverts commit 1c5e85b3da649c89db87abecc53b42f6eaa574c2.

It broke a lot of bots with a link error:
https://lab.llvm.org/buildbot/#/builders/171/builds/14222
https://lab.llvm.org/buildbot/#/builders/188/builds/13748
https://lab.llvm.org/buildbot/#/builders/109/builds/38127

2 years agoFix underlining in docs to fix the sphinx build
Aaron Ballman [Sat, 7 May 2022 11:21:43 +0000 (07:21 -0400)]
Fix underlining in docs to fix the sphinx build

2 years ago[ISD::IndexType] Helper functions for common queries.
Paul Walker [Tue, 5 Apr 2022 16:48:22 +0000 (17:48 +0100)]
[ISD::IndexType] Helper functions for common queries.

Add helper functions to query the signed and scaled properties
of ISD::IndexType along with functions to change them.

Remove setIndexType from MaskedGatherSDNode because it only has
one usage and typically should only be changed alongside its
index operand.

Minimise the direct use of the enum values to lay the groundwork
for more refactoring.

Differential Revision: https://reviews.llvm.org/D123347

2 years ago[FuzzMutate] Split out FuzzerCLI library that doesn't depend on IR.
Sam McCall [Fri, 6 May 2022 08:36:34 +0000 (10:36 +0200)]
[FuzzMutate] Split out FuzzerCLI library that doesn't depend on IR.

All llvm-project fuzzers use this library to parse command-line arguments.
Many of them don't deal with LLVM IR or modules in any way. Bundling those
functions in one library forces build dependencies that don't need to be there.

Among other things, this means check-clang-pseudo no longer depends on most of
LLVM.

Differential Revision: https://reviews.llvm.org/D125081

2 years ago[FuzzMutate] Move LLVM module (de)serialization from FuzzerCLI -> IRMutator. NFC
Sam McCall [Fri, 6 May 2022 08:15:41 +0000 (10:15 +0200)]
[FuzzMutate] Move LLVM module (de)serialization from FuzzerCLI -> IRMutator. NFC

These are not directly related to the CLI, and are mostly (always?) used when
mutating the modules as part of fuzzing.

Motivation: split FuzzerCLI into its own library that does not depend on IR.
Subprojects that don't use IR should be be fuzzed without the dependency.

Differential Revision: https://reviews.llvm.org/D125080

2 years ago[X86] Add description comments to SandyBridge for COPY/WriteZero/WriteVecMaskedGather...
Simon Pilgrim [Sat, 7 May 2022 09:42:12 +0000 (10:42 +0100)]
[X86] Add description comments to SandyBridge for COPY/WriteZero/WriteVecMaskedGatherWriteback cases. NFC.

Match other models.

Use X86WriteRes for WriteVecMaskedGatherWriteback like other models as well.

2 years ago[SLP] Add tests for awkward laod orders from SLP. NFC
David Green [Sat, 7 May 2022 09:27:32 +0000 (10:27 +0100)]
[SLP] Add tests for awkward laod orders from SLP. NFC

2 years ago[InstCombine] sub(add(X,Y),umin(Y,Z)) --> add(X,usub.sat(Y,Z))
Chenbing Zheng [Sat, 7 May 2022 09:17:48 +0000 (17:17 +0800)]
[InstCombine] sub(add(X,Y),umin(Y,Z)) --> add(X,usub.sat(Y,Z))

Alive2: https://alive2.llvm.org/ce/z/2UNVbp

Reviewed By: RKSimon, spatel

Differential Revision: https://reviews.llvm.org/D124503

2 years agoFix MLIR integration test after a8308020 (`func.` prefix is required bythe parser...
Mehdi Amini [Sat, 7 May 2022 09:09:03 +0000 (09:09 +0000)]
Fix MLIR integration test after a8308020 (`func.` prefix is required bythe parser now)

2 years ago[InstCombine] precommit some tests for reassociate add
Chenbing Zheng [Sat, 7 May 2022 07:52:28 +0000 (15:52 +0800)]
[InstCombine] precommit some tests for reassociate add

2 years ago[InstCombine] add casts from splat-a-bit pattern if necessary
Chenbing Zheng [Sat, 7 May 2022 07:34:57 +0000 (15:34 +0800)]
[InstCombine] add casts from splat-a-bit pattern if necessary

Splatting a bit of constant-index across a value:
sext (ashr (trunc iN X to iM), M-1) to iN --> ashr (shl X, N-M), N-1
If the dest type is different, use a cast (adjust use check).

https://alive2.llvm.org/ce/z/acAan3

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D124590

2 years agoRevert "[CMake][libcxx] Use target_include_directories for libc++ headers"
Petr Hosek [Sat, 7 May 2022 05:19:54 +0000 (22:19 -0700)]
Revert "[CMake][libcxx] Use target_include_directories for libc++ headers"

This reverts commit 203455c85ad03325ce2d77f067f6ac953f2a32ce since
it breaks the OpenMP builders for AMDGPU.

2 years ago[libcxx] Remove static inline and make use of _LIBCPP_HIDE_FROM_ABI in __support...
Brad Smith [Sat, 7 May 2022 05:06:32 +0000 (01:06 -0400)]
[libcxx] Remove static inline and make use of _LIBCPP_HIDE_FROM_ABI in __support headers

After feedback from D122861, do the same thing with some of the other headers. Try to move the
headers so they have a similar style and way of doing things.

Reviewed By: ldionne, daltenty

Differential Revision: https://reviews.llvm.org/D124227

2 years ago[libcxx] random_device, use arc4random() on Solaris
Brad Smith [Sat, 7 May 2022 04:57:41 +0000 (00:57 -0400)]
[libcxx] random_device, use arc4random() on Solaris

Reviewed By: ldionne

Differential Revision: https://reviews.llvm.org/D125068

2 years ago[test][ORC-RT] Disable elfnix_platform tests on non-x86_64 platforms
Peter S. Housel [Sat, 7 May 2022 02:59:56 +0000 (19:59 -0700)]
[test][ORC-RT] Disable elfnix_platform tests on non-x86_64 platforms

ORC ELFNixPlatform currently only supports x86_64.

2 years agoRevert "[runtime] Build compiler-rt with --unwindlib=none"
Petr Hosek [Sat, 7 May 2022 02:52:38 +0000 (19:52 -0700)]
Revert "[runtime] Build compiler-rt with --unwindlib=none"

This reverts commit 102bc634cb4129d9984a8da8515af945e8a5568b because
some tests are failing on sanitizer bots.

2 years agoUpstream support for POINTER assignment in FORALL.
Eric Schweitz [Fri, 22 Apr 2022 20:59:17 +0000 (13:59 -0700)]
Upstream support for POINTER assignment in FORALL.

Reviewed By: vdonaldson, PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D125140

2 years ago[runtime] Build compiler-rt with --unwindlib=none
Petr Hosek [Wed, 27 Apr 2022 06:23:53 +0000 (23:23 -0700)]
[runtime] Build compiler-rt with --unwindlib=none

This applies the change made to libunwind+libcxxabi+libcxx in D113253
to compiler-rt as well.

Differential Revision: https://reviews.llvm.org/D115674

2 years agoRevert "[runtime] Build compiler-rt with --unwindlib=none"
Petr Hosek [Sat, 7 May 2022 00:52:10 +0000 (17:52 -0700)]
Revert "[runtime] Build compiler-rt with --unwindlib=none"

This reverts commit fecad835fb4c6e65eb487fc626355686959605f6.