platform/upstream/llvm.git
3 years ago[CodeGen][WebAssembly] Better lowering for WASM_SYMBOL_TYPE_GLOBAL symbols
Andy Wingo [Wed, 5 May 2021 12:59:30 +0000 (14:59 +0200)]
[CodeGen][WebAssembly] Better lowering for WASM_SYMBOL_TYPE_GLOBAL symbols

As we have been missing support for WebAssembly globals on the IR level,
the lowering of WASM_SYMBOL_TYPE_GLOBAL to IR was incomplete.  This
commit fleshes out the lowering support, lowering references to and
definitions of addrspace(1) values to correctly typed
WASM_SYMBOL_TYPE_GLOBAL symbols.

Depends on D101608.

Differential Revision: https://reviews.llvm.org/D101913

3 years ago[VP] Improve the VP intrinsic unittests
Simon Moll [Tue, 11 May 2021 07:09:48 +0000 (09:09 +0200)]
[VP] Improve the VP intrinsic unittests

Test that all VP intrinsics are tested.
Test intrinsic id -> opcode -> intrinsic id round tripping.
Test property scopes in the include/llvm/IR/VPIntrinsics.def file.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D93534

3 years ago[WebAssembly] Support for WebAssembly globals in LLVM IR
Paulo Matos [Tue, 4 May 2021 12:13:08 +0000 (14:13 +0200)]
[WebAssembly] Support for WebAssembly globals in LLVM IR

This patch adds support for WebAssembly globals in LLVM IR, representing
them as pointers to global values, in a non-default, non-integral
address space.  Instruction selection legalizes loads and stores to
these pointers to new WebAssemblyISD nodes GLOBAL_GET and GLOBAL_SET.
Once the lowering creates the new nodes, tablegen pattern matches those
and converts them to Wasm global.get/set of the appropriate type.

Based on work by Paulo Matos in https://reviews.llvm.org/D95425.

Reviewed By: pmatos

Differential Revision: https://reviews.llvm.org/D101608

3 years ago[flang][cmake] Enable the new driver by default
Andrzej Warzynski [Tue, 4 May 2021 15:52:15 +0000 (15:52 +0000)]
[flang][cmake] Enable the new driver by default

With this patch, `FLANG_BUILD_NEW_DRIVER` is set to `On` by default
(i.e. the new driver is enabled). Note that the new driver depends on
Clang and hence with this change you will need to add `clang` to
`LLVM_ENABLE_PROJECTS`.

If you don't want to build the new driver, set `FLANG_BUILD_NEW_DRIVER`
to `Off`. This way you won't be required to include `clang` in
`LLVM_ENABLE_PROJECTS`.

Differential Revision: https://reviews.llvm.org/D101842

3 years ago* Add support for JSON output style to llvm-symbolizer
Alex Orlov [Tue, 11 May 2021 09:10:54 +0000 (13:10 +0400)]
* Add support for JSON output style to llvm-symbolizer

This patch adds JSON output style to llvm-symbolizer to better support CLI automation by providing a machine readable output.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D96883

3 years agoFix -Wdocumentation warnings. NFCI.
Simon Pilgrim [Tue, 11 May 2021 08:20:55 +0000 (09:20 +0100)]
Fix -Wdocumentation warnings. NFCI.

3 years ago[OpenCL] [NFC] Fixed underline being too short in rst
Ole Strohm [Tue, 11 May 2021 08:45:28 +0000 (09:45 +0100)]
[OpenCL] [NFC] Fixed underline being too short in rst

3 years agoSupport VectorTransfer splitting on writes also.
Tres Popp [Fri, 7 May 2021 14:19:22 +0000 (16:19 +0200)]
Support VectorTransfer splitting on writes also.

VectorTransfer split previously only split read xfer ops. This adds
the same logic to write ops. The resulting code involves 2
conditionals for write ops while read ops only needed 1, but the created
ops are built upon the same patterns, so pattern matching/expectations
are all consistent other than in regards to the if/else ops.

Differential Revision: https://reviews.llvm.org/D102157

3 years ago[libcxx][test] Make string.modifiers/clear_and_shrink_db1.pass.cpp a regular mode...
Kristina Bessonova [Mon, 10 May 2021 18:25:46 +0000 (20:25 +0200)]
[libcxx][test] Make string.modifiers/clear_and_shrink_db1.pass.cpp a regular mode test

Turn this test into a normal mode as it contains well-formed code and
checks for defined behavior. It still can be run in debug mode as of D100866.

Differential Revision: https://reviews.llvm.org/D102192

3 years ago[llvm-dwarfdump] Fix abstract origin vars location stats calculation
Djordje Todorovic [Mon, 10 May 2021 13:06:58 +0000 (06:06 -0700)]
[llvm-dwarfdump] Fix abstract origin vars location stats calculation

There are cases where a concrete DIE with DW_TAG_subprogram can have
abstract_origin attribute, so we handle that situation as well.

Differential Revision: https://reviews.llvm.org/D101025

3 years ago[mlir][linalg] Remove IndexedGenericOp support from LinalgToLoops...
Tobias Gysi [Tue, 11 May 2021 06:52:44 +0000 (06:52 +0000)]
[mlir][linalg] Remove IndexedGenericOp support from LinalgToLoops...

after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612).

Differential Revision: https://reviews.llvm.org/D102187

3 years ago[mlir][linalg] Remove IndexedGenericOp support from Fusion...
Tobias Gysi [Tue, 11 May 2021 06:27:38 +0000 (06:27 +0000)]
[mlir][linalg] Remove IndexedGenericOp support from Fusion...

after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612).

Differential Revision: https://reviews.llvm.org/D102174

3 years ago[libcxx] deprecates/removes `std::raw_storage_iterator`
Christopher Di Bella [Sun, 2 May 2021 20:39:35 +0000 (20:39 +0000)]
[libcxx] deprecates/removes `std::raw_storage_iterator`

C++17 deprecates `std::raw_storage_iterator` and C++20 removes it.

Implements part of:
  * P0174R2 'Deprecating Vestigial Library Parts in C++17'
  * P0619R4 'Reviewing Deprecated Facilities of C++17 for C++20'

Differential Revision: https://reviews.llvm.org/D101730

3 years ago[libcxx] makes comparison operators for `std::*_ordering` types hidden friends
Christopher Di Bella [Sat, 1 May 2021 19:00:36 +0000 (19:00 +0000)]
[libcxx] makes comparison operators for `std::*_ordering` types hidden friends

The standard leaves it up to the implementation to decide whether or not
these operators are hidden friends. There are several (well-documented)
reasons to prefer hidden friends, as well as an argument for improved
readability.

Depends on D100342.

Differential Revision: https://reviews.llvm.org/D101707

3 years ago[libcxx] removes operator!= and globally guards against no spaceship operator
Christopher Di Bella [Mon, 12 Apr 2021 20:55:05 +0000 (20:55 +0000)]
[libcxx] removes operator!= and globally guards against no spaceship operator

* `operator!=` isn't in the spec
* `<compare>` is designed to work with `operator<=>` so it doesn't
  really make sense to have `operator<=>`-less friendly sections.

Depends on D100283.

Differential Revision: https://reviews.llvm.org/D100342

3 years ago[clangd][remote-client] Set HasMore to true for failure
Kadir Cetinkaya [Wed, 5 May 2021 15:46:46 +0000 (17:46 +0200)]
[clangd][remote-client] Set HasMore to true for failure

Currently client was setting the HasMore to true iff stream said so.
Hence if we had a broken stream for whatever reason (e.g. hitting deadline for a
huge response), HasMore would be false, which is semantically incorrect (e.g.
will throw rename off).

Differential Revision: https://reviews.llvm.org/D101915

3 years ago[clangd][index-sever] Limit results in repsonse
Kadir Cetinkaya [Wed, 5 May 2021 15:36:20 +0000 (17:36 +0200)]
[clangd][index-sever] Limit results in repsonse

This is to prevent server from being DOS'd by possible malicious
parties issuing requests that can yield huge responses.

One possible drawback is on rename workflow. As it really requests all
occurences, but it has an internal limit on 50 files currently.
We are putting the limit on 10000 elements per response So for rename to regress
one should have 10k refs to a symbol in less than 50 files. This seems unlikely
and we fix it if there are complaints by giving up on the response based on the
number of files covered instead.

Differential Revision: https://reviews.llvm.org/D101914

3 years ago[mlir][linalg] Remove IndexedGenericOp support from Tiling...
Tobias Gysi [Tue, 11 May 2021 05:51:45 +0000 (05:51 +0000)]
[mlir][linalg] Remove IndexedGenericOp support from Tiling...

after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612).

Differential Revision: https://reviews.llvm.org/D102176

3 years ago[LLD] Improve reporting unresolved symbols in shared libraries
Igor Kudrin [Thu, 6 May 2021 13:45:29 +0000 (20:45 +0700)]
[LLD] Improve reporting unresolved symbols in shared libraries

Currently, when reporting unresolved symbols in shared libraries, if an
undefined symbol is firstly seen in a regular object file that shadows
the reference for the same symbol in a shared object. As a result, the
error for the unresolved symbol in the shared library is not reported.
If referencing sections in regular object files are discarded because of
'--gc-sections', no reports about such symbols are generated, and the
linker finishes successfully, generating an output image that fails on
the run.

The patch fixes the issue by keeping symbols, which should be checked,
for each shared library separately.

Differential Revision: https://reviews.llvm.org/D101996

3 years ago[OpAsmParser] Refactor parseOptionalInteger to support wide integers, NFC.
Chris Lattner [Sun, 9 May 2021 01:46:30 +0000 (18:46 -0700)]
[OpAsmParser] Refactor parseOptionalInteger to support wide integers, NFC.

OpAsmParser (and DialectAsmParser) supports a pair of
parseInteger/parseOptionalInteger methods, which allow parsing a bare
integer into a C type of your choice (e.g. int8_t) using templates.  It
was implemented in terms of a virtual method call that is hard coded to
int64_t because "that should be big enough".

Change the virtual method hook to return an APInt instead.  This allows
asmparsers for custom ops to parse large integers if they want to, without
changing any of the clients of the fixed size C API.

Differential Revision: https://reviews.llvm.org/D102120

3 years ago[AMDGPU] Pre-commit tests for D102211
Carl Ritson [Tue, 11 May 2021 03:14:01 +0000 (12:14 +0900)]
[AMDGPU] Pre-commit tests for D102211

3 years ago[RISCV] Fix the calculation of the offset of Zvlsseg spilling.
Hsiangkai Wang [Mon, 10 May 2021 14:29:00 +0000 (22:29 +0800)]
[RISCV] Fix the calculation of the offset of Zvlsseg spilling.

For Zvlsseg spilling, we need to convert the pseudo instructions
into multiple vector load/store instructions with appropriate offsets.
For example, for PseudoVSPILL3_M2, we need to convert it to

VS2R %v2, %base
ADDI %base, %base, (vlenb x 2)
VS2R %v4, %base
ADDI %base, %base, (vlenb x 2)
VS2R %v6, %base

We need to keep the size of the offset in the pseudo spilling instructions.
In this case, it is (vlenb x 2).

In the original implementation, we use the size of frame objects divide the
number of vectors in zvlsseg types. The size of frame objects is not
necessary exactly the same as the spilling data. It may be larger than
it. So, we change it to (VLENB x LMUL) in this patch. The calculation is
more direct and easy to understand.

Differential Revision: https://reviews.llvm.org/D101869

3 years agoEnable export of FIR includes into the install tree
Renaud-K [Mon, 10 May 2021 23:41:29 +0000 (16:41 -0700)]
Enable export of FIR includes into the install tree
https://reviews.llvm.org/D102040

3 years ago[NFC][LSAN] Fix flaky multithreaded test
Vitaly Buka [Tue, 11 May 2021 00:32:36 +0000 (17:32 -0700)]
[NFC][LSAN] Fix flaky multithreaded test

3 years ago[gn build] Port e5d483f28a3a
LLVM GN Syncbot [Tue, 11 May 2021 00:19:33 +0000 (00:19 +0000)]
[gn build] Port e5d483f28a3a

3 years ago[ORC-RT] Add unit test infrastructure, extensible_rtti implementation, unit test
Lang Hames [Fri, 7 May 2021 04:50:53 +0000 (21:50 -0700)]
[ORC-RT] Add unit test infrastructure, extensible_rtti implementation, unit test

Add unit test infrastructure for the ORC runtime, plus a cut-down
extensible_rtti system and extensible_rtti unit test.

Removes the placeholder.cpp source file.

Differential Revision: https://reviews.llvm.org/D102080

3 years ago[libcxx][ranges] Add ranges::empty CPO.
zoecarver [Fri, 23 Apr 2021 19:34:22 +0000 (12:34 -0700)]
[libcxx][ranges] Add ranges::empty CPO.

Depends on D101079. Refs D101189.

Differential Revision: https://reviews.llvm.org/D101193

3 years ago[mlir][linalg] remove the -now- obsolete sparse support in linalg
Aart Bik [Mon, 10 May 2021 19:56:15 +0000 (12:56 -0700)]
[mlir][linalg] remove the -now- obsolete sparse support in linalg

All glue and clutter in the linalg ops has been replaced by proper
sparse tensor type encoding. This code is no longer needed. Thanks
to ntv@ for giving us a temporary home in linalg.

So long, and thanks for all the fish.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D102098

3 years ago[AMDGPU] Constant fold Intrinsic::amdgcn_perm
Stanislav Mekhanoshin [Mon, 10 May 2021 22:42:47 +0000 (15:42 -0700)]
[AMDGPU] Constant fold Intrinsic::amdgcn_perm

Differential Revision: https://reviews.llvm.org/D102203

3 years ago[gn build] Port 3b8d2be52725
LLVM GN Syncbot [Mon, 10 May 2021 23:06:37 +0000 (23:06 +0000)]
[gn build] Port 3b8d2be52725

3 years agoReland: "[lld][WebAssembly] Initial support merging string data"
Sam Clegg [Sat, 27 Feb 2021 00:09:32 +0000 (16:09 -0800)]
Reland: "[lld][WebAssembly] Initial support merging string data"

This change was originally landed in: 5000a1b4b9edeb9e994f2a5b36da8d48599bea49
It was reverted in: 061e071d8c9b98526f35cad55a918a4f1615afd4

This change adds support for a new WASM_SEG_FLAG_STRINGS flag in
the object format which works in a similar fashion to SHF_STRINGS
in the ELF world.

Unlike the ELF linker this support is currently limited:
- No support for SHF_MERGE (non-string merging)
- Always do full tail merging ("lo" can be merged with "hello")
- Only support single byte strings (p2align 0)

Like the ELF linker merging is only performed at `-O1` and above.

This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828,
although crucially it doesn't not currently support debug sections
because they are not represented by data segments (they are custom
sections)

Differential Revision: https://reviews.llvm.org/D97657

3 years ago[mlir][Tensor] Add folding for tensor.from_elements
Benjamin Kramer [Mon, 10 May 2021 21:19:59 +0000 (23:19 +0200)]
[mlir][Tensor] Add folding for tensor.from_elements

This trivially folds into a constant when all operands are constant.

Differential Revision: https://reviews.llvm.org/D102199

3 years ago[AArch64][GlobalISel] Add post-legalizer lowering for NEON vector fcmps
Jessica Paquette [Mon, 3 May 2021 19:21:11 +0000 (12:21 -0700)]
[AArch64][GlobalISel] Add post-legalizer lowering for NEON vector fcmps

This is roughly equivalent to the floating point portion of
`AArch64TargetLowering::LowerVSETCC`. Main part that's missing is the v4s16 bit.

This also adds helpers equivalent to `EmitVectorComparison`, and
`changeVectorFPCCToAArch64CC`. This moves `changeFCMPPredToAArch64CC` out of
the selector into AArch64GlobalISelUtils for the sake of code reuse.

This is done in post-legalizer lowering with pseudos to simplify selection.
The imported patterns end up handling selection for us this way.

Differential Revision: https://reviews.llvm.org/D101782

3 years agoRevert "[lld][WebAssembly] Initial support merging string data"
Nico Weber [Mon, 10 May 2021 22:27:45 +0000 (18:27 -0400)]
Revert "[lld][WebAssembly] Initial support merging string data"

This reverts commit 5000a1b4b9edeb9e994f2a5b36da8d48599bea49.
Breaks tests, see https://reviews.llvm.org/D97657#2749151

Easily repros locally with `ninja check-llvm-mc-webassembly`.

3 years ago[AArch64][GlobalISel] Enable memcpy family combines on minsize functions
Jessica Paquette [Mon, 10 May 2021 21:06:42 +0000 (14:06 -0700)]
[AArch64][GlobalISel] Enable memcpy family combines on minsize functions

The combines in `tryCombineMemCpyFamily` have heuristics (e.g.
`TLI.getMaxStoresPerMemset`) which consider size. So, theoretically, enabling
these combines on minsize functions shouldn't be harmful.

With this enabled we save 0.9% geomean on CTMark at -Oz, and 5.1% on Bullet.
There are no code size regressions.

Differential Revision: https://reviews.llvm.org/D102198

3 years agoPre-commit test case for D101970
Guozhi Wei [Mon, 10 May 2021 21:47:54 +0000 (14:47 -0700)]
Pre-commit test case for D101970

This is a test case for D101970, which shows the optimization opportunity for

    lea (reg1, reg2), reg3
    sub reg3, reg4

to

    sub reg1, reg4
    sub reg2, reg4

Differential Revision: https://reviews.llvm.org/D102010

3 years ago[Hexagon] Handle loads and stores of scalar predicate vectors
Krzysztof Parzyszek [Mon, 10 May 2021 20:26:57 +0000 (15:26 -0500)]
[Hexagon] Handle loads and stores of scalar predicate vectors

Handle v2i1, v4i1, and v8i1.

3 years agoClangd Matchers.h: Fix -Wdeprecated-copy by making the defaulted copy ctor and delete...
David Blaikie [Mon, 10 May 2021 21:30:22 +0000 (14:30 -0700)]
Clangd Matchers.h: Fix -Wdeprecated-copy by making the defaulted copy ctor and deleted copy assignment operators explicit

3 years agoRemove some unnecessary explicit defaulted copy ctors to cleanup -Wdeprecated-copy
David Blaikie [Mon, 10 May 2021 21:28:09 +0000 (14:28 -0700)]
Remove some unnecessary explicit defaulted copy ctors to cleanup -Wdeprecated-copy

These types also wanted to be/were copy assignable, and using the
implicit copy ctor is deprecated in the presence of an explicit copy
ctor.

Removing the explicit copy ctor provides the desired behavior - both
ctor and assignment operator are available implicitly.

Also while I was nearby there were some missing std::moves on shared
pointer parameters.

3 years ago[InstCombine] fold extract subvector of bitcast insertelt
Sanjay Patel [Mon, 10 May 2021 21:20:10 +0000 (17:20 -0400)]
[InstCombine] fold extract subvector of bitcast insertelt

This is visible in the original example from:
https://llvm.org/PR50055
(but this change doesn't solve the bug)

https://alive2.llvm.org/ce/z/vM_Yq-

3 years ago[InstCombine] add tests for extract-subvector of insert; NFC
Sanjay Patel [Mon, 10 May 2021 20:32:52 +0000 (16:32 -0400)]
[InstCombine] add tests for extract-subvector of insert; NFC

3 years ago[clang-tidy] Aliasing: Add support for aggregates with references.
Artem Dergachev [Mon, 3 May 2021 21:32:37 +0000 (14:32 -0700)]
[clang-tidy] Aliasing: Add support for aggregates with references.

When a variable is used in an initializer of an aggregate
for its reference-type field this counts as aliasing.

Differential Revision: https://reviews.llvm.org/D101791

3 years ago[clang-tidy] Aliasing: Add more support for captures.
Artem Dergachev [Mon, 26 Apr 2021 20:52:01 +0000 (13:52 -0700)]
[clang-tidy] Aliasing: Add more support for captures.

D96215 takes care of the situation where the variable is captured into
a nearby lambda. This patch takes care of the situation where
the current function is the lambda and the variable is one of its captures
from an enclosing scope.

The analogous problem for ^{blocks} is already handled automagically
by D96215.

Differential Revision: https://reviews.llvm.org/D101787

3 years ago[clang-tidy] Aliasing: Add support for captures.
Artem Dergachev [Mon, 26 Apr 2021 20:47:36 +0000 (13:47 -0700)]
[clang-tidy] Aliasing: Add support for captures.

The utility function clang::tidy::utils::hasPtrOrReferenceInFunc() scans the
function for pointer/reference aliases to a given variable. It currently scans
for operator & over that variable and for declarations of references to that
variable.

This patch makes it also scan for C++ lambda captures by reference
and for Objective-C block captures.

Differential Revision: https://reviews.llvm.org/D96215

3 years ago[X86] AMD Zen 3: same-reg AVX YMM VPCMP is dep breaking one-idiom
Roman Lebedev [Mon, 10 May 2021 20:44:53 +0000 (23:44 +0300)]
[X86] AMD Zen 3: same-reg AVX YMM VPCMP is dep breaking one-idiom

As measured by exegesis, and confirmed by ref docs.
Still not zero-cycle :)

3 years ago[NFC][X86][MCA] AMD Zen 3: add tests for same-re AVX YMM VPCMP
Roman Lebedev [Mon, 10 May 2021 20:43:59 +0000 (23:43 +0300)]
[NFC][X86][MCA] AMD Zen 3: add tests for same-re AVX YMM VPCMP

3 years ago[X86] AMD Zen 3: same-reg AVX XMM VPCMP is dep breaking one-idiom
Roman Lebedev [Mon, 10 May 2021 20:40:34 +0000 (23:40 +0300)]
[X86] AMD Zen 3: same-reg AVX XMM VPCMP is dep breaking one-idiom

As measured by exegesis, and confirmed by ref docs.
Again, it's not zero-cycle.

3 years ago[NFC][X86][MCA] AMD Zen 3: add tests for same-re AVX XMM VPCMP
Roman Lebedev [Mon, 10 May 2021 20:36:28 +0000 (23:36 +0300)]
[NFC][X86][MCA] AMD Zen 3: add tests for same-re AVX XMM VPCMP

3 years ago[X86] AMD Zen 3: same-reg SSE XMM PCMP is dep breaking one-idiom
Roman Lebedev [Mon, 10 May 2021 20:36:08 +0000 (23:36 +0300)]
[X86] AMD Zen 3: same-reg SSE XMM PCMP is dep breaking one-idiom

As measured by exegesis, and confirmed by ref docs.
Much like with MMX PCMP, it does actually have to execute, though.

3 years ago[NFC][X86][MCA] AMD Zen 3: add tests for same-reg XMM SSE PCMP
Roman Lebedev [Mon, 10 May 2021 20:28:27 +0000 (23:28 +0300)]
[NFC][X86][MCA] AMD Zen 3: add tests for same-reg XMM SSE PCMP

3 years ago[X86] AMD Zen 3: same-reg PCMPEQ is an MMX all-ones dep breaking idiom
Roman Lebedev [Mon, 10 May 2021 19:52:15 +0000 (22:52 +0300)]
[X86] AMD Zen 3: same-reg PCMPEQ is an MMX all-ones dep breaking idiom

They are, however, not zero-cycle, and do actually execute.

As measured by exegesis, and confirmed by ref docs.

3 years ago[NFC][X86][MCA] AMD Zen 3: add tests for same-reg MMX PCMPEQ
Roman Lebedev [Mon, 10 May 2021 19:46:20 +0000 (22:46 +0300)]
[NFC][X86][MCA] AMD Zen 3: add tests for same-reg MMX PCMPEQ

3 years ago[libcxx] removes `weak_equality` and `strong_equality` from <compare>
Christopher Di Bella [Mon, 12 Apr 2021 05:23:09 +0000 (05:23 +0000)]
[libcxx] removes `weak_equality` and `strong_equality` from <compare>

`weak_equality` and `strong_equality` were removed before being
standardised, and need to be removed.

Also adjusts `common_comparison_category` since its test needed
adjusting due to the equality deletions.

Differential Revision: https://reviews.llvm.org/D100283

3 years ago[test] Put aix-xcoff-huge-relocs.ll under expensive checks
Arthur Eubanks [Mon, 10 May 2021 20:18:00 +0000 (13:18 -0700)]
[test] Put aix-xcoff-huge-relocs.ll under expensive checks

It is an order of magnitude slower than the second slowest test
according to obj/llvm/test/.lit_test_times.txt.

The two slowest are:
 2.870437e+02 CodeGen/PowerPC/aix-xcoff-huge-relocs.ll
 2.850697e+01 tools/llvm-readobj/ELF/file-header-machine-types.test

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D102190

3 years ago[mlir] Fix windows build bot break due to use of `alloca` in a test.
Stella Laurenzo [Mon, 10 May 2021 20:03:30 +0000 (20:03 +0000)]
[mlir] Fix windows build bot break due to use of `alloca` in a test.

Differential Revision: https://reviews.llvm.org/D102189

3 years ago[mlir][Python] Finish adding RankedTensorType support for encoding.
Stella Laurenzo [Mon, 10 May 2021 18:03:40 +0000 (18:03 +0000)]
[mlir][Python] Finish adding RankedTensorType support for encoding.

Differential Revision: https://reviews.llvm.org/D102184

3 years ago[InstCombine] Fold comparison of integers by parts
Nikita Popov [Sat, 24 Apr 2021 14:18:56 +0000 (16:18 +0200)]
[InstCombine] Fold comparison of integers by parts

Let's say you represent (i32, i32) as an i64 from which the parts
are extracted with lshr/trunc. Then, if you compare two tuples by
parts you get something like A[0] == B[0] && A[1] == B[1], just
that the part extraction happens by lshr/trunc and not a narrow
load or similar.

The fold implemented here reduces such equality comparisons by
converting them into a comparison on a larger part of the integer
(which might be the whole integer). It handles both the "and of eq"
and the conjugated "or of ne" case.

I'm being conservative with one-use for now, though this could be
relaxed if profitable (the base pattern converts 11 instructions
into 5 instructions, but there's quite a few variations on how it
can play out).

Differential Revision: https://reviews.llvm.org/D101232

3 years ago[VecLib] Add support for vector fns from Darwin's libsystem.
Florian Hahn [Mon, 10 May 2021 19:49:19 +0000 (20:49 +0100)]
[VecLib] Add support for vector fns from Darwin's libsystem.

This patch adds support for Darwin's libsystem math vector functions to
TLI. Darwin's libsystem provides a range of vector functions for libm
functions.

This initial patch only adds the 2 x double and 4 x float versions,
which are available on both X86 and ARM64. On X86, wider vector versions
are supported as well.

Reviewed By: jroelofs

Differential Revision: https://reviews.llvm.org/D101856

3 years ago[lld][WebAssembly] Initial support merging string data
Sam Clegg [Sat, 27 Feb 2021 00:09:32 +0000 (16:09 -0800)]
[lld][WebAssembly] Initial support merging string data

This change adds support for a new WASM_SEG_FLAG_STRINGS flag in
the object format which works in a similar fashion to SHF_STRINGS
in the ELF world.

Unlike the ELF linker this support is currently limited:
- No support for SHF_MERGE (non-string merging)
- Always do full tail merging ("lo" can be merged with "hello")
- Only support single byte strings (p2align 0)

Like the ELF linker merging is only performed at `-O1` and above.

This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828,
although crucially it doesn't not currently support debug sections
because they are not represented by data segments (they are custom
sections)

Differential Revision: https://reviews.llvm.org/D97657

3 years ago[NFC] Use ArgListEntry indirect types more in ISel lowering
Arthur Eubanks [Sun, 2 May 2021 04:27:47 +0000 (21:27 -0700)]
[NFC] Use ArgListEntry indirect types more in ISel lowering

For opaque pointers, we're trying to avoid uses of
PointerType::getElementType().

A couple of ISel places use PointerType::getElementType(). Some of these
are easy to fix by using ArgListEntry's indirect types.

The inalloca type wasn't stored there, as opposed to preallocated and
byval which have their indirect types available, so add it and use it.

Differential Revision: https://reviews.llvm.org/D101713

3 years ago[ORC] Use a unique_function rather than std::function for dispatchTask.
Lang Hames [Mon, 10 May 2021 19:34:52 +0000 (12:34 -0700)]
[ORC] Use a unique_function rather than std::function for dispatchTask.

3 years ago[Inliner] Fix noalias metadata handling for instructions simplified during cloning...
Nikita Popov [Sat, 8 May 2021 15:05:05 +0000 (17:05 +0200)]
[Inliner] Fix noalias metadata handling for instructions simplified during cloning (PR50270)

Instead of using VMap, which may include instructions from the
caller as a result of simplification, iterate over the
(FirstNewBlock, Caller->end()) range, which will only include new
instructions.

Fixes https://bugs.llvm.org/show_bug.cgi?id=50270.

Differential Revision: https://reviews.llvm.org/D102110

3 years ago[Scudo] Use GWP-ASan's aligned allocations and fixup postalloc hooks.
Mitch Phillips [Mon, 10 May 2021 19:19:19 +0000 (12:19 -0700)]
[Scudo] Use GWP-ASan's aligned allocations and fixup postalloc hooks.

This patch does a few cleanup things:
 1. The non-standalone scudo has a problem where GWP-ASan allocations
 may not meet alignment requirements where Scudo was requested to have
 alignment >= 16. Use the new GWP-ASan API to fix this.
 2. The standalone variant loses some debugging information inside of
 GWP-ASan because we ask GWP-ASan to allocate an aligned size in the
 frontend. This means reports end up with 'UaF on a 16-byte allocation'
 for a 1-byte allocation with 16-byte alignment. Also use the new API to
 fix this.
 3. Add post-alloc hooks for GWP-ASan intercepted allocations, and add
 stats tracking for GWP-ASan allocations.
 4. Add a small test that checks the alignment of the frontend
 allocator, so that it can be used under GWP-ASan torture mode.
 5. Add GWP-ASan torture mode as a testing configuration to catch these
 regressions.

Depends on D94830, D95889.

Reviewed By: cryptoad

Differential Revision: https://reviews.llvm.org/D95884

3 years ago[mlir][sparse] complete migration to sparse tensor type
Aart Bik [Mon, 10 May 2021 17:34:21 +0000 (10:34 -0700)]
[mlir][sparse] complete migration to sparse tensor type

A very elaborate, but also very fun revision because all
puzzle pieces are finally "falling in place".

1. replaces lingalg annotations + flags with proper sparse tensor types
2. add rigorous verification on sparse tensor type and sparse primitives
3. removes glue and clutter on opaque pointers in favor of sparse tensor types
4. migrates all tests to use sparse tensor types

NOTE: next CL will remove *all* obsoleted sparse code in Linalg

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D102095

3 years ago[lld-macho] Fix order file arch filtering
Jez Ng [Mon, 10 May 2021 19:45:20 +0000 (15:45 -0400)]
[lld-macho] Fix order file arch filtering

We had a hardcoded check and a stale TODO, written back when we only had
support for one architecture.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D102154

3 years ago[lld-macho] Treat undefined symbols uniformly
Jez Ng [Mon, 10 May 2021 19:45:18 +0000 (15:45 -0400)]
[lld-macho] Treat undefined symbols uniformly

In particular, we should apply the `-undefined` behavior to all
such symbols, include those that are specified via the command line
(i.e.  `-e`, `-u`, and `-exported_symbol`). ld64 supports this too.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D102143

3 years ago[lld-macho][nfc] Clean up tests
Jez Ng [Mon, 10 May 2021 02:09:17 +0000 (22:09 -0400)]
[lld-macho][nfc] Clean up tests

* Remove unnecessary `rm -rf %t`s
* Have lc-linker-option.ll use the right comment marker

3 years ago[PowerPC] Spilling to registers does not require frame index scavenging
Stefan Pintilie [Mon, 10 May 2021 18:10:11 +0000 (13:10 -0500)]
[PowerPC] Spilling to registers does not require frame index scavenging

If spills are to registers instead of to the stack then a copy will be used
and frame index scavenging is not required.

This patch adds debug info to frame index scavenging and makes sure that
spilling to registers does not cause frame index scavenging.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D101360

3 years ago[TargetLowering] Only inspect attributes in the arguments for ArgListEntry
Arthur Eubanks [Tue, 4 May 2021 01:00:50 +0000 (18:00 -0700)]
[TargetLowering] Only inspect attributes in the arguments for ArgListEntry

Parameter attributes are considered part of the function [1], and like
mismatched calling conventions [2], we can't have the verifier check for
mismatched parameter attributes.

[1] https://llvm.org/docs/LangRef.html#parameter-attributes
[2] https://llvm.org/docs/FAQ.html#why-does-instcombine-simplifycfg-turn-a-call-to-a-function-with-a-mismatched-calling-convention-into-unreachable-why-not-make-the-verifier-reject-it

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D101806

3 years ago[mlir][linalg] Restrict distribution to parallel dims
Lei Zhang [Mon, 10 May 2021 19:17:14 +0000 (15:17 -0400)]
[mlir][linalg] Restrict distribution to parallel dims

According to the API contract, LinalgLoopDistributionOptions
expects to work on parallel iterators. When getting processor
information, only loop ranges for parallel dimensions should
be fed in. But right now after generating scf.for loop nests,
we feed in *all* loops, including the ones materialized for
reduction iterators. This can cause unexpected distribution
of reduction dimensions. This commit fixes it.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D102079

3 years ago[libc] Rever "Simplifies multi implementations and benchmarks".
Siva Chandra Reddy [Mon, 10 May 2021 19:20:27 +0000 (19:20 +0000)]
[libc] Rever "Simplifies multi implementations and benchmarks".

This reverts commit 541f107871bc9c020925a6e5342542a47c902d12 as the bots
are failing with unknown architecture "x86-64-v*". Will let the original
author decide on the right course of action to correct the problem and
reland.

3 years ago[scudo] [GWP-ASan] Add GWP-ASan variant of scudo benchmarks.
Mitch Phillips [Mon, 10 May 2021 18:59:45 +0000 (11:59 -0700)]
[scudo] [GWP-ASan] Add GWP-ASan variant of scudo benchmarks.

GWP-ASan is the "production" variant as compiled by compiler-rt, and it's useful to be able to benchmark changes in GWP-ASan or Scudo's GWP-ASan hooks across versions. GWP-ASan is sampled, and sampled allocations are much slower, but given the amount of allocations that happen under test here - we actually get a reasonable representation of GWP-ASan's negligent performance impact between runs.

Reviewed By: cryptoad

Differential Revision: https://reviews.llvm.org/D101865

3 years ago[RISCV] Validate the SEW and LMUL operands to __builtin_rvv_vsetvli(max)
Craig Topper [Mon, 10 May 2021 18:30:45 +0000 (11:30 -0700)]
[RISCV] Validate the SEW and LMUL operands to __builtin_rvv_vsetvli(max)

These are required to be constants, this patch makes sure they
are in the accepted range of values.

These are usually created by wrappers in the riscv_vector.h header
which should always be correct. This patch protects against a user
using the builtin directly.

Reviewed By: khchen

Differential Revision: https://reviews.llvm.org/D102086

3 years ago[GlobalISel][IRTranslator] Fix bit-test lowering dropping phi edges.
Amara Emerson [Mon, 10 May 2021 18:12:14 +0000 (11:12 -0700)]
[GlobalISel][IRTranslator] Fix bit-test lowering dropping phi edges.

For contiguous ranges we drop the last bit-test case but in doing so we skip
adding the new MBB PHI edges to the list of replacement PHI edges, and as a
result we incorrectly omit them in the G_PHI in finishPendingPhis().

Was found when bootstrapping clang with -O3 and GlobalISel enabled on Apple Silicon.

3 years ago[PassManager] add helper function to hold set of vector passes (2nd try)
Sanjay Patel [Mon, 10 May 2021 17:55:42 +0000 (13:55 -0400)]
[PassManager] add helper function to hold set of vector passes (2nd try)

This is better no-functional-change-intended than the 1st attempt.
As noted in D102002, there were at least 2 diffs that went
unchecked in pass manager regressions tests: different pass
parameters (SimplifyCFG) and an extension point/callback.
Those should be lifted from the original code blocks correctly
now.

3 years ago[mlir][Python] Re-export cext sparse_tensor module to the public namespace.
Stella Laurenzo [Mon, 10 May 2021 17:42:24 +0000 (17:42 +0000)]
[mlir][Python] Re-export cext sparse_tensor module to the public namespace.

* This was left out of the previous commit accidentally.

Differential Revision: https://reviews.llvm.org/D102183

3 years ago[X86] AMD Zen 3: sub-32-bit CMP also break dependencies
Roman Lebedev [Mon, 10 May 2021 17:52:30 +0000 (20:52 +0300)]
[X86] AMD Zen 3: sub-32-bit CMP also break dependencies

They measure as having the same effect as 32-bit CMP.

3 years ago[NFC][X86][MCA] AMD Zen 3: add tests for sub-32-bit CMP dep breaking
Roman Lebedev [Mon, 10 May 2021 17:48:41 +0000 (20:48 +0300)]
[NFC][X86][MCA] AMD Zen 3: add tests for sub-32-bit CMP dep breaking

3 years ago[X86][AVX] Add example of failure to remove a 256-bit permute(hadd(hadd(),hadd()...
Simon Pilgrim [Mon, 10 May 2021 17:43:02 +0000 (18:43 +0100)]
[X86][AVX] Add example of failure to remove a 256-bit permute(hadd(hadd(),hadd())) shuffle by reordering the packed operands.

3 years ago[X86][SSE] canonicalizeShuffleMaskWithHorizOp - add TODO for better 256/512-bit shuff...
Simon Pilgrim [Mon, 10 May 2021 17:41:05 +0000 (18:41 +0100)]
[X86][SSE] canonicalizeShuffleMaskWithHorizOp - add TODO for better 256/512-bit shuffle+hop folding support. NFC.

3 years ago[lld-macho] Improve an external weak def test
Fangrui Song [Mon, 10 May 2021 17:35:44 +0000 (10:35 -0700)]
[lld-macho] Improve an external weak def test

The rebase table entry is untested.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D102150

3 years ago[Dependence Analysis] Enable delinearization of fixed sized arrays
Andy Kaylor [Mon, 10 May 2021 17:01:43 +0000 (10:01 -0700)]
[Dependence Analysis] Enable delinearization of fixed sized arrays

Patch by Artem Radzikhovskyy!

Allow delinearization of fixed sized arrays if we can prove that the GEP indices do not overflow the array dimensions. The checks applied are similar to the ones that are used for delinearization of parametric size arrays. Make sure that the GEP indices are non-negative and that they are smaller than the range of that dimension.

Changes Summary:

- Updated the LIT tests with more exact values, as we are able to delinearize and apply more exact tests
- profitability.ll - now able to delinearize in all cases, no need to use -da-disable-delinearization-checks flag and run the test twice
- loop-interchange-optimization-remarks.ll - in one of the cases we are able to delinearize without using -da-disable-delinearization-checks
- SimpleSIVNoValidityCheckFixedSize.ll - removed unnecessary "-da-disable-delinearization-checks" flag. Now can get the exact answer without it.
- SimpleSIVNoValidityCheckFixedSize.ll and PreliminaryNoValidityCheckFixedSize.ll - made negative tests more explicit, in order to demonstrate the need for "-da-disable-delinearization-checks" flag

Differential Revision: https://reviews.llvm.org/D101486

3 years ago[mlir][Python] Upstream the PybindAdaptors.h helpers and use it to implement sparse_t...
Stella Laurenzo [Mon, 10 May 2021 01:09:09 +0000 (18:09 -0700)]
[mlir][Python] Upstream the PybindAdaptors.h helpers and use it to implement sparse_tensor.encoding.

* The PybindAdaptors.h file has been evolving across different sub-projects (npcomp, circt) and has been successfully used for out of tree python API interop/extensions and defining custom types.
* Since sparse_tensor.encoding is the first in-tree custom attribute we are supporting, it seemed like the right time to upstream this header and use it to define the attribute in a way that we can support for both in-tree and out-of-tree use (prior, I had not wanted to upstream dead code which was not used in-tree).
* Adapted the circt version of `mlir_type_subclass`, also providing an `mlir_attribute_subclass`. As we get a bit of mileage on this, I would like to transition the builtin types/attributes to this mechanism and delete the old in-tree only `PyConcreteType` and `PyConcreteAttribute` template helpers (which cannot work reliably out of tree as they depend on internals).
* Added support for defaulting the MlirContext if none is passed so that we can support the same idioms as in-tree versions.

There is quite a bit going on here and I can split it up if needed, but would prefer to keep the first use and the header together so sending out in one patch.

Differential Revision: https://reviews.llvm.org/D102144

3 years ago[lld][WebAssembly] Disallow exporting of TLS symbols
Sam Clegg [Fri, 7 May 2021 03:29:05 +0000 (20:29 -0700)]
[lld][WebAssembly] Disallow exporting of TLS symbols

Cross module TLS is currently not supported by our ABI.  This
change makes explicitly exporting a TLS symbol into an error
and prevents implicit exporting (via --export-all).

See https://github.com/emscripten-core/emscripten/issues/14120

Differential Revision: https://reviews.llvm.org/D102044

3 years ago[cmake] Enable -Wmisleading-indentation
Dave Lee [Fri, 7 May 2021 21:16:47 +0000 (14:16 -0700)]
[cmake] Enable -Wmisleading-indentation

Enable `-Wmisleading-indentation` to balance with the LLVM style of optional parentheses.

Differential Revision: https://reviews.llvm.org/D102092

3 years ago[mlir][CAPI] Add CAPI bindings for the sparse_tensor dialect.
Stella Laurenzo [Sun, 9 May 2021 23:14:05 +0000 (16:14 -0700)]
[mlir][CAPI] Add CAPI bindings for the sparse_tensor dialect.

* Adds dialect registration, hand coded 'encoding' attribute and test.
* An MLIR CAPI tablegen backend for attributes does not exist, and this is a relatively complicated case. I opted to hand code it in a canonical way for now, which will provide a reasonable blueprint for building out the tablegen version in the future.
* Also added a (local) CMake function for declaring new CAPI tests, since it was getting repetitive/buggy.

Differential Revision: https://reviews.llvm.org/D102141

3 years ago[X86][SSE] Add examples of failures to remove a permute(pack(pack(),pack())) shuffle...
Simon Pilgrim [Mon, 10 May 2021 16:50:26 +0000 (17:50 +0100)]
[X86][SSE] Add examples of failures to remove a permute(pack(pack(),pack())) shuffle by reordering the packed operands.

3 years ago[RISCV] Correct VL for fixed length masked scatter.
Craig Topper [Mon, 10 May 2021 16:34:32 +0000 (09:34 -0700)]
[RISCV] Correct VL for fixed length masked scatter.

We were incorrectly calling getVectorNumElements on a scalable
vector type. This shouldn't be allowed. This gives a warning on
EVT, but not MVT.

3 years ago[Demangle][Rust] Parse basic types
Tomasz Miąsko [Mon, 10 May 2021 15:58:20 +0000 (08:58 -0700)]
[Demangle][Rust] Parse basic types

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102142

3 years ago[clang] Support -fpic -fno-semantic-interposition for AArch64
Fangrui Song [Mon, 10 May 2021 16:43:33 +0000 (09:43 -0700)]
[clang] Support -fpic -fno-semantic-interposition for AArch64

-fno-semantic-interposition (only effective with -fpic) can optimize default
visibility external linkage (non-ifunc-non-COMDAT) variable access and function
calls to avoid GOT/PLT, by using local aliases, e.g.
```
int var;
__attribute__((optnone)) int fun(int x) { return x * x; }
int test() { return fun(var); }
```

-fpic (var and fun are dso_preemptable)
```
test:                                   // @test
        adrp    x8, :got:var
        ldr     x8, [x8, :got_lo12:var]
        ldr     w0, [x8]
// fun is preemptible by default in ld -shared mode. ld will create a PLT.
        b       fun
```

vs -fpic -fno-semantic-interposition (var and fun are dso_local)
```
test:                                   // @test
.Ltest$local:
        adrp    x8, .Lvar$local
        ldr     w0, [x8, :lo12:.Lvar$local]
// The assembler either resolves .Lfun$local at assembly time, or produces a
// relocation referencing a non-preemptible section symbol (which can avoid PLT).
        b       .Lfun$local
```

Note: Clang's default -fpic is more aggressive than GCC -fpic: interprocedural
optimizations (including inlining) are available but local aliases are not used.
-fpic -fsemantic-interposition can disable interprocedural optimizations.

Depends on D101872

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D101873

3 years ago[ORC] Update SpeculativeJIT example for dispatchTask changes in 5344c88dcb2.
Lang Hames [Mon, 10 May 2021 16:20:59 +0000 (09:20 -0700)]
[ORC] Update SpeculativeJIT example for dispatchTask changes in 5344c88dcb2.

3 years ago[llvm-nm] Help option output should be consistent with the command guide
gbreynoo [Mon, 10 May 2021 16:25:41 +0000 (17:25 +0100)]
[llvm-nm] Help option output should be consistent with the command guide

The nm command guide shows the short options used as aliases but these
are not found in the help text unless --show-hidden is used, other tools
show aliases with --help. This change fixes the help output to be
consistent with the command guide.

Differential Revision: https://reviews.llvm.org/D102072

3 years ago[llvm-symbolizer] Update Command Guide
gbreynoo [Mon, 10 May 2021 16:19:05 +0000 (17:19 +0100)]
[llvm-symbolizer] Update Command Guide

The option --use-symbol-table is now a noop and does not appear in the
help text, however it still appears in the command guide. This change
removes it from the command guide and updates the description of
--output-style .

Differential Revision: https://reviews.llvm.org/D102078

3 years ago[X86][SSE] Add tests for missing shuffle(pack(x,y),pack(z,w)) -> permute(pack())...
Simon Pilgrim [Mon, 10 May 2021 16:17:21 +0000 (17:17 +0100)]
[X86][SSE] Add tests for missing shuffle(pack(x,y),pack(z,w)) -> permute(pack()) folds.

3 years ago[X86][SSE] Merge equal X32/X64 check prefixes. NFCI.
Simon Pilgrim [Mon, 10 May 2021 16:04:44 +0000 (17:04 +0100)]
[X86][SSE] Merge equal X32/X64 check prefixes. NFCI.

3 years ago[llvm-objdump][MachO] Print a newline before lazy bind/bind/weak/exports trie
Fangrui Song [Mon, 10 May 2021 16:16:18 +0000 (09:16 -0700)]
[llvm-objdump][MachO] Print a newline before lazy bind/bind/weak/exports trie

This adds a separator between two pieces of information.

Reviewed By: #lld-macho, alexshap

Differential Revision: https://reviews.llvm.org/D102114

3 years ago[libc++][NFC] Remove _VSTD:: when not needed.
Mark de Wever [Sun, 9 May 2021 16:22:52 +0000 (18:22 +0200)]
[libc++][NFC] Remove _VSTD:: when not needed.

Reviewed By: #libc, Quuxplusone

Differential Revision: https://reviews.llvm.org/D102133

3 years ago[X86] Fix position-independent TType encoding
Harald van Dijk [Mon, 10 May 2021 16:04:33 +0000 (17:04 +0100)]
[X86] Fix position-independent TType encoding

The logic for x86_64 position-independent TType encodings was backwards,
using 8 bytes where 4 were wanted and 4 where 8 were wanted. For regular
x86_64, this was mostly harmless, exception tables are allowed to use
8-byte encodings even when it is not needed. For the large code model,
and for X32, however, the generated exception tables were wrong. For the
large code model, we cannot assume that the address will fit in 4 bytes.
For X32, we cannot use 64-bit relocations.

Fixes PR50148.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D102132

3 years ago[NFC] Synchronize reserved identifier code between macro and variables / symbols
serge-sans-paille [Mon, 10 May 2021 12:54:25 +0000 (14:54 +0200)]
[NFC] Synchronize reserved identifier code between macro and variables / symbols

Differential Revision: https://reviews.llvm.org/D102164

3 years ago[clang][AArch32] Correctly align HA arguments when passed on the stack
Momchil Velikov [Mon, 10 May 2021 13:53:21 +0000 (14:53 +0100)]
[clang][AArch32] Correctly align HA arguments when passed on the stack

Analogously to https://reviews.llvm.org/D98794 this patch uses the
`alignstack` attribute to fix incorrect passing of homogeneous
aggregate (HA) arguments on AArch32. The EABI/AAPCS was recently
updated to clarify how VFP co-processor candidates are aligned:
https://github.com/ARM-software/abi-aa/commit/4488e34998514dc7af5507236f279f6881eede62

Differential Revision: https://reviews.llvm.org/D100853