platform/upstream/llvm.git
2 years ago[libcxx] Throw correct exception from std::vector::reserve
Mikhail Maltsev [Thu, 21 Oct 2021 09:40:05 +0000 (10:40 +0100)]
[libcxx] Throw correct exception from std::vector::reserve

According to the standard [vector.capacity]/5, std::vector<T>::reserve
shall throw an exception of type std::length_error when the requested
capacity exceeds max_size().

This behavior is not implemented correctly: the function 'reserve'
simply propagates the exception from allocator<T>::allocate. Before
D110846 that exception used to be of type std::length_error (which is
correct for vector<T>::reserve, but incorrect for
allocator<T>::allocate).

This patch fixes the issue and adds regression tests.

Reviewed By: Quuxplusone, ldionne, #libc

Differential Revision: https://reviews.llvm.org/D112068

2 years ago[libcxx] Support allocators with explicit c-tors in vector<bool>
Mikhail Maltsev [Thu, 21 Oct 2021 09:38:56 +0000 (10:38 +0100)]
[libcxx] Support allocators with explicit c-tors in vector<bool>

std::vector<bool> rebinds the supplied allocator to construct objects
of type '__storage_type' rather than 'bool'. Allocators are allowed to
use explicit conversion constructors, so care must be taken when
performing conversions.

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D112150

2 years agoRevert "[fir] Add Character helper"
Valentin Clement [Thu, 21 Oct 2021 09:36:10 +0000 (11:36 +0200)]
Revert "[fir] Add Character helper"

This reverts commit e4ce92245c96cea9492767d7149eb9e30dee0d16.

Buildbots not happy with the tests.

2 years ago[clang] Support __float128 on DragonFlyBSD.
Frederic Cambus [Thu, 21 Oct 2021 09:18:52 +0000 (11:18 +0200)]
[clang] Support __float128 on DragonFlyBSD.

Differential Revision: https://reviews.llvm.org/D111760

2 years ago[docs] Fix broken link rendering in the LLVM Coding Standards.
Frederic Cambus [Thu, 21 Oct 2021 09:10:18 +0000 (11:10 +0200)]
[docs] Fix broken link rendering in the LLVM Coding Standards.

2 years ago[lldb] [Host/SerialPort] Add std::moves for better compatibility
Michał Górny [Thu, 21 Oct 2021 09:08:05 +0000 (11:08 +0200)]
[lldb] [Host/SerialPort] Add std::moves for better compatibility

2 years ago[lldb] [Host/Terminal] Add missing #ifdef for baudRateToConst()
Michał Górny [Thu, 21 Oct 2021 09:00:17 +0000 (11:00 +0200)]
[lldb] [Host/Terminal] Add missing #ifdef for baudRateToConst()

2 years ago[lldb] [unittest] Disable SetParity() tests on Linux entirely
Michał Górny [Thu, 21 Oct 2021 08:54:02 +0000 (10:54 +0200)]
[lldb] [unittest] Disable SetParity() tests on Linux entirely

Attempting to enable PARENB causes tcsetattr() to fail on the Debian
and Ubuntu buildbots, so let's skip these tests on Linux entirely.

2 years ago[lldb] Add serial:// protocol for connecting to serial port
Michał Górny [Thu, 7 Oct 2021 21:14:23 +0000 (23:14 +0200)]
[lldb] Add serial:// protocol for connecting to serial port

Add a new serial:// protocol along with SerialPort that provides a new
API to open serial ports.  The URL consists of serial device path
followed by URL-style options, e.g.:

    serial:///dev/ttyS0?baud=115200&parity=even

If no options are provided, the serial port is only set to raw mode
and the other attributes remain unchanged.  Attributes provided via
options are modified to the specified values.  Upon closing the serial
port, its original attributes are restored.

Differential Revision: https://reviews.llvm.org/D111355

2 years ago[NARY-REASSOCIATE][NFC] Simplify min/max handling
Evgeniy Brevnov [Wed, 20 Oct 2021 09:42:19 +0000 (16:42 +0700)]
[NARY-REASSOCIATE][NFC] Simplify min/max handling

In order to explore different variants of reassociation current implementation uses "swap in a loop" approach. Unfortunately, the implementation is more complicated than it could be. This is an attempt to streamline the code. New approach is to extract core functionality into a helper function and call it explicitly as many times as required.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D112128

2 years ago[mlir][linalg][bufferize][NFC] Change findValueInReverseUseDefChain signature
Matthias Springer [Thu, 21 Oct 2021 08:29:02 +0000 (17:29 +0900)]
[mlir][linalg][bufferize][NFC] Change findValueInReverseUseDefChain signature

This commit is in preparation for scf.if support.

* `condition` in findValueInReverseUseDefChain takes a Value instead of OpOperand*.
* Return a SetVector<Value> instead of a single Value. This SetVector always contains exactly one Value at the moment.

Differential Revision: https://reviews.llvm.org/D111928

2 years ago[SVE][Analysis] Tune the cost model according to the tune-cpu attribute
David Sherwood [Wed, 22 Sep 2021 09:54:05 +0000 (10:54 +0100)]
[SVE][Analysis] Tune the cost model according to the tune-cpu attribute

This patch introduces a new function:

  AArch64Subtarget::getVScaleForTuning

that returns a value for vscale that can be used for tuning the cost
model when using scalable vectors. The VScaleForTuning option in
AArch64Subtarget is initialised according to the following rules:

1. If the user has specified the CPU to tune for we use that, else
2. If the target CPU was specified we use that, else
3. The tuning is set to "generic".

For CPUs of type "generic" I have assumed that vscale=2.

New tests added here:

  Analysis/CostModel/AArch64/sve-gather.ll
  Analysis/CostModel/AArch64/sve-scatter.ll
  Transforms/LoopVectorize/AArch64/sve-strict-fadd-cost.ll

Differential Revision: https://reviews.llvm.org/D110259

2 years ago[lldb] [Host] Add setters for common teletype properties to Terminal
Michał Górny [Sun, 3 Oct 2021 18:25:01 +0000 (20:25 +0200)]
[lldb] [Host] Add setters for common teletype properties to Terminal

Add setters for common teletype properties to the Terminal class:

- SetRaw() to enable common raw mode options

- SetBaudRate() to set the baud rate

- SetStopBits() to select the number of stop bits

- SetParity() to control parity bit in the output

- SetHardwareControlFlow() to enable or disable hardware control flow
  (if supported)

Differential Revision: https://reviews.llvm.org/D111030

2 years ago[MLIR][OpenMP] Add support for ordered construct
Peixin-Qiao [Thu, 21 Oct 2021 08:30:46 +0000 (16:30 +0800)]
[MLIR][OpenMP] Add support for ordered construct

This patch supports the ordered construct in OpenMP dialect following
Section 2.19.9 of the OpenMP 5.1 standard. Also lowering to LLVM IR
using OpenMP IRBduiler. Lowering to LLVM IR for ordered simd directive
is not supported yet since LLVM optimization passes do not support it
for now.

Reviewed By: kiranchandramohan, clementval, ftynse, shraiysh

Differential Revision: https://reviews.llvm.org/D110015

2 years ago[mlir][linalg][bufferize][NFC] Check return value of getResultBuffer
Matthias Springer [Thu, 21 Oct 2021 08:23:15 +0000 (17:23 +0900)]
[mlir][linalg][bufferize][NFC] Check return value of getResultBuffer

In a subsequent commit, getResultBuffer can return a "null" Value. This is the case when the returned buffer from an scf.if is not unique.

This commit is in preparation for scf.if support to keep the next commit smaller.

Differential Revision: https://reviews.llvm.org/D111927

2 years ago[mlir][linalg][bufferize] Bufferize using PostOrder traversal
Matthias Springer [Thu, 21 Oct 2021 08:00:31 +0000 (17:00 +0900)]
[mlir][linalg][bufferize] Bufferize using PostOrder traversal

This is required for bufferization of scf::IfOp, which is added in a subsequent commit.

Some ops (scf::ForOp, TiledLoopOp) require PreOrder traversal to make sure that bbArgs are mapped before bufferizing the loop body.

Differential Revision: https://reviews.llvm.org/D111924

2 years ago[lldb][NFC] clang-format CPlusPlusLanguage.cpp
Raphael Isemann [Thu, 21 Oct 2021 08:01:02 +0000 (10:01 +0200)]
[lldb][NFC] clang-format CPlusPlusLanguage.cpp

2 years ago[fir] Add Character helper
Valentin Clement [Thu, 21 Oct 2021 07:47:33 +0000 (09:47 +0200)]
[fir] Add Character helper

This patch is extracted from D111337. It introduce the
CharacterExprHelper that helps dealing with character in FIR.

Reviewed By: schweitz, awarzynski

Differential Revision: https://reviews.llvm.org/D112140

Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
2 years ago[NFC][LoopIdiom] Add more test case to runtime-determined memset size
eopXD [Sat, 16 Oct 2021 16:11:44 +0000 (09:11 -0700)]
[NFC][LoopIdiom] Add more test case to runtime-determined memset size

This patch supplements missing test case for D107353.
- Fix wrong descriptions in 64-bit mode test case
- Added testcase under 32-bit mode

Reviewed By: bmahjour

Differential Revision: https://reviews.llvm.org/D108507

2 years ago[LLDB] [NFC] Typo fix in usage text for "type filter" command
Daniel Jalkut [Thu, 21 Oct 2021 06:52:07 +0000 (12:22 +0530)]
[LLDB] [NFC] Typo fix in usage text for "type filter" command

When you invoke "help type filter" the resulting help shows:

Syntax: type synthetic [<sub-command-options>]

This patch fixes the help so it says "type filter" instead of "type synthetic".

patch by: "Daniel Jalkut <jalkut@red-sweater.com>"

Reviewed By: teemperor

Differential Revision: https://reviews.llvm.org/D112199

2 years ago[opt-viewer] Use safe yaml load_all
Yi Kong [Thu, 21 Oct 2021 05:56:14 +0000 (13:56 +0800)]
[opt-viewer] Use safe yaml load_all

Differential Revision: https://reviews.llvm.org/D112075

2 years agoRevert "[MLIR][OpenMP] Add support for ordered construct"
Mehdi Amini [Thu, 21 Oct 2021 04:53:45 +0000 (04:53 +0000)]
Revert "[MLIR][OpenMP] Add support for ordered construct"

This reverts commit dc2be87ecf10f2f1cf05f638a72256387c78f1c1.

Seems like this broke all the CI bots.

2 years ago[ELF] Avoid adding an orphan section to a less suitable segment
Igor Kudrin [Thu, 21 Oct 2021 04:37:52 +0000 (11:37 +0700)]
[ELF] Avoid adding an orphan section to a less suitable segment

If segments are defined in a linker script, placing an orphan section
before the found closest-rank section can result in adding it in a
previous segment and changing flags of that segment. This happens if
the orphan section has a lower sort rank than the found section. To
avoid that, the patch forces orphan sections to be moved after the
found section if segments are explicitly defined.

Differential Revision: https://reviews.llvm.org/D111717

2 years ago[NFC][msan] Add NormalArgAfterNoUndef testcase
Vitaly Buka [Thu, 21 Oct 2021 03:44:41 +0000 (20:44 -0700)]
[NFC][msan] Add NormalArgAfterNoUndef testcase

2 years ago[NFC][msan] Rerun update_test_checks.py for a test
Vitaly Buka [Thu, 21 Oct 2021 03:24:11 +0000 (20:24 -0700)]
[NFC][msan] Rerun update_test_checks.py for a test

2 years ago[NFC][msan] Break the loop when done
Vitaly Buka [Thu, 21 Oct 2021 03:00:10 +0000 (20:00 -0700)]
[NFC][msan] Break the loop when done

We have nothing to do after the Argument
is found.

2 years ago[lld-macho][nfc] Added some notes on deliberate differences btw LD64 vs LLD-MACHO
Vy Nguyen [Sat, 25 Sep 2021 01:39:30 +0000 (21:39 -0400)]
[lld-macho][nfc] Added some notes on deliberate differences btw LD64 vs LLD-MACHO

For future references and to help with debugging crashes, this could be useful.

Differential Revision: https://reviews.llvm.org/D110464

2 years ago[Codegen] Set ARITH_FENCE as meta-instruction
Shengchen Kan [Wed, 20 Oct 2021 09:11:08 +0000 (17:11 +0800)]
[Codegen] Set ARITH_FENCE as meta-instruction

ARITH_FENCE, which was added by https://reviews.llvm.org/D99675,
should be a meta-instruction b/c it only emits comments "ARITH_FENCE".

Reviewed By: pengfei, LuoYuanke

Differential Revision: https://reviews.llvm.org/D112127

2 years ago[modules] While merging ObjCInterfaceDecl definitions, merge them as decl contexts...
Volodymyr Sapsai [Wed, 22 Sep 2021 19:37:46 +0000 (12:37 -0700)]
[modules] While merging ObjCInterfaceDecl definitions, merge them as decl contexts too.

While working on https://reviews.llvm.org/D110280 I've tried to merge
decl contexts as it seems to be correct and matching our handling of
decl contexts from different modules. It's not required for the fix in
https://reviews.llvm.org/D110280 but it revealed a missing diagnostic,
so separating this change into a separate commit.

Renamed some variables to distinguish diagnostic like "declaration of
'x' does not match" for different cases.

Differential Revision: https://reviews.llvm.org/D110287

2 years ago[MLIR][OpenMP] Add support for ordered construct
Peixin-Qiao [Thu, 21 Oct 2021 01:16:04 +0000 (09:16 +0800)]
[MLIR][OpenMP] Add support for ordered construct

This patch supports the ordered construct in OpenMP dialect following
Section 2.19.9 of the OpenMP 5.1 standard. Also lowering to LLVM IR
using OpenMP IRBduiler. Lowering to LLVM IR for ordered simd directive
is not supported yet since LLVM optimization passes do not support it
for now.

Reviewed By: kiranchandramohan, clementval, ftynse, shraiysh

Differential Revision: https://reviews.llvm.org/D110015

2 years ago[Driver][OpenBSD] Some improvements to the external assembler handling
Brad Smith [Thu, 21 Oct 2021 00:59:46 +0000 (20:59 -0400)]
[Driver][OpenBSD] Some improvements to the external assembler handling

- Pass CPU variant for ARM
- Pass MIPS CPU in addition to the ABI

2 years ago[ARM] Use correct name of floating point ceil intrinsic in test.
Craig Topper [Thu, 21 Oct 2021 00:29:02 +0000 (17:29 -0700)]
[ARM] Use correct name of floating point ceil intrinsic in test.

The intrinsic is called llvm.ceil not llvm.fceil. The checks weren't
strong enough to notice that a call to llvm.fceil was emitted in
the final assembly.

2 years ago[msan] Add stat-family interceptors on Linux
Nikita Malyavin [Wed, 20 Oct 2021 23:53:50 +0000 (16:53 -0700)]
[msan] Add stat-family interceptors on Linux

Add following interceptors on Linux: stat, lstat, fstat, fstatat.

This fixes use-of-uninitialized value on platforms with GLIBC 2.33+.
In particular: Arch Linux, Ubuntu hirsute/impish.

The tests should have also been failing during the release on the mentioned platforms, but I cannot find any related discussion.

Most likely, the regression was introduced by glibc commit [[ https://github.com/bminor/glibc/commit/8ed005daf0ab03e142500324a34087ce179ae78e | 8ed005daf0ab03e14250032 ]]:
all stat-family functions are now exported as shared functions.

Before, some of them (namely stat, lstat, fstat, fstatat) were provided as a part of libc_noshared.a and called their __xstat dopplegangers. This is still true for Debian Sid and earlier Ubuntu's. stat interceptors may be safely provided for them, no problem with that.

Closes https://github.com/google/sanitizers/issues/1452.
See also https://jira.mariadb.org/browse/MDEV-24841

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D111984

2 years ago[lld-macho] Temporarily disable lc-linker-option.ll on Windows
Jez Ng [Thu, 21 Oct 2021 00:04:37 +0000 (20:04 -0400)]
[lld-macho] Temporarily disable lc-linker-option.ll on Windows

It's currently using a symlink, which is not supported on Windows.

2 years ago[SelectionDAG] Bail out of mergeTruncStores when not optimizing
Arthur Eubanks [Tue, 12 Oct 2021 01:51:37 +0000 (18:51 -0700)]
[SelectionDAG] Bail out of mergeTruncStores when not optimizing

With unoptimized code, we may see lots of stores and spend too much time in mergeTruncStores.

Fixes PR51827.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D111596

2 years ago[ARM] Fix inline assembly referencing floating point registers on soft-float targets
Pavel Kosov [Wed, 20 Oct 2021 23:39:01 +0000 (02:39 +0300)]
[ARM] Fix inline assembly referencing floating point registers on soft-float targets

Fixes PR: https://bugs.llvm.org/show_bug.cgi?id=52230

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D112135

OS Laboratory, Huawei Russian Research Institute, Saint-Petersburg

2 years agoRevert "[ORC-RT] Configure the ORC runtime for more architectures and platforms"
Ben Langmuir [Wed, 20 Oct 2021 22:32:06 +0000 (15:32 -0700)]
Revert "[ORC-RT] Configure the ORC runtime for more architectures and platforms"

Broke on aarch64-linux. Reverting while I investigate.

This reverts commit 5692ed0cce8c9506eef40ffe6ca2d9629956c51c.

2 years ago[runtimes] Rename CI job from "Runtimes build" to "Bootstrapping build"
Louis Dionne [Wed, 20 Oct 2021 21:43:55 +0000 (17:43 -0400)]
[runtimes] Rename CI job from "Runtimes build" to "Bootstrapping build"

2 years ago[libunwind] Revert "Use the from-scratch testing configuration by default"
Louis Dionne [Wed, 20 Oct 2021 21:40:23 +0000 (17:40 -0400)]
[libunwind] Revert "Use the from-scratch testing configuration by default"

This reverts commit 5a8ad80b6fa5cbad58b78384f534b78fca863e7f, which broke
the Bootstrapping build. I'm reverting until we've fixed the issue.

Differential Revision: https://reviews.llvm.org/D112082

2 years ago[libc++abi] Guard include of <unistd.h> behind __has_include
Louis Dionne [Wed, 20 Oct 2021 21:36:13 +0000 (17:36 -0400)]
[libc++abi] Guard include of <unistd.h> behind __has_include

This doesn't change anything on platforms that have <unistd.h>, but
it will allow this file to compile on platforms that do not.

2 years ago[Tests] Add tests for non-speculatable ephemeral values
Nikita Popov [Wed, 20 Oct 2021 20:15:51 +0000 (22:15 +0200)]
[Tests] Add tests for non-speculatable ephemeral values

The loads in these examples are currently not considered ephemeral
because they are not speculatable.

2 years agoRevert "[fir] Add Character helper"
Valentin Clement [Wed, 20 Oct 2021 20:43:13 +0000 (22:43 +0200)]
Revert "[fir] Add Character helper"

This reverts commit 02d7089c239075a5c2e148087d2824d253fc3d5f.

2 years ago[x86] add special-case lowering for usubsat for AVX512
Sanjay Patel [Wed, 20 Oct 2021 20:09:15 +0000 (16:09 -0400)]
[x86] add special-case lowering for usubsat for AVX512

This is a small extension of D112095 to avoid another regression
seen with D112085.
In this case, we allow the same conversion from usubsat to ALU
ops if the target supports vpternlog.

That pattern will get converted later in X86DAGToDAGISel::tryVPTERNLOG().
This seems better than putting a magic immediate constant directly in
this code to create the exact vpternlog that we need. It's possible that
there are other special-cases along these lines, so we should try to
keep all of the vpternlog magic in one place.

Differential Revision: https://reviews.llvm.org/D112138

2 years ago[libc++] Fix incorrect main() signatures in the tests
Louis Dionne [Wed, 20 Oct 2021 20:24:55 +0000 (16:24 -0400)]
[libc++] Fix incorrect main() signatures in the tests

Those creep up from time to time. We need to use `int main(int, char**)`
because in freestanding mode, `main` doesn't get special treatment and
special mangling, so we setup a symbol alias from the mangled version of
`main(int, char**)` to `extern "C" main`. That only works if all the tests
are consistent about how they define their main function.

2 years ago[InstCombine] Fold `(a & ~b) & ~c` to `a & ~(b | c)`
Stanislav Mekhanoshin [Tue, 19 Oct 2021 23:11:02 +0000 (16:11 -0700)]
[InstCombine] Fold `(a & ~b) & ~c` to `a & ~(b | c)`

  %not1 = xor i32 %b, -1
  %not2 = xor i32 %c, -1
  %and1 = and i32 %a, %not1
  %and2 = and i32 %and1, %not2
=>
  %i1 = or i32 %b, %c
  %i2 = xor i32 %1, -1
  %and2 = and i32 %i2, %a

Differential Revision: https://reviews.llvm.org/D112108

2 years agoRemove include of 'type_info' from ext-int test.
Erich Keane [Wed, 20 Oct 2021 19:52:40 +0000 (12:52 -0700)]
Remove include of 'type_info' from ext-int test.

Originally I thought that I needed to do a #include to trick the
compiler into letting me use typeid I believe, but Aaron explained that
it was just looking for the type_info type.  I had to give it some
public/private members to make it emit the same as before, but this
ought to be a 'perfect' replacement.

2 years agoPrecommit updated InstCombine/and-xor-or.ll test. NFC.
Stanislav Mekhanoshin [Wed, 20 Oct 2021 19:50:23 +0000 (12:50 -0700)]
Precommit updated InstCombine/and-xor-or.ll test. NFC.

2 years ago[IndVars] Invalidate SCEV when IR is changed in rewriteLoopExitValue.
Florian Hahn [Wed, 20 Oct 2021 19:25:07 +0000 (20:25 +0100)]
[IndVars] Invalidate SCEV when IR is changed in rewriteLoopExitValue.

At the moment, rewriteLoopExitValue forgets the current phi node in the
loop that collects phis to rewrite. A few lines after the value is
forgotten, SCEV is used again to analyze incoming values and
potentially expand SCEV expression. This means that another SCEV is
created for PN, before the IR is actually updated in the next loop.

This leads to accessing invalid cached expression in combination with
D71539.

PN should only be changed once the actual incoming exit value is set in
the next loop. Moving invalidation there should ensure that PN is
invalidated in all relevant cases.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D111495

2 years ago[mlir][sparse] make index type explicit in public API of support library
Aart Bik [Wed, 20 Oct 2021 04:43:03 +0000 (21:43 -0700)]
[mlir][sparse] make index type explicit in public API of support library

The current implementation used explicit index->int64_t casts for some, but
not all instances of passing values of type "index" in and from the sparse
support library. This revision makes the situation more consistent by
using new "index_t" type at all such places  (which allows for less trivial
casting in the generated MLIR code).  Note that the current revision still
assumes that "index" is 64-bit wide. If we want to support targets with
alternative "index" bit widths, we need to build the support library different.
But the current revision is a step forward by making this requirement explicit
and more visible.

Reviewed By: wrengr

Differential Revision: https://reviews.llvm.org/D112122

2 years agoMake dr177x.cpp test work with Windows-32 bit platfroms with 'thiscall'.
Erich Keane [Wed, 20 Oct 2021 19:37:19 +0000 (12:37 -0700)]
Make dr177x.cpp test work with Windows-32 bit platfroms with 'thiscall'.

My downstream noticed that the test failed on windows-32 bit machines
since the types have __attribute__((thiscall)) on them in a few places.
This patch just adds a wildcard to handle that, since it isn't
particularly important to the test.

2 years ago[fir] Add Character helper
Valentin Clement [Wed, 20 Oct 2021 19:33:48 +0000 (21:33 +0200)]
[fir] Add Character helper

This patch is extracted from D111337. It introduce the
CharacterExprHelper that helps dealing with character in FIR.

Reviewed By: schweitz, awarzynski

Differential Revision: https://reviews.llvm.org/D112140

Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
2 years agoUpdate ext-int test to have x86 linux/windows before ABI Impl
Erich Keane [Wed, 20 Oct 2021 19:25:49 +0000 (12:25 -0700)]
Update ext-int test to have x86 linux/windows before ABI Impl

Writing a quick test to make sure we are aware of the change to the
_ExtInt/_BitInt ABI on x86 (32bit) OSes.

2 years agoDrop transfer_read inner most unit dimensions
Ahmed S. Taei [Wed, 20 Oct 2021 17:32:56 +0000 (17:32 +0000)]
Drop transfer_read inner most unit dimensions

Add a pattern to take a rank-reducing subview and drop inner most
contiguous unit dim.
This is useful when lowering vector to backends with 1d vector types.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D111561

2 years ago[AArch64][GlobalISel] combine (and (or x, c1), c2) => (and x, c2) iff c1 & c2 == 0
Jon Roelofs [Thu, 14 Oct 2021 19:15:35 +0000 (12:15 -0700)]
[AArch64][GlobalISel] combine (and (or x, c1), c2) => (and x, c2) iff c1 & c2 == 0

https://godbolt.org/z/h8ejrG4hb

rdar://83597585

Differential Revision: https://reviews.llvm.org/D111856

2 years ago[lldb] Remove variable "any" which is set but not used
Jonas Devlieghere [Wed, 20 Oct 2021 19:08:18 +0000 (12:08 -0700)]
[lldb] Remove variable "any" which is set but not used

2 years ago[AMDGPU] MachineLICM cannot hoist VALU
Stanislav Mekhanoshin [Fri, 23 Jul 2021 19:28:55 +0000 (12:28 -0700)]
[AMDGPU] MachineLICM cannot hoist VALU

MachineLoop::isLoopInvariant() returns false for all VALU
because of the exec use. Check TII::isIgnorableUse() to
allow hoisting.

That unfortunately results in higher register consumption
since MachineLICM does not adequately estimate pressure.
Therefor I think it shall only be enabled after D107677 even
though it does not depend on it.

Differential Revision: https://reviews.llvm.org/D107859

2 years ago[AMDGPU] Allow rematerialization of SOP with virtual registers
Stanislav Mekhanoshin [Fri, 24 Sep 2021 18:12:59 +0000 (11:12 -0700)]
[AMDGPU] Allow rematerialization of SOP with virtual registers

D106408 was doing this for all targets although it was
reverted due to couple performance regressions on some targets.
The difference for AMDGPU is the ability to rematerialize SOP
instructions with virtual register uses like we already do for VOP.

Differential Revision: https://reviews.llvm.org/D110743

2 years ago[MC] Recursively calculate symbol offset
Leonard Grey [Wed, 20 Oct 2021 18:29:43 +0000 (14:29 -0400)]
[MC] Recursively calculate symbol offset

This is speculative since I'm not sure if there's some implicit contract that a
variable symbol must not have another variable symbol in its evaluation tree.

Downstream bug: https://bugs.chromium.org/p/chromium/issues/detail?id=471146#c23.

Test is based on alias.s (removed checks since we just need to know it didn't
crash).

Differential Revision: https://reviews.llvm.org/D109109

2 years ago[lld/mac] Remove else-after-return in ICF code
Nico Weber [Sun, 17 Oct 2021 19:42:27 +0000 (15:42 -0400)]
[lld/mac] Remove else-after-return in ICF code

No behavior change.

2 years ago[InstCombine] fold fake vector insert to bit-logic
Sanjay Patel [Wed, 20 Oct 2021 17:25:00 +0000 (13:25 -0400)]
[InstCombine] fold fake vector insert to bit-logic

bitcast (inselt (bitcast X), Y, 0) --> or (and X, MaskC), (zext Y)

https://alive2.llvm.org/ce/z/Ux-662

Similar to D111082 / db231ebdb07f :
We want to avoid relatively opaque vector ops on types that are
likely supported by the backend as scalar integers. The bitwise
logic ops are more likely to allow further combining.

We probably want to generalize this to allow a shift too, but
that would oppose instcombine's general rule of not creating
extra instructions, so that's left as a potential follow-up.
Alternatively, we could do that transform in VectorCombine
with the help of the TTI cost model.

This is part of solving:
https://llvm.org/PR52057

2 years ago[ORC-RT] Configure the ORC runtime for more architectures and platforms
Ben Langmuir [Wed, 20 Oct 2021 17:37:32 +0000 (10:37 -0700)]
[ORC-RT] Configure the ORC runtime for more architectures and platforms

Enable building the ORC runtime for 64-bit and 32-bit ARM architectures,
and for all Darwin embedded platforms (iOS, tvOS, and watchOS). This
covers building the cross-platform code, but does not add TLV runtime
support for the new architectures, which can be added independently.

Incidentally, stop building the Mach-O TLS support file unnecessarily on
other platforms.

Differential Revision: https://reviews.llvm.org/D112111

2 years ago[clang] Disallow mixing SEH and Objective-C exceptions
Nico Weber [Wed, 20 Oct 2021 16:47:15 +0000 (12:47 -0400)]
[clang] Disallow mixing SEH and Objective-C exceptions

We already disallow mixing SEH and C++ exceptions, and
mixing SEH and Objective-C exceptions seems to not work (see PR52233).
Emitting an error is friendlier than crashing.

Differential Revision: https://reviews.llvm.org/D112157

2 years agoPrecommit InstCombine/and-xor-or.ll test. NFC.
Stanislav Mekhanoshin [Wed, 20 Oct 2021 17:47:24 +0000 (10:47 -0700)]
Precommit InstCombine/and-xor-or.ll test. NFC.

2 years agoRaise compile error when using unimplemented functions
Muiez Ahmed [Wed, 20 Oct 2021 17:55:50 +0000 (13:55 -0400)]
Raise compile error when using unimplemented functions

The path functions in this patch are unimplemented (as per the TODO comment from upstream). To avoid running into a linker error (missing symbol), this patch raises a compile error by commenting out the functions, which is more user friendly.

Differential Revision: https://reviews.llvm.org/D111892

2 years ago[NFC] De-template LazyCallGraph::visitReferences() and move into .cpp file
Arthur Eubanks [Wed, 20 Oct 2021 17:49:21 +0000 (10:49 -0700)]
[NFC] De-template LazyCallGraph::visitReferences() and move into .cpp file

This makes changing it and recompiling it much faster.

2 years ago[Clang][AST] Resolve FIXME: Remove ObjCObjectPointer from
Alfonso Gregory [Wed, 20 Oct 2021 17:29:02 +0000 (10:29 -0700)]
[Clang][AST] Resolve FIXME: Remove ObjCObjectPointer from
isSpecifierType

There is no reason to have this here, (since all tests pass) and it
isn't even a specifier anyway. We can just treat it as a pointer
instead.

Differential Revision: https://reviews.llvm.org/D110068

2 years ago[IR] Refactor GlobalIFunc to inherit from GlobalObject, Remove GlobalIndirectSymbol
Itay Bookstein [Wed, 20 Oct 2021 17:29:47 +0000 (10:29 -0700)]
[IR] Refactor GlobalIFunc to inherit from GlobalObject, Remove GlobalIndirectSymbol

As discussed in:
* https://reviews.llvm.org/D94166
* https://lists.llvm.org/pipermail/llvm-dev/2020-September/145031.html

The GlobalIndirectSymbol class lost most of its meaning in
https://reviews.llvm.org/D109792, which disambiguated getBaseObject
(now getAliaseeObject) between GlobalIFunc and everything else.
In addition, as long as GlobalIFunc is not a GlobalObject and
getAliaseeObject returns GlobalObjects, a GlobalAlias whose aliasee
is a GlobalIFunc cannot currently be modeled properly. Creating
aliases for GlobalIFuncs does happen in the wild (e.g. glibc). In addition,
calling getAliaseeObject on a GlobalIFunc will currently return nullptr,
which is undesirable because it should return the object itself for
non-aliases.

This patch refactors the GlobalIFunc class to inherit directly from
GlobalObject, and removes GlobalIndirectSymbol (while inlining the
relevant parts into GlobalAlias and GlobalIFunc). This allows for
calling getAliaseeObject() on a GlobalIFunc to return the GlobalIFunc
itself, making getAliaseeObject() more consistent and enabling
alias-to-ifunc to be properly modeled in the IR.

I exercised some judgement in the API clients of GlobalIndirectSymbol:
some were 'monomorphized' for GlobalAlias and GlobalIFunc, and
some remained shared (with the type adapted to become GlobalValue).

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D108872

2 years agoInvalidPtrChecker - don't dereference a dyn_cast<> - use cast<> instead.
Simon Pilgrim [Wed, 20 Oct 2021 17:05:46 +0000 (18:05 +0100)]
InvalidPtrChecker - don't dereference a dyn_cast<> - use cast<> instead.

Avoid dereferencing a nullptr returned by dyn_cast<>, by using cast<> instead which asserts that the cast is valid.

2 years ago[mlir] fix region property generation in python bindings
Alex Zinenko [Wed, 20 Oct 2021 16:56:20 +0000 (18:56 +0200)]
[mlir] fix region property generation in python bindings

2 years agoFix unused variable warning.
Sterling Augustine [Wed, 20 Oct 2021 16:59:16 +0000 (09:59 -0700)]
Fix unused variable warning.

2 years ago[WebAssembly] Add prototype relaxed float min max instructions
Zhi An Ng [Wed, 20 Oct 2021 16:41:50 +0000 (09:41 -0700)]
[WebAssembly] Add prototype relaxed float min max instructions

Add relaxed. f32x4.min, f32x4.max, f64x2.min, f64x2.max. These are only
exposed as builtins, and require user opt-in.

Differential Revision: https://reviews.llvm.org/D112146

2 years ago[OpenMP] Add GOMP allocator functions
Nawrin Sultana [Tue, 12 Oct 2021 19:54:46 +0000 (14:54 -0500)]
[OpenMP] Add GOMP allocator functions

This patch adds GOMP_alloc and GOMP_free functions of LIBGOMP.

Differential revision: https://reviews.llvm.org/D111673

2 years ago[InstCombine] add tests for casted insertelement; NFC
Sanjay Patel [Wed, 20 Oct 2021 16:14:45 +0000 (12:14 -0400)]
[InstCombine] add tests for casted insertelement; NFC

2 years ago[MLIR][OpenMP] Shifted hint from CriticalOp to CriticalDeclareOp
Shraiysh Vaishay [Wed, 20 Oct 2021 12:32:21 +0000 (18:02 +0530)]
[MLIR][OpenMP] Shifted hint from CriticalOp to CriticalDeclareOp

According to the OpenMP 5.0 standard, names and hints of critical operation are
closely related. The following are the restrictions on them:
 - Unless the effect is as if `hint(omp_sync_hint_none)` was specified, the
   critical construct must specify a name.
 - If the hint clause is specified, each of the critical constructs with the
   same name must have a hint clause for which the hint-expression evaluates to
   the same value.

These restrictions will be enforced by design if the hint expression is a part
of the `omp.critical.declare` operation.
 - Any operation with no "name" will be considered to have
   `hint(omp_sync_hint_none)`.
 - All the operations with the same "name" will have the same hint value.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D112134

2 years ago[clang] Add plugin ActionType to run command line plugin before main action
Arthur Eubanks [Tue, 19 Oct 2021 21:50:44 +0000 (14:50 -0700)]
[clang] Add plugin ActionType to run command line plugin before main action

Currently we have a way to run a plugin if specified on the command line
after the main action, and ways to unconditionally run the plugin before
or after the main action, but no way to run a plugin if specified on the
command line before the main action.

This introduces the missing option.

This is helpful because -clear-ast-before-backend clears the AST before
codegen, while some plugins may want access to the AST.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D112096

2 years ago[CodeGenPrepare] Avoid a scalable-vector crash in ctlz/cttz
Fraser Cormack [Wed, 20 Oct 2021 14:51:36 +0000 (15:51 +0100)]
[CodeGenPrepare] Avoid a scalable-vector crash in ctlz/cttz

This patch fixes a crash when despeculating ctlz/cttz intrinsics with
scalable-vector types. It is not safe to speculatively get the size of
the vector type in bits in case the vector type is not a fixed-length type. As
it happens this isn't required as vector types are skipped anyway.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D112141

2 years ago[RISCV] Use clang_builtin_alias for all RISCV vector intrinsics.
Craig Topper [Wed, 20 Oct 2021 14:54:05 +0000 (07:54 -0700)]
[RISCV] Use clang_builtin_alias for all RISCV vector intrinsics.

Previously we used builtin_alias for overloaded intrinsics, but
macros for the non-overloaded version. This patch changes the
non-overloaded versions to also use builtin_alias, but without
the overloadable attribute.

Reviewed By: khchen, HsiangKai

Differential Revision: https://reviews.llvm.org/D112020

2 years ago[analyzer][NFC] Refactor llvm::isa<> usages in the StaticAnalyzer
Balazs Benics [Wed, 20 Oct 2021 15:43:31 +0000 (17:43 +0200)]
[analyzer][NFC] Refactor llvm::isa<> usages in the StaticAnalyzer

It turns out llvm::isa<> is variadic, and we could have used this at a
lot of places.

The following patterns:
  x && isa<T1>(x) || isa<T2>(x) ...
Will be replaced by:
  isa_and_non_null<T1, T2, ...>(x)

Sometimes it caused further simplifications, when it would cause even
more code smell.

Aside from this, keep in mind that within `assert()` or any macro
functions, we need to wrap the isa<> expression within a parenthesis,
due to the parsing of the comma.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D111982

2 years ago[lld-macho] Fix dangling string reference when adding frameworks
Kaining Zhong [Wed, 20 Oct 2021 15:20:21 +0000 (11:20 -0400)]
[lld-macho] Fix dangling string reference when adding frameworks

In Driver.cpp, addFramework used std::string instance to represent the path of a framework, which will be freed after the function returns. However, this string is stored in loadedArchive, which will be used later to compare with path of newly added frameworks. This caused https://bugs.llvm.org/show_bug.cgi?id=52133. A test is included in this commit to reproduce this bug.

Now resolveDylibPath returns a StringRef instance, and it uses StringSaver to save its data, then returns it to functions on the top. This ensures the resolved framework path is still valid after LC_LINKER_OPTION is parsed.

Reviewed By: int3, #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D111706

2 years ago[NewPM][test] Strickly use -passes in some more lit tests
Bjorn Pettersson [Wed, 20 Oct 2021 15:06:25 +0000 (17:06 +0200)]
[NewPM][test] Strickly use -passes in some more lit tests

Removed/replaced RUN lines using legacy PM syntax in favor of using
-passes in lit tests for Float2Int, MetaRenamer, StripDeadPrototypes
and StripSymbols.

2 years ago[NewPM][test] Use -passes syntax in Mem2Reg lit tests
Bjorn Pettersson [Wed, 20 Oct 2021 13:45:29 +0000 (15:45 +0200)]
[NewPM][test] Use -passes syntax in Mem2Reg lit tests

The legacy PM is deprecated, so use the new PM syntax in lit tests
verifying the mem2reg pass.

2 years ago[Sema, StaticAnalyzer] Use StringRef::contains (NFC)
Kazu Hirata [Wed, 20 Oct 2021 15:02:36 +0000 (08:02 -0700)]
[Sema, StaticAnalyzer] Use StringRef::contains (NFC)

2 years ago[RISCV][WebAssembly][TargetLowering] Allow expandCTLZ/expandCTTZ to rely on CTPOP...
Craig Topper [Tue, 19 Oct 2021 23:10:02 +0000 (16:10 -0700)]
[RISCV][WebAssembly][TargetLowering] Allow expandCTLZ/expandCTTZ to rely on CTPOP expansion for vectors.

Our fallback expansion for CTLZ/CTTZ relies on CTPOP. If CTPOP
isn't legal or custom for a vector type we would scalarize the
CTLZ/CTTZ. This is different than CTPOP itself which would use a
vector expansion.

This patch teaches expandCTLZ/CTTZ to rely on the vector CTPOP
expansion instead of scalarizing. To do this I had to add additional
checks to make sure the operations used by CTPOP expansions are all
supported. Some of the operations were already needed for the CTLZ/CTTZ
expansion.

This is a huge improvement to the RISCV which doesn't have a scalar
ctlz or cttz in the base ISA.

For WebAssembly, I've added Custom lowering to keep the scalarizing
behavior. I've also extended the scalarizing to CTPOP.

Differential Revision: https://reviews.llvm.org/D111919

2 years ago[clangd] Fix use-after-free in HeaderIncluderCache
Kadir Cetinkaya [Wed, 20 Oct 2021 10:52:49 +0000 (12:52 +0200)]
[clangd] Fix use-after-free in HeaderIncluderCache

Includer cache could get into a bad state when a main file went bad and
added back afterwards. This patch adds a check to invalidate to prevent
that.

Differential Revision: https://reviews.llvm.org/D112130

2 years ago[clangd] Only publish preamble after rebuilds
Kadir Cetinkaya [Wed, 20 Oct 2021 13:12:25 +0000 (15:12 +0200)]
[clangd] Only publish preamble after rebuilds

Don't invoke parsing callback for preamble if clangd is using a
previously built one.

Differential Revision: https://reviews.llvm.org/D112137

2 years ago[x86] make helper for useVPTERNLOG; NFC
Sanjay Patel [Wed, 20 Oct 2021 13:01:11 +0000 (09:01 -0400)]
[x86] make helper for useVPTERNLOG; NFC

See D112085 for another use case.

2 years ago[mlir] Expand prefixing to OpFormatGen
Jacques Pienaar [Wed, 20 Oct 2021 14:08:36 +0000 (07:08 -0700)]
[mlir] Expand prefixing to OpFormatGen

Follow up to also use the prefixed emitters in OpFormatGen (moved
getGetterName(s) and getSetterName(s) to Operator as that is most
convenient usage wise even though it just depends on Dialect). Prefix
accessors in Test dialect and follow up on missed changes in
OpDefinitionsGen.

Differential Revision: https://reviews.llvm.org/D112118

2 years ago[DebugInfo][InstrRef] Track a single variable at a time
Jeremy Morse [Wed, 20 Oct 2021 13:57:07 +0000 (14:57 +0100)]
[DebugInfo][InstrRef] Track a single variable at a time

Here's another performance patch for InstrRefBasedLDV: rather than
processing all variable values in a scope at a time, instead, process one
variable at a time. The benefits are twofold:
 * It's easier to reason about one variable at a time in your mind,
 * It improves performance, apparently from increased locality.

The downside is that the value-propagation code gets indented one level
further, plus there's some churn in the unit tests.

Differential Revision: https://reviews.llvm.org/D111799

2 years ago[mlir][Linalg] Add a first vectorization pattern for conv1d in NWCxWCF format.
Nicolas Vasilache [Wed, 20 Oct 2021 13:53:56 +0000 (13:53 +0000)]
[mlir][Linalg] Add a first vectorization pattern for conv1d in NWCxWCF format.

This revision uses the newly refactored StructuredGenerator to create a simple vectorization for conv1d_nwc_wcf.

Note that the pattern is not specific to the op and is technically not even specific to the ConvolutionOpInterface (modulo minor details related to dilations and strides).

The overall design follows the same ideas as the lowering of vector::ContractionOp -> vector::OuterProduct: it seeks to be minimally complex, composable and extensible while avoiding inference analysis. Instead, we metaprogram the maps/indexings we expect and we match against them.

This is just a first stab and still needs to be evaluated for performance.
Other tradeoffs are possible that should be explored.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D111894

2 years ago[libFuzzer] Update InputInfo.TimeOfUnit when replacing it in the corpus.
PZ Read [Wed, 20 Oct 2021 13:14:22 +0000 (06:14 -0700)]
[libFuzzer] Update InputInfo.TimeOfUnit when replacing it in the corpus.

Previously, when the fuzzing loop replaced an input in the corpus, it didn't update the execution time of the input. Therefore, some schedulers (e.g. Entropic) would adjust weights based on the incorrect execution time.

This patch updates the execution time of the input when replacing it.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D111479

2 years ago[libc++] Move LIBCXX-DEBUG-FIXME to params.py
Louis Dionne [Mon, 18 Oct 2021 20:19:52 +0000 (16:19 -0400)]
[libc++] Move LIBCXX-DEBUG-FIXME to params.py

This temporary FIXME really belongs to the testing config, not to the
specific CMake cache that enables that configuration.

Differential Revision: https://reviews.llvm.org/D112031

2 years ago[NewPM][test] Only use -passes syntax in Scalarizer lit tests
Bjorn Pettersson [Wed, 20 Oct 2021 13:00:07 +0000 (15:00 +0200)]
[NewPM][test] Only use -passes syntax in Scalarizer lit tests

With legacy PM being deprecated it should be enough to verify the
scalarizer pass using the new-PM syntax when invoking opt.

2 years ago[NewPM][test] Use -passes syntax in VectorCombine lit tests
Bjorn Pettersson [Wed, 20 Oct 2021 12:47:49 +0000 (14:47 +0200)]
[NewPM][test] Use -passes syntax in VectorCombine lit tests

The legacy PM is deprecated, so use the new PM syntax in lit tests
running the vector-combine pass.

2 years ago[NewPM][test] Use -passes syntax in BoundsChecking lit tests
Bjorn Pettersson [Wed, 20 Oct 2021 10:34:45 +0000 (12:34 +0200)]
[NewPM][test] Use -passes syntax in BoundsChecking lit tests

The legacy PM is deprecated, so use the new PM syntax in lit tests
running the bounds-checking pass.

2 years ago[NewPM][test] Use -passes syntax in SpeculativeExecution lit tests
Bjorn Pettersson [Wed, 20 Oct 2021 10:32:08 +0000 (12:32 +0200)]
[NewPM][test] Use -passes syntax in SpeculativeExecution lit tests

The legacy PM is deprecated, so use the new PM syntax in lit tests
running the speculative-execution pass.

2 years ago[NewPM][test] Avoid using -enable-new-pm=1 since -passes implies new PM
Bjorn Pettersson [Wed, 20 Oct 2021 09:28:07 +0000 (11:28 +0200)]
[NewPM][test] Avoid using -enable-new-pm=1 since -passes implies new PM

2 years ago[lldb] [ABI/X86] Support combining xmm* and ymm*h regs into ymm*
Michał Górny [Fri, 27 Aug 2021 16:55:37 +0000 (18:55 +0200)]
[lldb] [ABI/X86] Support combining xmm* and ymm*h regs into ymm*

gdbserver does not expose combined ymm* registers but rather XSAVE-style
split xmm* and ymm*h portions.  Extend value_regs to support combining
multiple registers and use it to create user-friendly ymm* registers
that are combined from split xmm* and ymm*h portions.

Differential Revision: https://reviews.llvm.org/D108937

2 years ago[lldb] [Process/Utility] Fix value_regs/invalidate_regs for ARM
Michał Górny [Tue, 19 Oct 2021 12:17:20 +0000 (14:17 +0200)]
[lldb] [Process/Utility] Fix value_regs/invalidate_regs for ARM

Fix incorrect values for value_regs, and incomplete values for
invalidate_regs in RegisterInfos_arm.  The value_regs entry needs
to list only one base (i.e. larger) register that needs to be read
to get the value for this register, while invalidate_regs needs to list
all other registers (including pseudo-register) whose values would
change when this register is written to.

7a8ba4ffbeecb5070926b80bb839a4d80539f1ac fixed a similar problem
for ARM64.

Differential Revision: https://reviews.llvm.org/D112066

2 years ago[lldb] [Process/Linux] Support arbitrarily-sized FPR writes on ARM
Michał Górny [Wed, 20 Oct 2021 11:35:35 +0000 (13:35 +0200)]
[lldb] [Process/Linux] Support arbitrarily-sized FPR writes on ARM

Support arbitrarily-sized FPR writes on ARM in order to fix writing qN
registers directly.  Currently, writing them works only by accident
due to value_regs splitting them into smaller writes via dN and sN
registers.

Differential Revision: https://reviews.llvm.org/D112131

2 years ago[SelectionDAG] Fix getVectorSubVecPointer for scalable subvectors.
Sander de Smalen [Tue, 12 Oct 2021 11:37:42 +0000 (12:37 +0100)]
[SelectionDAG] Fix getVectorSubVecPointer for scalable subvectors.

When inserting a scalable subvector into a scalable vector through
the stack, the index to store to needs to be scaled by vscale.
Before this patch, that didn't yet happen, so it would generate the
wrong offset, thus storing a subvector to the incorrect address
and overwriting the wrong lanes.

For some insert:
  nxv8f16 insert_subvector(nxv8f16 %vec, nxv2f16 %subvec, i64 2)

The offset was not scaled by vscale:
  orr     x8, x8, #0x4
  st1h    { z0.h }, p0, [sp]
  st1h    { z1.d }, p1, [x8]
  ld1h    { z0.h }, p0/z, [sp]

And is changed to:
  mov x8, sp
  st1h { z0.h }, p0, [sp]
  st1h { z1.d }, p1, [x8, #1, mul vl]
  ld1h { z0.h }, p0/z, [sp]

Differential Revision: https://reviews.llvm.org/D111633