platform/upstream/llvm.git
3 years ago[JITLink] Minor fix to avoid Windows compiler warning for static-cast
David Stuttard [Wed, 7 Apr 2021 13:19:21 +0000 (14:19 +0100)]
[JITLink] Minor fix to avoid Windows compiler warning for static-cast

Change-Id: Id0c1d5535b53e2aebe314151c0efa585e763f3f6

Differential Revision: https://reviews.llvm.org/D100093

3 years ago[AArch64] Change __ARM_FEATURE_FP16FML macro name to __ARM_FEATURE_FP16_FML
Keith Walker [Thu, 29 Apr 2021 12:46:25 +0000 (13:46 +0100)]
[AArch64] Change __ARM_FEATURE_FP16FML macro name to  __ARM_FEATURE_FP16_FML

The "Arm C Language extensions" document (the current version can be
found at https://developer.arm.com/documentation/101028/0012/?lang=en)
states that the name of the feature test macro for the FP16 FML extension
is __ARM_FEATURE_FP16_FML.

Differential Revision: https://reviews.llvm.org/D101532

3 years ago[lldb] Add tests for DumpDataExtractor formats
David Spickett [Tue, 13 Apr 2021 14:42:02 +0000 (15:42 +0100)]
[lldb] Add tests for DumpDataExtractor formats

Covering basic cases where you have 1 item on 1 line.

Apart from eFormatCharArray, where using multiple lines
highlights the difference between it and eFormatVectorOfChar.

Reviewed By: #lldb, teemperor

Differential Revision: https://reviews.llvm.org/D101453

3 years ago[RISCV][NFC] Merge RV32/RV64 test checks with a common prefix
Fraser Cormack [Fri, 30 Apr 2021 08:39:25 +0000 (09:39 +0100)]
[RISCV][NFC] Merge RV32/RV64 test checks with a common prefix

3 years ago[RISCV] Support STEP_VECTOR with a step greater than one
Fraser Cormack [Tue, 20 Apr 2021 14:23:30 +0000 (15:23 +0100)]
[RISCV] Support STEP_VECTOR with a step greater than one

DAGCombiner was recently taught how to combine STEP_VECTOR nodes,
meaning the step value is no longer guaranteed to be one by the time it
reaches the backend for lowering.

This patch supports such cases on RISC-V by lowering to other step
values to a multiply following the vid.v instruction. It includes a
small optimization for common cases where the multiply can be expressed
as a shift left.

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D100856

3 years ago[llvm][Support][NFC] Fix fallthrough attribute indentation
Timm Bäder [Fri, 30 Apr 2021 08:17:03 +0000 (10:17 +0200)]
[llvm][Support][NFC] Fix fallthrough attribute indentation

The attribute does not belong to the if statement before and trips up
gcc's indentation checker.

3 years agotsan: fix fork syscall test
Dmitry Vyukov [Fri, 30 Apr 2021 08:20:01 +0000 (10:20 +0200)]
tsan: fix fork syscall test

Arm64 builders failed with:
error: use of undeclared identifier 'SYS_fork'
https://lab.llvm.org/buildbot/#/builders/7/builds/2575

Indeed, not all arches have fork syscall.
Implement fork via clone on these arches.

Differential Revision: https://reviews.llvm.org/D101603

3 years ago[GISel] Teach TableGen to check predicates of immediate operands in patterns
Dominik Montada [Mon, 29 Mar 2021 13:21:46 +0000 (15:21 +0200)]
[GISel] Teach TableGen to check predicates of immediate operands in patterns

Reviewed By: dsanders

Differential Revision: https://reviews.llvm.org/D91703

3 years ago[AMDGPU] Simplify getWaitStatesSince. NFC.
Jay Foad [Fri, 30 Apr 2021 07:58:20 +0000 (08:58 +0100)]
[AMDGPU] Simplify getWaitStatesSince. NFC.

3 years ago[cmake] Use -ffunction-sections and -Wl,--gc-sections on MinGW targets
Martin Storsjö [Wed, 21 Apr 2021 05:35:40 +0000 (08:35 +0300)]
[cmake] Use -ffunction-sections and -Wl,--gc-sections on MinGW targets

If compiling with GCC or linking with ld.bfd, these options have little
effect, but if built with Clang and linked with LLD, they provide a
quite notable size decrease - this shrinks an entire llvm-mingw
distribution package by 22%.

If building with BUILD_SHARED_LIBS or LLVM_BUILD_LLVM_DYLIB with LLD,
this requires a version of LLD that contains a fix for auto exporting
symbols from comdats, 2b01a417d7ccb001ccc1185ef5fdc967c9fac8d7.

Differential Revision: https://reviews.llvm.org/D101568

3 years agoFix -fdebug-pass-structure test case
Evgeny Leviant [Fri, 30 Apr 2021 07:18:23 +0000 (10:18 +0300)]
Fix -fdebug-pass-structure test case

Pass structure can change when -O0 is given and extensions are used.

3 years agoReapply [llvm-readobj] [ARMWinEH] Fix handling of relocations and symbol offsets
Martin Storsjö [Mon, 12 Apr 2021 09:49:17 +0000 (12:49 +0300)]
Reapply [llvm-readobj] [ARMWinEH] Fix handling of relocations and symbol offsets

When looking up data referenced from pdata/xdata structures, the
referenced data can be found in two different ways:
- For an unrelocated object file, it's located via a relocation
- For a relocated, linked image, the data is referenced with an
  (image relative) absolute address

For the latter case, the absolute address can optionally be
described with a symbol.

For the case of an object file, there's two offsets involved; one
immediate offset encoded in the data location that is modified by
the relocation, and a section offset in the symbol.

Previously, for the ExceptionRecord field, we printed the offset
from the symbol (only) but used the immediate offset ignoring
the symbol's address (using only the symbol's section) for printing
the exception data.

Add a helper method for doing the lookup and address calculation,
for simplifying the calling code and making all the cases consistent.

This addresses an existing FIXME comment, fixing printing of the
exception data for cases where relocations point at individual
symbols in the xdata section (which is what MSVC generates) instead of
all relocations pointing at the start of the xdata section (which is
what LLVM generates).

This also fixes printing of the function name for packed entries in
linked images.

Relanded with a format string fix in the formatSymbol function; one
can't use %X as format string for an uint64_t. That bug has been
present since this code was added in e6971cab306cd.

Differential Revision: https://reviews.llvm.org/D100305

3 years agotsan: refactor fork handling
Dmitry Vyukov [Fri, 30 Apr 2021 06:32:52 +0000 (08:32 +0200)]
tsan: refactor fork handling

Commit efd254b6362 ("tsan: fix deadlock in pthread_atfork callbacks")
fixed another deadlock related to atfork handling.
But builders with DCHECKs enabled reported failures of
pthread_atfork_deadlock2.c and pthread_atfork_deadlock3.c tests
related to the fact that we hold runtime locks on interceptor exit:
https://lab.llvm.org/buildbot/#/builders/70/builds/6727
This issue is somewhat inherent to the current approach,
we indeed execute user code (atfork callbacks) with runtime lock held.

Refactor fork handling to not run user code (atfork callbacks)
with runtime locks held. This change does this by installing
own atfork callbacks during runtime initialization.
Atfork callbacks run in LIFO order, so the expectation is that
our callbacks run last, right before the actual fork.
This way we lock runtime mutexes around fork, but not around
user callbacks.

Extend tests to also install after fork callbacks just to cover
more scenarios. Some tests also started reporting real races
that we previously suppressed.

Also extend tests to cover fork syscall support.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D101517

3 years ago[debugserver] Use add_lldb_library instead of add_library
Jonas Devlieghere [Fri, 30 Apr 2021 05:07:46 +0000 (22:07 -0700)]
[debugserver] Use add_lldb_library instead of add_library

Use add_lldb_library to ensure debugserver inherits the defines set by
llvm and lldb.

Differential revision: https://reviews.llvm.org/D101596

3 years ago[msan] Add static to some msan allocator functions
Jianzhou Zhao [Thu, 29 Apr 2021 23:10:19 +0000 (23:10 +0000)]
[msan] Add static to some msan allocator functions

This is to help review refactor the allocator code.
So it is easy to see which are the real public interfaces.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D101586

3 years agoPre-commit test for PPC vector extraction test
Qiu Chaofan [Fri, 30 Apr 2021 04:02:37 +0000 (12:02 +0800)]
Pre-commit test for PPC vector extraction test

3 years ago[InlineCost] Remove visitUnaryInstruction()
Arthur Eubanks [Thu, 29 Apr 2021 21:19:14 +0000 (14:19 -0700)]
[InlineCost] Remove visitUnaryInstruction()

The simplifyInstruction() in visitUnaryInstruction() does not trigger
for all of check-llvm. Looking at all delegates to UnaryInstruction in
InstVisitor, the only instructions that either don't have a visitor in
CallAnalyzer, or redirect to UnaryInstruction, are VAArgInst and Alloca.
VAArgInst will never get simplified, and visitUnaryInstruction(Alloca)
would always return false anyway.

Reviewed By: mtrofin, lebedev.ri

Differential Revision: https://reviews.llvm.org/D101577

3 years ago[AMDGPU] Skip promote-alloca for insertelement/insertvalue users
Christudasan Devadasan [Thu, 29 Apr 2021 18:52:45 +0000 (00:22 +0530)]
[AMDGPU] Skip promote-alloca for insertelement/insertvalue users

It is difficult to track the users of vector and aggregate types.

Reviewed by: arsenm

Differential Revision: https://reviews.llvm.org/D101562

3 years ago[RISCV] Fix StackOffset calculation when using sp to access the fixed stack object...
luxufan [Mon, 12 Apr 2021 05:28:00 +0000 (13:28 +0800)]
[RISCV] Fix StackOffset calculation when using sp to access the fixed stack object in the case of rvv vector objects existed

When rvv vector objects existed, using sp to access the fixed stack object will pass the rvv vector objects field. So the StackOffset needs add a scalable offset of the size of rvv vector objects field

Differential Revision: https://reviews.llvm.org/D100286

3 years ago[RISCV] Precommit a test case that test accessing a fixed object when has rvv vector...
luxufan [Mon, 12 Apr 2021 05:28:00 +0000 (13:28 +0800)]
[RISCV] Precommit a test case that test accessing a fixed object when has rvv vector object existed

Differential Revision: https://reviews.llvm.org/D100284

3 years ago[CMake][compiler-rt] avoid conflict with builtin check_linker_flag
Steven Wu [Thu, 29 Apr 2021 22:48:46 +0000 (15:48 -0700)]
[CMake][compiler-rt] avoid conflict with builtin check_linker_flag

Rename `check_linker_flag` in compiler_rt to avoid conflict. Follow up
as the fix in D100901.

Patched by radford.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D101581

3 years ago[MS] Preserve base register %rbx around cpuid
Wang, Pengfei [Fri, 30 Apr 2021 01:34:06 +0000 (09:34 +0800)]
[MS] Preserve base register %rbx around cpuid

This patch copies implementation from cpuid.h, which preserve base register %rbx around cpuid. It fixes PR50133.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D101338

3 years ago[AArch64][GlobalISel] Fix width value for G_SBFX/G_UBFX
Brendon Cahoon [Thu, 29 Apr 2021 15:03:24 +0000 (11:03 -0400)]
[AArch64][GlobalISel] Fix width value for G_SBFX/G_UBFX

When creating G_SBFX/G_UBFX opcodes, the last operand is the
width instead of the bit position. The bit position is used
for the AArch64 SBFM and UBFM instructions. The bit position
is converted to a width if the SBFX/UBFX aliases are generated.
For other SBMF/UBFM aliases, such as shifts, the bit position
is used.

Differential Revision: https://reviews.llvm.org/D101543

3 years agoVirtRegMap: Support partially allocated virtual registers
Matt Arsenault [Thu, 25 Oct 2018 21:47:57 +0000 (14:47 -0700)]
VirtRegMap: Support partially allocated virtual registers

Don't assert if there are unassigned virtual registers.  Maintain
LiveIntervals by removing the RegUnits for allocated registers, since
they should not longer be necessary.

One part I find somewhat questionable is the special handling
necessary for handleIdentityCopy. The LiveIntervals for the relevant
regunits needs to be removed.

3 years ago[lldb-vscode] Follow up of D99989 - store some strings more safely
Walter Erquinigo [Tue, 27 Apr 2021 23:02:38 +0000 (16:02 -0700)]
[lldb-vscode] Follow up of D99989 - store some strings more safely

As a follow up of https://reviews.llvm.org/D99989#inline-953343, I'm now
storing std::string instead of char *. I know it might never break as char *,
but if it does, chasing that bug might be dauting.
Besides, I'm also checking of the strings gotten through the SB API are
null or not.

3 years agoVirtRegMap: Add pass option to not clear virt regs
Matt Arsenault [Thu, 25 Oct 2018 21:45:55 +0000 (14:45 -0700)]
VirtRegMap: Add pass option to not clear virt regs

In a future change it will be possible to run register
allocation with a specific set of register classes,
so some of the remaining virtual registers will still
be meaningful.

3 years agoAMDGPU: Add missing runline to test
Matt Arsenault [Wed, 28 Apr 2021 23:36:09 +0000 (19:36 -0400)]
AMDGPU: Add missing runline to test

There are checks for gfx908, but this wasn't actually running with it.

3 years ago[AMDGPU][NFC] Refactor hazard recognition IsHazardFn and IsExpiredFn
Carl Ritson [Fri, 30 Apr 2021 00:01:10 +0000 (09:01 +0900)]
[AMDGPU][NFC] Refactor hazard recognition IsHazardFn and IsExpiredFn

Refactor IsHazardFn and IsExpiredFn to use constant references as these should not be mutating the instructions visited and the instruction can never be null.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D101430

3 years ago[AMDGPU] Remove dead early-out in GCNHazardRecognizer
Carl Ritson [Thu, 29 Apr 2021 23:55:42 +0000 (08:55 +0900)]
[AMDGPU] Remove dead early-out in GCNHazardRecognizer

Remove an early-out in wait state counting which can never be
taken.

Reviewed By: foad, rampitec

Differential Revision: https://reviews.llvm.org/D101520

3 years ago[Sema] Don't set BlockDecl's DoesNotEscape bit if the parameter type of
Akira Hatanaka [Thu, 22 Apr 2021 16:48:54 +0000 (09:48 -0700)]
[Sema] Don't set BlockDecl's DoesNotEscape bit if the parameter type of
the function the block is passed to isn't a block pointer type

This patch fixes a bug where a block passed to a function taking a
parameter that doesn't have a block pointer type (e.g., id or reference
to a block pointer) was marked as noescape.

This partially fixes PR50043.

rdar://77030453

Differential Revision: https://reviews.llvm.org/D101097

3 years ago[msan] Remove dead function/fields
Jianzhou Zhao [Thu, 29 Apr 2021 18:47:04 +0000 (18:47 +0000)]
[msan] Remove dead function/fields

To see how to extract a shared allocator interface for D101204,
found some unused code. Tests passed. Are they safe to remove?

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D101559

3 years ago[ObjC][ARC] Don't enter the cleanup scope if the initializer expression
Akira Hatanaka [Tue, 27 Apr 2021 01:38:58 +0000 (18:38 -0700)]
[ObjC][ARC] Don't enter the cleanup scope if the initializer expression
isn't an ExprWithCleanups

This patch fixes a bug where a temporary ObjC pointer is released before
the end of the full expression.

This fixes PR50043.

rdar://77030453

Differential Revision: https://reviews.llvm.org/D101502

3 years agoReland "[lld-link] Enable addrsig table in COFF lto"
Zequan Wu [Thu, 29 Apr 2021 22:52:24 +0000 (15:52 -0700)]
Reland "[lld-link] Enable addrsig table in COFF lto"

This reverts commit a78fa73bcf986cf5912d665ecd9620535f480607.

The commit cab48e2f0e00648ef0494ce114f4e00a3ded330f fixes the issue on eabd55b1b2c5e322c3b36cb44348f178692890c8.

3 years ago[mlir][sparse] migrate sparse operations into new sparse tensor dialect
Aart Bik [Thu, 29 Apr 2021 21:31:18 +0000 (14:31 -0700)]
[mlir][sparse] migrate sparse operations into new sparse tensor dialect

This is the very first step toward removing the glue and clutter from linalg and
replace it with proper sparse tensor types. This revision migrates the LinalgSparseOps
into SparseTensorOps of a sparse tensor dialect. This also provides a new home for
sparse tensor related transformation.

NOTE: the actual replacement with sparse tensor types (and removal of linalg glue/clutter)
will follow but I am trying to keep the amount of changes per revision manageable.

Differential Revision: https://reviews.llvm.org/D101573

3 years ago[CodeGen] don't emit addrsig symbol if it's used only by metadata
Zequan Wu [Thu, 29 Apr 2021 04:25:51 +0000 (21:25 -0700)]
[CodeGen] don't emit addrsig symbol if it's used only by metadata

Value only used by metadata can be removed from .addrsig table.
This solves the undefined symbol error when enabling addrsig table on COFF LTO.

Differential Revision: https://reviews.llvm.org/D101512

3 years ago[mlir][tosa] Remove constant-0 dim expr values from TOSA lowerings
Rob Suderman [Fri, 23 Apr 2021 00:40:35 +0000 (17:40 -0700)]
[mlir][tosa] Remove constant-0 dim expr values from TOSA lowerings

Constant-0 dim expr values should be avoided for linalg as it can prevent
fusion. This includes adding support for rank-0 reshapes.

Differential Revision: https://reviews.llvm.org/D101418

3 years ago[XCOFF] Handle the case when personality routine is an alias
jasonliu [Thu, 29 Apr 2021 20:39:43 +0000 (20:39 +0000)]
[XCOFF] Handle the case when personality routine is an alias

Summary:
Personality routine could be an alias to another personality routine.
Fix the situation when we compile the file that contains the personality
routine and the file also have functions that need to refer to the
personality routine.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D101401

3 years agoRecommit "[clang][driver] Use the provided arch name for a Darwin target triple
Alex Lorenz [Mon, 26 Apr 2021 21:56:56 +0000 (14:56 -0700)]
Recommit "[clang][driver] Use the provided arch name for a Darwin target triple

This ensures that the Darwin driver uses a consistent target triple
representation when the triple is printed out to the user.

This reverts the revert commit ab0df6c0346e515291a381467527621ab0ccf953.

Differential Revision: https://reviews.llvm.org/D100807

3 years ago[libcxx][ranges] Fix tests for stdlib types that conform to sized_sentinel_for.
zoecarver [Tue, 27 Apr 2021 16:03:52 +0000 (09:03 -0700)]
[libcxx][ranges] Fix tests for stdlib types that conform to sized_sentinel_for.

Differential Revision: https://reviews.llvm.org/D101371

3 years ago[GlobalISel][Legalizer] Bump up a smallvector size that was found to be too small...
Amara Emerson [Thu, 29 Apr 2021 21:35:02 +0000 (14:35 -0700)]
[GlobalISel][Legalizer] Bump up a smallvector size that was found to be too small. NFC.

3 years ago[CMake] Stop using c++ subdirectory for libc++ on Win to ARM Linux cross builds. NFC
Vladimir Vereschaka [Thu, 29 Apr 2021 21:19:24 +0000 (14:19 -0700)]
[CMake] Stop using c++ subdirectory for libc++ on Win to ARM Linux cross builds. NFC

Updated cross Win-x-ARM Linux toolchain cmake cache file in according of
the following changes: https://reviews.llvm.org/D100869

Stop using use c++ subdirectory for libc++ library

3 years ago[ORC] JITDylib::addDependencies should be run under the session lock.
Lang Hames [Thu, 29 Apr 2021 20:47:44 +0000 (13:47 -0700)]
[ORC] JITDylib::addDependencies should be run under the session lock.

3 years agoRevert "[llvm-readobj] [ARMWinEH] Fix handling of relocations and symbol offsets"
Martin Storsjö [Thu, 29 Apr 2021 21:03:40 +0000 (00:03 +0300)]
Revert "[llvm-readobj] [ARMWinEH] Fix handling of relocations and symbol offsets"

This reverts commit 37789240882bfacd951767acdb4c088fcbf53385.

The added test fails on at least one buildbot, by printing a reversed
combination, printing "func3_xdata +0x18 (0x8)" while it's supposed to
be "func3_xdata +0x8 (0x18)", see e.g.
https://lab.llvm.org/buildbot/#/builders/107/builds/7269. Currently
no idea how that could happen, but reverting until it can be figured
out.

3 years ago[AArch64][GlobalISel] Simplify out of range rotate amount.
Amara Emerson [Mon, 26 Apr 2021 15:29:59 +0000 (08:29 -0700)]
[AArch64][GlobalISel] Simplify out of range rotate amount.

Differential Revision: https://reviews.llvm.org/D101005

3 years agoRevert "[mlir][sparse] migrate sparse operations into new sparse tensor dialect"
Mehdi Amini [Thu, 29 Apr 2021 20:59:41 +0000 (20:59 +0000)]
Revert "[mlir][sparse] migrate sparse operations into new sparse tensor dialect"

This reverts commit a6d92a971175d727873a9e7644913ee02d7232a8.

The build with -DBUILD_SHARED_LIBS=ON is broken.

3 years ago[llvm-readobj] [ARMWinEH] Fix handling of relocations and symbol offsets
Martin Storsjö [Mon, 12 Apr 2021 09:49:17 +0000 (12:49 +0300)]
[llvm-readobj] [ARMWinEH] Fix handling of relocations and symbol offsets

When looking up data referenced from pdata/xdata structures, the
referenced data can be found in two different ways:
- For an unrelocated object file, it's located via a relocation
- For a relocated, linked image, the data is referenced with an
  (image relative) absolute address

For the latter case, the absolute address can optionally be
described with a symbol.

For the case of an object file, there's two offsets involved; one
immediate offset encoded in the data location that is modified by
the relocation, and a section offset in the symbol.

Previously, for the ExceptionRecord field, we printed the offset
from the symbol (only) but used the immediate offset ignoring
the symbol's address (using only the symbol's section) for printing
the exception data.

Add a helper method for doing the lookup and address calculation,
for simplifying the calling code and making all the cases consistent.

This addresses an existing FIXME comment, fixing printing of the
exception data for cases where relocations point at individual
symbols in the xdata section (which is what MSVC generates) instead of
all relocations pointing at the start of the xdata section (which is
what LLVM generates).

This also fixes printing of the function name for packed entries in
linked images.

Differential Revision: https://reviews.llvm.org/D100305

3 years ago[LLD] [COFF] Fix the mingw --export-all-symbols behaviour with comdat symbols
Martin Storsjö [Thu, 29 Apr 2021 08:57:33 +0000 (11:57 +0300)]
[LLD] [COFF] Fix the mingw --export-all-symbols behaviour with comdat symbols

When looking for the "all" symbols that are supposed to be exported,
we can't look at the live flag - the symbols we mark as to be
exported will become GC roots even if they aren't yet marked as live.

With this in place, building an LLVM library with BUILD_SHARED_LIBS
produces the same set of symbols exported regardless of whether the
--gc-sections flag is specified, both with and without being built
with -ffunction-sections.

Differential Revision: https://reviews.llvm.org/D101522

3 years ago[flang][OpenMP][FIX] Fix the worksharing nesting check with inclusion of more constru...
Arnamoy Bhattacharyya [Thu, 29 Apr 2021 20:12:28 +0000 (16:12 -0400)]
[flang][OpenMP][FIX] Fix the worksharing nesting check with inclusion of more constructs to cover combined constructs.

3 years agoRevert "Generalize getInvertibleOperand recurrence handling slightly"
Philip Reames [Thu, 29 Apr 2021 20:06:26 +0000 (13:06 -0700)]
Revert "Generalize getInvertibleOperand recurrence handling slightly"

This reverts commit 0c01b37eeb18a51a7e9c9153330d8009de0f600e while a problem reported is investigated.

3 years ago[mlir] Fix lowering of multi-dimensional vector log1p to LLVM
Benjamin Kramer [Thu, 29 Apr 2021 13:26:10 +0000 (15:26 +0200)]
[mlir] Fix lowering of multi-dimensional vector log1p to LLVM

This was using the untransformed operand, leading to invalid IR.

Differential Revision: https://reviews.llvm.org/D101531

3 years ago[AMDGPU] Fix v_swap_b32 formation on physical registers
Jay Foad [Thu, 29 Apr 2021 16:48:54 +0000 (17:48 +0100)]
[AMDGPU] Fix v_swap_b32 formation on physical registers

As explained in the comments, matchSwap matches:

// mov t, x
// mov x, y
// mov y, t

and turns it into:

// mov t, x (t is potentially dead and move eliminated)
// v_swap_b32 x, y

On physical registers we don't have full use-def chains so the check
for T being live-out was not working properly with subregs/superregs.

Differential Revision: https://reviews.llvm.org/D101546

3 years ago[COST] Improve shuffle kind detection if shuffle mask is provided.
Alexey Bataev [Thu, 29 Apr 2021 19:46:59 +0000 (12:46 -0700)]
[COST] Improve shuffle kind detection if shuffle mask is provided.

Added an extra analysis for better choosing of shuffle kind in
getShuffleCost functions for better cost estimation if mask was
provided.

Differential Revision: https://reviews.llvm.org/D100865

3 years agoRevert "[COST] Improve shuffle kind detection if shuffle mask is provided."
Alexey Bataev [Thu, 29 Apr 2021 19:39:48 +0000 (12:39 -0700)]
Revert "[COST] Improve shuffle kind detection if shuffle mask is provided."

This reverts commit 92399322217917e67c0d72a55ec51ddc82251cf6 to fix
a compiler crash on mask checks.

3 years ago[lld-macho] Remove stray file
Jez Ng [Thu, 29 Apr 2021 19:32:43 +0000 (15:32 -0400)]
[lld-macho] Remove stray file

3 years agoBasic block sections for functions with implicit-section-name attribute
Sriraman Tallam [Thu, 29 Apr 2021 18:48:11 +0000 (11:48 -0700)]
Basic block sections for functions with implicit-section-name attribute

Functions can have section names set via #pragma or section attributes,
basic block sections should be correctly named for such functions.

With #pragma, the expectation is that all functions in that file are placed
in the same section in the final binary. Basic block sections should be
correctly named with the unique flag set so that the final binary has all the
basic blocks of the function in that named section. This patch fixes the bug
by calling getExplictSectionGlobal when implicit-section-name attribute is set
to make sure the function's basic blocks get the correct section name.

Differential Revision: https://reviews.llvm.org/D101311

3 years ago[lld-macho][nfc] Clean up header.s test
Jez Ng [Thu, 29 Apr 2021 02:45:03 +0000 (22:45 -0400)]
[lld-macho][nfc] Clean up header.s test

I don't think it's super worthwhile to test the dylib headers outputs of
all the different archs when x86_64 is the only one that has interesting
behavior.

Motivated by my upcoming addition of arm32...

3 years ago[lld-macho] Make everything PIE by default
Jez Ng [Thu, 29 Apr 2021 19:09:01 +0000 (15:09 -0400)]
[lld-macho] Make everything PIE by default

Modern versions of macOS (>= 10.7) and in general all modern Mach-O
target archs want PIEs by default. ld64 defaults to PIE for iOS >= 4.3,
as well as for all versions of watchOS and simulators. Basically all the
platforms LLD is likely to target want PIE. So instead of cluttering LLD's
code with legacy version checks, I think it's simpler to just default to
PIE for everything.

Note that `-no_pie` still works, so users can still opt out of it.

Reviewed By: #lld-macho, thakis, MaskRay

Differential Revision: https://reviews.llvm.org/D101513

3 years ago[mlir][sparse] migrate sparse operations into new sparse tensor dialect
Aart Bik [Thu, 29 Apr 2021 01:15:11 +0000 (18:15 -0700)]
[mlir][sparse] migrate sparse operations into new sparse tensor dialect

This is the very first step toward removing the glue and clutter from linalg and
replace it with proper sparse tensor types. This revision migrates the LinalgSparseOps
into SparseTensorOps of a sparse tensor dialect. This also provides a new home for
sparse tensor related transformation.

NOTE: the actual replacement with sparse tensor types (and removal of linalg glue/clutter)
will follow but I am trying to keep the amount of changes per revision manageable.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D101488

3 years ago[InstCombine] narrow popcount with zext operand
Sanjay Patel [Thu, 29 Apr 2021 18:02:50 +0000 (14:02 -0400)]
[InstCombine] narrow popcount with zext operand

https://llvm.org/PR50141

3 years ago[InstCombine] add tests for popcount with zext operand; NFC
Sanjay Patel [Thu, 29 Apr 2021 17:46:10 +0000 (13:46 -0400)]
[InstCombine] add tests for popcount with zext operand; NFC

PR50141

3 years agoRevert "RegAlloc: do not consider liveins to EH-pad successors as liveout."
Tim Northover [Thu, 29 Apr 2021 18:58:51 +0000 (19:58 +0100)]
Revert "RegAlloc: do not consider liveins to EH-pad successors as liveout."

Some liveins *can* come from this block (e.g. any SSA value except the call),
it's only the ones that produce `landingpad` values that can't and I didn't
think it through properly.

3 years agoAMDGPU/GlobalISel: Fix selection of image intrinsics with unused return
Petar Avramovic [Wed, 28 Apr 2021 11:11:44 +0000 (13:11 +0200)]
AMDGPU/GlobalISel: Fix selection of image intrinsics with unused return

When atomic image intrinsic return value is unused, register class for
destination of a sub-register copy of return value ends up not being set.
This copy then hits 'Register class not set' assert later.
If return value has uses, register class is determined by use instruction.
Fix is to not create sub-register copy when image intrinsic destination has
no uses because it would be deleted by dead-mi-elimination later anyway.

Differential Revision: https://reviews.llvm.org/D101448

3 years ago[ASan] Rename `-fsanitize-address-destructor-kind=` to drop the `-kind` suffix.
Dan Liew [Wed, 28 Apr 2021 21:22:08 +0000 (14:22 -0700)]
[ASan] Rename `-fsanitize-address-destructor-kind=` to drop the `-kind` suffix.

Renaming the option is based on discussions in https://reviews.llvm.org/D101122.

It is normally not a good idea to rename driver flags but this flag is
new enough and obscure enough that it is very unlikely to have adopters.

While we're here also drop the `<kind>` metavar. It's not necessary and
is actually inconsistent with the documentation in
`clang/docs/ClangCommandLineReference.rst`.

Differential Revision: https://reviews.llvm.org/D101491

3 years agoRegAlloc: do not consider liveins to EH-pad successors as liveout.
Tim Northover [Thu, 29 Apr 2021 11:31:22 +0000 (12:31 +0100)]
RegAlloc: do not consider liveins to EH-pad successors as liveout.

These registers get defined by the runtime, not the block being allocated, and
treating them as preassigned in RegAllocFast adds extra pressure, sometimes
enough to make the function unallocatable.

3 years ago[AIX][TLS] Add ASM portion changes to support TLSGD relocations to XCOFF objects
Victor Huang [Wed, 28 Apr 2021 18:57:16 +0000 (13:57 -0500)]
[AIX][TLS] Add ASM portion changes to support TLSGD relocations to XCOFF objects

- Add new variantKinds for the symbol's variable offset and region handle
- Print the proper relocation specifier @gd in the asm streamer when emitting
  the TC Entry for the variable offset for the symbol
- Fix the switch section failure between the TC Entry of variable offset and
  region handle
- Put .__tls_get_addr symbol in the ProgramCodeSects with XTY_ER property

Reviewed by: sfertile

Differential Revision: https://reviews.llvm.org/D100956

3 years ago[SimplifyCFG] Common code sinking: fix application of profitability check
Roman Lebedev [Thu, 29 Apr 2021 16:20:06 +0000 (19:20 +0300)]
[SimplifyCFG] Common code sinking: fix application of profitability check

The profitability check is: we don't want to create more than a single PHI
per instruction sunk. We need to create the PHI unless we'll sink
all of it's would-be incoming values.

But there is a caveat there.
This profitability check doesn't converge on the first iteration!
If we first decide that we want to sink 10 instructions,
but then determine that 5'th one is unprofitable to sink,
that may result in us not sinking some instructions that
resulted in determining that some other instruction
we've determined to be profitable to sink becoming unprofitable.

So we need to iterate until we converge, as in determine
that all leftover instructions are profitable to sink.

But, the direct approach of just re-iterating seems dumb,
because in the worst case we'd find that the last instruction
is unprofitable, which would result in revisiting instructions
many many times.

Instead, i think we can get away with just two passes - forward and backward.
However then it isn't obvious what is the most performant way to update
InstructionsToSink.

3 years ago[lld][WebAssembly] Add `--export-if-defined`
Sam Clegg [Mon, 5 Apr 2021 15:00:30 +0000 (08:00 -0700)]
[lld][WebAssembly] Add `--export-if-defined`

Unlike the existing `--export` option this will not causes errors
or warnings if the specified symbol is not defined.

See: https://github.com/emscripten-core/emscripten/issues/13736

Differential Revision: https://reviews.llvm.org/D99887

3 years ago[libc++] Fixes std::to_chars for bases != 10.
Mark de Wever [Sat, 27 Feb 2021 15:52:39 +0000 (16:52 +0100)]
[libc++] Fixes std::to_chars for bases != 10.

While working on D70631, Microsoft's unit tests discovered an issue.
Our `std::to_chars` implementation for bases != 10 uses the range
`[first,last)` as temporary buffer. This violates the contract for
to_chars:
[charconv.to.chars]/1 http://eel.is/c++draft/charconv#to.chars-1
`to_chars_result to_chars(char* first, char* last, see below value, int base = 10);`
"If the member ec of the return value is such that the value is equal to
the value of a value-initialized errc, the conversion was successful and
the member ptr is the one-past-the-end pointer of the characters
written."

Our implementation modifies the range `[member ptr, last)`, which causes
Microsoft's test to fail. Their test verifies the buffer
`[member ptr, last)` is unchanged. (The test is only done when the
conversion is successful.)

While looking at the code I noticed the performance for bases != 10 also
is suboptimal. This is tracked in D97705.

This patch fixes the issue and adds a benchmark. This benchmark will be
used as baseline for D97705.

Reviewed By: #libc, Quuxplusone, zoecarver

Differential Revision: https://reviews.llvm.org/D100722

3 years ago[CMake] Set correct CXX_FLAGS for relative-vtables variants
Petr Hosek [Thu, 29 Apr 2021 17:18:02 +0000 (10:18 -0700)]
[CMake] Set correct CXX_FLAGS for relative-vtables variants

We overrite CXX_FLAGS to enable relative vtables, but doing so
overwrites generic Fuchsia CXX_FLAGS leading to a build failure
on Windows.

Differential Revision: https://reviews.llvm.org/D101551

3 years ago[lldb] Make the NSSet formatter faster and less prone to infinite recursion
Raphael Isemann [Thu, 29 Apr 2021 17:13:21 +0000 (19:13 +0200)]
[lldb] Make the NSSet formatter faster and less prone to infinite recursion

Right now to get the 'NSSet *` pointer value we first derefence it and then take
the address of the result.

Beside being inefficient this potentially can cause an infinite recursion if the
`pointer` value we get is a pointer of a type that the TypeSystem can't
derefence. If the pointer is for example some form of `void *` that the dynamic
type resolution can't resolve to an actual type, then the `Derefence` call goes
back to asking the formatters how to reference it. If the NSSet formatter then
checks if it's an NSSet variation under the hood then we just end infinitely
often recursion.

In practice this seems to happen with some form of Builtin.RawPointer we get
from a NSDictionary in Swift.

FWIW, no other formatter is doing the same deref->addressOf as here and there
doesn't seem to be any specific reason to do so in the git history (it's just
part of the initial formatter commit)

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D101537

3 years ago[gn build] Port df323ba445f7
LLVM GN Syncbot [Thu, 29 Apr 2021 16:59:58 +0000 (16:59 +0000)]
[gn build] Port df323ba445f7

3 years agoRevert "[X86] Support AMX fast register allocation"
Benjamin Kramer [Thu, 29 Apr 2021 16:51:34 +0000 (18:51 +0200)]
Revert "[X86] Support AMX fast register allocation"

This reverts commit 3b8ec86fd576b9808dc63da620d9a4f7bbe04372.

Revert "[X86] Refine AMX fast register allocation"

This reverts commit c3f95e9197643b699b891ca416ce7d72cf89f5fc.

This pass breaks using LLVM in a multi-threaded environment by
introducing global state.

3 years agoRevert "[scudo] Use require_constant_initialization"
Vitaly Buka [Thu, 29 Apr 2021 16:55:28 +0000 (09:55 -0700)]
Revert "[scudo] Use require_constant_initialization"

This reverts commit 7ad4dee3e733d820115f44cecce73ceb64c76450.

3 years ago[ConstantFolding] propagate poison through vector reduction intrinsics
Sanjay Patel [Thu, 29 Apr 2021 16:53:26 +0000 (12:53 -0400)]
[ConstantFolding] propagate poison through vector reduction intrinsics

3 years ago[libcxx] [test] Include more libraries that normally are linked automatically
Martin Storsjö [Wed, 28 Apr 2021 08:08:12 +0000 (11:08 +0300)]
[libcxx] [test] Include more libraries that normally are linked automatically

As the libcxx tests link with -nostdlib, libraries that normally
are added by default by the compiler driver has to be added
manually.

The "oldnames" library is automatically added when driving linking
with clang-cl. When linking with the plain clang driver, as the
libcxx tests do, the clang driver does the same but only since Clang
12.0). But when linking with -nostdlib, like the libcxx tests do,
the driver defaults aren't added at all, and we need to specify the
defaults manually.

This allows removing a TODO from the Windows CI setup; it turns out
that upgrading to Clang 12.0 didn't help here as expected, sorry about
that mixup.

Differential Revision: https://reviews.llvm.org/D101434

3 years ago[scudo] Use require_constant_initialization
Vitaly Buka [Thu, 29 Apr 2021 08:19:51 +0000 (01:19 -0700)]
[scudo] Use require_constant_initialization

Attribute guaranties safe static initialization of globals.

Differential Revision: https://reviews.llvm.org/D101514

3 years ago[RISCV] Teach DAG combine to fold (and (select_cc lhs, rhs, cc, -1, c), x) -> (select...
Craig Topper [Thu, 29 Apr 2021 16:39:21 +0000 (09:39 -0700)]
[RISCV] Teach DAG combine to fold (and (select_cc lhs, rhs, cc, -1, c), x) -> (select_cc lhs, rhs, cc, x, (and, x, c))

Similar for or/xor with 0 in place of -1.

This is the canonical form produced by InstCombine for something like `c ? x & y : x;` Since we have to use control flow to expand select we'll usually end up with a mv in basic block. By folding this we may be able to pull the and/or/xor into the block instead and avoid a mv instruction.

The code here is based on code from ARM that uses this to create predicated instructions. I'm doing it on SELECT_CC so it happens late, but we could do it on select earlier which is what ARM does. I'm not sure if we lose any combine opportunities if we do it earlier.

I left out add and sub because this can separate sext.w from the add/sub. It also made a conditional i64 addition/subtraction on RV32 worse. I guess both of those would be fixed by doing this earlier on select.

The select-binop-identity.ll test has not been commited yet, but I made the diff show the changes to it.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D101485

3 years ago[RISCV] Add test cases for D101485. NFC
Craig Topper [Thu, 29 Apr 2021 16:15:08 +0000 (09:15 -0700)]
[RISCV] Add test cases for D101485. NFC

3 years ago[COST] Improve shuffle kind detection if shuffle mask is provided.
Alexey Bataev [Tue, 20 Apr 2021 14:05:30 +0000 (07:05 -0700)]
[COST] Improve shuffle kind detection if shuffle mask is provided.

Added an extra analysis for better choosing of shuffle kind in
getShuffleCost functions for better cost estimation if mask was
provided.

Differential Revision: https://reviews.llvm.org/D100865

3 years ago[unittest] Fix Frontend/OpenMPIRBuilderTest.cpp -Wsign-compare after D89671
Fangrui Song [Thu, 29 Apr 2021 16:37:58 +0000 (09:37 -0700)]
[unittest] Fix Frontend/OpenMPIRBuilderTest.cpp -Wsign-compare after D89671

3 years ago[DebugInfo] Add tests that we emit .eh_frame instead of .debug_frame
Fangrui Song [Thu, 29 Apr 2021 16:35:48 +0000 (09:35 -0700)]
[DebugInfo] Add tests that we emit .eh_frame instead of .debug_frame

Add tests which can catch the issue in 0ce723cb228bc1d1a0f5718f3862fb836145a333
(If any function needs CFISection::EH, the module should use CFISection::EH).

Reviewed By: echristo

Differential Revision: https://reviews.llvm.org/D101339

3 years ago[ConstProp] add tests for vector reductions of poison; NFC
Sanjay Patel [Thu, 29 Apr 2021 16:20:59 +0000 (12:20 -0400)]
[ConstProp] add tests for vector reductions of poison; NFC

3 years ago[ConstantFolding] refactor helper for vector reductions; NFC
Sanjay Patel [Thu, 29 Apr 2021 16:07:51 +0000 (12:07 -0400)]
[ConstantFolding] refactor helper for vector reductions; NFC

We should handle other cases (undef/poison), so reduce
the duplication of repeated switches.

3 years ago[ADT] fix typo in code block comment; NFC
Sanjay Patel [Thu, 29 Apr 2021 14:43:11 +0000 (10:43 -0400)]
[ADT] fix typo in code block comment; NFC

3 years ago[AsmParser][SystemZ][z/OS] Reject "Dot" as current PC on z/OS
Anirudh Prasad [Thu, 29 Apr 2021 15:27:56 +0000 (11:27 -0400)]
[AsmParser][SystemZ][z/OS] Reject "Dot" as current PC on z/OS

- Currently, the "." (Dot) character, when not identifying an Identifier or a Constant, refers to the current PC (Program Counter)
- However, in z/OS, for the HLASM dialect, it strictly accepts only the "*" as the current PC (Support for this will be put up in a follow-up patch)
- The changes in this patch allow individual platforms to choose whether they would like to use the "." (Dot) character as a marker for the current PC or not.
- It is achieved by introducing a new field in MCAsmInfo.h called `DotIsPC` (similar to `DollarIsPC`)

Reviewed By: abhina.sreeskantharajan

Differential Revision: https://reviews.llvm.org/D100975

3 years ago[ELF] Support .rela.eh_frame with unordered r_offset values
Fangrui Song [Thu, 29 Apr 2021 15:51:09 +0000 (08:51 -0700)]
[ELF] Support .rela.eh_frame with unordered r_offset values

GNU ld -r can create .rela.eh_frame with unordered r_offset values.
(With LLD, we can craft such a case by reordering sections in .eh_frame.)
This is currently unsupported and will trigger
`assert(pieces[i].inputOff <= off ...` in `OffsetGetter::get`
(the content is corrupted in a -DLLVM_ENABLE_ASSERTIONS=off build).
This patch supports this case.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D101116

3 years ago[RISCV] Enable SPLAT_VECTOR for fixed vXi64 types on RV32.
Craig Topper [Thu, 29 Apr 2021 15:10:39 +0000 (08:10 -0700)]
[RISCV] Enable SPLAT_VECTOR for fixed vXi64 types on RV32.

This replaces D98479.

This allows type legalization to form SPLAT_VECTOR_PARTS so we don't
lose the splattedness when the scalar type is split.

I'm handling SPLAT_VECTOR_PARTS for fixed vectors separately so
we can continue using non-VL nodes for scalable vectors.

I limited to RV32+vXi64 because DAGCombiner::visitBUILD_VECTOR likes
to form SPLAT_VECTOR before seeing if it can replace the BUILD_VECTOR
with other operations. Especially interesting is a splat BUILD_VECTOR of
the extract_vector_elt which can become a splat shuffle, but won't if
we form SPLAT_VECTOR first. We either need to reorder visitBUILD_VECTOR
or add visitSPLAT_VECTOR.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D100803

3 years ago[RISCV] Teach computeKnownBits that vsetvli returns number less than 2^31.
Craig Topper [Thu, 29 Apr 2021 15:00:10 +0000 (08:00 -0700)]
[RISCV] Teach computeKnownBits that vsetvli returns number less than 2^31.

This seems like a reasonable upper bound on VL. WG discussions for
the V spec would probably allow us to use 2^16 as an upper bound
on VLEN, but this is good enough for now.

This allows us to remove sext and zext if user happens to assign
the size_t result into an int and then uses it as a VL intrinsic
argument which is size_t.

Reviewed By: frasercrmck, rogfer01, arcbbb

Differential Revision: https://reviews.llvm.org/D101472

3 years agoRevert "[LV] Calculate max feasible scalable VF."
Sander de Smalen [Thu, 29 Apr 2021 14:37:57 +0000 (15:37 +0100)]
Revert "[LV] Calculate max feasible scalable VF."

Temporarily reverting this patch due to some unexpected issue found
by one of the PPC buildbots.

This reverts commit 584e9b6e4b4987b882719923e640eed854613d91.

3 years ago[AMDGPU] Add a v_swap_b32 test case to be fixed
Jay Foad [Thu, 29 Apr 2021 15:03:00 +0000 (16:03 +0100)]
[AMDGPU] Add a v_swap_b32 test case to be fixed

3 years ago[Clang][OpenMP] Frontend work for sections - D89671
Chirag Khandelwal [Thu, 29 Apr 2021 13:36:07 +0000 (19:06 +0530)]
[Clang][OpenMP] Frontend work for sections - D89671

This patch is child of D89671, contains the clang
implementation to use the OpenMP IRBuilder's section
construct.

Co-author: @anchu-rajendran

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D91054

3 years agoUnbreak no-asserts testing
David Zarzycki [Thu, 29 Apr 2021 14:01:37 +0000 (10:01 -0400)]
Unbreak no-asserts testing

3 years ago[OpenCL][Docs] Misc updates to C++ for OpenCL and offline compilation
Anastasia Stulova [Thu, 29 Apr 2021 13:08:21 +0000 (14:08 +0100)]
[OpenCL][Docs] Misc updates to C++ for OpenCL and offline compilation

Differential Revision: https://reviews.llvm.org/D101092

3 years ago[LLVM][OpenMP] Adding support for OpenMP sections construct in OpenMPIRBuilder
Chirag Khandelwal [Thu, 29 Apr 2021 13:08:24 +0000 (18:38 +0530)]
[LLVM][OpenMP] Adding support for OpenMP sections construct in OpenMPIRBuilder

This patch adds section support in the OpenMP IRBuilder module, along with a test for the same.

Reviewed By: fghanim

Differential Revision: https://reviews.llvm.org/D89671

3 years ago[OpenCL][Docs] Describe extension for legacy atomics with generic addr space.
Anastasia Stulova [Thu, 29 Apr 2021 13:02:29 +0000 (14:02 +0100)]
[OpenCL][Docs] Describe extension for legacy atomics with generic addr space.

This extension is primarily targeting SPIR-V compilations flow
as the IR translation is the same between 1.x and 2.x atomics.

Differential Revision: https://reviews.llvm.org/D101089

3 years ago[VPlan] Add getVPSingleValue helper.
Florian Hahn [Thu, 29 Apr 2021 12:17:37 +0000 (13:17 +0100)]
[VPlan] Add getVPSingleValue helper.

As suggested in D99294, this adds a getVPSingleValue helper to use for
recipes that are guaranteed to define a single value. This replaces uses
of getVPValue() which used to default to I = 0.

3 years ago[flang][OpenMP] Add semantic checks for strict nesting inside `teams` construct.
Arnamoy Bhattacharyya [Thu, 29 Apr 2021 12:29:58 +0000 (08:29 -0400)]
[flang][OpenMP] Add semantic checks for strict nesting inside `teams` construct.

3 years ago[mlir] fix shared-lib build
Alex Zinenko [Thu, 29 Apr 2021 11:26:54 +0000 (13:26 +0200)]
[mlir] fix shared-lib build

3 years ago[AArch64][SVE] Use SIMD variant of INSR when scalar is the result of a vector extract
Bradley Smith [Fri, 23 Apr 2021 15:34:26 +0000 (16:34 +0100)]
[AArch64][SVE] Use SIMD variant of INSR when scalar is the result of a vector extract

At the intrinsic layer the sve.insr operation takes a scalar. When this
scalar is an integer we are forcing a data transition between GPRs and
ZPRs that is potentially costly.

Often the integer scalar is the result of a vector extract, when
performing a reduction for example. In such cases we should keep all
data within the ZPRs.

Co-authored-by: Paul Walker <paul.walker@arm.com>
Differential Revision: https://reviews.llvm.org/D101169

3 years ago[AArch64][SVE] Convert svdup(vec, SV_VL1, elm) to insertelement(vec, elm, 0)
Bradley Smith [Fri, 23 Apr 2021 12:55:42 +0000 (13:55 +0100)]
[AArch64][SVE] Convert svdup(vec, SV_VL1, elm) to insertelement(vec, elm, 0)

By converting the SVE intrinsic to a normal LLVM insertelement we give
the code generator a better chance to remove transitions between GPRs
and VPRs

Co-authored-by: Paul Walker <paul.walker@arm.com>
Depends on D101302

Differential Revision: https://reviews.llvm.org/D101167