platform/upstream/llvm.git
2 years ago[AMDGPU] Add test for no waitcnt before issuing LDS DMA. NFC.
Stanislav Mekhanoshin [Mon, 16 May 2022 22:58:28 +0000 (15:58 -0700)]
[AMDGPU] Add test for no waitcnt before issuing LDS DMA. NFC.

A wait is only needed after the DMA before LDS can be read.

2 years ago[X86] Rename combineCONCAT_VECTORS\INSERT_SUBVECTOR\EXTRACT_SUBVECTOR to match Opcode...
Simon Pilgrim [Tue, 17 May 2022 17:15:30 +0000 (18:15 +0100)]
[X86] Rename combineCONCAT_VECTORS\INSERT_SUBVECTOR\EXTRACT_SUBVECTOR to match Opcode name. NFCI.

Its a lot easier to quickly search for the combine when it actually contains the name of the opcode it combines.

2 years ago[AMDGPU] Add intrinsics llvm.amdgcn.{raw|struct}.buffer.load.lds
Stanislav Mekhanoshin [Fri, 13 May 2022 20:31:38 +0000 (13:31 -0700)]
[AMDGPU] Add intrinsics llvm.amdgcn.{raw|struct}.buffer.load.lds

Differential Revision: https://reviews.llvm.org/D124884

2 years ago[mlir][LLVMIR] Use a new way to verify GEPOp indices
Min-Yih Hsu [Thu, 21 Apr 2022 00:46:39 +0000 (17:46 -0700)]
[mlir][LLVMIR] Use a new way to verify GEPOp indices

Previously, GEPOp relies on `findKnownStructIndices` to check if a GEP
index should be static. The truth is, `findKnownStructIndices` can only
tell you a GEP index _might_ be indexing into a struct (which should use
a static GEP index). But GEPOp::build and GEPOp::verify are falsely
taking this information as a certain answer, which creates many false
alarms like the one depicted in
`test/Target/LLVMIR/Import/dynamic-gep-index.ll`.

The solution presented here adopts a new verification scheme: When we're
recursively checking the child element types of a struct type, instead
of checking every child types, we only check the one dictated by the
(static) GEP index value. We also combine "refinement" logics --
refine/promote struct index mlir::Value into constants -- into the very
verification process since they have lots of logics in common. The
resulting code is more concise and less brittle.

We also hide GEPOp::findKnownStructIndices since most of the
aforementioned logics are already encapsulated within GEPOp::build and
GEPOp::verify, we found little reason for findKnownStructIndices (or the
new findStructIndices) to be public.

Differential Revision: https://reviews.llvm.org/D124935

2 years agofix typo error in DivergenceAnalysis.h
Ruobing Han [Tue, 17 May 2022 16:54:36 +0000 (16:54 +0000)]
fix typo error in DivergenceAnalysis.h

Fix a typo error in the comment in DivergenceAnalysis.h

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D125808

2 years ago[AArch64] Teach perfect shuffles tables about D-lane movs
David Green [Tue, 17 May 2022 17:16:45 +0000 (18:16 +0100)]
[AArch64] Teach perfect shuffles tables about D-lane movs

Similar to D123386, this adds D-Movs to the AArch64 perfect shuffle
tables, slightly lowering the costs a little more. This is a rough
improvement in general, especially if you ignore mov v0.16b, v2.16b type
moves that are often artefacts of the calling convention.

The D register movs are encoded as (0x4 | LaneIdx), and to generate a D
register move we are required to bitcast into a higher type, but it is
otherwise very similar to the S-lane mov's already supported.

Differential Revision: https://reviews.llvm.org/D125477

2 years ago[Polly] Mark classes as final by default. NFC.
Michael Kruse [Tue, 17 May 2022 15:55:27 +0000 (10:55 -0500)]
[Polly] Mark classes as final by default. NFC.

This make is obivious that a class was not intended to be derived from.

NPM analysis pass can unfortunately not marked as final because they are
derived from a llvm::Checker<T> template internally by the NPM.

Also normalize the use of classes/structs
 * NPM passes are structs
 * Legacy passes are classes
 * structs that have methods and are not a visitor pattern are classes
 * structs have public inheritance by default, remove "public" keyword
 * Use typedef'ed type instead of inline forward declaration

2 years ago[LV] Regenerate check lines for some tests.
Florian Hahn [Tue, 17 May 2022 16:44:54 +0000 (17:44 +0100)]
[LV] Regenerate check lines for some tests.

Make sure the auto-generated check lines are up-to-date for some files,
to reduce the test diff in upcoming changes

2 years ago[clang-cl] Add /Zc:wchar_t- option
Pengxuan Zheng [Fri, 13 May 2022 02:50:18 +0000 (19:50 -0700)]
[clang-cl] Add /Zc:wchar_t- option

Map /Zc:wchar_t- to the cc1 flag -fno-wchar which is already supported.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D125513

2 years ago[AMDGPU] Add a MIR test for D125567
Jay Foad [Tue, 17 May 2022 15:50:08 +0000 (16:50 +0100)]
[AMDGPU] Add a MIR test for D125567

2 years ago[llvm][json] Fix UINT64 json parsing
Walter Erquinigo [Tue, 10 May 2022 15:16:32 +0000 (08:16 -0700)]
[llvm][json] Fix UINT64 json parsing

https://reviews.llvm.org/D109347 added support for UINT64 json numeric
types. However, it seems that it didn't properly test uint64_t numbers
larger than the int64_t because the number parsing logic doesn't
have any special handling for these large numbers.

This diffs adds a handler for large numbers, and besides that, fixes the
parsing of signed types by checking for errno ERANGE, which is the
recommended way to check if parsing fails because of out of bounds
errors. Before this diff, strtoll was always returning a number within
the bounds of an int64_t and the bounds check it was doing was completely
superfluous.

As an interesting fact about the old implementation, when calling strtoll
with "18446744073709551615", the largest uint64_t, End was S.end(), even
though it didn't use all digits. Which means that this check can only be
used to identify if the numeric string is malformed or not.

This patch also adds additional tests for extreme cases.

Differential Revision: https://reviews.llvm.org/D125322

2 years ago[lldb-vscode] Fix data race in lldb-vscode when running with ThreadSanitizer
Walter Erquinigo [Tue, 17 May 2022 15:53:51 +0000 (08:53 -0700)]
[lldb-vscode] Fix data race in lldb-vscode when running with ThreadSanitizer

This patch fixes https://github.com/llvm/llvm-project/issues/54768. A ProgressEventReporter creates a dedicated thread that keeps checking whether there are new events that need to be sent to IDE as long as m_thread_should_exit is true. When the VSCode instance is destructed, it will set m_thread_should_exit to false, which caused a data race because at the same time its ProgressEventReporter is reading this value to determine whether it should quit. This fix simply uses mutex to ensure they cannot read and write this value at the same time.

Committed on behalf of PRESIDENT810

Reviewed By: clayborg, wallace

Differential Revision: https://reviews.llvm.org/D125073

2 years agoRevert "[llvm-objcopy][test] Add cmp after copy"
Keith Smiley [Tue, 17 May 2022 16:06:58 +0000 (09:06 -0700)]
Revert "[llvm-objcopy][test] Add cmp after copy"

This reverts commit 0d863b5b90a2f11e58b0b54d7183cb1577fd3a0b.

Broke a test https://reviews.llvm.org/D125478#3519509

2 years ago[OpaquePtr][BitcodeReader] Explicitly turn off opaque pointers if we see a typed...
Arthur Eubanks [Tue, 17 May 2022 00:49:59 +0000 (17:49 -0700)]
[OpaquePtr][BitcodeReader] Explicitly turn off opaque pointers if we see a typed pointer

Followup to D125735 on the bitcode reader side.

Reviewed By: #opaque-pointers, nikic

Differential Revision: https://reviews.llvm.org/D125736

2 years ago[OpaquePtr][LLParser] Explicitly turn off opaque pointers if we see a star
Arthur Eubanks [Tue, 17 May 2022 00:01:09 +0000 (17:01 -0700)]
[OpaquePtr][LLParser] Explicitly turn off opaque pointers if we see a star

If we turn on --opaque-pointers, tests with '*' would use opaque pointers.

Can't really test this without flipping the default value for --opaque-pointers.

Reviewed By: #opaque-pointers, nikic

Differential Revision: https://reviews.llvm.org/D125735

2 years agoFix the std::string formatter to report errors in the case where the
Jim Ingham [Tue, 17 May 2022 15:21:09 +0000 (08:21 -0700)]
Fix the std::string formatter to report errors in the case where the
string points to unaccessible memory.

The formatter tries to get the data field of the std::string, and to
check whether that fails it just checks that the ValueObjectSP
returned is not empty. But we never return empty ValueObjectSP's to
indicate failure, since doing so would lose the Error object that
tells you why fetching the ValueObject failed.

This patch adds a check for ValueObject::GetError().Success().

I also added a test case for this failure, and reworked the test case
a bit (to use run_to_source_breakpoint). I also renamed a couple of
single letter locals which don't follow the lldb coding conventions.

Differential Revision: https://reviews.llvm.org/D108228

2 years ago[gn build] Port 76ddbb1ca747
LLVM GN Syncbot [Tue, 17 May 2022 15:17:39 +0000 (15:17 +0000)]
[gn build] Port 76ddbb1ca747

2 years agoRevert "[clangd] Indexing of standard library"
Sam McCall [Tue, 17 May 2022 15:16:40 +0000 (17:16 +0200)]
Revert "[clangd] Indexing of standard library"

This reverts commit ecaa4d9662c9a6ac013ac40a8ad72a2c75e3fd3b.

2 years ago[InstCombine] remove cast-of-signbit to shift transform
Sanjay Patel [Tue, 17 May 2022 14:21:02 +0000 (10:21 -0400)]
[InstCombine] remove cast-of-signbit to shift transform

The transform was wrong in 3 ways:

1. It created an extra instruction when the source and dest types don't match.
2. It did not account for an extra use of the icmp, so could create 2 extra insts.
3. It favored bit hacks over icmp (icmp generally has better analysis).

This fixes #54692 (modeled by the PhaseOrdering tests).

This is a minimal step to fix the bug, but we should likely invert
this and the sibling transform for the "is negative" pattern too.

The backend should be able to invert this back to a shift if that
leads to better codegen.

This is a reduced try of 3794cc0e9964 - that was reverted because
it could cause infinite loops by conflicting with the related
transforms in this block that create shifts.

2 years ago[RISCV] Add a test showing incorrect RVV stack alignment
Fraser Cormack [Fri, 1 Oct 2021 11:45:43 +0000 (12:45 +0100)]
[RISCV] Add a test showing incorrect RVV stack alignment

The RISC-V stack is assumed to be aligned to 16 bytes and can handle stack
realignment for larger objects, but the "RVV stack" is only ensured to be
aligned to 8 bytes. This means that objects specified at a larger alignment may
be misaligned, not only for 16-byte-aligned RVV objects that don't trigger
realignment, but also for 32-byte-and-larger-aligned objects which do.

The new test checks a variety of alignment configurations, showing the
misaligned cases.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D110933

2 years ago[LV] Fetch vector loop region once and remember it (NFC).
Florian Hahn [Tue, 17 May 2022 14:57:23 +0000 (15:57 +0100)]
[LV] Fetch vector loop region once and remember it (NFC).

This avoids an unnecessary lookup and makes the code slightly more
compact.

2 years ago[gn build] Port ecaa4d9662c9
LLVM GN Syncbot [Tue, 17 May 2022 14:51:11 +0000 (14:51 +0000)]
[gn build] Port ecaa4d9662c9

2 years ago[clangd] Indexing of standard library
Sam McCall [Sun, 28 Nov 2021 23:09:41 +0000 (00:09 +0100)]
[clangd] Indexing of standard library

This provides a nice "warm start" with all headers indexed, not just
those included so far.

The standard library is indexed after a preamble is parsed, using that
file's configuration. The result is pushed into the dynamic index.
If we later see a higher language version, we reindex it.

It's configurable as Index.StandardLibrary, off by default for now.

Based on D105177 by @kuhnel

Fixes https://github.com/clangd/clangd/issues/618

Differential Revision: https://reviews.llvm.org/D115232

2 years ago[RISCV] Drop notion of "strict" vsetvli compatibility
Fraser Cormack [Tue, 17 May 2022 07:52:20 +0000 (08:52 +0100)]
[RISCV] Drop notion of "strict" vsetvli compatibility

With recent fixes to the dataflow in place, we now never pass
Strict=true to isCompatible, so remove the parameter completely.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D125748

2 years ago[NFC][AIX] Reenable mri1.test
Jake Egan [Tue, 17 May 2022 14:26:59 +0000 (10:26 -0400)]
[NFC][AIX] Reenable mri1.test

This test is passing now because of D124017 and D123949.

Reviewed By: DiggerLin

Differential Revision: https://reviews.llvm.org/D125772

2 years ago[IRBuilder] Move insertvalue/extractvalue to fold infrastructure
Nikita Popov [Tue, 17 May 2022 13:55:42 +0000 (15:55 +0200)]
[IRBuilder] Move insertvalue/extractvalue to fold infrastructure

Move from the old CreateXYZ() to the new FoldXYZ() mechanism.

This change is likely NFC in practice, because I don't think that
the places using InstSimplifyFolder use insertvalue/extractvalue.

2 years agoFix release note typo from 6da3d66f
Erich Keane [Tue, 17 May 2022 13:35:06 +0000 (06:35 -0700)]
Fix release note typo from 6da3d66f

2 years ago[mlir] vim: add bf16 type
Cullen Rhodes [Fri, 13 May 2022 15:11:25 +0000 (15:11 +0000)]
[mlir] vim: add bf16 type

2 years ago[mlir][licm] Fix debug output with newlines
Cullen Rhodes [Thu, 5 May 2022 14:25:44 +0000 (14:25 +0000)]
[mlir][licm] Fix debug output with newlines

2 years agoFix an unused variable warning in no-asserts build mode
Dmitri Gribenko [Tue, 17 May 2022 13:27:44 +0000 (15:27 +0200)]
Fix an unused variable warning in no-asserts build mode

2 years ago[concepts] Implement dcl.decl.general p4: No constraints on non-template funcs
Erich Keane [Mon, 16 May 2022 14:55:35 +0000 (07:55 -0700)]
[concepts] Implement dcl.decl.general p4: No constraints on non-template funcs

The standard says:
The optional requires-clause ([temp.pre]) in an init-declarator or
member-declarator shall be present only if the declarator declares a
templated function ([dcl.fct]).

This implements that limitation, and updates the tests to the best of my
ability to capture the intent of the original checks.

Differential Revision: https://reviews.llvm.org/D125711

2 years ago[pseudo] Add the missing ; terminal for module-declaration rule.
Haojian Wu [Tue, 17 May 2022 13:13:51 +0000 (15:13 +0200)]
[pseudo] Add the missing ; terminal for module-declaration rule.

2 years ago[SLP]Add an extra check for select minmax reduction to avoid crash.
Alexey Bataev [Tue, 17 May 2022 12:32:01 +0000 (05:32 -0700)]
[SLP]Add an  extra check for select minmax reduction to avoid crash.

Need to check if the reduction is still (not)cmp-select pattern min/max
reduction to avoid compiler crash during building list of reduction
operations. cmp-sel pattern provides 2 reduction operations, while
intrinsics - just one.

2 years ago[pgo] Fix doc typo: thingswith -> things with
Konrad Kleine [Tue, 17 May 2022 10:48:03 +0000 (10:48 +0000)]
[pgo] Fix doc typo: thingswith -> things with

The title says it all.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D125763

2 years agoEnabling the detection of devtoolset-11 toolchain.
Kamau Bridgeman [Thu, 12 May 2022 20:02:00 +0000 (15:02 -0500)]
Enabling the detection of devtoolset-11 toolchain.

This patch allows systems to build the llvm-project with the devtoolset-11
toolchain.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D125499

2 years ago[DAG] Fold (shl (srl x, c), c) -> and(x, m) even if srl has other uses
Simon Pilgrim [Tue, 17 May 2022 12:40:03 +0000 (13:40 +0100)]
[DAG] Fold (shl (srl x, c), c) -> and(x, m) even if srl has other uses

If we're using shift pairs to mask, then relax the one use limit if the shift amounts are equal - we'll only be generating a single AND node.

AArch64 has a couple of regressions due to this, so I've enforced the existing one use limit inside a AArch64TargetLowering::shouldFoldConstantShiftPairToMask callback.

Part of the work to fix the regressions in D77804

Differential Revision: https://reviews.llvm.org/D125607

2 years ago[libc++] Introduce LIBCXX_LIBRARY_VERSION
Louis Dionne [Mon, 16 May 2022 13:50:56 +0000 (09:50 -0400)]
[libc++] Introduce LIBCXX_LIBRARY_VERSION

This allows controlling the current_version linker property on Apple
platforms.

Differential Revision: https://reviews.llvm.org/D125686

2 years ago[clang] Expose CoawaitExpr's operand in the AST
Nathan Ridge [Mon, 4 Apr 2022 06:29:21 +0000 (02:29 -0400)]
[clang] Expose CoawaitExpr's operand in the AST

Previously the Expr returned by getOperand() was actually the
subexpression common to the "ready", "suspend", and "resume"
expressions, which often isn't just the operand but e.g.
await_transform() called on the operand.

It's important for the AST to expose the operand as written
in the source for traversals and tools like clangd to work
correctly.

Fixes https://github.com/clangd/clangd/issues/939

Differential Revision: https://reviews.llvm.org/D115187

2 years ago[RegAllocGreedy] New hook regClassPriorityTrumpsGlobalness
Jay Foad [Wed, 4 May 2022 15:33:32 +0000 (16:33 +0100)]
[RegAllocGreedy] New hook regClassPriorityTrumpsGlobalness

Add a new TargetRegisterInfo hook to allow targets to tweak the
priority of live ranges, so that AllocationPriority of the register
class will be treated as more important than whether the range is local
to a basic block or global. This is determined per-MachineFunction.

Differential Revision: https://reviews.llvm.org/D125102

2 years ago[mlir][Tablegen-LSP] Don't link with llvm dylib
David Spickett [Tue, 17 May 2022 11:00:34 +0000 (11:00 +0000)]
[mlir][Tablegen-LSP] Don't link with llvm dylib

This updates 5de12bb703c5104b3fd64ee51c6900d6171d826a
to not link with the dylib since that does not include
the tablegen library.

Should fix flang dylib build failures:
https://lab.llvm.org/buildbot/#/builders/177/builds/5120

2 years ago[VPlan] Move usesScalars/onlyFirstLaneUsed to VPUser.
Florian Hahn [Tue, 17 May 2022 10:20:06 +0000 (11:20 +0100)]
[VPlan] Move usesScalars/onlyFirstLaneUsed to VPUser.

Those helpers model properties of a user and they should also be
available to non-recipe users. This will be used in D123537 for a new
exit value user.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D124936

2 years ago[AArch64] Extra tests useful for D-lane shuffles. NFC
David Green [Tue, 17 May 2022 10:15:55 +0000 (11:15 +0100)]
[AArch64] Extra tests useful for D-lane shuffles. NFC

2 years ago[JumpThreading] Regenerate test checks (NFC)
Nikita Popov [Tue, 17 May 2022 10:12:46 +0000 (12:12 +0200)]
[JumpThreading] Regenerate test checks (NFC)

2 years ago[WebAssembly][NFC] Convert IsWasm64 instruction field to 'bit' from string
Alex Bradbury [Tue, 17 May 2022 10:06:40 +0000 (11:06 +0100)]
[WebAssembly][NFC] Convert IsWasm64 instruction field to 'bit' from string

Extends the cleanup in D125713 to IsWasm64.

Differential Revision: https://reviews.llvm.org/D125714

2 years ago[WebAssembly][NFC] Convert StackBased instruction field to 'bit' from string
Alex Bradbury [Tue, 17 May 2022 10:02:30 +0000 (11:02 +0100)]
[WebAssembly][NFC] Convert StackBased instruction field to 'bit' from string

This is (IMHO) cleaner and (objectively) more strongly typed than using strings.

A follow-on patch will do the same for IsWasm64.

Differential Revision: https://reviews.llvm.org/D125713

2 years ago[X86] Attempt to fold EFLAGS into X86ISD::ADD/SUB ops
Simon Pilgrim [Tue, 17 May 2022 09:59:14 +0000 (10:59 +0100)]
[X86] Attempt to fold EFLAGS into X86ISD::ADD/SUB ops

We already use combineAddOrSubToADCOrSBB to fold extended EFLAGS results into ISD::ADD/SUB ops as X86ISD::ADC/SBB carry ops.

This patch extends this to also try to fold EFLAGS results with X86ISD::ADD/SUB ops

Differential Revision: https://reviews.llvm.org/D125642

2 years ago[OpenCL] Do not guard vload/store_half builtins
Sven van Haastregt [Tue, 17 May 2022 09:57:23 +0000 (10:57 +0100)]
[OpenCL] Do not guard vload/store_half builtins

The vload*_half* and vstore*_half* builtins do not require the
cl_khr_fp16 extension: pointers to `half` can be declared without the
extension and the _half variants of vload and vstore should be
available without the extension.

This aligns the guards for these builtins for
`-fdeclare-opencl-builtins` with `opencl-c.h`.

Fixes https://github.com/llvm/llvm-project/issues/55275

Differential Revision: https://reviews.llvm.org/D125401

2 years ago[JumpThreading] Don't pass DT to isGuaranteedNotToBeUndefOrPoison()
Nikita Popov [Tue, 17 May 2022 09:51:24 +0000 (11:51 +0200)]
[JumpThreading] Don't pass DT to isGuaranteedNotToBeUndefOrPoison()

JumpThreading intentionally does not force updating of the DT
during optimization, because this may be expensive when many CFG
updates and DT calculations are interleaved.

We shouldn't be fetching the DT just for the purpose of calling
isGuaranteedNotToBeUndefOrPoison(), especially as DT availability
doesn't even show benefit in tests.

2 years ago[DWARFLinker][NFC] Add None value to the DwarfLinkerAccelTableKind enum.
Alexey Lapshin [Thu, 12 May 2022 16:01:53 +0000 (19:01 +0300)]
[DWARFLinker][NFC] Add None value to the DwarfLinkerAccelTableKind enum.

this review is extracted from D86539.

1. Rename AccelTableKind to DwarfLinkerAccelTableKind
   (to differentiate from AccelTableKind from CodeGen/AsmPrinter/DwarfDebug.h)

2. Add None value to the DwarfLinkerAccelTableKind.

3. added 'None' value for 'accelerator' option of dsymutil.

Differential Revision: https://reviews.llvm.org/D125474

2 years ago[SROA] Avoid postponing rewriting load/store by ignoring lifetime intrinsics in parti...
Dmitry Vassiliev [Tue, 17 May 2022 09:25:59 +0000 (11:25 +0200)]
[SROA] Avoid postponing rewriting load/store by ignoring lifetime intrinsics in partition's promotability checking

This patch fixes a bug that generates unnecessary packing/unpacking structure code because of incorrectly handling lifetime intrinsic.
For example, a partition of an alloca may contain many slices:
```
Partition [0, 4):
  Slice0: [0, 4) used by: load i32 addr;
  Slice1: [0, 4) used by: store i32 v, addr;
  Slice2: [0, 16) used by lifetime.start(16, addr);
```
When SROA determines if the partition can be promoted, lifetime.start is currently treated as a whole alloca load/store, so Slice0 and Slice1 cannot be promoted at this attempt,
but the packing/unpacking code for Slice0 and Slice1 has been generated.
After rewrite lifetime.start/end intrinsic, SROA tries again with Slice0 and Slice1 and finally promotes them, but redundant packing/unpacking code remaining in the IRs.
This patch changes promotability checking to ignore lifetime intrinsic (they will be rewritten to correct sizes later), so we can promote the real users (load/store) at the first attempt with optimal code.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D124967

2 years ago[SROA] Precommit test for D124967
Dmitry Vassiliev [Tue, 17 May 2022 09:23:31 +0000 (11:23 +0200)]
[SROA] Precommit test for D124967

2 years ago[RISCV][NFC] Reword split SP adjustment comments
Fraser Cormack [Tue, 17 May 2022 09:01:15 +0000 (10:01 +0100)]
[RISCV][NFC] Reword split SP adjustment comments

2 years ago[mlir] support isa/cast/dyn_cast<Operation *>(operation) again
Alex Zinenko [Fri, 13 May 2022 13:06:02 +0000 (15:06 +0200)]
[mlir] support isa/cast/dyn_cast<Operation *>(operation) again

The support for this has been added by 946311b8938114a37db5c9d42fb9f5a1481ccae1
but then ignored by bc22b5c9a2f729460ffdf7627b3534a8d9f3f767.

This enables one to write generic code that can be instantiated for both
specific operation classes and the common base class without
specialization. Examples include functions that take/return ops, such
as:

```mlir
template <typename FnTy>
void applyIf(FnTy &&lambda, ...) {
  for (Operation *op : ...) {
    auto specific = dyn_cast<function_traits<FnTy>::template arg_t<0>>(op);
    if (specific)
      lambda(specific);
  }
}
```

that would otherwise need to rely on template specialization to support
lambdas that take specific operations and those that take `Operation *`.

Differential Revision: https://reviews.llvm.org/D125543

Reviewed by: rriddle

2 years ago[SelectionDAG] Support more VP reduction mask operation.
jacquesguan [Thu, 5 May 2022 11:13:05 +0000 (11:13 +0000)]
[SelectionDAG] Support more VP reduction mask operation.

This patch uses VP_REDUCE_AND and VP_REDUCE_OR to replace VP_REDUCE_SMAX,VP_REDUCE_SMIN,VP_REDUCE_UMAX and VP_REDUCE_UMIN for mask vector type.

Differential Revision: https://reviews.llvm.org/D125002

2 years ago[RISCV][NFC] Fix comment typos in split SP adjustment
Fraser Cormack [Tue, 17 May 2022 08:56:54 +0000 (09:56 +0100)]
[RISCV][NFC] Fix comment typos in split SP adjustment

2 years ago[InstCombine] precommit tests for foldSelectToCopysign
Chenbing Zheng [Tue, 17 May 2022 08:42:42 +0000 (16:42 +0800)]
[InstCombine] precommit tests for foldSelectToCopysign

2 years ago[llvm] Fix typo for libxml2 detection
Samuel Thibault [Tue, 17 May 2022 08:44:07 +0000 (08:44 +0000)]
[llvm] Fix typo for libxml2 detection

This seems to be a copy-paste from the similar zlib detection code.

Patch By: sthibaul

Differential Revision: https://reviews.llvm.org/D117052

2 years ago[XCOFF] support writing sections, relocations and symbols for XCOFF64.
esmeyi [Tue, 17 May 2022 08:27:47 +0000 (04:27 -0400)]
[XCOFF] support writing sections, relocations and symbols for XCOFF64.

This is the second patch to enable the XCOFF64 object writer.

Reviewed By: jhenderson, shchenz

Differential Revision: https://reviews.llvm.org/D122287

2 years ago[LVI] Compute range for xor
Nikita Popov [Tue, 17 May 2022 08:18:38 +0000 (10:18 +0200)]
[LVI] Compute range for xor

We do have a non-trivial implementation for binaryXor() now.

2 years ago[CVP] Add test for xor (NFC)
Nikita Popov [Tue, 17 May 2022 08:17:34 +0000 (10:17 +0200)]
[CVP] Add test for xor (NFC)

2 years ago[ConstantRange] Implement binaryXor() using known bits
Nikita Popov [Tue, 17 May 2022 08:02:50 +0000 (10:02 +0200)]
[ConstantRange] Implement binaryXor() using known bits

This allows us to compute known high bits. It's not optimal, but
better than nothing.

2 years ago[RISCV] Add a test w/ RVV stack objects misaligning non-RVV ones
Fraser Cormack [Wed, 11 May 2022 13:08:41 +0000 (14:08 +0100)]
[RISCV] Add a test w/ RVV stack objects misaligning non-RVV ones

This patch adds a simple test which demonstrates a miscompilation of
16-byte-aligned scalar (non-RVV) objects when combined with RVV stack
objects.

The RISCV stack is assumed to be aligned to 16 bytes, and this is
guaranteed/assumed to be true when setting up the stack. However, when
the stack contains RVV objects, we decrement the stack pointer by some
multiple of vlenb, which is only guaranteed to be aligned to 8 bytes.
This means that non-RVV objects specifically requiring 16-byte alignment
fall through the cracks and are misaligned. Objects requiring larger
alignment trigger stack realignment and thus should be okay.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D125382

2 years ago[StackColoring] Don't merge slots with differing StackIDs
Fraser Cormack [Mon, 16 May 2022 15:38:52 +0000 (16:38 +0100)]
[StackColoring] Don't merge slots with differing StackIDs

The documentation for this specifically mentions that this should not
happen. We could think about adding target hooks to permit it (and how
to merge IDs) in the future if that is desirable.

This specific test case was merging a scalable-vector slot into a
non-scalable one and dropping the notion of scalability, meaning we
failed to allocate enough stack space for the object.

Reviewed By: arsenm, MaskRay, sdesmalen

Differential Revision: https://reviews.llvm.org/D125699

2 years ago[KnownBits] Add operator==
Nikita Popov [Mon, 16 May 2022 15:20:19 +0000 (17:20 +0200)]
[KnownBits] Add operator==

Checking whether two KnownBits are the same is somewhat common,
mainly in test code.

I don't think there is a lot of room for confusion with "determine
what the KnownBits for an icmp eq would be", as that has a
different result type (this is what the eq() method implements,
which returns Optional<bool>).

Differential Revision: https://reviews.llvm.org/D125692

2 years ago[flang] Add one semantic check for elemental call arguments
Peixin-Qiao [Tue, 17 May 2022 07:11:46 +0000 (15:11 +0800)]
[flang] Add one semantic check for elemental call arguments

As Fortran 2018 15.8.1(3), in a reference to an elemental procedure, if
any argument is an array, each actual argument that corresponds to an
INTENT (OUT) or INTENT (INOUT) dummy argument shall be an array. Add
this semantic check.

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D125685

2 years ago[flang][OpenMP] Support lowering to MLIR for ordered clause
Peixin-Qiao [Tue, 17 May 2022 07:07:52 +0000 (15:07 +0800)]
[flang][OpenMP] Support lowering to MLIR for ordered clause

This supports the lowering parse-tree to MLIR for ordered clause in
worksharing-loop directive. Also add the test case for operation
conversion.

Part of this patch is from the fir-dev branch of
https://github.com/flang-compiler/f18-llvm-project.

Co-authored-by: Sourabh Singh Tomar <SourabhSingh.Tomar@amd.com>
Reviewed By: kiranchandramohan, NimishMishra

Differential Revision: https://reviews.llvm.org/D125456

2 years ago[RISCV] Support getHostCpuName for sifive-u74
luxufan [Tue, 17 May 2022 06:06:42 +0000 (14:06 +0800)]
[RISCV] Support getHostCpuName for sifive-u74

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D123978

2 years ago[mlir][LLVMIR] Add support for translating insertelement/extractelement.
jacquesguan [Mon, 16 May 2022 09:19:17 +0000 (09:19 +0000)]
[mlir][LLVMIR] Add support for translating insertelement/extractelement.

Add support for translating llvm::InsertElement and llvm::ExtractElement.

Differential Revision: https://reviews.llvm.org/D125674

2 years ago[Frontend] [Coroutines] Emit error when we found incompatible allocation
Chuanqi Xu [Thu, 12 May 2022 10:00:45 +0000 (18:00 +0800)]
[Frontend] [Coroutines] Emit error when we found incompatible allocation
function in promise_type

According to https://cplusplus.github.io/CWG/issues/2585.html, this
fixes https://github.com/llvm/llvm-project/issues/54881

Simply, the clang tried to found (do lookup and overload resolution. Is
there any better word to use than found?) allocation function in
promise_type and global scope. However, this is not consistent with the
standard. The standard behavior would be that the compiler shouldn't
lookup in global scope in case we lookup the allocation function name in
promise_type. In other words, the program is ill-formed if there is
incompatible allocation function in promise type.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D125517

2 years agoRevert "[dwarf] Emit a DIGlobalVariable for constant strings."
Mitch Phillips [Tue, 17 May 2022 02:07:22 +0000 (19:07 -0700)]
Revert "[dwarf] Emit a DIGlobalVariable for constant strings."

This reverts commit 4680982b36a84770a1600fc438be8ec090671724.

Broke a fuchsia windows bot. More details in the review:
https://reviews.llvm.org/D123534

2 years ago[nfc][lld-macho] Follow up fixes to bd9e46815d73e4236c207bad8b5c54e7188154d7
Vy Nguyen [Tue, 17 May 2022 00:53:27 +0000 (20:53 -0400)]
[nfc][lld-macho] Follow up fixes to bd9e46815d73e4236c207bad8b5c54e7188154d7

Need -DAG in the first expect statement too

2 years ago[mlir][sparse] Moved _mlir_ciface_newSparseTensor closer to its macros
wren romano [Mon, 16 May 2022 23:45:50 +0000 (16:45 -0700)]
[mlir][sparse] Moved _mlir_ciface_newSparseTensor closer to its macros

This is a followup to D125431, to keep from confusing the machinery that generates diffs (since combining these two changes into one would obfuscate the changes actually made in the previous differential).

Depends On D125431

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D125432

2 years ago[WebAssembly] Update relaxed SIMD opcodes and names
Thomas Lively [Tue, 17 May 2022 00:51:45 +0000 (17:51 -0700)]
[WebAssembly] Update relaxed SIMD opcodes and names

to reflect the latest state of the proposal:
https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md#binary-format.
Moves code around to match the instruction order from the proposal, but the only
functional changes are to the names and opcodes.

Reviewed By: aheejin

Differential Revision: https://reviews.llvm.org/D125726

2 years ago[nfc][lld-macho] Fixed test from https://reviews.llvm.org/D125732
Vy Nguyen [Tue, 17 May 2022 00:46:15 +0000 (20:46 -0400)]
[nfc][lld-macho] Fixed test from https://reviews.llvm.org/D125732

Details:
The test was incorrectly expecting the error messages for the export symbols to have a particular order.
It shouldn't because the export symbol list is processed concurrently.

2 years ago[lld-macho] Temporarily disable test on windows
Vy Nguyen [Tue, 17 May 2022 00:36:49 +0000 (20:36 -0400)]
[lld-macho] Temporarily disable test on windows
The metadata seems to be demangled differently

2 years ago[RISCV] Use classic dataflow for VSETVLI insertion
Philip Reames [Mon, 16 May 2022 23:43:13 +0000 (16:43 -0700)]
[RISCV] Use classic dataflow for VSETVLI insertion

Our current implementation of the InsertVSETVLI dataflow allows phase 3 to arrive at a different block end state than the data flow in phase 1/2 computed. This arises because a block which contains instructions (e.g. load or stores) which don't consume all the incoming bits of the VL/VTYPE can be compatible with multiple incoming states. The algorithm effectively changes the SEW on such instructions, and propagates the prior state forward. As phase 3 uses the block input state for this propagation, but phase 1/2 doesn't, this can result in different block end states.

If we don't correct for it, this discrepancy can result in miscompiles. This was the source of multiple recent bugs. However, by now we have fixes for all known correctness issues.

The basic strategy we use is to insert a compensation vsetvli to bring the block state leaving the block back into consistency with the one computed. This is correct, but results in extra vsetvlis being placed at the end of blocks.

This change adjusts the phase 1/2 algorithm to propagate the incoming block state through the block, allowing the compatibility rules to modify the end state. The algorithm may need to run slightly more iterations, but the end result is consistent with what phase 3 does.

The benefit of doing this is two fold.

First, we reverse some of the code quality introductions introduced in the functional fixes.

Second, we simplify the invariants, and allow the strict assertions to be enabled. Several humans, myself included, have found it quite surprising that invariant didn't hold already, and arguably that confusion is the cause of several of our recent miscompiles in this code.

The downside to this patch is that the dataflow may require additional iterations to stabilize. In the worse case, we go from O(Edges) to O(E + UniquePaths) as the incoming state (and thus the outgoing one) can now change once for each path from the entry block.

Differential Revision: https://reviews.llvm.org/D125232

2 years ago[RISCV] Fix missing vsetvli in transparent block case
Philip Reames [Mon, 16 May 2022 23:40:35 +0000 (16:40 -0700)]
[RISCV] Fix missing vsetvli in transparent block case

We've got a lurking problem with our data flow implementation where different phases disagree, resulting in possible miscompiles. D119518 introduced a workaround, but failed to consider blocks which only contain load/stores compatible with their incoming state.

When I went to rebase and simplify D125232, it turned out that not all of the correctness issues had been fixed yet after all. This is the correctness fix accidentally embedded in the original more complicated version.

Note that the test changes here are mostly regressions. It's worth noting that the simplified version of D125232 exactly reverses all the non-functional diffs in the test caused here. D125232 should be the immediate following commit.

Differential Revision: https://reviews.llvm.org/D125703

2 years ago[test-suite][cmake] sort unit test targets
Grace Jennings [Mon, 16 May 2022 23:50:49 +0000 (16:50 -0700)]
[test-suite][cmake] sort unit test targets

This patch sorts unit test targets into directories corresponding to the
test source file directories to improve target navigation.

Reviewed By: smeenai

Differential Revision: https://reviews.llvm.org/D124810

2 years ago[dwarf] Emit a DIGlobalVariable for constant strings.
Mitch Phillips [Mon, 16 May 2022 23:03:47 +0000 (16:03 -0700)]
[dwarf] Emit a DIGlobalVariable for constant strings.

An upcoming patch will extend llvm-symbolizer to provide the source line
information for global variables. The goal is to move AddressSanitizer
off of internal debug info for symbolization onto the DWARF standard
(and doing a clean-up in the process). Currently, ASan reports the line
information for constant strings if a memory safety bug happens around
them. We want to keep this behaviour, so we need to emit debuginfo for
these variables as well.

Reviewed By: dblaikie, rnk, aprantl

Differential Revision: https://reviews.llvm.org/D123534

2 years ago[lld-macho] Demangle symbol names in export-symbol error messages when -demangle...
Vy Nguyen [Mon, 16 May 2022 23:19:32 +0000 (19:19 -0400)]
[lld-macho] Demangle symbol names in export-symbol error messages when -demangle is specified.
PR/55512

Reviewed By: keith

Differential Revision: https://reviews.llvm.org/D125732

2 years ago[mlir][NFC] Fix the tags for various doc code blocks
River Riddle [Mon, 16 May 2022 23:45:51 +0000 (16:45 -0700)]
[mlir][NFC] Fix the tags for various doc code blocks

2 years ago[mlir][PDLL] Tweak the grammar to highlight partial code better
River Riddle [Mon, 16 May 2022 23:37:31 +0000 (16:37 -0700)]
[mlir][PDLL] Tweak the grammar to highlight partial code better

This commit enables proper highlighting when inner statements are
outside of a constraint/pattern/etc. This shouldn't really happen in
actual code, but can happen in documentation (which uses the same
syntax grammar).

2 years ago[mlir][sparse] Restyling macros in the runtime library
wren romano [Wed, 11 May 2022 23:32:54 +0000 (16:32 -0700)]
[mlir][sparse] Restyling macros in the runtime library

In addition to reducing code repetition, this also helps ensure that the various API functions follow the naming convention of mlir::sparse_tensor::primaryTypeFunctionSuffix (e.g., due to typos in the repetitious code).

Depends On D125428

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D125431

2 years agoTeach PeepholeOpt to eliminate redundant copy from constant physreg (e.g VLENB on...
Philip Reames [Mon, 16 May 2022 23:27:39 +0000 (16:27 -0700)]
Teach PeepholeOpt to eliminate redundant copy from constant physreg (e.g VLENB on RISCV)

The existing redundant copy elimination required a virtual register source, but the same logic works for any physreg where we don't have to worry about clobbers.  On RISCV, this helps eliminate redundant CSR reads from VLENB.

Differential Revision: https://reviews.llvm.org/D125564

2 years ago[llvm-ar][NFC] Address post-commit comments on D125439.
Ben Dunbobbin [Mon, 16 May 2022 23:11:18 +0000 (00:11 +0100)]
[llvm-ar][NFC] Address post-commit comments on D125439.

Remove errant whitespace.

AIX uses big archive format so check for both !<arch> and <bigaf>.

Only the "gnu" format has thin archives; specify --format=gnu for
thin archive test-cases.

2 years ago[mlir][NFC] Fix a few langref typos
River Riddle [Mon, 16 May 2022 23:23:01 +0000 (16:23 -0700)]
[mlir][NFC] Fix a few langref typos

2 years ago[llvm-objcopy][test] Add cmp after copy
Keith Smiley [Thu, 12 May 2022 17:28:57 +0000 (10:28 -0700)]
[llvm-objcopy][test] Add cmp after copy

All of the other tests here either check that the copy fails, or that
the resulting binary is the same, it seems like this check was omitted
for the universal object case.

Differential Revision: https://reviews.llvm.org/D125478

2 years ago[mlir][Tablegen-LSP] Add support for a basic TableGen language server
River Riddle [Mon, 9 May 2022 17:36:48 +0000 (10:36 -0700)]
[mlir][Tablegen-LSP] Add support for a basic TableGen language server

This follows the same general structure of the MLIR and PDLL language
servers. This commits adds the basic functionality for setting up the server,
and initially only supports providing diagnostics. Followon commits will
build out more comprehensive behavior.

Realistically this should eventually live in llvm/, but building in MLIR is an easier
initial step given that:
* All of the necessary LSP functionality is already here
* It allows for proving out useful language features (e.g. compilation databases)
  without affecting wider scale tablegen users
* MLIR has a vscode extension that can immediately take advantage of it

Differential Revision: https://reviews.llvm.org/D125440

2 years ago[clang] Avoid suggesting typoed directives in `.S` files
Ken Matsui [Mon, 16 May 2022 22:36:54 +0000 (15:36 -0700)]
[clang] Avoid suggesting typoed directives in `.S` files

This patch is itended to avoid suggesting typoed directives in `.S`
files to support the cases of `#` directives treated as comments or
various pseudo-ops. The feature is implemented in
https://reviews.llvm.org/D124726.

Fixes: https://reviews.llvm.org/D124726#3516346.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D125727

2 years ago[mlir][sparse] Adding "final" keyword wherever appropriate
wren romano [Wed, 11 May 2022 23:10:22 +0000 (16:10 -0700)]
[mlir][sparse] Adding "final" keyword wherever appropriate

This enables the compiler to perform devirtualization.  And benchmarks
indicate devirtualization can sometimes give considerable speedup.

Depends On D122061

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D125428

2 years ago[mlir][sparse] Enhancing sparse=>sparse conversion.
wren romano [Wed, 11 May 2022 23:05:13 +0000 (16:05 -0700)]
[mlir][sparse] Enhancing sparse=>sparse conversion.

Fixes: https://github.com/llvm/llvm-project/issues/51652

Depends On D122060

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D122061

2 years ago[mlir] Restrict dialect doc gen to a single dialect
River Riddle [Sun, 15 May 2022 23:40:18 +0000 (16:40 -0700)]
[mlir] Restrict dialect doc gen to a single dialect

In the overwhelmingly majority of cases only one dialect is generated at a time
anyways, and this restriction more easily catches user error when multiple
dialects might be generated. We hit this semi-recently with the PDL dialect,
and circt+other downstream users are also actively hitting this as well.

Differential Revision: https://reviews.llvm.org/D125651

2 years ago[NFC] Don't bother with unstripped binary w/ dSYM, don't DebugSymbols twice
Jason Molenda [Mon, 16 May 2022 22:27:21 +0000 (15:27 -0700)]
[NFC] Don't bother with unstripped binary w/ dSYM, don't DebugSymbols twice

This patch addresses two perf issues when we find a dSYM on macOS
after calling into the DebugSymbols framework.  First, when we have
a local (probably stripped) binaary, we find the dSYM and we may
be told about the location of the symbol rich binary (probably
unstripped) which may be on a remote filesystem.  We don't need the
unstripped binary, use the local binary we already have.
Second, after we've found the path to the dSYM, save that in the Module
so we don't call into DebugSymbols a second time later on to
rediscover it.  If the user has a DBGShellCommands set, we need to
exec that process twice, serially, which can add up.

Differential Revision: https://reviews.llvm.org/D125616
rdar://84576917

2 years ago[OpenMP] Don't build the offloading driver without a source input
Joseph Huber [Mon, 16 May 2022 16:26:32 +0000 (12:26 -0400)]
[OpenMP] Don't build the offloading driver without a source input

The Clang driver additional stages to build a complete offloading
program for applications using CUDA or OpenMP offloading. This normally
requires either a source file input or a valid object file to be
handled. This would cause problems when trying to compile an assembly or
LLVM IR file through clang with flags that would enable offloading. This
patch simply adds a check to prevent the offloading toolchain from being
used if we don't have a valid source file.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D125705

2 years ago[Libomptarget] Don't build the device runtime without a new Clang
Joseph Huber [Mon, 16 May 2022 15:49:51 +0000 (11:49 -0400)]
[Libomptarget] Don't build the device runtime without a new Clang

The OpenMP device offloading library is a bitcode library and thus only
expect to build and linked with the same version of clang that was used
to create it. This somewhat copmlicates the building process as we
require the Clang that was just built to be used to create the library.
This is either done with a two-step build, where OpenMP is built with
the Clang that was just installed, or through the
`-DLLLVM_ENABLE_RUNTIMES=openmp` option. This has always been the case,
but recent changes have caused this to make it difficult to build the
rest of OpenMP. This patchs adds a check to not build the OpenMP device
runtime if the current compiler is not Clang with the same version as
the LLVM installation. This should allow users to build OpenMP as a
project using any compiler without it erroring out due to the bitcode
library, but if users require it they will need to use the above methods
to compile it.

Reviewed By: jdoerfert, tianshilei1992, ye-luo

Differential Revision: https://reviews.llvm.org/D125698

2 years ago[mlir] allow for re-registering extension ops
Alex Zinenko [Fri, 13 May 2022 16:03:08 +0000 (18:03 +0200)]
[mlir] allow for re-registering extension ops

Op registration mechanism does not allow for ops with the same name to be
re-registered. This is okay to avoid name conflicts and debug
double-registration, but may be problematic for dialect extensions that may get
registered several times (unlike dialects that are deduplicated in the
registry). When registering ops through the Transform dialect extension
mechanism, check first if the ops are already registered and only complain in
the case of repeated registration with the same name but different TypeID.

Differential Revision: https://reviews.llvm.org/D125554

2 years ago[lldb] Prevent underflow in crashlog.py
Jonas Devlieghere [Mon, 16 May 2022 21:51:27 +0000 (14:51 -0700)]
[lldb] Prevent underflow in crashlog.py

Avoid a OverflowError (an underflow really) when the pc is zero. This
can happen for "unknown frames" where the crashlog generator reports a
zero pc. We could omit them altogether, but if they're part of the
crashlog it seems fair to display them in lldb as well.

rdar://92686666

Differential revision: https://reviews.llvm.org/D125716

2 years agoRevert "[InstCombine] invert canonicalization for cast of signbit test"
Sanjay Patel [Mon, 16 May 2022 21:47:02 +0000 (17:47 -0400)]
Revert "[InstCombine] invert canonicalization for cast of signbit test"

This reverts commit 3794cc0e996481e10307b67c8436aa44e0d65d22.
This change is suspected of causing bots to hang at stage 2
compiles, so reverting to confirm and investigate.

2 years ago[MC] [Win64EH] Check for matches between epilogs and the prolog on ARM64
Martin Storsjö [Sat, 14 May 2022 22:59:14 +0000 (01:59 +0300)]
[MC] [Win64EH] Check for matches between epilogs and the prolog on ARM64

This allows sharing opcodes between prolog and epilog even when there
is more than one epilog.

I didn't make any handcrafted special MC level testcases for this (yet
at least), but it does seem to have the expected effect on two existing
CodeGen level testcases.

Differential Revision: https://reviews.llvm.org/D125619

2 years ago[MC] [Win64EH] Try writing an ARM64 "packed epilog" even if the epilog doesn't share...
Martin Storsjö [Fri, 13 May 2022 07:42:56 +0000 (10:42 +0300)]
[MC] [Win64EH] Try writing an ARM64 "packed epilog" even if the epilog doesn't share opcodes with the prolog

The "packed epilog" form only implies that the epilog is located
exactly at the end of the function (so the location of the epilog
is implicit from the epilog opcodes), but it doesn't have to share
opcodes with the prolog - as long as the total number of opcode
bytes and the offset to the epilog fit within the bitfields.

This avoids writing a 4 byte epilog scope in many cases. (I haven't
measured how much this shrinks actual xdata sections in practice
though.)

Differential Revision: https://reviews.llvm.org/D125536