platform/upstream/llvm.git
2 years ago[MLIR][LLVMIR] Add round intrinsic
lorenzo chelini [Thu, 2 Jun 2022 13:56:01 +0000 (15:56 +0200)]
[MLIR][LLVMIR] Add round intrinsic

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D126879

2 years ago[pseudo] Fix noptr-abstract-declarator rule.
Haojian Wu [Fri, 3 Jun 2022 19:13:26 +0000 (21:13 +0200)]
[pseudo] Fix noptr-abstract-declarator rule.

The const-expression in the [] can be empty.

Differential Revision: https://reviews.llvm.org/D126992

2 years ago[SCCP] Don't mark ranges from branch conditions as potentially undef
Nikita Popov [Mon, 30 May 2022 13:05:04 +0000 (15:05 +0200)]
[SCCP] Don't mark ranges from branch conditions as potentially undef

Now that transforms introducing branch on poison have been removed,
we can stop marking ranges that have been derived from branch
conditions as containing undef. The existing comment explains why
this is legal. I've checked that alive2 is happy with SCCP tests
after this change.

Differential Revision: https://reviews.llvm.org/D126647

2 years ago[pseudo] Fix the member-specification grammar rule.
Haojian Wu [Fri, 3 Jun 2022 19:02:25 +0000 (21:02 +0200)]
[pseudo] Fix the member-specification grammar rule.

The grammar rule is not right, doesn't match the standard one.

Differential Revision: https://reviews.llvm.org/D126991

2 years ago[flang][docs] Remove the out-dated note on Windows support
Andrzej Warzynski [Tue, 7 Jun 2022 08:09:26 +0000 (08:09 +0000)]
[flang][docs] Remove the out-dated note on Windows support

Building Flang on Windows *is supported*. It's been tested there for
quite a while now:
  * https://lab.llvm.org/buildbot/#/builders/172

Submitting this without a review as the current note in the readme file
is clearly incorrect.

2 years ago[DAGCombiner] Remove overzealous assertion when folding assert+trunc+assert (PR55846)
Nikita Popov [Fri, 3 Jun 2022 09:11:03 +0000 (11:11 +0200)]
[DAGCombiner] Remove overzealous assertion when folding assert+trunc+assert (PR55846)

These assert that there are no "useless" assertzext/assertsext nodes
(that assert a wider width than a following trunc), but I don't think
there is anything preventing such nodes from reaching this code.
I don't think the assertion is relevant for correctness of this
transform either -- if such an assert is present, then the other
one will always be to a smaller width, and we'll pick that one.
The assertion dates back to D37017.

Fixes https://github.com/llvm/llvm-project/issues/55846.

Differential Revision: https://reviews.llvm.org/D126952

2 years ago[mlir][complex] Add complex.conj op
lewuathe [Tue, 7 Jun 2022 07:37:20 +0000 (09:37 +0200)]
[mlir][complex] Add complex.conj op

Add complex.conj op to calculate the complex conjugate which is widely used for the mathematical operation on the complex space.

Reviewed By: pifon2a

Differential Revision: https://reviews.llvm.org/D127181

2 years ago[MC] De-capitalize MCStreamer functions
Fangrui Song [Tue, 7 Jun 2022 07:31:02 +0000 (00:31 -0700)]
[MC] De-capitalize MCStreamer functions

Follow-up to c031378ce01b8485ba0ef486654bc9393c4ac024 .
The class is mostly consistent now.

2 years ago[flang][OpenMP] Support lowering parse-tree to MLIR for threadprivate directive
Peixin-Qiao [Tue, 7 Jun 2022 07:08:17 +0000 (15:08 +0800)]
[flang][OpenMP] Support lowering parse-tree to MLIR for threadprivate directive

This supports lowering parse-tree to MLIR for threadprivate directive
following the OpenMP 5.1 [2.21.2] standard. Take the following as an
example:

```
program m
  integer, save :: i
  !$omp threadprivate(i)
  call sub(i)
  !$omp parallel
    call sub(i)
  !$omp end parallel
end
```
```
func.func @_QQmain() {
  %0 = fir.address_of(@_QFEi) : !fir.ref<i32>
  %1 = omp.threadprivate %0 : !fir.ref<i32> -> !fir.ref<i32>
  fir.call @_QPsub(%1) : (!fir.ref<i32>) -> ()
  omp.parallel   {
    %2 = omp.threadprivate %0 : !fir.ref<i32> -> !fir.ref<i32>
    fir.call @_QPsub(%2) : (!fir.ref<i32>) -> ()
    omp.terminator
  }
  return
}
```

A threadprivate operation (omp.threadprivate) is created for all
references to a threadprivate variable. The runtime will appropriately
return a threadprivate var (%1 as above) or its copy (%2 as above)
depending on whether it is outside or inside a parallel region. For
threadprivate access outside the parallel region, the threadprivate
operation is created in instantiateVar. Inside the parallel region, it
is created in createBodyOfOp.

One new utility function collectSymbolSet is created for collecting
all the variables with a property within a evaluation, which may be one
Fortran, or OpenMP, or OpenACC construct.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D124226

2 years ago[flang] Fix XArrayCoorOp conversion for index type slices
Peixin-Qiao [Tue, 7 Jun 2022 06:58:44 +0000 (14:58 +0800)]
[flang] Fix XArrayCoorOp conversion for index type slices

The previous XArrayCoorOp conversion forgot to change getting the
operands from OpAdaptor for upper bound and step of slice. This leads to
the fail of incompatible of types of codegen when slices are index type.

Reviewed By: kiranchandramohan, schweitz

Differential Revision: https://reviews.llvm.org/D125967

2 years ago[flang] Fix semantic checks for C919
Peixin-Qiao [Tue, 7 Jun 2022 06:55:31 +0000 (14:55 +0800)]
[flang] Fix semantic checks for C919

The previous semantic analysis does not consider when the last part-ref
is scalar or complex part. Refactor the previous code and bring all the
checks into one place. The check starts from the designator by
extracting the dataref wrapped including the substring and complex part
and recursively check the base objects.

Co-authored-by: Peter Klausler <pklausler@nvidia.com>
Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D126595

2 years agoReplace Goals and Why section with Introduction
Jeff Bailey [Tue, 7 Jun 2022 06:52:02 +0000 (06:52 +0000)]
Replace Goals and Why section with Introduction

Rewrite the introduction of the page to state clearly the goals of
LLVM's libc project.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D127174

2 years ago[MLIR][SCF] Fix top-level comment (NFC)
lorenzo chelini [Sat, 4 Jun 2022 16:30:03 +0000 (18:30 +0200)]
[MLIR][SCF] Fix top-level comment (NFC)

2 years ago[analyzer] Track assume call stack to detect fixpoint
Gabor Marton [Fri, 27 May 2022 18:44:43 +0000 (20:44 +0200)]
[analyzer] Track assume call stack to detect fixpoint

Assume functions might recurse (see `reAssume` or `tryRearrange`).
During the recursion, the State might not change anymore, that means we
reached a fixpoint. In this patch, we avoid infinite recursion of assume
calls by checking already visited States on the stack of assume function
calls. This patch renders the previous "workaround" solution (D47155)
unnecessary. Note that this is not an NFC patch. If we were to limit the
maximum stack depth of the assume calls to 1 then would it be equivalent
with the previous solution in D47155.

Additionally, in D113753, we simplify the symbols right at the beginning
of evalBinOpNN. So, a call to `simplifySVal` in `getKnownValue` (added
in D51252) is no longer needed.

Fixes https://github.com/llvm/llvm-project/issues/55851

Differential Revision: https://reviews.llvm.org/D126560

2 years ago[MC][ARM] Reuse symbol value in constant pool
luxufan [Mon, 6 Jun 2022 13:35:33 +0000 (21:35 +0800)]
[MC][ARM] Reuse symbol value in constant pool

Fix https://github.com/llvm/llvm-project/issues/55816

Before this patch, MCConstantExpr were reused, but MCSymbolExpr were
not. To reuse symbol value, this patch added a DenseMap to record the
symbol value.

Differential Revision: https://reviews.llvm.org/D127113

2 years ago[vscode-mlir] Bump to version 0.9
River Riddle [Tue, 7 Jun 2022 03:19:59 +0000 (20:19 -0700)]
[vscode-mlir] Bump to version 0.9

Since version 0.8 we've added:

* Switched PDLL and TableGen to use incremental doc updates
* Added support to PDLL for inlay hints

2 years ago[mlir:PDLL] Add support for inlay hints
River Riddle [Wed, 18 May 2022 00:56:17 +0000 (17:56 -0700)]
[mlir:PDLL] Add support for inlay hints

These allow for displaying additional inline information,
such as the types of variables, names operands/results,
constraint/rewrite arguments, etc. This requires a bump in the
vscode extension to a newer version, as inlay hints are a new LSP feature.

Differential Revision: https://reviews.llvm.org/D126033

2 years ago[mlir:LSP] Switch document sync mode to Incremental
River Riddle [Tue, 17 May 2022 22:16:24 +0000 (15:16 -0700)]
[mlir:LSP] Switch document sync mode to Incremental

This is much more efficient over the full mode, as it only requires sending
smalls chunks of files. It also works around a weird command ordering
issue (full document updates are being sent after other commands like
code completion) in newer versions of vscode.

Differential Revision: https://reviews.llvm.org/D126032

2 years ago[ASan] Skip any instruction inserted by another instrumentation.
Enna1 [Tue, 7 Jun 2022 03:15:12 +0000 (11:15 +0800)]
[ASan] Skip any instruction inserted by another instrumentation.

Currently, we only check !nosanitize metadata for instruction passed to function `getInterestingMemoryOperands()` or instruction which is a cannot return callable instruction.
This patch add this check to any instruction.

E.g. ASan shouldn't instrument the instruction inserted by UBSan/pointer-overflow.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D126269

2 years ago[clang] Allow consteval functions in default arguments
Evgeny Shulgin [Tue, 7 Jun 2022 02:51:59 +0000 (10:51 +0800)]
[clang] Allow consteval functions in default arguments

We should not mark a function as "referenced" if we call it within a
ConstantExpr, because the expression will be folded to a value in LLVM
IR. To prevent emitting consteval function declarations, we should not "jump
over" a ConstantExpr when it is a top-level ParmVarDecl's subexpression.

Fixes https://github.com/llvm/llvm-project/issues/48230

Reviewed By: erichkeane, aaron.ballman, ChuanqiXu

Differenitial Revision: https://reviews.llvm.org/D119646

2 years ago[NFC] Properly suppress unused argument warning in __isOSVersionAtLeast()
Arthur Eubanks [Tue, 7 Jun 2022 02:38:50 +0000 (19:38 -0700)]
[NFC] Properly suppress unused argument warning in __isOSVersionAtLeast()

Casting to non-void causes
  expression result unused [-Wunused-value]

2 years ago[NFC] Use predecessors to replace make_range.
jacquesguan [Mon, 6 Jun 2022 04:32:35 +0000 (04:32 +0000)]
[NFC] Use predecessors to replace make_range.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127085

2 years ago[gmodules] Skip CXXDeductionGuideDecls when visiting FunctionDecls in
Akira Hatanaka [Tue, 17 May 2022 16:37:29 +0000 (09:37 -0700)]
[gmodules] Skip CXXDeductionGuideDecls when visiting FunctionDecls in
DebugTypeVisitor

This recommits d1346e2. I've added a line to the test case to enable it
only on assert builds.

Differential Revision: https://reviews.llvm.org/D125839

2 years agoRevert "[gmodules] Skip CXXDeductionGuideDecls when visiting FunctionDecls in"
Akira Hatanaka [Tue, 7 Jun 2022 01:48:24 +0000 (18:48 -0700)]
Revert "[gmodules] Skip CXXDeductionGuideDecls when visiting FunctionDecls in"

This reverts commit d1346e2ee2741919a8cc1b1ffe400001e76a6d06.

The commit broke a few bots.

2 years ago[mlir] Add documentation for TableGen LSP features and setup
River Riddle [Tue, 7 Jun 2022 01:29:20 +0000 (18:29 -0700)]
[mlir] Add documentation for TableGen LSP features and setup

This commit beefs up the documentation for MLIR language servers by
adding proper documentations/examples/etc for the provided TableGen
language server capabilities. Given that this documentation is also used
for the vscode extension, this commit also updates the user facing vscode
extension documentation.

Note that the images referenced in the new documentation are hosted on
the website, and will be commited to mlir-www shortly after this commit
lands.

2 years ago[WebAssembly][NFC] RelaxedBinary tablegen multiclass for relaxed SIMD
Thomas Lively [Tue, 7 Jun 2022 00:56:39 +0000 (17:56 -0700)]
[WebAssembly][NFC] RelaxedBinary tablegen multiclass for relaxed SIMD

Refactor the tablegen definitions for relaxed SIMD min/max instructions to use a
shared RelaxedBinary multiclass modeled on the existing SIMDBinary multiclass. A
future commit will add further instruction definitions that use RelaxedBinary.

Also rename the SIMD_RELAXED_CONVERT multiclass to RelaxedConvert to better fit
existing naming conventions.

Reviewed By: aheejin

Differential Revision: https://reviews.llvm.org/D127157

2 years ago[NFC] Fix spelling error M->L
Chris Bieneman [Tue, 7 Jun 2022 00:38:15 +0000 (19:38 -0500)]
[NFC] Fix spelling error M->L

Clearly I cannot spell...

2 years agoFix big endian build bots
Chris Bieneman [Tue, 7 Jun 2022 00:25:25 +0000 (19:25 -0500)]
Fix big endian build bots

Another case of reading a value from a struct that has been byte
swapped to write out. This should address the failure on the ppcbe bot.

2 years ago[Object][Archive] Support a new archive member /<ECSYMBOLS>/
Pengxuan Zheng [Sat, 4 Jun 2022 19:22:38 +0000 (12:22 -0700)]
[Object][Archive] Support a new archive member /<ECSYMBOLS>/

Some libraries (e.g., arm64rt.lib) from the Windows WDK (version 10.0.22000.0)
contain an undocumented special member '/<ECSYMBOLS>/'. This causes llvm-lib to
fail with the following error:

"truncated or malformed archive (long name offset characters after the '/' are
not all decimal numbers: '<ECSYMBOLS>/' for archive member header at offset 162)"

The '/<ECSYMBOLS>/' member does not seem to be documented anywhere, but might be
related to the ARM64EC ABI Microsoft announced last year.

https://blogs.windows.com/windowsdeveloper/2021/06/28/announcing-arm64ec-building-native-and-interoperable-apps-for-windows-11-on-arm/

Reviewed By: thieta, thakis

Differential Revision: https://reviews.llvm.org/D127135

2 years ago[DX][ObjYAML] Support for parsing DXIL part
Chris Bieneman [Thu, 19 May 2022 18:13:56 +0000 (13:13 -0500)]
[DX][ObjYAML] Support for parsing DXIL part

This patch adds support for parsing the DXIL part data into the
ObjectYAML tooling.

The DXIL part has additional headers describing the shader and bitcode
data and stores serialized bitcode after the headers.

Depends on D124945

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D126795

2 years ago[NFC] Remove trailing whitespace
Shilei Tian [Mon, 6 Jun 2022 22:59:13 +0000 (18:59 -0400)]
[NFC] Remove trailing whitespace

2 years agoFix typo in wasm objcopy's only-keep-debug.test
Derek Schuff [Mon, 6 Jun 2022 22:48:12 +0000 (15:48 -0700)]
Fix typo in wasm objcopy's only-keep-debug.test

Lefover from review for https://reviews.llvm.org/D126509#inline-1218795

2 years ago[Driver] add -lresolv for all but Android.
Kevin Athey [Mon, 6 Jun 2022 22:43:00 +0000 (15:43 -0700)]
[Driver] add -lresolv for all but Android.

As there 3 intercepts that depend on libresolv, link tests in ./configure scripts may be confuse by the presence of resolv symbols (i.e. dn_expand) even with -lresolv and get a runtime error.

Android provides the functionality in libc.

https://reviews.llvm.org/D122849
https://reviews.llvm.org/D126851

Reviewed By: eugenis, MaskRay

Differential Revision: https://reviews.llvm.org/D127145

2 years ago[mlir][tosa] Moves constant folding operations out of the Canonicalizer
Georgios Pinitas [Mon, 6 Jun 2022 22:10:08 +0000 (22:10 +0000)]
[mlir][tosa] Moves constant folding operations out of the Canonicalizer

Transpose operations on constant data were getting folded during the
canonicalization process. This has compile time cost proportional to
the constant size. Moving this to a separate pass to enable optionality
and flexibility of how such scenarios can be handled.

Reviewed By: rsuderman, jpienaar, stellaraccident

Differential Revision: https://reviews.llvm.org/D124685

2 years ago[mlir][vector] fix typo in vector unroll transform
Christopher Bate [Mon, 6 Jun 2022 22:08:23 +0000 (16:08 -0600)]
[mlir][vector] fix typo in vector unroll transform

2 years ago[clang] P2266: apply move elision rules on throw expr nested in function prototypes
Matheus Izvekov [Sun, 5 Jun 2022 19:50:00 +0000 (21:50 +0200)]
[clang] P2266: apply move elision rules on throw expr nested in function prototypes

Our rules to determine if the throw expression are within the variable
scope were giving a false negative result in case the throw expression
would appear within a decltype in a nested function declaration.

Per P2266R3, the relevant rule is: [expr.prim.id.unqual]/2
```
    if the id-expression (possibly parenthesized) is the operand of a throw-expression, and names an implicitly movable entity that belongs to a scope that does not contain the compound-statement of the innermost lambda-expression, try-block , or function-try-block (if any) whose compound-statement or ctor-initializer encloses the throw-expression.
```

This fixes PR54341.

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D127075

2 years ago[mlir][bazel] fix bazel build on VectorTransforms
Mogball [Mon, 6 Jun 2022 21:51:11 +0000 (21:51 +0000)]
[mlir][bazel] fix bazel build on VectorTransforms

2 years ago[libc] Align the new thread stack as required by the target ABI.
Siva Chandra [Mon, 6 Jun 2022 21:44:35 +0000 (14:44 -0700)]
[libc] Align the new thread stack as required by the target ABI.

2 years agoRevert "[AMDGPU] gfx11 vop3dpp instructions"
Joe Nash [Mon, 6 Jun 2022 21:12:09 +0000 (17:12 -0400)]
Revert "[AMDGPU] gfx11 vop3dpp instructions"

This reverts commit 99a83b1286748501e0ccf199a582dc3ec5451ef5.

2 years agoRevert "[AMDGPU] gfx11 VOP1+VOP2 Instruction MC support"
Joe Nash [Mon, 6 Jun 2022 21:05:11 +0000 (17:05 -0400)]
Revert "[AMDGPU] gfx11 VOP1+VOP2 Instruction MC support"

This reverts commit 6079804498be497f52f97d1e3ef398d680b37f79.

2 years ago[RISCV] Add cost model test coverage of scalable reductions
Philip Reames [Mon, 6 Jun 2022 21:32:30 +0000 (14:32 -0700)]
[RISCV] Add cost model test coverage of scalable reductions

2 years ago[BasicTTI] Add missing scalable vector handling
Philip Reames [Mon, 6 Jun 2022 21:20:27 +0000 (14:20 -0700)]
[BasicTTI] Add missing scalable vector handling

BasicTTI needs to return an invalid cost for scalable vectors instead of crash.  Without this, it is impossible to write tests for missing functionality in a target.

2 years ago[WebAssembly] Remove restriction on main name mangling
Sam Clegg [Wed, 10 Jun 2020 22:48:35 +0000 (15:48 -0700)]
[WebAssembly] Remove restriction on main name mangling

Summary: Emscripten now handles/supports this new mode.

Subscribers: dschuff, jgravelle-google, aheejin, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75277

2 years ago[Objcopy][Wasm] Allow selecting known sections by name
Derek Schuff [Mon, 6 Jun 2022 20:40:43 +0000 (13:40 -0700)]
[Objcopy][Wasm] Allow selecting known sections by name

Currently, only custom sections can be selected by operations that use section
names, because only custom sections have explicit names (whereas known sections
have names defined by the spec and only use their indices in the binary format).
This CL makes objdopy use the spec-defined names for these sections, allowing
them to be used in operations such as dumping and removal.

This is a prerequisite for fixing
https://github.com/emscripten-core/emscripten/issues/13084

Differential Revision: https://reviews.llvm.org/D126509

2 years ago[ARM] Use llvm::dbgs() to print debug info (NFC)
ksyx [Mon, 6 Jun 2022 20:39:28 +0000 (16:39 -0400)]
[ARM] Use llvm::dbgs() to print debug info (NFC)

For consistency with other parts of code.

Approved by efriedma in differential revision
https://reviews.llvm.org/D127055

2 years ago[gn build] Port b79b2b677256
LLVM GN Syncbot [Mon, 6 Jun 2022 20:32:38 +0000 (20:32 +0000)]
[gn build] Port b79b2b677256

2 years ago[mlir][vector] Allow unroll of contraction in arbitrary order
Christopher Bate [Fri, 3 Jun 2022 22:34:47 +0000 (16:34 -0600)]
[mlir][vector] Allow unroll of contraction in arbitrary order

Adds supprot for vector unroll transformations to unroll in different
orders. For example, the `vector.contract` can be unrolled into a
smaller set of contractions.  There is a choice of how to unroll the
decomposition  based on the traversal order of (dim0, dim1, dim2).
The choice of traversal order can now be specified by a callback which
given by the caller of the transform. For now, only the
`vector.contract`, `vector.transfer_read/transfer_write` operations
support the callback.

Differential Revision: https://reviews.llvm.org/D127004

2 years ago[libc++] Implement ranges::find_first_of
Nikolas Klauser [Mon, 6 Jun 2022 11:57:34 +0000 (13:57 +0200)]
[libc++] Implement ranges::find_first_of

Reviewed By: Mordante, var-const, #libc

Spies: libcxx-commits, mgorny

Differential Revision: https://reviews.llvm.org/D126529

2 years ago[clang] [MinGW] Default to WinEH (SEH) exception handling instead of Dwarf
Martin Storsjö [Tue, 3 May 2022 14:45:36 +0000 (17:45 +0300)]
[clang] [MinGW] Default to WinEH (SEH) exception handling instead of Dwarf

The relevant runtime libraries have been updated to support this
now.

Differential Revision: https://reviews.llvm.org/D126871

2 years ago[ARM] [MinGW] Default to WinEH exception handling instead of Dwarf
Martin Storsjö [Fri, 13 May 2022 08:45:52 +0000 (11:45 +0300)]
[ARM] [MinGW] Default to WinEH exception handling instead of Dwarf

Switching this target to WinEH also seems to affect the `-windows-itanium`
target.

Differential Revision: https://reviews.llvm.org/D126870

2 years ago[libunwind] Don't store a predecremented PC when using SEH
Martin Storsjö [Wed, 11 May 2022 08:20:38 +0000 (11:20 +0300)]
[libunwind] Don't store a predecremented PC when using SEH

This fixes unwinding in boundary cases on ARM with SEH.

In the case of ARM/Thumb, disp->ControlPc points at the following
instruction, with the thumb bit set. Thus by decrementing 1,
it still points at the next instruction. To achieve the desired
effect of pointing at the previous instruction, one first has to strip
out the thumb bit, then do the decrement by 1 to reach the previous
instruction.

When libcxxabi looks for call site ranges, it already does
`_Unwind_GetIP(context) - 1` (in `scan_eh_tab` in
libcxxabi/src/cxa_personality.cpp), so we shouldn't do the
corresponding `- 1` multiple times.

In the case of libcxxabi on Thumb, `funcStart` (still in `scan_eh_tab`)
may have the thumb bit set. If the program counter address is
decremented both in libunwind (first removing the thumb bit, then
decremented), and then libcxxabi decrements it further, and compares
with a `funcStart` with the thumb bit set, it could point to one byte
before the start of the call site.

Thus: This modification makes libunwind with SEH work with libcxxabi
on Thumb, in settings where libunwind and libcxxabi worked fine with
Dwarf before.

For existing cases with libunwind with SEH (on x86_64 and aarch64),
this modification doesn't break any of my testcases.

Differential Revision: https://reviews.llvm.org/D126869

2 years ago[libunwind] Remove unused ARM SEH placeholder code
Martin Storsjö [Tue, 10 May 2022 10:02:59 +0000 (13:02 +0300)]
[libunwind] Remove unused ARM SEH placeholder code

There's no such corresponding code for ARM64 (which has been working
in production for years). The SEH version of the Unwind functions
(e.g. `_Unwind_GetLanguageSpecificData`) doesn't use these fields.

The `_Unwind_ForcedUnwind` function would need these bits though,
but that's not used in normal C++ exception unwinding.

Differential Revision: https://reviews.llvm.org/D126868

2 years ago[libunwind] Fix SEH unwinding on ARM
Martin Storsjö [Fri, 6 May 2022 21:26:45 +0000 (00:26 +0300)]
[libunwind] Fix SEH unwinding on ARM

Check `__SEH__` when checking if ARM EHABI should be implied,
similarly to 4a3722a2c3dff1fe885cc38bf43d3c095c9851e7 / D126866.

Fix a warning by using the right format specifier (PRIxPTR instead
of PRIx64), and add a double->float cast in a codepath that hasn't
been built so far.

This is enough to make SEH unwinding of itanium ABI exceptions on
ARM mostly work - one specific issue is fixed in a separate follow-up
patch.

Differential Revision: https://reviews.llvm.org/D126867

2 years ago[builtins] Check __SEH__, when checking if ARM EHABI is implied
Martin Storsjö [Fri, 6 May 2022 21:19:05 +0000 (00:19 +0300)]
[builtins] Check __SEH__, when checking if ARM EHABI is implied

ARM EHABI isn't signalled by any specific compiler builtin define,
but is implied by the lack of defines specifying any other
exception handling mechanism, `__USING_SJLJ_EXCEPTIONS__` or
`__ARM_DWARF_EH__`.

As Windows SEH also can be used for unwinding, check for the
`__SEH__` define too, in the same way.

This is the same change as 4a3722a2c3dff1fe885cc38bf43d3c095c9851e7 /
D126866, applied on the compiler-rt builtins gcc_personality_v0
function.

Differential Revision: https://reviews.llvm.org/D126863

2 years ago[clang] [Headers] Check __SEH__, when checking if ARM EHABI is implied
Martin Storsjö [Fri, 6 May 2022 21:23:34 +0000 (00:23 +0300)]
[clang] [Headers] Check __SEH__, when checking if ARM EHABI is implied

ARM EHABI isn't signalled by any specific compiler builtin define,
but is implied by the lack of defines specifying any other
exception handling mechanism, `__USING_SJLJ_EXCEPTIONS__` or
`__ARM_DWARF_EH__`.

As Windows SEH also can be used for unwinding, check for the
`__SEH__` define too, in the same way.

This is the same change as 4a3722a2c3dff1fe885cc38bf43d3c095c9851e7 /
D126866, applied on the clang headers.

Differential Revision: https://reviews.llvm.org/D126865

2 years ago[libcxxabi] Check __SEH__, when checking if ARM EHABI is implied
Martin Storsjö [Fri, 6 May 2022 21:37:40 +0000 (00:37 +0300)]
[libcxxabi] Check __SEH__, when checking if ARM EHABI is implied

ARM EHABI isn't signalled by any specific compiler builtin define,
but is implied by the lack of defines specifying any other
exception handling mechanism, `__USING_SJLJ_EXCEPTIONS__` or
`__ARM_DWARF_EH__`.

As Windows SEH also can be used for unwinding, check for the
`__SEH__` define too, in the same way.

Differential Revision: https://reviews.llvm.org/D126866

2 years ago[libcxx] Omit dllimport in public headers in MinGW mode
Martin Storsjö [Sat, 9 Apr 2022 21:32:03 +0000 (00:32 +0300)]
[libcxx] Omit dllimport in public headers in MinGW mode

In MinGW environments, thanks to slightly different code generation
and linker tricks, it's possible to link against a DLL C++ standard
library without dllimport attributes.

This allows using one single set of headers for linking against
either the DLL or a static library, leaving the decision entirely
up to the linking stage (where it can be switched with options like
-static-libstdc++).

This matches how libstdc++ headers work; there's no dllimport attributes
by default (unless the user has defined _GLIBCXX_DLL when including
headers).

This allows using one single set of headers while linking against
either a DLL or a static library, just like on Unix platforms.

This matches how libc++ has been used in MinGW configurations for
years (by first building the DLL, then configuring a static-only
build and installing on top, overwriting the libc++ config file
with one for static linking) by multiple MinGW toolchains, making
the dllimport-less use the de-facto tested configuration in the wild.

This also allows building all of libc++ in one single CMake
configuration, instead of having to do two separate builds on top of
each other.

(Linking against a DLL without dllimport can break if e.g. templates
use inconsistent visibility attributes - in cases where it still
works when using explicit dllimport; such a case was fixed in
948dd664c3ed30dd853df03cb931436f280bad4a / D99932. With this as the
default configuration, we can catch such issues in CI.)

Differential Revision: https://reviews.llvm.org/D125924

2 years ago[libcxx] [test] Don't use header defines for detecting linking against a DLL
Martin Storsjö [Sun, 10 Apr 2022 21:32:11 +0000 (00:32 +0300)]
[libcxx] [test] Don't use header defines for detecting linking against a DLL

In clang-cl/MSVC environments, linking against a DLL C++ standard
library requires having dllimport attributes in the headers; this
has been used for detecting whether the tests link against a DLL,
by looking at the libc++ specific define
_LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS.

In mingw environments, thanks to slightly different code generation
and a couple linker tricks, it's possible to link against a DLL C++
standard library without dllimport attributes. Therefore, don't
rely on the libc++ specific header define for the detection.

Replace the detection with a runtime test.

Differential Revision: https://reviews.llvm.org/D125922

2 years ago[mlir] Add documentation for PDLL LSP features and setup
River Riddle [Sun, 15 May 2022 22:23:35 +0000 (15:23 -0700)]
[mlir] Add documentation for PDLL LSP features and setup

This commit beefs up the documentation for MLIR language servers by
adding proper documentations/examples/etc for the provided PDLL
language server capabilities. Given that this documentation is also used
for the vscode extension, this commit also updates the user facing vscode
extension documentation.

Not that the images referenced in the new documentation are hosted on
the website, and will be commited to mlir-www shortly after this commit
lands.

Differential Revision: https://reviews.llvm.org/D125650

2 years ago[DirectX] Embed DXIL in LLVM Module
Chris Bieneman [Tue, 10 May 2022 19:58:01 +0000 (14:58 -0500)]
[DirectX] Embed DXIL in LLVM Module

At the end of the codegen pipeline for DXIL we will emit the DXIL into
a global variable in the Module annotated for the "DXIL" section.

This will be used by the MCDXContainerStreamer to emit the DXIL into a
DXContainer DXIL part.

Other parts of the DXContainer will be constructed similarly by
serializing their values into GlobalVariables.

This will allow DXIL to flow into DXContainers through the normal
MCStreamer flow used in the MC layer.

Depends on D122270

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D125334

2 years ago[NFC] Change lit test for print-changed=dot-cfg to use regular expression
Jamie Schmeiser [Mon, 6 Jun 2022 19:51:48 +0000 (15:51 -0400)]
[NFC] Change lit test for print-changed=dot-cfg to use regular expression

Summary:

Issue 55761:
Change the lit test for print-changed=dot-cfg to have a regular expression
for the template arguments portion of the name for a pass manager pass.
This part of the name can change because it is based on the name provided
by the compiler, which is implementation-dependent. This mimics the
other change printer tests.

Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: mgorny (Michal Gorny)
Differential Revision: https://reviews.llvm.org/D126876

2 years ago[ModuloSchedule] Fix terminator update when peeling.
Hendrik Greving [Wed, 25 May 2022 15:18:55 +0000 (15:18 +0000)]
[ModuloSchedule] Fix terminator update when peeling.

Fixes a bug of us not correctly updating the terminator of the loop's
preheader, if multiple terminating branch instructions are present.

This is tested through existing tests. The bug itself is hard or not
possible to get exposed with the upstream Hexagon backend, because
the machine pipeliner checks for an existing preheader, which is
defined as a block with only 1 edge into the header.

The condition of this bug is a block into the loop with more than 1
edge, and not every downstream target checks for an existing preheader.

Differential Revision: https://reviews.llvm.org/D126386

2 years ago[gmodules] Skip CXXDeductionGuideDecls when visiting FunctionDecls in
Akira Hatanaka [Tue, 17 May 2022 16:37:29 +0000 (09:37 -0700)]
[gmodules] Skip CXXDeductionGuideDecls when visiting FunctionDecls in
DebugTypeVisitor

Differential Revision: https://reviews.llvm.org/D125839

2 years ago[RISCV] Autogen a test for ease of update
Philip Reames [Mon, 6 Jun 2022 19:44:05 +0000 (12:44 -0700)]
[RISCV] Autogen a test for ease of update

2 years ago[docs] Fix style and typo in HowToSetUpLLVMStyleRTTI.rst after D126943
Fangrui Song [Mon, 6 Jun 2022 19:41:21 +0000 (12:41 -0700)]
[docs] Fix style and typo in HowToSetUpLLVMStyleRTTI.rst after D126943

2 years ago[mlir][linalg] add conv_2d_nhwc_fhwc named op
Christopher Bate [Fri, 3 Jun 2022 02:06:24 +0000 (20:06 -0600)]
[mlir][linalg] add conv_2d_nhwc_fhwc named op

This operation should be supported as a named op because
when the operands are viewed as having canonical layouts
with decreasing strides, then the "reduction" dimensions
of the filter (h, w, and c) are contiguous relative to each
output channel. When lowered to a matrix multiplication,
this layout is the simplest to deal with, and thus future
transforms/vectorizations of `conv2d` may find using this
named op convenient.

Differential Revision: https://reviews.llvm.org/D126995

2 years ago[lld-macho] Demangle symbol names in duplicate-symbol error when -demangle is specified
Vy Nguyen [Fri, 3 Jun 2022 20:30:06 +0000 (16:30 -0400)]
[lld-macho] Demangle symbol names in duplicate-symbol error when -demangle is specified

Differential Revision: https://reviews.llvm.org/D127110

2 years ago[BPF] Add BTF 64bit enum value support
Yonghong Song [Wed, 20 Apr 2022 22:07:08 +0000 (15:07 -0700)]
[BPF] Add BTF 64bit enum value support

Current BTF only supports 32-bit value. For example,
  enum T { VAL = 0xffffFFFF00000008 };
the generated BTF looks like
        .long   16                              # BTF_KIND_ENUM(id = 4)
        .long   100663297                       # 0x6000001
        .long   8
        .long   18
        .long   8
The encoded value is 8 which equals to (uint32_t)0xffffFFFF00000008
and this is incorrect.

This patch introduced BTF_KIND_ENUM64 which permits to encode
64-bit value. The format for each enumerator looks like:
        .long   name_offset
        .long   (uint32_t)value # lower-32 bit value
        .long   value >> 32     # high-32 bit value

We use two 32-bit values to represent a 64-bit value as current
BTF type subsection has 4-byte alignment and gaps are not permitted
in the subsection.

This patch also added support for kflag (the bit 31 of CommonType.Info)
such that kflag = 1 implies the value is signed and kflag = 0
implies the value is unsigned. The kernel UAPI enumerator definition is
  struct btf_enum {
        __u32   name_off;
        __s32   val;
  };
so kflag = 0 with unsigned value provides backward compatability.

With this patch, for
  enum T { VAL = 0xffffFFFF00000008 };
the generated BTF looks like
        .long   16                              # BTF_KIND_ENUM64(id = 4)
        .long   3187671053                      # 0x13000001
        .long   8
        .long   18
        .long   8                               # 0x8
        .long   4294967295                      # 0xffffffff
and the enumerator value and signedness are encoded correctly.

Differential Revision: https://reviews.llvm.org/D124641

2 years ago[gn build] Port 352c395fb685
LLVM GN Syncbot [Mon, 6 Jun 2022 18:25:14 +0000 (18:25 +0000)]
[gn build] Port 352c395fb685

2 years ago[ObjectYAML][DX] Add dxcontainer2yaml support
Chris Bieneman [Tue, 3 May 2022 18:17:15 +0000 (13:17 -0500)]
[ObjectYAML][DX] Add dxcontainer2yaml support

This change finishes fleshing out the ObjectYAML tools to support
converting DXContainer files into yaml representations.

Depends on D124944

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D124945

2 years ago[GISel] Add new combines for G_ADD
Michael Kitzan [Wed, 18 May 2022 21:41:48 +0000 (14:41 -0700)]
[GISel] Add new combines for G_ADD

Patch adds new GICombineRules for G_ADD:

G_ADD(x, G_SUB(y, x)) -> y
G_ADD(G_SUB(y, x), x) -> y

Patch additionally adds new combine tests for AArch64 target for
these new rules.

Reviewed by: paquette

Differential Revision: https://reviews.llvm.org/D87936

2 years ago[mlir][linalg] fix crash when promoting rank-reducing memref.subviews
Christopher Bate [Mon, 6 Jun 2022 04:54:11 +0000 (22:54 -0600)]
[mlir][linalg] fix crash when promoting rank-reducing memref.subviews

This change adds support for promoting `linalg` operation operands that
are produced by rank-reducing `memref.subview` ops.

Differential Revision: https://reviews.llvm.org/D127086

2 years ago[libc++][NFC] Move span tests under views.span
Louis Dionne [Mon, 6 Jun 2022 18:02:41 +0000 (14:02 -0400)]
[libc++][NFC] Move span tests under views.span

2 years ago[libc++][NFC] Fix outdated comment in span test
Louis Dionne [Mon, 6 Jun 2022 18:02:05 +0000 (14:02 -0400)]
[libc++][NFC] Fix outdated comment in span test

2 years ago[libc] Fix cmake compatibility issue with list(POP_FRONT).
Tue Ly [Mon, 6 Jun 2022 17:30:13 +0000 (13:30 -0400)]
[libc] Fix cmake compatibility issue with list(POP_FRONT).

list(POP_FRONT) is only added to cmake in 3.15, while our base line
version is 3.13

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D127129

2 years ago[RISCV] Reorganize getShuffleCost to make it more clear what's going on [nfc]
Philip Reames [Mon, 6 Jun 2022 17:11:23 +0000 (10:11 -0700)]
[RISCV] Reorganize getShuffleCost to make it more clear what's going on [nfc]

2 years ago[SLP][NFC] Precommit test for followup patch that fixes vector phi poison input.
Vasileios Porpodas [Fri, 3 Jun 2022 02:01:57 +0000 (19:01 -0700)]
[SLP][NFC] Precommit test for followup patch that fixes vector phi poison input.

Differential Revision: https://reviews.llvm.org/D126938

2 years ago[SelectionDAG] Further improve computeKnownBits for (smax X, C) where C is non-negative.
Craig Topper [Mon, 6 Jun 2022 16:45:52 +0000 (09:45 -0700)]
[SelectionDAG] Further improve computeKnownBits for (smax X, C) where C is non-negative.

Move the code that was added for D126896 after the normal recursive calls
to computeKnownBits. This allows us to calculate trailing zeros.
Previously we would break out of the switch before the recursive calls.

2 years ago[libc++][NFC] Add missing includes
Louis Dionne [Mon, 6 Jun 2022 16:57:42 +0000 (12:57 -0400)]
[libc++][NFC] Add missing includes

2 years ago[libc++] Avoid creating temporaries in unary expressions involving valarray
Louis Dionne [Thu, 5 May 2022 16:24:43 +0000 (12:24 -0400)]
[libc++] Avoid creating temporaries in unary expressions involving valarray

Currently, unary expressions involving valarray will create a temporary.
This leads to dangling references in expressions like `-a * b`, because
`-a` is a temporary and the resulting expression will refer to it. This
patch fixes the problem by creating a lazy expression to perform the unary
operation instead of eagerly creating a temporary valarray. This is
permitted by the Standard, which does not specify the exact type of
most expressions involving valarrays.

This is technically an ABI break, however I believe the actual potential
for breakage is very low.

rdar://90152242

Differential Revision: https://reviews.llvm.org/D125019

2 years agoSupport converting pointers from opaque to typed
Chris Bieneman [Tue, 24 May 2022 16:56:12 +0000 (11:56 -0500)]
Support converting pointers from opaque to typed

Using the pointer type analysis we can re-constitute typed pointers and
populate the correct types in the bitcasts throughout the IR.

This doesn't yet handle all cases, but this should be illustrative as
to the dirction and feasability of
the solution.

Reviewed By: pete

Differential Revision: https://reviews.llvm.org/D122270

2 years agoFix overflow bug impacting 32-bit testing
Chris Bieneman [Mon, 6 Jun 2022 16:16:16 +0000 (11:16 -0500)]
Fix overflow bug impacting 32-bit testing

This test was failing on 32-bit arm builders due to an interger
overflow. This changes the math to avoid overflow and should resolve
the test failure.

2 years ago[AMDGPU][GFX9+] Support base+soffset+offset s_atc_probe's.
Ivan Kosarev [Mon, 6 Jun 2022 15:46:22 +0000 (16:46 +0100)]
[AMDGPU][GFX9+] Support base+soffset+offset s_atc_probe's.

Resolves part of
https://github.com/llvm/llvm-project/issues/38652

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D126791

2 years ago[InstCombine] add/move tests for opposite direction shifts; NFC
Sanjay Patel [Mon, 6 Jun 2022 15:13:17 +0000 (11:13 -0400)]
[InstCombine] add/move tests for opposite direction shifts; NFC

2 years ago[AMDGPU][GFX9][GFX10] Support base+soffset+offset s_dcache_discard's.
Ivan Kosarev [Mon, 6 Jun 2022 15:32:16 +0000 (16:32 +0100)]
[AMDGPU][GFX9][GFX10] Support base+soffset+offset s_dcache_discard's.

Resolves part of
https://github.com/llvm/llvm-project/issues/38652

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D126766

2 years ago[AMDGPU][GFX8][DOC][NFC] Update assembler syntax description
Dmitry Preobrazhensky [Mon, 6 Jun 2022 13:08:16 +0000 (16:08 +0300)]
[AMDGPU][GFX8][DOC][NFC] Update assembler syntax description

Summary of changes:
- Updated MUBUF lds syntax (see https://reviews.llvm.org/D124485).
- Enabled literals with src0 for v_madak*, v_madmk* (see https://reviews.llvm.org/D111067).
- Minor bug fixing.

2 years agoDon't warn when 'llvm' isn't found
Alex Brachet [Mon, 6 Jun 2022 14:29:14 +0000 (14:29 +0000)]
Don't warn when 'llvm' isn't found

2 years ago[AMDGPU] gfx11 VOP1+VOP2 Instruction MC support
Joe Nash [Mon, 23 May 2022 14:26:02 +0000 (10:26 -0400)]
[AMDGPU] gfx11 VOP1+VOP2 Instruction MC support

Includes dpp instructions and vop1/vop2 promoted to vop3

Patch 17/N for upstreaming of AMDGPU gfx11 architecture

Depends on D126483

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D126917

2 years ago[AMDGPU] gfx11 vop3dpp instructions
Joe Nash [Wed, 18 May 2022 19:01:20 +0000 (15:01 -0400)]
[AMDGPU] gfx11 vop3dpp instructions

gfx11 adds the ability to use dpp modifiers on vop3 instructions.
This patch adds machine code layer support for that. The MCCodeEmitter
is changed to use APInt instead of uint64_t to support these wider
instructions.

Patch 16/N for upstreaming of AMDGPU gfx11 architecture

Depends on D126475

Reviewed By: rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D126483

2 years ago[libc++] Remove a bunch of conditionals on _LIBCPP_DEBUG_LEVEL
Louis Dionne [Fri, 3 Jun 2022 19:17:03 +0000 (15:17 -0400)]
[libc++] Remove a bunch of conditionals on _LIBCPP_DEBUG_LEVEL

Instead of providing two different constructors for iterators that
support the debug mode, provide a single constructor but leave the
container parameter unused when the debug mode is not enabled.

This allows simplifying all the call sites to unconditionally pass
the container, which removes a bunch of duplication in the container's
implementation.

Note that this patch does add some complexity to std::span, however
that is only because std::span has the ability to use raw pointers
as iterators instead of __wrap_iter. In retrospect, I believe it was
a mistake to provide that capability, and so it will be removed in a
future patch, along with the complexity added by this patch.

Differential Revision: https://reviews.llvm.org/D126993

2 years ago[IPSCCP] Switch away from Instruction::isSafeToRemove()
Kevin P. Neal [Fri, 3 Jun 2022 18:28:02 +0000 (14:28 -0400)]
[IPSCCP] Switch away from Instruction::isSafeToRemove()

In D115737 I found that I needed to teach Instruction::isSafeToRemove()
about strictfp/constrained intrinsics. It was pointed out that this is
probably the wrong function to use isInstructionTriviallyDead(). It doesn't
make sense to have a "second, worse implementation".

I also believe that the Instruction class is the wrong place for this
functionality. The information about whether or not an instruction can be
removed is in the transform passes and should stay there.

Differential Revision: https://reviews.llvm.org/D118387

2 years ago[flang][driver] Remove references to the `flang` bash script
Andrzej Warzynski [Mon, 6 Jun 2022 10:03:21 +0000 (10:03 +0000)]
[flang][driver] Remove references to the `flang` bash script

This is a follow-up of https://reviews.llvm.org/D125832
(see also https://reviews.llvm.org/D125788 for more context). It simply
removes any remaining references to the `flang` bash script. Note that
that `flang-to-external-fc` remains intact.

This felt worthwhile mentioning in the release notes, which have not
been updated since LLVM 12 (we are approaching LLVM 15 now). I took the
liberty of removing all of the out-dated content and added a note about
the renaming.

Differential Revision: https://reviews.llvm.org/D127094

2 years ago[AMDGPU][GFX7][DOC][NFC] Update assembler syntax description
Dmitry Preobrazhensky [Mon, 6 Jun 2022 12:50:10 +0000 (15:50 +0300)]
[AMDGPU][GFX7][DOC][NFC] Update assembler syntax description

Summary of changes:
- Updated MUBUF lds syntax (see https://reviews.llvm.org/D124485).
- Enabled literals with src0 of v_madak_f32, v_madmk_f32 (see https://reviews.llvm.org/D111067).
- Corrected LGKM_CNT description.
- Minor bug fixing.

2 years agoFix a -Wlogical-op-parentheses warning; NFC
Aaron Ballman [Mon, 6 Jun 2022 11:51:24 +0000 (07:51 -0400)]
Fix a -Wlogical-op-parentheses warning; NFC

This should address bot failures like:
https://lab.llvm.org/buildbot/#/builders/77/builds/18317

2 years ago[gn build] Port 8171586176ee
LLVM GN Syncbot [Mon, 6 Jun 2022 11:33:45 +0000 (11:33 +0000)]
[gn build] Port 8171586176ee

2 years ago[libc++][ranges] Implement ranges::binary_search and ranges::{lower, upper}_bound
Nikolas Klauser [Sun, 5 Jun 2022 19:15:16 +0000 (21:15 +0200)]
[libc++][ranges] Implement ranges::binary_search and ranges::{lower, upper}_bound

Reviewed By: Mordante, var-const, ldionne, #libc

Spies: sstefan1, libcxx-commits, mgorny

Differential Revision: https://reviews.llvm.org/D121964

2 years agoAllow use of an elaborated type specifier in a _Generic association in C++
Aaron Ballman [Mon, 6 Jun 2022 11:17:35 +0000 (07:17 -0400)]
Allow use of an elaborated type specifier in a _Generic association in C++

Currently, Clang accepts this code in C mode (where the tag is required
to be used) but rejects it in C++ mode thinking that the association is
defining a new type.

void foo(void) {
  struct S { int a; };
  _Generic(something, struct S : 1);
}
Clang thinks this in C++ because it sees struct S : when parsing the
class specifier and decides that must be a type definition (because the
colon signifies the presence of a base class type). This patch adds a
new declarator context to represent a _Generic association so that we
can distinguish these situations properly.

Fixes #55562

Differential Revision: https://reviews.llvm.org/D126969

2 years ago[AArch64] Generate ADDP from shuffled add
David Green [Mon, 6 Jun 2022 10:39:51 +0000 (11:39 +0100)]
[AArch64] Generate ADDP from shuffled add

This adds a fold of add(x, shuffle(x, <1,0,3,2,5,4,...>), into
shuffle(addp(x), <0,0,1,1,2,2,..>. The ADDP instruction takes two
vectors and returns one, adding adjacent pairs. So we match x in a
custom combine as it is lowered from a v8i32. The original code
would be 2 rev64 and 2 add, with the new code being a single addp
with a zip1;zip2 shuffle, producing smaller code.

Differential Revision: https://reviews.llvm.org/D126686

2 years agoFix "not all control paths return a value" MSVC warning. NFC.
Simon Pilgrim [Mon, 6 Jun 2022 10:31:46 +0000 (11:31 +0100)]
Fix "not all control paths return a value" MSVC warning. NFC.

2 years ago[gn build] port f06abbb39380 a bit (create main() functions for GENERATE_DRIVER targets)
Nico Weber [Mon, 6 Jun 2022 10:18:13 +0000 (06:18 -0400)]
[gn build] port f06abbb39380 a bit (create main() functions for GENERATE_DRIVER targets)