platform/upstream/llvm.git
3 years ago[libc++] NFC: Normalize `#endif //` comment indentation
Louis Dionne [Tue, 20 Apr 2021 16:03:32 +0000 (12:03 -0400)]
[libc++] NFC: Normalize `#endif //` comment indentation

3 years agoGlobalISel: Defer register creation in handleAssignments
Matt Arsenault [Tue, 13 Apr 2021 01:40:23 +0000 (21:40 -0400)]
GlobalISel: Defer register creation in handleAssignments

This is currently built on top of the SelectionDAG call lowering, but
does not use it the same way. SelectionDAG passes legalized types to
the assignment functions, and the tablegenerated assignment functions
may change the value types expected for registers. This does not
change the types used, just moves the register creation to help fix
this in the future.

Defer the register creation until after all of the assignment
decisions have been made. This will also help have correct tail call
compatibility checking in a future change. Currently it does not work
as expected for any arguments split across multiple registers.

3 years ago[AMDGPU] Allow multiple uses of the same literal
Jay Foad [Thu, 8 Apr 2021 11:58:49 +0000 (12:58 +0100)]
[AMDGPU] Allow multiple uses of the same literal

In GFX10 VOP3 can have a literal, which opens up the possibility of two
operands using the same literal value, which is allowed and only counts
as one use of the constant bus.

AMDGPUAsmParser::validateConstantBusLimitations already knew about this
but SIInstrInfo::verifyInstruction did not.

Differential Revision: https://reviews.llvm.org/D100770

3 years ago[AArch64] Bump apple-latest CPU alias to apple-a14.
Ahmed Bougacha [Fri, 16 Apr 2021 20:54:13 +0000 (13:54 -0700)]
[AArch64] Bump apple-latest CPU alias to apple-a14.

3 years ago[AArch64] Don't always override CPU for arm64e.
Ahmed Bougacha [Fri, 16 Apr 2021 20:59:08 +0000 (13:59 -0700)]
[AArch64] Don't always override CPU for arm64e.

This demotes the apple-a12 CPU selection for arm64e to just be the
last-resort default.  Concretely, this means:
- an explicitly-specified -mcpu will override the arm64e default;
  a user could potentially pick an invalid CPU that doesn't have
  v8.3a support, but that's not a major problem anymore
- arm64e-apple-macos (and variants) will pick apple-m1 instead of
  being forced to apple-a12.

3 years ago[AArch64] Add apple-m1 CPU, and default to it for macOS.
Ahmed Bougacha [Thu, 15 Apr 2021 02:34:55 +0000 (19:34 -0700)]
[AArch64] Add apple-m1 CPU, and default to it for macOS.

apple-m1 has the same level of ISA support as apple-a14,
so this is a straightforward mechanical change.  However, that
also means this inherits apple-a14's v8.5a+nobti quirkiness.

rdar://68287159

3 years ago[gn build] Port 120fa8293e22
LLVM GN Syncbot [Tue, 20 Apr 2021 15:33:43 +0000 (15:33 +0000)]
[gn build] Port 120fa8293e22

3 years ago[libc++][nfc] Move iterator_traits and related into __iterator/iterator_traits.h.
zoecarver [Mon, 19 Apr 2021 21:44:42 +0000 (14:44 -0700)]
[libc++][nfc] Move iterator_traits and related into __iterator/iterator_traits.h.

Based on D100682 and D99855.

(Note: I originally was going to just make this part of D99855, but I decided not to because this patch moves lots of unrelated code around, and I didn't want to make D99855 harder to review because of unrelated code-changes/moves.)

Differential Revision: https://reviews.llvm.org/D100686

3 years agoGlobalISel: Check for powers of 2 for inverse funnel shift lowering
Matt Arsenault [Mon, 29 Mar 2021 21:26:49 +0000 (17:26 -0400)]
GlobalISel: Check for powers of 2 for inverse funnel shift lowering

This doesn't make a practical difference since it would only be broken
if a target actually had a legal non-power-of-2 inverse shift.

3 years ago[libcxx] makes `iterator_traits` C++20-aware
zoecarver [Tue, 20 Apr 2021 12:50:11 +0000 (08:50 -0400)]
[libcxx] makes `iterator_traits` C++20-aware

* adds `iterator_traits` specialisation that supports all expected
  member aliases except for `pointer`
* adds `iterator_traits` specialisations for iterators that meet the
  legacy iterator requirements but might lack multiple member aliases
* makes pointer `iterator_traits` specialisation require objects

Depends on D99854.

Differential Revision: https://reviews.llvm.org/D99855

3 years agoRevert "[SLP] Add detection of shuffled/perfect matching of tree entries."
Alexey Bataev [Tue, 20 Apr 2021 15:29:07 +0000 (08:29 -0700)]
Revert "[SLP] Add detection of shuffled/perfect matching of tree entries."

This reverts commit daf6e18c55c2ac56bbf0f9de233fb2a1150ee331 to fix the
compiler crash.

3 years ago[ARM] Limit PerformExtractEltToVMOVRRD to when f64 is legal.
David Green [Tue, 20 Apr 2021 15:24:36 +0000 (16:24 +0100)]
[ARM] Limit PerformExtractEltToVMOVRRD to when f64 is legal.

The generic SoftFloatVectorExtract.ll test was failing when run on arm
machines, as it tries to create a f64 under soft float. Limit the
transform to when f64 is legal.

Also add a missing override, as reported in D100244.

3 years agoAMDGPU/GlobalISel: Fix uitofp/sitofp with non-power-of-2 integers
Matt Arsenault [Sat, 27 Mar 2021 15:14:15 +0000 (11:14 -0400)]
AMDGPU/GlobalISel: Fix uitofp/sitofp with non-power-of-2 integers

3 years agoEnsure target-multiversioning emits deferred declarations
Erich Keane [Tue, 20 Apr 2021 14:35:57 +0000 (07:35 -0700)]
Ensure target-multiversioning emits deferred declarations

As reported in PR50025, sometimes we would end up not emitting functions
needed by inline multiversioned variants. This is because we typically
use the 'deferred decl' mechanism to emit these.  However, the variants
are emitted after that typically happens.  This fixes that by ensuring
we re-run deferred decls after this happens. Also, the multiversion
emission is done recursively to ensure that MV functions that require
other MV functions to be emitted get emitted.

3 years agoGlobalISel: Restrict narrow scalar for fptoui/fptosi results
Matt Arsenault [Fri, 26 Mar 2021 21:29:36 +0000 (17:29 -0400)]
GlobalISel: Restrict narrow scalar for fptoui/fptosi results

This practically only works for the f16 case AMDGPU uses, not wider
types.

Fixes bug 49710 by failing legalization.

3 years agoMachineVerifier: Continue reporting errors for copies
Matt Arsenault [Wed, 31 Mar 2021 20:45:19 +0000 (16:45 -0400)]
MachineVerifier: Continue reporting errors for copies

This was skipping verification of later copies, but generally the
verifier tries to report as many things wrong as possible in the
function.

3 years ago[SLP] Add detection of shuffled/perfect matching of tree entries.
Alexey Bataev [Tue, 20 Apr 2021 14:27:32 +0000 (07:27 -0700)]
[SLP] Add detection of shuffled/perfect matching of tree entries.

SLP supports perfect diamond matching for the vectorized tree entries
but do not support it for gathered entries and does not support
non-perfect (shuffled) matching with 1 or 2 tree entries. Patch adds
support for this matching to improve cost of the vectorized tree.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D100495

3 years ago[mlir][StandardToSPIRV] Add support for lowering std.xor on bool to SPIR-V
Hanhan Wang [Tue, 20 Apr 2021 14:34:32 +0000 (07:34 -0700)]
[mlir][StandardToSPIRV] Add support for lowering std.xor on bool to SPIR-V

std.xor ops on bool are lowered to spv.LogicalNotEqual. For Boolean values, xor
and not-equal are the same thing.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D100817

3 years ago[gn build] reformat all gn files
Nico Weber [Tue, 20 Apr 2021 14:33:35 +0000 (10:33 -0400)]
[gn build] reformat all gn files

$ git ls-files '*.gn' '*.gni' | xargs llvm/utils/gn/gn.py format

(and manually wrap two comments)

3 years ago[AArch64][SVE] Lower MULHU/MULHS nodes to umulh/smulh instructions
Bradley Smith [Tue, 13 Apr 2021 14:19:59 +0000 (15:19 +0100)]
[AArch64][SVE] Lower MULHU/MULHS nodes to umulh/smulh instructions

Mark MULHS/MULHU nodes as legal for both scalable and fixed SVE types,
and lower them to the appropriate SVE instructions.

Additionally now that the MULH nodes are legal, integer divides can be
expanded into a more performant code sequence.

Differential Revision: https://reviews.llvm.org/D100487

3 years agoRevert "[SLP] Add detection of shuffled/perfect matching of tree entries."
Alexey Bataev [Tue, 20 Apr 2021 14:15:25 +0000 (07:15 -0700)]
Revert "[SLP] Add detection of shuffled/perfect matching of tree entries."

This reverts commit b232771acad6225574a2eaf9f860a0fed7ef0804 to fix
buildbots.

3 years ago[ARM] Create VMOVRRD from adjacent vector extracts
David Green [Tue, 20 Apr 2021 14:15:43 +0000 (15:15 +0100)]
[ARM] Create VMOVRRD from adjacent vector extracts

This adds a combine for extract(x, n); extract(x, n+1)  ->
VMOVRRD(extract x, n/2). This allows two vector lanes to be moved at the
same time in a single instruction, and thanks to the other VMOVRRD folds
we have added recently can help reduce the amount of executed
instructions. Floating point types are very similar, but will include a
bitcast to an integer type.

This also adds a shouldRewriteCopySrc, to prevent copy propagation from
DPR to SPR, which can break as not all DPR regs can be extracted from
directly.  Otherwise the machine verifier is unhappy.

Differential Revision: https://reviews.llvm.org/D100244

3 years ago[flang][driver] Refactor methods for parsing options (nfc)
Andrzej Warzynski [Wed, 14 Apr 2021 11:42:11 +0000 (11:42 +0000)]
[flang][driver] Refactor methods for parsing options (nfc)

This is just a small update that makes sure that errors arising from
parsing command-line options are captured more visibly. Also, all
parsing methods will now consistently return either a bool ("may fail")
or void ("never fails").

An instance of `InputKind` coming from `-x` is added to
`FrontendOptions` rather then being returned from `ParseFrontendArgs`.
It's currently not used, but we will require it shortly. In particular,
once code-generation is available we will use it to differentiate
between LLVM IR and Fortran input. `FrontendOptions` is a very suitable
place to keep it.

This changes don't affect the error reporting in the driver. In this
respect these are non-functional-changes. However, it will simplify
things in the forthcoming patches in which we may need a better error
tracking/recovery mechanism.

Differential Revision: https://reviews.llvm.org/D100556

3 years ago[SLP] Add detection of shuffled/perfect matching of tree entries.
Alexey Bataev [Tue, 20 Apr 2021 12:47:55 +0000 (05:47 -0700)]
[SLP] Add detection of shuffled/perfect matching of tree entries.

SLP supports perfect diamond matching for the vectorized tree entries
but do not support it for gathered entries and does not support
non-perfect (shuffled) matching with 1 or 2 tree entries. Patch adds
support for this matching to improve cost of the vectorized tree.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D100495

3 years ago[AArch64][AsmParser] NFC: Remove unused ExtendOp struct
Cullen Rhodes [Tue, 20 Apr 2021 12:22:37 +0000 (12:22 +0000)]
[AArch64][AsmParser] NFC: Remove unused ExtendOp struct

Left over from 2625a993f926 when extend and shift were merged.

3 years agoFix PR46880: Fail CHECK-NOT with undefined variable
Thomas Preud'homme [Fri, 28 Aug 2020 10:30:01 +0000 (11:30 +0100)]
Fix PR46880: Fail CHECK-NOT with undefined variable

Currently a CHECK-NOT directive succeeds whenever the corresponding
match fails. However match can fail due to an error rather than a lack
of match, for instance if a variable is undefined. This commit makes match
error a failure for CHECK-NOT.

Reviewed By: jdenny

Differential Revision: https://reviews.llvm.org/D86222

3 years ago[AMDGPU] Add TransVALU to gfx10
Sebastian Neubauer [Thu, 8 Apr 2021 15:22:32 +0000 (17:22 +0200)]
[AMDGPU] Add TransVALU to gfx10

Instructions on the transcendental unit are executed in parallel to the
normal VALU, so add this as an extra resource.

This doesn't seem to have any effect, but it should be more correct.

Differential Revision: https://reviews.llvm.org/D100123

3 years ago[RISCV][NFC] Add tests for scalable-vector DAGCombiner improvements
Fraser Cormack [Tue, 20 Apr 2021 12:53:10 +0000 (13:53 +0100)]
[RISCV][NFC] Add tests for scalable-vector DAGCombiner improvements

These will all be improved by future patches.

3 years ago[AMDGPU] Use if instead of foreach in a few places. NFC.
Jay Foad [Tue, 20 Apr 2021 13:19:51 +0000 (14:19 +0100)]
[AMDGPU] Use if instead of foreach in a few places. NFC.

3 years ago[flang][nfc] Port 2 tests to use the new driver when enabled
Andrzej Warzynski [Fri, 16 Apr 2021 15:23:47 +0000 (15:23 +0000)]
[flang][nfc] Port 2 tests to use the new driver when enabled

This is similar to https://reviews.llvm.org/D100309, i.e. `%f18` is
replaced with `%flang_new`.

resolve105.f90 wasn't in tree when D100309 was worked on, so it's
updated here instead.

label14.f90 requires `-fsyntax-only`. I didn't notice that when
submitting D100309, hence updating it now instead. `-fsyntax-only` is
required to prevent `%f18` from calling an external compiler (which then
fails and returns a non-zero exit code).

Differential Revision: https://reviews.llvm.org/D100655

3 years ago[libc++][ci] Re-split the CI pipeline to try and reduce load on more builders
Louis Dionne [Tue, 20 Apr 2021 12:35:39 +0000 (08:35 -0400)]
[libc++][ci] Re-split the CI pipeline to try and reduce load on more builders

3 years ago[MCA][LSUnit] Fix a potential use after free in the logic that updates memory groups.
Andrea Di Biagio [Tue, 20 Apr 2021 11:57:20 +0000 (12:57 +0100)]
[MCA][LSUnit] Fix a potential use after free in the logic that updates memory groups.

Make sure that the `CriticalMemoryInstruction` of a memory group is invalidated
if it references an already executed instruction.  This avoids a potential
use-after-free if the critical memory info becomes stale, and the value is
read after the instruction has executed.

3 years ago[PowerPC] Canonicalize shuffles on big endian targets as well
Nemanja Ivanovic [Tue, 20 Apr 2021 11:25:18 +0000 (06:25 -0500)]
[PowerPC] Canonicalize shuffles on big endian targets as well

Extend shuffle canonicalization and conversion of shuffles fed by vectorized
scalars to big endian subtargets. For big endian subtargets, loads and direct
moves of scalars into vector registers put the data in the correct element for
SCALAR_TO_VECTOR if the data type is 8 bytes wide. However, if the data type is
narrower, the value still ends up in the wrong place - althouth a different
wrong place than on little endian targets.

This patch extends the combine that keeps values where they are if they feed a
shuffle to big endian targets.

Differential revision: https://reviews.llvm.org/D100478

3 years ago[llvm-objdump] Add an llvm-otool tool
Nico Weber [Thu, 15 Apr 2021 14:55:22 +0000 (10:55 -0400)]
[llvm-objdump] Add an llvm-otool tool

This implements an LLVM tool that's flag- and output-compatible
with macOS's `otool` -- except for bugs, but from testing with both
`otool` and `xcrun otool-classic`, llvm-otool matches vanilla
otool's behavior very well already. It's not 100% perfect, but
it's a very solid start.

This uses the same approach as llvm-objcopy: llvm-objdump uses
a different OptTable when it's invoked as llvm-otool. This
is possible thanks to D100433.

Differential Revision: https://reviews.llvm.org/D100583

3 years ago[ValueTypes] Fix sizes of v256i32 and v256f32 (8182 -> 8192)
Cullen Rhodes [Tue, 20 Apr 2021 11:31:43 +0000 (11:31 +0000)]
[ValueTypes] Fix sizes of v256i32 and v256f32 (8182 -> 8192)

3 years ago[AMDGPU] Use simpler alternatives to !foldl. NFC.
Jay Foad [Tue, 20 Apr 2021 11:37:16 +0000 (12:37 +0100)]
[AMDGPU] Use simpler alternatives to !foldl. NFC.

3 years ago[mlir][linalg] lower index operations during linalg to vector lowering.
Tobias Gysi [Tue, 20 Apr 2021 11:26:44 +0000 (11:26 +0000)]
[mlir][linalg] lower index operations during linalg to vector lowering.

The patch extends the vectorization pass to lower linalg index operations to vector code. It allocates constant 1d vectors that enumerate the indexes along the iteration dimensions and broadcasts/transposes these 1d vectors to the iteration space.

Differential Revision: https://reviews.llvm.org/D100373

3 years ago[DAG] SelectionDAG.cpp - breakup if-else chains where each block returns. NFCI.
Simon Pilgrim [Tue, 20 Apr 2021 10:59:23 +0000 (11:59 +0100)]
[DAG] SelectionDAG.cpp - breakup if-else chains where each block returns. NFCI.

Match style guide that requests that if+return blocks are separate.

3 years agoFix Wdocumentation warning by consistently using '///' comment blocks. NFCI.
Simon Pilgrim [Tue, 20 Apr 2021 10:41:04 +0000 (11:41 +0100)]
Fix Wdocumentation warning by consistently using '///' comment blocks. NFCI.

3 years ago[mlir] test gather/scatter index vector of type index.
Tobias Gysi [Tue, 20 Apr 2021 09:49:06 +0000 (09:49 +0000)]
[mlir] test gather/scatter index vector of type index.

Test the vector to llvm lowering of index vectors with index element type.

Differential Revision: https://reviews.llvm.org/D100827

3 years ago[lit, test] Fix test cancellation feature detection
Thomas Preud'homme [Thu, 1 Apr 2021 14:02:36 +0000 (15:02 +0100)]
[lit, test] Fix test cancellation feature detection

A lit feature guards tests for the lit timeout functionality because on
most system it depends on the availability of the psutil Python module.
However, that feature is defined based on the ability of the testing lit
to cancel test, which does not necessarily apply to the ability of the
tested lit.

In particular, RUN commands have a cleared PYTHONPATH and user site
packages are disabled. In the case where psutil is found by the testing
lit from one of those two source of python path, the tested lit would
not be able to find it, causing timeout tests to fail.

This commit fixes the issue by testing the ability to cancel tests in
the RUN command environment.

Reviewed By: yln

Differential Revision: https://reviews.llvm.org/D99728

3 years agoclang-format: [JS] do not merge imports and exports.
Martin Probst [Mon, 19 Apr 2021 11:31:06 +0000 (13:31 +0200)]
clang-format: [JS] do not merge imports and exports.

Previously, clang-format would erroneously merge import and export
statements. These need to be kept separate, as the semantics differ.

Differential Revision: https://reviews.llvm.org/D100752

3 years ago[C++, test] Fix typo in NSS* vars
Thomas Preud'homme [Sat, 3 Apr 2021 10:49:50 +0000 (11:49 +0100)]
[C++, test] Fix typo in NSS* vars

The NSS FileCheck variables at the end of the
CodeGenCXX/split-stacks.cpp clang testcase are off by 1, resulting in
the use of an undefined variable (NSS3). One of the CHECK-NOT is also
redundant because _Z8tnosplitIiEiv uses the same attribute as _Z3foov
without split stack. This commit fixes that.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D99839

3 years ago[AMDGPU] Re-arrange ds_read/ds_write ISel pattern for better readability.
hsmahesha [Tue, 20 Apr 2021 10:47:05 +0000 (16:17 +0530)]
[AMDGPU] Re-arrange ds_read/ds_write ISel pattern for better readability.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D100773

3 years ago[MemoryBuiltins] Added support for memalign
Dávid Bolvanský [Tue, 20 Apr 2021 10:38:55 +0000 (12:38 +0200)]
[MemoryBuiltins] Added support for memalign

memalign is older aligned_alloc.

3 years ago[Support] APInt.h - remove <algorithm> include. NFCI.
Simon Pilgrim [Tue, 20 Apr 2021 10:21:26 +0000 (11:21 +0100)]
[Support] APInt.h - remove <algorithm> include. NFCI.

Replace std::min use which should allow us to avoid including the <algorithm> header in every include of APInt.h.

3 years ago[CodeGen] CodeGenPassBuilder.h - remove unnecessary <string> include. NFCI.
Simon Pilgrim [Tue, 20 Apr 2021 09:43:45 +0000 (10:43 +0100)]
[CodeGen] CodeGenPassBuilder.h - remove unnecessary <string> include. NFCI.

We only use StringRef so include that.

3 years ago[RISCV] Refactor an optimization of addition with immediate
Ben Shi [Tue, 20 Apr 2021 10:04:25 +0000 (18:04 +0800)]
[RISCV] Refactor an optimization of addition with immediate

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D100769

3 years ago[AArch64] Constant fold sve_convert_from_svbool(zero) to zero
Joe Ellis [Thu, 15 Apr 2021 08:12:11 +0000 (08:12 +0000)]
[AArch64] Constant fold sve_convert_from_svbool(zero) to zero

Co-authored-by: Paul Walker <paul.walker@arm.com>
Differential Revision: https://reviews.llvm.org/D100463

3 years ago[AArch64][SVE][InstCombine] Replace last{a,b} intrinsics with extracts...
Joe Ellis [Fri, 16 Apr 2021 10:05:05 +0000 (10:05 +0000)]
[AArch64][SVE][InstCombine] Replace last{a,b} intrinsics with extracts...

when the predicate used by last{a,b} specifies a known vector length.

For example:
  aarch64_sve_lasta(VL1, D) -> extractelement(D, #1)
  aarch64_sve_lastb(VL1, D) -> extractelement(D, #0)

Co-authored-by: Paul Walker <paul.walker@arm.com>
Differential Revision: https://reviews.llvm.org/D100476

3 years ago[libcxx][test] Split off debug mode tests
Kristina Bessonova [Tue, 20 Apr 2021 09:58:41 +0000 (11:58 +0200)]
[libcxx][test] Split off debug mode tests

This continues the work started by @ldionne in 2908eb20ba7.

The debug mode tests from

- libcxx/containers/sequences/vector/
- libcxx/strings/basic.string/string.access/
- libcxx/strings/basic.string/string.iterators/

similarly contain two tests in every file making the second test never
run. The patch splits the tests into separate files.

Reviewed By: Quuxplusone, ldionne

Differential Revision: https://reviews.llvm.org/D100592

3 years ago[ARM] Regenerate a couple of tests. NFC
David Green [Tue, 20 Apr 2021 09:54:41 +0000 (10:54 +0100)]
[ARM] Regenerate a couple of tests. NFC

3 years ago[mlir] Progressively lower vector to SCF
Matthias Springer [Tue, 20 Apr 2021 08:17:56 +0000 (17:17 +0900)]
[mlir] Progressively lower vector to SCF

Add a new ProgressiveVectorToSCF pass that lowers vector transfer ops to SCF by gradually unpacking one dimension at time. Unpacking stops at 1D, but can be configured to stop earlier, should the HW support (N>1)-d vectors.

The current implementation cannot handle permutation maps, masks, tensor types and unrolling yet. These will be added in subsequent commits. Once features are on par with VectorToSCF, this implementation will replace VectorToSCF.

Differential Revision: https://reviews.llvm.org/D100622

3 years ago[mlir] Add patterns to lower Math operations to LLVM based libm calls.
Tres Popp [Tue, 13 Apr 2021 08:18:34 +0000 (10:18 +0200)]
[mlir] Add patterns to lower Math operations to LLVM based libm calls.

Some Math operations do not have an equivalent in LLVM. In these cases,
allow a low priority fallback of calling the libm functions. This is to
give functionality and is not a performant option.

Differential Revision: https://reviews.llvm.org/D100367

3 years ago[Support] BinaryStreamReader.h - remove unnecessary <string> include. NFCI.
Simon Pilgrim [Mon, 19 Apr 2021 16:56:42 +0000 (17:56 +0100)]
[Support] BinaryStreamReader.h - remove unnecessary <string> include. NFCI.

We only use StringRef so include that.

3 years agoRe-land [GreedyRA ORE] Add Cost of spill locations into remark
Serguei Katkov [Tue, 20 Apr 2021 05:59:44 +0000 (12:59 +0700)]
Re-land [GreedyRA ORE] Add Cost of spill locations into remark

Re-land the patch with a fix of clang test.

Cost of spill location is computed basing on relative branch frequency
where corresponding spill/reload/copy are located.

While the number itself is highly depends on incoming IR,
the total cost can be used when do some changes in RA.

Revert "Revert "[GreedyRA ORE] Add Cost of spill locations into remark""
This reverts commit 680f3d6de79f7dd75ee0cda256a541d18e504a22.

3 years ago[RISCV] Fix missing emergency slots for scalable stack offsets
Fraser Cormack [Thu, 15 Apr 2021 16:02:20 +0000 (17:02 +0100)]
[RISCV] Fix missing emergency slots for scalable stack offsets

This patch adds an additional emergency spill slot to RVV code. This is
required as RVV stack offsets may require an additional register to compute.

This patch includes an optimization by @HsiangKai <kai.wang@sifive.com>
to reduce the number of registers required for the computation of stack
offsets from 3 to 2. Otherwise we'd need two additional emergency spill
slots.

Reviewed By: HsiangKai

Differential Revision: https://reviews.llvm.org/D100574

3 years ago[LV] Let selectVectorizationFactor reason directly on VectorizationFactor.
Sander de Smalen [Mon, 19 Apr 2021 09:59:30 +0000 (10:59 +0100)]
[LV] Let selectVectorizationFactor reason directly on VectorizationFactor.

Rather than maintaining two separate values, a `float` for the per-lane
cost and a Width for the VF, maintain a single VectorizationFactor which
comprises the two and also removes the need for converting an integer value
to float.

This simplifies the query when asking if one VF is more profitable than
another when we want to extend this for scalable vectors (which may
require additional options to determine if e.g. a scalable VF of the
some cost, is more profitable than a fixed VF of the same cost).

The patch isn't entirely NFC because it also fixes an issue in
selectEpilogueVectorizationFactor, where the cost passed to ProfitableVFs
no longer truncates the floating-point cost from `float` to `unsigned` to
then perform the calculation on the truncated cost. It now does
a cost comparison with the correct precision.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D100121

3 years ago[PowerPC] Use mtvsrdd to put callee-saved GPR into VSR
Qiu Chaofan [Tue, 20 Apr 2021 08:32:24 +0000 (16:32 +0800)]
[PowerPC] Use mtvsrdd to put callee-saved GPR into VSR

This patch exploits mtvsrdd instruction (available in ISA3.0+) to save
two callee-saved GPR registers into a single VSR, making it more
efficient.

Reviewed By: jsji, nemanjai

Differential Revision: https://reviews.llvm.org/D62565

3 years ago[DAGCombiner] Support fold zero scalar vector.
Jun Ma [Tue, 20 Apr 2021 03:37:06 +0000 (11:37 +0800)]
[DAGCombiner] Support fold zero scalar vector.

This patch changes ISD::isBuildVectorAllZeros to
ISD::isConstantSplatVectorAllZeros which handles zero sclar vector.

TestPlan: check-llvm

Differential Revision: https://reviews.llvm.org/D100813

3 years ago[AMDGPU] GCNDPPCombine: don't shrink V_ADD_CO_U32 if carry out is used
Jay Foad [Mon, 19 Apr 2021 13:48:20 +0000 (14:48 +0100)]
[AMDGPU] GCNDPPCombine: don't shrink V_ADD_CO_U32 if carry out is used

Don't shrink VOP3 instructions if there are any uses of a carry-out
operand, because the shrunken form of the instruction would write the
carry-out to vcc instead of to a virtual register.

Differential Revision: https://reviews.llvm.org/D100760

3 years ago[X86][AMX] Verify illegal types or instructions for x86_amx.
Luo, Yuanke [Tue, 20 Apr 2021 07:52:29 +0000 (15:52 +0800)]
[X86][AMX] Verify illegal types or instructions for x86_amx.

This patch is related to https://reviews.llvm.org/D100032 which define
some illegal types or operations for x86_amx. There are no arguments,
arrays, pointers, vectors or constants of x86_amx.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D100472

3 years agoExplicitly pass type to cast load constant folding result
Arthur Eubanks [Sun, 18 Apr 2021 08:35:52 +0000 (01:35 -0700)]
Explicitly pass type to cast load constant folding result

Previously we would use the type of the pointee to determine what to
cast the result of constant folding a load. To aid with opaque pointer
types, we should explicitly pass the type of the load rather than
looking at pointee types.

ConstantFoldLoadThroughBitcast() converts the const prop'd value to the
proper load type (e.g. [1 x i32] -> i32). Instead of calling this in
every intermediate step like bitcasts, we only call this when we
actually see the global initializer value.

In some existing uses of this API, we don't know the exact type we're
loading from immediately (e.g. first we visit a bitcast, then we visit
the load using the bitcast). In those cases we have to manually call
ConstantFoldLoadThroughBitcast() when simplifying the load to make sure
that we cast to the proper type.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D100718

3 years ago[PowerPC] Support f128 under VSX
Qiu Chaofan [Tue, 20 Apr 2021 07:47:54 +0000 (15:47 +0800)]
[PowerPC] Support f128 under VSX

This patch is the last one in backend to support fp128 type in
pre-POWER9 subtargets with VSX, removing temporary option and updating
remaining tests.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D92374

3 years ago[SelectionDAG] Relax constraints on STEP_VECTOR step operand
Fraser Cormack [Wed, 14 Apr 2021 08:03:27 +0000 (09:03 +0100)]
[SelectionDAG] Relax constraints on STEP_VECTOR step operand

This patch relaxes the requirement that the STEP_VECTOR step constant
must be of a type at least as large as the vector element type. This
does not permit its use on targets which have legal vector element types
larger than the largest legal scalar type, such as i64 vectors on RV32.

As such, the requirement has been loosened so that the step operand must
be any scalar type so long as the constant immediate is non-negative and
the value fits inside the vector element type.

This limits combining optimizations in certain circumstances but in
practice it's unlikely to be a hindrance.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D100660

3 years ago[CSKY 6/n] Add support branch and symbol series instruction
Zi Xuan Wu [Tue, 20 Apr 2021 06:42:26 +0000 (14:42 +0800)]
[CSKY 6/n] Add support branch and symbol series instruction

This patch adds basic CSKY branch instructions and symbol address series instructions.
Those two kinds of instruction have relationship between each other, and it involves much work about Fixups.

For now, basic instructions are enabled except for disassembler support.
We would support to generate basic codegen asm firstly and delay disassembler work later.

Differential Revision: https://reviews.llvm.org/D95029

3 years ago[CSKY 5/n] Add support for all CSKY basic integer instructions except for branch...
Zi Xuan Wu [Tue, 20 Apr 2021 06:40:46 +0000 (14:40 +0800)]
[CSKY 5/n] Add support for all CSKY basic integer instructions except for branch series

This patch adds basic CSKY integer instructions except for branch series such as bsr, br.
It mainly includes basic ALU, load & store, compare and data move instructions.

Branch series instructions need handle complex symbol operand as following patch later.

Differential Revision: https://reviews.llvm.org/D94007

3 years ago[CSKY 4/n] Add basic CSKYAsmParser and CSKYInstPrinter
Zi Xuan Wu [Tue, 20 Apr 2021 06:06:36 +0000 (14:06 +0800)]
[CSKY 4/n] Add basic CSKYAsmParser and CSKYInstPrinter

This basic parser will handle basic instructions with register or immediate operands.
With the addition of CSKYInstPrinter, we can now make use of lit tests.

Differential Revision: https://reviews.llvm.org/D93798

3 years ago[NFC] Restructure code to make it possible to insert other GCs
Max Kazantsev [Tue, 20 Apr 2021 06:59:03 +0000 (13:59 +0700)]
[NFC] Restructure code to make it possible to insert other GCs

3 years ago[MLIR][LinAlg] Detensoring CF cost-model: look forward.
KareemErgawy-TomTom [Tue, 20 Apr 2021 07:01:43 +0000 (09:01 +0200)]
[MLIR][LinAlg] Detensoring CF cost-model: look forward.

This patch extends the control-flow cost-model for detensoring by
implementing a forward-looking pass on block arguments that should be
detensored. This makes sure that if a (to-be-detensored) block argument
"escapes" its block through the terminator, then the successor arguments
are also detensored.

Reviewed By: silvas

Differential Revision: https://reviews.llvm.org/D100457

3 years ago[X86][AMX] Add description of x86_amx to LangRef.
Luo, Yuanke [Wed, 7 Apr 2021 12:57:21 +0000 (20:57 +0800)]
[X86][AMX] Add description of x86_amx to LangRef.

Differential Revision: https://reviews.llvm.org/D100032

3 years ago[Test] Add -lcssa run to force LI in GVN
Max Kazantsev [Tue, 20 Apr 2021 06:07:33 +0000 (13:07 +0700)]
[Test] Add -lcssa run to force LI in GVN

3 years ago[llvm-rc] Fix handling of the /X option to match its documentation and rc.exe
Martin Storsjö [Fri, 16 Apr 2021 10:30:47 +0000 (13:30 +0300)]
[llvm-rc] Fix handling of the /X option to match its documentation and rc.exe

This matches how it's documented in the option listing.

Differential Revision: https://reviews.llvm.org/D100754

3 years ago[llvm-rc] Simplify Opts.td to avoid repetition. NFC.
Martin Storsjö [Wed, 14 Apr 2021 13:23:50 +0000 (16:23 +0300)]
[llvm-rc] Simplify Opts.td to avoid repetition. NFC.

Differential Revision: https://reviews.llvm.org/D100753

3 years ago[mlir][linalg] update fusion on tensors to support linalg index operations.
Tobias Gysi [Tue, 20 Apr 2021 05:28:26 +0000 (05:28 +0000)]
[mlir][linalg] update fusion on tensors to support linalg index operations.

The patch replaces the index operations in the body of fused producers and linearizes the indices after expansion.

Differential Revision: https://reviews.llvm.org/D100479

3 years ago[RISCV] Handle PseudoVRELOAD and PseudoVSPILL in getInstSizeInBytes.
Zakk Chen [Fri, 9 Apr 2021 13:48:29 +0000 (06:48 -0700)]
[RISCV] Handle PseudoVRELOAD and PseudoVSPILL in getInstSizeInBytes.

It's necessary to calculate correct instruction size because
PseudoVRELOAD and PseudoSPILL will be expanded into multiple
instructions.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D100702

3 years ago[mlir][linalg] update drop unit dims to support linalg index operations.
Tobias Gysi [Tue, 20 Apr 2021 04:50:43 +0000 (04:50 +0000)]
[mlir][linalg] update drop unit dims to support linalg index operations.

Update the dimensions of the index operations to account for dropped dimensions and replace the index operations of dropped dimensions by zero.

Differential Revision: https://reviews.llvm.org/D100395

3 years ago[Test] Add loop load PRE test with GC pointers
Max Kazantsev [Tue, 20 Apr 2021 04:30:17 +0000 (11:30 +0700)]
[Test] Add loop load PRE test with GC pointers

3 years ago[RISCV][test] Add a new test of addition
Ben Shi [Tue, 20 Apr 2021 04:11:56 +0000 (12:11 +0800)]
[RISCV][test] Add a new test of addition

Reviewed by: craig.topper

Differential Revision: https://reviews.llvm.org/D100767

3 years agoRevert "[GreedyRA ORE] Add Cost of spill locations into remark"
Serguei Katkov [Tue, 20 Apr 2021 04:08:24 +0000 (11:08 +0700)]
Revert "[GreedyRA ORE] Add Cost of spill locations into remark"

This reverts commit 328377307ad2da961b3be0f2bbf1814a6f1f4ed3.

This commit causes buildbot failures due to some clang tests are not updated.
Temporary revert to fix clang tests.

3 years ago[Docs] Mention LLVM_EXPERIMENTAL_TARGETS_TO_BUILD variable in CMake.rst
xgupta [Tue, 20 Apr 2021 03:57:57 +0000 (09:27 +0530)]
[Docs] Mention LLVM_EXPERIMENTAL_TARGETS_TO_BUILD variable in CMake.rst

Beginners might not aware of this variable and wanted to try a new experimental target.

Although this variable mention in Writing a Backend Documentation. But it becomes easy to search when listed in cmake.rst doc where most variables are listed.

Reviewed By: myhsu

Differential Revision: https://reviews.llvm.org/D100729

3 years ago[GreedyRA ORE] Add Cost of spill locations into remark
Serguei Katkov [Fri, 9 Apr 2021 10:22:36 +0000 (17:22 +0700)]
[GreedyRA ORE] Add Cost of spill locations into remark

Cost of spill location is computed basing on relative branch frequency
where corresponding spill/reload/copy are located.

While the number itself is highly depends on incoming IR,
the total cost can be used when do some changes in RA.

Reviewers: reames, MatzeB, anemet, thegameg
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D100020

3 years ago[AArch64][SVE] Combine add and index_vector
Jun Ma [Thu, 8 Apr 2021 13:21:42 +0000 (21:21 +0800)]
[AArch64][SVE] Combine add and index_vector

This patch tries to combine pattern add(index_vector(zero, step), dup(X)) into index_vector(X, step)

TestPlan: check-llvm

Differential Revision: https://reviews.llvm.org/D100107

3 years ago[lldb] Fix one leak in reproducer
Fangrui Song [Tue, 20 Apr 2021 02:39:10 +0000 (19:39 -0700)]
[lldb] Fix one leak in reproducer

Use a variable of static storage duration to reference an intentionally
leaked variable. A static data area is in the GC-set of various leak
checkers.

This fixes 3 `check-lldb-shell` tests in a `-DLLVM_USE_SANITIZER={Leaks,Address}` build,
e.g. `test/Shell/Reproducer/TestHomeDir.test`

Differential Revision: https://reviews.llvm.org/D100806

3 years ago[mlir][llvm] Add UnnamedAddr attribute to GlobalOp
clementval [Tue, 20 Apr 2021 01:45:01 +0000 (21:45 -0400)]
[mlir][llvm] Add UnnamedAddr attribute to GlobalOp

This patch add the UnnamedAddr attribute for the GlobalOp in the LLVM
dialect. The attribute is also handled to and from LLVM IR.

This is meant to be used in a follow up patch to lower OpenACC/OpenMP ops to
call to kmp and tgt runtime calls (D100678).

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D100677

3 years ago[CSSPGO] Exclude pseudo probes from slot index
Hongtao Yu [Tue, 13 Apr 2021 06:51:44 +0000 (23:51 -0700)]
[CSSPGO] Exclude pseudo probes from slot index

Pseudo probe are currently given a slot index like other regular instructions. This affects register pressure and lifetime weight computation because of enlarged lifetime length with pseudo probe instructions. As a consequence, program could get different code generated w/ and w/o pseudo probes. I'm closing the gap by excluding pseudo probes from stack index and downstream register allocation related passes.

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D100334

3 years ago[CSSPGO] Flip SkipPseudoOp to true for MIR APIs.
Hongtao Yu [Mon, 19 Apr 2021 15:51:57 +0000 (08:51 -0700)]
[CSSPGO] Flip SkipPseudoOp to true for MIR APIs.

Flipping the default value of SkipPseudoOp to true for those MIR APIs to favor maximum performance. Note that certain spots like branch folding and MIR if-conversion is are disabled for better counts quality. For these two optimizations, this is a no-diff change.

The counts quality with SPEC2017 before/after this change is unchanged.

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D100332

3 years agoFix android-x86 library name in asan_device_setup.
Evgenii Stepanov [Wed, 14 Apr 2021 20:11:38 +0000 (13:11 -0700)]
Fix android-x86 library name in asan_device_setup.

https://reviews.llvm.org/D26764 removed i686 variants of compiler-rt
libraries and canonicalized the i386 name.

https://reviews.llvm.org/D37278 partially reverted the previous change
to keep i686 name on Android, but did not update asan_device_setup
script.

This changes fixes asan_device_setup.

Differential Revision: https://reviews.llvm.org/D100505

3 years ago[InstCombine] Enhance deduction of alignment for aligned_alloc
Dávid Bolvanský [Tue, 20 Apr 2021 00:03:38 +0000 (02:03 +0200)]
[InstCombine] Enhance deduction of alignment for aligned_alloc

This patch improves https://reviews.llvm.org/D76971 (Deduce attributes for aligned_alloc in InstCombine) and implements "TODO" item mentioned in the review of that patch.

> The function aligned_alloc() is the same as memalign(), except for the added restriction that size should be a multiple of alignment.

Currently, we simply bail out if we see a non-constant size - change that.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D100785

3 years ago[lldb] Fix demangler leaks in the DWARF AST parser
Fangrui Song [Mon, 19 Apr 2021 23:36:54 +0000 (16:36 -0700)]
[lldb] Fix demangler leaks in the DWARF AST parser

This fixes 6 check-lldb-shell failures in a `-DLLVM_USE_SANITIZER=Leaks` build.

Differential Revision: https://reviews.llvm.org/D100800

3 years ago[libc++] [C++20] [P0586] Implement safe integral comparisons
Kamlesh Kumar [Mon, 19 Apr 2021 23:18:34 +0000 (04:48 +0530)]
[libc++] [C++20] [P0586] Implement safe integral comparisons

* https://wg21.link/P0586

Reviewed By: #libc, curdeius, Quuxplusone

Differential Revision: https://reviews.llvm.org/D94511

3 years agoAdd a cache of checked AttributeLists.
Nick Lewycky [Mon, 19 Apr 2021 03:51:43 +0000 (20:51 -0700)]
Add a cache of checked AttributeLists.

Differential Revision: https://reviews.llvm.org/D100738

3 years ago[M68k] Put M68kDesc as the direct library dependency for disassembler
Min-Yih Hsu [Mon, 19 Apr 2021 22:53:52 +0000 (15:53 -0700)]
[M68k] Put M68kDesc as the direct library dependency for disassembler

M68kDisassembler should put M68kDesc as its direct library dependency
since it uses logics releated to code beads Otherwise the build will
fail when building LLVM libraries as shared objects (building LLVM
libraries statically won't have this problem though)

3 years agoReset NextFnNum in MachineModuleInfo::initialize
Roman Tereshin [Tue, 11 Aug 2020 06:29:39 +0000 (23:29 -0700)]
Reset NextFnNum in MachineModuleInfo::initialize

In an env that reuses compiler instances for multiple compilations, this
omission results in non-deterministic assembly output (names of the
auto-generated labels) if the order or full set of Modules compiled
varies.

Differential Revision: https://reviews.llvm.org/D100797

3 years ago[libc++][gardening] Replace instances of `\x{AD}`.
zoecarver [Mon, 19 Apr 2021 21:59:32 +0000 (14:59 -0700)]
[libc++][gardening] Replace instances of `\x{AD}`.

This is a NFC.

Differential Revision: https://reviews.llvm.org/D100799

3 years agoRevert "[clang-scan-deps] Add support for clang-cl"
Alexandre Ganea [Mon, 19 Apr 2021 21:43:24 +0000 (17:43 -0400)]
Revert "[clang-scan-deps] Add support for clang-cl"

This reverts commit bb26fa8c286bf524ed9235c3e293ad22ecf3e984.

3 years ago[PhaseOrdering] add test to show unintended code sinking; NFC
Sanjay Patel [Mon, 19 Apr 2021 20:41:27 +0000 (16:41 -0400)]
[PhaseOrdering] add test to show unintended code sinking; NFC

See D87479 for discussion.

3 years ago[M68k] Implement Disassembler
Ricky Taylor [Thu, 11 Mar 2021 22:12:47 +0000 (22:12 +0000)]
[M68k] Implement Disassembler

This is an implementation of a disassembler for M68k.

Differential Revision: https://reviews.llvm.org/D98540

3 years ago[M68k] Change printing of absolute memory references
Ricky Taylor [Sat, 17 Apr 2021 10:03:34 +0000 (11:03 +0100)]
[M68k] Change printing of absolute memory references

This also includes PC-relative addresses since they are still
referenced as absolute addresses in assembly and converted to
relative addresses by the assembler.

This changes, for example:
- `bra #-2` -> `bra $100`
- `jsr #16` -> `jsr $10`

Differential Revision: https://reviews.llvm.org/D100697

3 years agoRevert "[SLP]Add detection of shuffled/perfect matching of tree entries."
Alexey Bataev [Mon, 19 Apr 2021 21:09:40 +0000 (14:09 -0700)]
Revert "[SLP]Add detection of shuffled/perfect matching of tree entries."

This reverts commit d6fde913790db898e72e27b51defbc7442f3418a to fix
compiler crashes.