platform/upstream/llvm.git
4 years ago[NFC] Refactor SimplifyCFG to make propagating information easier.
Tyker [Fri, 24 Apr 2020 19:38:12 +0000 (21:38 +0200)]
[NFC] Refactor SimplifyCFG to make propagating information easier.

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77742

4 years agoAdd an internal bit to the XcodeSDK class.
Adrian Prantl [Wed, 22 Apr 2020 21:22:54 +0000 (14:22 -0700)]
Add an internal bit to the XcodeSDK class.

For developing the OS itself there exists an "internal" variant of
each SDK. This patch adds support for these SDK directories to the
XcodeSDK class.

Differential Revision: https://reviews.llvm.org/D78675

4 years agoAMDGPU: Break read2/write2 search range on a memory fence
Matt Arsenault [Fri, 24 Apr 2020 14:06:00 +0000 (10:06 -0400)]
AMDGPU: Break read2/write2 search range on a memory fence

This is to fix performance regressions introduced by
86c944d790728891801778b8d98c2c65a83f36a5.

The old search would collect all potentially mergeable instructions in
the entire block. In this case, the same address is written in
multiple places in the block on the other side of a fence. When sorted
by offset, the two unmergeable, identical addresses would be next to
each other and the merge would give up.

Break the search space when we encounter an instruction we won't be
able to merge across. This will keep the identical addresses in
different merge attempts.

This may also improve compile time by reducing the merge list size.

4 years ago[lldb/Driver] Remove level of indentation (NFC)
Jonas Devlieghere [Fri, 24 Apr 2020 19:50:06 +0000 (12:50 -0700)]
[lldb/Driver] Remove level of indentation (NFC)

Use an early return for when we couldn't create a pipe to source the
commands.

4 years ago[libc++] Properly import lit.formats from the new format
Louis Dionne [Fri, 24 Apr 2020 19:44:02 +0000 (15:44 -0400)]
[libc++] Properly import lit.formats from the new format

4 years ago[gold] Simplify with StringRef::consume_front. NFC
Fangrui Song [Fri, 24 Apr 2020 17:41:28 +0000 (10:41 -0700)]
[gold] Simplify with StringRef::consume_front. NFC

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D78819

4 years ago[mlir][DialectConversion] Add support for properly tracking replaceUsesOfBlockArgument
River Riddle [Fri, 24 Apr 2020 19:25:05 +0000 (12:25 -0700)]
[mlir][DialectConversion] Add support for properly tracking replaceUsesOfBlockArgument

The current implementation of this method performs the replacement directly, and thus doesn't support proper back tracking.

Differential Revision: https://reviews.llvm.org/D78790

4 years ago[libc++] NFC: Refactor the new format substitutions into its own method
Louis Dionne [Fri, 24 Apr 2020 15:30:10 +0000 (11:30 -0400)]
[libc++] NFC: Refactor the new format substitutions into its own method

This way, we can reuse the substitution logic in the new DSL.

4 years agoRevert "[mlir][drr] NFC: avoid SmallVector when collecting substitution values"
Lei Zhang [Fri, 24 Apr 2020 19:33:03 +0000 (15:33 -0400)]
Revert "[mlir][drr] NFC: avoid SmallVector when collecting substitution values"

This reverts commit 2f8b164ca220f8cd29d70c8359ed91e8fb8d9959, which
causes a breakage on Clang 5.

4 years ago[libc++] Get rid of pipe in command to check whether verify is supported
Louis Dionne [Fri, 24 Apr 2020 19:29:42 +0000 (15:29 -0400)]
[libc++] Get rid of pipe in command to check whether verify is supported

4 years ago[llvm-cov] Prevent llvm-cov from using too many threads
Alexandre Ganea [Fri, 24 Apr 2020 19:28:01 +0000 (15:28 -0400)]
[llvm-cov] Prevent llvm-cov from using too many threads

As reported here: https://reviews.llvm.org/D75153#1987272

Before, each instance of llvm-cov was creating one thread per hardware core, which wasn't needed probably because the number of inputs were small. This was probably causing a thread rlimit issue on large core count systems.

After this patch, the previous behavior is restored (to what was before rG8404aeb5):

If --num-threads is not specified, we create one thread per input, up to num.cores.
When specified, --num-threads indicates any number of threads, with no upper limit.

Differential Revision: https://reviews.llvm.org/D78408

4 years ago[mlir][DictionaryAttr] Add a new getWithSorted and use it when possible
River Riddle [Fri, 24 Apr 2020 19:18:18 +0000 (12:18 -0700)]
[mlir][DictionaryAttr] Add a new getWithSorted and use it when possible

The elements of a DictionaryAttr are sorted by name. In many situations, e.g NamedAttributeList, we can guarantee that the elements are sorted on construction and remove the need to perform extra checks. In places with lots of calls to attribute methods, this leads to a good performance improvement.

Differential Revision: https://reviews.llvm.org/D78781

4 years ago[Fuchsia] Build compiler-rt builtins for 32-bit x86
Petr Hosek [Wed, 22 Apr 2020 23:41:03 +0000 (16:41 -0700)]
[Fuchsia] Build compiler-rt builtins for 32-bit x86

While we don't support 32-bit architectures in Fuchsia, these are needed
in the early boot phase on x86, so we build just these to satisfy that
use case.

Differential Revision: https://reviews.llvm.org/D78687

4 years agoFix `-Wparentheses` warnings. NFC.
Michael Liao [Fri, 24 Apr 2020 19:03:48 +0000 (15:03 -0400)]
Fix `-Wparentheses` warnings. NFC.

4 years agoRevert "[CUDA][HIP] Fix host/device based overload resolution"
Yaxun (Sam) Liu [Fri, 24 Apr 2020 18:57:10 +0000 (14:57 -0400)]
Revert "[CUDA][HIP] Fix host/device based overload resolution"

This reverts commit c77a4078e01033aa2206c31a579d217c8a07569b.

4 years ago[CUDA][HIP] Fix host/device based overload resolution
Yaxun (Sam) Liu [Sat, 11 Apr 2020 16:45:00 +0000 (12:45 -0400)]
[CUDA][HIP] Fix host/device based overload resolution

Currently clang fails to compile the following CUDA program in device compilation:

__host__ int foo(int x) {
     return 1;
}

template<class T>
__device__ __host__ int foo(T x) {
    return 2;
}

__device__ __host__ int bar() {
    return foo(1);
}

__global__ void test(int *a) {
    *a = bar();
}

This is due to foo is resolved to the __host__ foo instead of __device__ __host__ foo.
This seems to be a bug since __device__ __host__ foo is a viable callee for foo whereas
clang is unable to choose it.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D77954

4 years agoDelete cargo-cult code that doesn't affect the testsuite.
Adrian Prantl [Fri, 24 Apr 2020 18:11:23 +0000 (11:11 -0700)]
Delete cargo-cult code that doesn't affect the testsuite.

4 years ago[libc++] Quietly scp tarballs over with the remote executor
Louis Dionne [Fri, 24 Apr 2020 18:47:09 +0000 (14:47 -0400)]
[libc++] Quietly scp tarballs over with the remote executor

Otherwise, the progress-meter is printed.

4 years ago[AssumeBundles] Use assume bundles in isKnownNonZero
Tyker [Fri, 24 Apr 2020 17:46:18 +0000 (19:46 +0200)]
[AssumeBundles] Use assume bundles in isKnownNonZero

Summary: Use nonnull and dereferenceable from an assume bundle in isKnownNonZero

Reviewers: jdoerfert, nikic, lebedev.ri, reames, fhahn, sstefan1

Reviewed By: jdoerfert

Subscribers: fhahn, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76149

4 years agoAArch64: Remove reversedInstructionsWithoutDebug helper
Vedant Kumar [Fri, 24 Apr 2020 18:16:43 +0000 (11:16 -0700)]
AArch64: Remove reversedInstructionsWithoutDebug helper

When using reversedInstructionsWithoutDebug to construct a range from a
pair of MachineInstrBundleIterators, the range unexpectedly leaves out an
element. This results in mis-optimization as @mstorsjo points out in
https://reviews.llvm.org/D78157.

The problem is that when we convert a MachineInstrBundleIterator to a
reverse iterator, the result gets incremented:

  MachineInstrBundleIterator(++I.getReverse())

The comment there explains that the "resulting iterator will dereference
... to the previous node, which is somewhat unexpected; but converting
the two endpoints in a range will give the same range in reverse". This
makes it hard to understand what reversedInstructionsWithoutDebug will
do: I've removed the helper to prevent similar mistakes in the future.

4 years agoAdd Objective-C property accessors loaded from Clang module DWARF to lookup
Adrian Prantl [Thu, 16 Apr 2020 22:20:07 +0000 (15:20 -0700)]
Add Objective-C property accessors loaded from Clang module DWARF to lookup

This patch fixes a bug when synthesizing an ObjC property from
-gmodules debug info. Because the method declaration that is injected
via the non-modular property implementation is not added to the
ObjCInterfaceDecl's lookup pointer, a second copy of the accessor
would be generated when processing the ObjCPropertyDecl. This can be
avoided by finding the existing method decl in
ClangExternalASTSourceCallbacks::FindExternalVisibleDeclsByName() and
adding it to the LookupPtr.

Differential Revision: https://reviews.llvm.org/D78333

4 years ago[llvm][NFC][CallSite] Remove {Immutable}CallSite and CallSiteBase
Mircea Trofin [Fri, 24 Apr 2020 05:59:12 +0000 (22:59 -0700)]
[llvm][NFC][CallSite] Remove {Immutable}CallSite and CallSiteBase

Reviewers: dblaikie, craig.topper

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78794

4 years ago[X86] Don't use types when getting the intrinsic declaration for x86_avx512_mask_vcvt...
Craig Topper [Fri, 24 Apr 2020 17:59:20 +0000 (10:59 -0700)]
[X86] Don't use types when getting the intrinsic declaration for x86_avx512_mask_vcvtph2ps_512.

This intrinsic isn't overloaded so we should query with types.
Doing so causes the backend to miss the intrinsic and not codegen it.
This eventually leads to a linker error.

4 years ago[InstCombine] regenerate test checks; NFC
Sanjay Patel [Fri, 24 Apr 2020 17:47:56 +0000 (13:47 -0400)]
[InstCombine] regenerate test checks; NFC

Values named 'tmp' can cause problems for the auto-generated check script.

4 years agoAdd constructor to ShapedTypeComponents for unranked with element type.
Stella Laurenzo [Thu, 23 Apr 2020 23:52:16 +0000 (16:52 -0700)]
Add constructor to ShapedTypeComponents for unranked with element type.

Subscribers: mehdi_amini, rriddle, jpienaar, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, grosul1, frgossen, Kayjukh, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78774

4 years ago[cmake] Add headers in TextAPI/Elf and TextAPI/MachO subdirectories
Simon Pilgrim [Fri, 24 Apr 2020 17:36:12 +0000 (18:36 +0100)]
[cmake] Add headers in TextAPI/Elf and TextAPI/MachO subdirectories

4 years agoAllocationOrder.h - split MCRegisterInfo.h include. NFC.
Simon Pilgrim [Fri, 24 Apr 2020 17:16:53 +0000 (18:16 +0100)]
AllocationOrder.h - split MCRegisterInfo.h include. NFC.
We only require to include MCRegister.h and SmallVector.h.

4 years ago[SVE] Do not store a bool for Scalable in VectorType
Christopher Tetreault [Fri, 24 Apr 2020 17:11:24 +0000 (10:11 -0700)]
[SVE] Do not store a bool for Scalable in VectorType

Summary:
- Whether or not a vector is scalable is a function of its type. Since
all instances of ScalableVectorType will have true for this value and
all instances of FixedVectorType will have false for this value, there
is no need to store it as a class member.

Reviewers: efriedma, fpetrogalli, kmclaughlin

Reviewed By: fpetrogalli

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78601

4 years ago[CostModel][X86] Account for splitting cost when vector zext/sext type legalize to...
Craig Topper [Thu, 23 Apr 2020 23:54:07 +0000 (16:54 -0700)]
[CostModel][X86] Account for splitting cost when vector zext/sext type legalize to the same size vector.

4 years ago[DSE,MSSA] Improve debug output (NFC).
Florian Hahn [Fri, 24 Apr 2020 16:48:03 +0000 (17:48 +0100)]
[DSE,MSSA] Improve debug output (NFC).

This patch slightly improves the formatting of the debug output, adds a
few missing outputs and makes some existing outputs more consistent with
the rest.

4 years ago[MC] Fix quadratic behavior in addPendingLabel()
Alexandre Ganea [Fri, 24 Apr 2020 16:48:39 +0000 (12:48 -0400)]
[MC] Fix quadratic behavior in addPendingLabel()

Differential Revision: https://reviews.llvm.org/D78775

4 years ago[Driver] Move GCC multilib/multiarch paths support from Linux.cpp to Gnu.cpp
Samuel Thibault [Fri, 24 Apr 2020 01:23:44 +0000 (18:23 -0700)]
[Driver] Move GCC multilib/multiarch paths support from Linux.cpp to Gnu.cpp

The current code for GNU/Linux is actually completely generic, and can be moved to ToolChains/Gnu.cpp,
so that it can benefit GNU/Hurd and GNU/kFreeBSD.

Reviewed By: MaskRay, phosek

Differential Revision: https://reviews.llvm.org/D73845

4 years ago[llvm][NFC] Factor out inlining pipeline as a module pipeline.
Mircea Trofin [Mon, 20 Apr 2020 18:05:29 +0000 (11:05 -0700)]
[llvm][NFC] Factor out inlining pipeline as a module pipeline.

Summary:
This simplifies testing in scenarios where we want to set up module-wide
analyses for inlining. The patch enables treating inlining and its
function cleanups, as a module pass. The alternative would be for tests
to describe the pipeline, which is tedious and adds maintenance
overhead.

Reviewers: davidxl, dblaikie, jdoerfert, sstefan1

Subscribers: hiraditya, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78512

4 years ago[DSE,MSSA] Skip checking write clobber for DomAccess (NFC).
Florian Hahn [Thu, 23 Apr 2020 19:37:26 +0000 (20:37 +0100)]
[DSE,MSSA] Skip checking write clobber for DomAccess (NFC).

There is no need to check if the starting access for is a write clobber
and all of its uses have already been checked.

4 years ago[InstCombine] intersect FMF when reassociating FP min/max intrinsics
Sanjay Patel [Fri, 24 Apr 2020 15:59:15 +0000 (11:59 -0400)]
[InstCombine] intersect FMF when reassociating FP min/max intrinsics

As discussed in PR45478:
https://bugs.llvm.org/show_bug.cgi?id=45478
...propagating FMF from the outer (second) call is not correct,
so intersect them instead.
I suspect we could do better (see TODO comment), but mismatched
FMF is probably too rare to care about.

Differential Revision: https://reviews.llvm.org/D78631

4 years ago[lldb/Core] Don't crash in GetSoftwareBreakpointTrapOpcode for unknown triples
Jonas Devlieghere [Fri, 24 Apr 2020 16:03:51 +0000 (09:03 -0700)]
[lldb/Core] Don't crash in GetSoftwareBreakpointTrapOpcode for unknown triples

This patch ensures we don't crash in GetSoftwareBreakpointTrapOpcode for
not-yet-supported architectures but rather continue with degraded
behavior.

I found the issue in the context of an invalid ArchSpec, which should be
handled further up the chain. In this patch I've also added an assert to
cover that, so we can still catch those issues.

Differential revision: https://reviews.llvm.org/D78588

4 years ago[AArch64] Allow PAC mnemonics in the HINT space with PAC disabled
Pablo Barrio [Tue, 10 Mar 2020 15:05:49 +0000 (15:05 +0000)]
[AArch64] Allow PAC mnemonics in the HINT space with PAC disabled

Summary:
It is important to emit HINT instructions instead of PAC ones when
PAC is disabled. This allows compatibility with other assemblers
(e.g. GAS). This was implemented in commit da33762de853.

Still, developers of assembly code will want to write code that is
compatible with both pre- and post-PAC CPUs. They could use HINT
mnemonics, but the new mnemonics are a lot more readable (e.g.
paciaz instead of hint #24), and they will result in the same
encodings. So, while LLVM should not *emit* the new mnemonics when
PAC is disabled, this patch will at least make LLVM *accept*
assembly code that uses them.

Reviewers: danielkiss, chill, olista01, LukeCheeseman, simon_tatham

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78372

4 years ago[gn build] update two comments
Nico Weber [Fri, 24 Apr 2020 15:52:42 +0000 (11:52 -0400)]
[gn build] update two comments

4 years ago[XRay] Change ARM/AArch64/powerpc64le to use version 2 sled (PC-relative address)
Fangrui Song [Tue, 21 Apr 2020 18:31:15 +0000 (11:31 -0700)]
[XRay] Change ARM/AArch64/powerpc64le to use version 2 sled (PC-relative address)

Follow-up of D78082 (x86-64).

This change avoids dynamic relocations in `xray_instr_map` for ARM/AArch64/powerpc64le.

MIPS64 cannot use 64-bit PC-relative addresses because R_MIPS_PC64 is not defined.
Because MIPS32 shares the same code, for simplicity, we don't use PC-relative addresses for MIPS32 as well.

Tested on AArch64 Linux and ppc64le Linux.

Reviewed By: ianlevesque

Differential Revision: https://reviews.llvm.org/D78590

4 years agoValueEnumerator.h - remove unnecessary includes. NFC.
Simon Pilgrim [Fri, 24 Apr 2020 15:21:25 +0000 (16:21 +0100)]
ValueEnumerator.h - remove unnecessary includes. NFC.
The forward declarations are already present in the header.

4 years agoMipsTargetStreamer.h - remove unnecessary MipsABIFlagsSection forward declaration...
Simon Pilgrim [Fri, 24 Apr 2020 15:19:00 +0000 (16:19 +0100)]
MipsTargetStreamer.h - remove unnecessary MipsABIFlagsSection forward declaration. NFC.
We need to include MipsABIFlagsSection.h already

4 years agoAMDGPUArgumentUsageInfo.h - cleanup includes and forward declarations. NFC.
Simon Pilgrim [Fri, 24 Apr 2020 15:18:10 +0000 (16:18 +0100)]
AMDGPUArgumentUsageInfo.h - cleanup includes and forward declarations. NFC.
Reduce Function.h include to (already existing) forward declaration.
Remove unused GCNSubtarget/TargetMachine forward declarations.

4 years ago[gn build] Port 7aaff8fd2da
LLVM GN Syncbot [Fri, 24 Apr 2020 15:06:14 +0000 (15:06 +0000)]
[gn build] Port 7aaff8fd2da

4 years ago[gn build] minimally merge 67b2dbd5a33583fe148fd12 even more
Nico Weber [Fri, 24 Apr 2020 15:04:22 +0000 (11:04 -0400)]
[gn build] minimally merge 67b2dbd5a33583fe148fd12 even more

4 years ago[libc++] NFC: Remove unused parameters in the new test format
Louis Dionne [Fri, 24 Apr 2020 15:04:18 +0000 (11:04 -0400)]
[libc++] NFC: Remove unused parameters in the new test format

4 years ago[ARM] Armv8.6-a Matrix Mul cmd line support
Luke Geeson [Thu, 9 Apr 2020 22:00:22 +0000 (23:00 +0100)]
[ARM] Armv8.6-a Matrix Mul cmd line support

This patch upstreams support for the Armv8.6-a Matrix Multiplication
Extension. A summary of the features can be found here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

This patch includes:

- Command line options to enable these features with +i8mm, +f32mm, or f64mm

Note: +f32mm and +f64mm are optional and so are not enabled by default

This is part of a patch series, starting with BFloat16 support and
the other components in the armv8.6a extension (in previous patches
linked in phabricator)

Based on work by:
- Luke Geeson
- Oliver Stannard
- Luke Cheeseman

Reviewers: t.p.northover, DavidSpickett

Reviewed By: DavidSpickett

Subscribers: DavidSpickett, ostannard, kristof.beyls, danielkiss,
cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77875

4 years ago[AArch32] Armv8.6a Matrix Mul Assembly Parsing Support
Luke Geeson [Thu, 9 Apr 2020 20:38:37 +0000 (21:38 +0100)]
[AArch32] Armv8.6a Matrix Mul Assembly Parsing Support

This patch upstreams support for the Armv8.6-a Matrix Multiplication
Extension. A summary of the features can be found here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

This patch includes:

- Assembly support for AArch32 and Assembly Parsing

D77872 has already added the MC representations of the instructions so that
they can be used in code gen; this patch fills in the details needed to
make assembly parsing work, and adds tests for asm and disasm

This is part of a patch series, starting with BFloat16 support and
the other components in the armv8.6a extension (in previous patches
linked in phabricator)

Based on work by:
- Luke Geeson
- Oliver Stannard
- Luke Cheeseman

Reviewers: t.p.northover, simon_tatham

Reviewed By: simon_tatham

Subscribers: simon_tatham, ostannard, kristof.beyls, hiraditya,
danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77874

4 years ago[AArch64] Armv8.6-A Mat Mul SVE Assembly
Luke Geeson [Thu, 9 Apr 2020 19:25:27 +0000 (20:25 +0100)]
[AArch64] Armv8.6-A Mat Mul SVE Assembly

This patch upstreams support for the Armv8.6-a Matrix Multiplication
Extension. A summary of the features can be found here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

This patch includes:

- Assembly support for AArch64 Scalable Vector Instructions (in line
  with the Scalable Vector Extension - SVE)

This is part of a patch series, starting with BFloat16 support and
the other components in the armv8.6a extension (in previous patches
linked in phabricator)

Based on work by:
- Luke Geeson
- Oliver Stannard
- Luke Cheeseman

Reviewers: t.p.northover, rengolin, c-rhodes

Reviewed By: c-rhodes

Subscribers: c-rhodes, ostannard, tschuett, kristof.beyls, hiraditya,
danielkiss, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77873

4 years ago[AArch32] Armv8.6-a Matrix Mult Assembly + Intrinsics
Luke Geeson [Thu, 9 Apr 2020 18:29:19 +0000 (19:29 +0100)]
[AArch32] Armv8.6-a Matrix Mult Assembly + Intrinsics

This patch upstreams support for the Armv8.6-a Matrix Multiplication
Extension. A summary of the features can be found here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

This patch includes:

- Assembly support for AArch32
- Intrinsics Support for AArch32 Neon Intrinsics for Matrix
  Multiplication

Note: these extensions are optional in the 8.6a architecture and so have
to be enabled by default

No additional IR types or C Types are needed for this extension.

This is part of a patch series, starting with BFloat16 support and
the other components in the armv8.6a extension (in previous patches
linked in phabricator)

Based on work by:
- Luke Geeson
- Oliver Stannard
- Luke Cheeseman

Reviewers: t.p.northover, miyuki

Reviewed By: miyuki

Subscribers: miyuki, ostannard, kristof.beyls, hiraditya, danielkiss,
cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77872

4 years ago[AArch64] Armv8.6-a Matrix Mult Assembly + Intrinsics
Luke Geeson [Thu, 9 Apr 2020 16:21:19 +0000 (17:21 +0100)]
[AArch64] Armv8.6-a Matrix Mult Assembly + Intrinsics

This patch upstreams support for the Armv8.6-a Matrix Multiplication
Extension. A summary of the features can be found here:

https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a

This patch includes:

- Assembly support for AArch64 only (no SVE or Neon)
- Intrinsics Support for AArch64 Armv8.6a Matrix Multiplication Instructions (No bfloat16 matrix multiplication)

No IR types or C Types are needed for this extension.

This is part of a patch series, starting with BFloat16 support and
the other components in the armv8.6a extension (in previous patches
linked in phabricator)

Based on work by:
- Luke Geeson
- Oliver Stannard
- Luke Cheeseman

Reviewers: ostannard, t.p.northover, rengolin, kmclaughlin

Reviewed By: kmclaughlin

Subscribers: kmclaughlin, kristof.beyls, hiraditya, danielkiss,
cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D77871

4 years ago[MLIR] Add RecursiveSideEffects to Loops::ParallelOp.
Tres Popp [Thu, 23 Apr 2020 12:03:11 +0000 (14:03 +0200)]
[MLIR] Add RecursiveSideEffects to Loops::ParallelOp.

Summary:
This is to specify that ParallelOp does not have side effects on its own
but has the effects of all operations executed in its region.

Differential Revision: https://reviews.llvm.org/D78707

4 years agoDwarfDebug.h - remove unnecessary forward declarations. NFC.
Simon Pilgrim [Fri, 24 Apr 2020 14:34:54 +0000 (15:34 +0100)]
DwarfDebug.h - remove unnecessary forward declarations. NFC.
We include their headers already.

4 years agoMetadataLoader.h - remove unnecessary Error forward declaration. NFC.
Simon Pilgrim [Fri, 24 Apr 2020 14:32:57 +0000 (15:32 +0100)]
MetadataLoader.h - remove unnecessary Error forward declaration. NFC.
We need to include Error.h already

4 years agoLLParser.h - remove unnecessary Module.h include. NFC.
Simon Pilgrim [Fri, 24 Apr 2020 13:49:45 +0000 (14:49 +0100)]
LLParser.h - remove unnecessary Module.h include. NFC.

4 years ago[gn build] minimally merge 67b2dbd5a33583fe148fd12 more
Nico Weber [Fri, 24 Apr 2020 14:23:22 +0000 (10:23 -0400)]
[gn build] minimally merge 67b2dbd5a33583fe148fd12 more

4 years ago[mlir] Add a ViewLikeOpInterface
Lei Zhang [Wed, 22 Apr 2020 15:16:34 +0000 (11:16 -0400)]
[mlir] Add a ViewLikeOpInterface

This can help provide a common interface for view-like
ops so that for example Linalg's dependency analysis
can avoid relying on concrete ops.

Differential Revision: https://reviews.llvm.org/D78645

4 years ago[gn build] minimally merge 67b2dbd5a33583fe148fd12
Nico Weber [Fri, 24 Apr 2020 13:58:19 +0000 (09:58 -0400)]
[gn build] minimally merge 67b2dbd5a33583fe148fd12

4 years ago[OPENMP]Use new interface for task reduction.
Alexey Bataev [Thu, 23 Apr 2020 17:27:03 +0000 (13:27 -0400)]
[OPENMP]Use new interface for task reduction.

Summary:
Patch forces codegen to use the new runtime functions for task reductions where
the issue with passing the address of the original variables to the UDR
initializers is fixed. Also, this patch is required for upcoming
support of task modifier inreduction clause.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78733

4 years ago[mlir][drr] NFC: avoid SmallVector when collecting substitution values
Lei Zhang [Thu, 23 Apr 2020 22:28:22 +0000 (18:28 -0400)]
[mlir][drr] NFC: avoid SmallVector when collecting substitution values

Now both Operation::operand_range and Operation::result_range have
.begin() and .end() for ranged-based for loop and we have
ValueRange for wrapping a single Value. We can remove the SmallVector
materialization!

Differential Revision: https://reviews.llvm.org/D78766

4 years ago[Debuginfo] Remove redundand variable from getAttributeValue()
Alexey Lapshin [Wed, 22 Apr 2020 21:05:56 +0000 (00:05 +0300)]
[Debuginfo] Remove redundand variable from getAttributeValue()

Summary: AttrIndex could be removed from DWARFAbbreviationDeclaration::getAttributeValue.

Reviewers: clayborg, dblaikie

Differential Revision: https://reviews.llvm.org/D78672

4 years ago[SveEmitter] Add builtins for compares and ReverseCompare flag.
Sander de Smalen [Fri, 24 Apr 2020 10:31:34 +0000 (11:31 +0100)]
[SveEmitter] Add builtins for compares and ReverseCompare flag.

The IsReverseCompare flag tells CGBuiltin to swap the operands,
so that a LT/LE intrinsics can be expressed in terms of GE/GT
intrinsics.

This patch also adds builtins for the wide-variants of the compares.

Reviewers: SjoerdMeijer, efriedma, ctetreau

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78747

4 years ago[gn build] (manually) merge 8f766e382b77eef in a minimal way
Nico Weber [Fri, 24 Apr 2020 13:32:10 +0000 (09:32 -0400)]
[gn build] (manually) merge 8f766e382b77eef in a minimal way

4 years agoARCRuntimeEntryPoints.h - remove unnecessary includes. NFC.
Simon Pilgrim [Fri, 24 Apr 2020 13:16:31 +0000 (14:16 +0100)]
ARCRuntimeEntryPoints.h - remove unnecessary includes. NFC.

4 years ago[LLD][ELF][ARM] recommit Fix ARM Exidx order for non monotonic section order
Peter Smith [Fri, 24 Apr 2020 10:23:23 +0000 (11:23 +0100)]
[LLD][ELF][ARM] recommit Fix ARM Exidx order for non monotonic section order

Fixed error detected by msan. The size field of the .ARM.exidx synthetic
section needs to be initialized to at least estimation level before
calling assignAddresses as that will use the size field.

This was previously reverted in 1ca16fc4f5146b90512d4740cfcc4d4c34640853.

Differential Revision: https://reviews.llvm.org/D78422

4 years agoFix minor bug in CommonArgs.cpp
Yaxun (Sam) Liu [Fri, 24 Apr 2020 12:46:27 +0000 (08:46 -0400)]
Fix minor bug in CommonArgs.cpp

Change-Id: Ibe87b1633cc7516479bb08bf51b6860a1585a94f

4 years agoLLVMContextImpl.h - remove defunct getOrAddScope* helpers declarations. NFC.
Simon Pilgrim [Fri, 24 Apr 2020 12:27:42 +0000 (13:27 +0100)]
LLVMContextImpl.h - remove defunct getOrAddScope* helpers declarations. NFC.
The implementation and uses were removed back at rL223802 (IR: Split Metadata from Value) but these were missed.

4 years agoLLVMContextImpl.h - cleanup includes and forward declarations. NFC.
Simon Pilgrim [Fri, 24 Apr 2020 12:07:02 +0000 (13:07 +0100)]
LLVMContextImpl.h - cleanup includes and forward declarations. NFC.
Reduce StringRef.h include to forward declaration.
Remove unnecessary ConstantFP/ConstantInt forward declarations as we have to include Constants.h

4 years agoFileCheckImpl.h - remove unnecessary FileCheckDiag forward declaration. NFC.
Simon Pilgrim [Fri, 24 Apr 2020 11:49:12 +0000 (12:49 +0100)]
FileCheckImpl.h - remove unnecessary FileCheckDiag forward declaration. NFC.

4 years agoSIRegisterInfo.h - remove unnecessary MachineRegisterInfo forward declaration. NFC.
Simon Pilgrim [Fri, 24 Apr 2020 11:43:46 +0000 (12:43 +0100)]
SIRegisterInfo.h - remove unnecessary MachineRegisterInfo forward declaration. NFC.
We already need to include MachineRegisterInfo.h

4 years agoLLLexer.h - reduce SourceMgr.h include to SMLoc.h. NFC
Simon Pilgrim [Fri, 24 Apr 2020 11:37:25 +0000 (12:37 +0100)]
LLLexer.h - reduce SourceMgr.h include to SMLoc.h. NFC
We only need the SMLoc definition and the SourceMgr forward declaration.

4 years agoHexagonShuffler.h - remove duplicate STLExtras.h include. NFC.
Simon Pilgrim [Fri, 24 Apr 2020 11:35:59 +0000 (12:35 +0100)]
HexagonShuffler.h - remove duplicate STLExtras.h include. NFC.

4 years ago[clangd] NFC: Omit deduced template parameters
Kirill Bobyrev [Fri, 24 Apr 2020 12:26:35 +0000 (14:26 +0200)]
[clangd] NFC: Omit deduced template parameters

Related revision: D78521

4 years ago[obj2yaml] - Program headers: simplify the computation of p_filesz.
Georgii Rymar [Wed, 22 Apr 2020 11:00:33 +0000 (14:00 +0300)]
[obj2yaml] - Program headers: simplify the computation of p_filesz.

Currently we have computations of `p_filesz` and `p_memsz` mixed together
with the use of a loop over fragments. After recent changes it is possible to
avoid using a loop for the computation of `p_filesz`, since we know that fragments
are sorted by their file offsets.

The main benefit of this change is that splits the computation of `p_filesz`
and `p_memsz` what is simpler and allows us to fix the computation of the
`p_memsz` independently (D78005 shows the issue that we have currently).

Differential revision: https://reviews.llvm.org/D78628

4 years ago[clangd] Fix build when CLANGD_REMOTE is not enabled
Kirill Bobyrev [Fri, 24 Apr 2020 12:07:03 +0000 (14:07 +0200)]
[clangd] Fix build when CLANGD_REMOTE is not enabled

4 years ago[mlir] Add missing llvm::iterator_facade_base<...>::operator++ for
Haojian Wu [Fri, 24 Apr 2020 12:02:58 +0000 (14:02 +0200)]
[mlir] Add missing llvm::iterator_facade_base<...>::operator++ for
UseIterator;

This would fix our internal build.

4 years ago[clangd] Extend dexp to support remote index
Kirill Bobyrev [Fri, 24 Apr 2020 11:46:45 +0000 (13:46 +0200)]
[clangd] Extend dexp to support remote index

Summary:
* Merge clangd-remote-client into dexp
* Implement `clangd::remote::IndexClient` that is derived from `SymbolIndex`
* Upgrade remote mode-related CMake infrastructure

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: mgorny, ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78521

4 years ago[AMDGPU] Skip generating cache invalidating instructions on AMDPAL
Piotr Sobczak [Fri, 24 Apr 2020 07:56:41 +0000 (09:56 +0200)]
[AMDGPU] Skip generating cache invalidating instructions on AMDPAL

Summary:
Frontend guarantees that coherent accesses have
corresponding cache policy bits set (glc, dlc).
Therefore there is no need for extra instructions
that invalidate cache.

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78800

4 years ago[ADT] Move allocate_buffer to MemAlloc.h and out of line
Benjamin Kramer [Fri, 24 Apr 2020 11:22:40 +0000 (13:22 +0200)]
[ADT] Move allocate_buffer to MemAlloc.h and out of line

There's an ABI breakage here if LLVM is compiled in C++14 without
aligned allocation and a user tries to use the result with aligned
allocation. If DenseMap or unique_function is used across that ABI
boundary it will break (PR45413). Moving it out of line is a bit of
a band-aid and LLVM doesn't really give ABI guarantees at this level,
but given the number of complaints I've received over this it still
seems worth fixing.

4 years agoPassAnalysisSupport.h - reduce StringRef.h include to forward declaration. NFC.
Simon Pilgrim [Fri, 24 Apr 2020 11:13:28 +0000 (12:13 +0100)]
PassAnalysisSupport.h - reduce StringRef.h include to forward declaration. NFC.

4 years ago[ARM] Various tests for MVE and FP16 codegen. NFC
David Green [Thu, 23 Apr 2020 20:58:00 +0000 (21:58 +0100)]
[ARM] Various tests for MVE and FP16 codegen. NFC

4 years ago[libc++] Improve the detection of whether the blocks runtime is available
Louis Dionne [Thu, 23 Apr 2020 20:47:52 +0000 (16:47 -0400)]
[libc++] Improve the detection of whether the blocks runtime is available

The runtime for Blocks may not be available even though the Blocks
language extension _is_ available. Instead of potentially failing,
this commit is much more conservative and assumes the runtime for
Blocks is only provided on Apple platforms.

Differential Revision: https://reviews.llvm.org/D78757

4 years ago[SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics
Kerry McLaughlin [Fri, 24 Apr 2020 09:45:25 +0000 (10:45 +0100)]
[SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics

Summary:
This patch maps IR operations for sdiv & udiv to the
@llvm.aarch64.sve.[s|u]div intrinsics.

A ptrue must be created during lowering as the div instructions
have only a predicated form.

Patch contains changes by Andrzej Warzynski.

Reviewers: sdesmalen, c-rhodes, efriedma, cameron.mcinally, rengolin

Reviewed By: efriedma

Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, andwar, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D78569

4 years ago[SveEmitter] Add builtins for contiguous prefetches
Sander de Smalen [Fri, 24 Apr 2020 10:31:34 +0000 (11:31 +0100)]
[SveEmitter] Add builtins for contiguous prefetches

This patch also adds the enum `sv_prfop` for the prefetch operation specifier
and checks to ensure the passed enum values are valid.

Reviewers: SjoerdMeijer, efriedma, ctetreau

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78674

4 years ago[LoopVectorize] Preserve CFG analyses if CFG wasn't modified
Max Kazantsev [Fri, 24 Apr 2020 10:02:37 +0000 (17:02 +0700)]
[LoopVectorize] Preserve CFG analyses if CFG wasn't modified

One of transforms the loop vectorizer makes is LCSSA formation. In some cases it
is the only transform it makes. We should not drop CFG analyzes if only LCSSA was
formed and no actual CFG changes was made.

We should think of expanding this logic to other passes as well, and maybe make
it a part of PM framework.

Reviewed By: Florian Hahn
Differential Revision: https://reviews.llvm.org/D78360

4 years ago[SveEmitter] Add builtins for svld1rq
Sander de Smalen [Fri, 24 Apr 2020 10:10:28 +0000 (11:10 +0100)]
[SveEmitter] Add builtins for svld1rq

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78748

4 years ago[SveEmitter] Add builtins for scatter stores
Sander de Smalen [Fri, 24 Apr 2020 09:57:10 +0000 (10:57 +0100)]
[SveEmitter] Add builtins for scatter stores

D77735 only added scatters for the non-temporal variants.

Reviewers: SjoerdMeijer, efriedma, andwar

Reviewed By: efriedma

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78751

4 years agoDo not declare compiler extension member as const
serge-sans-paille [Fri, 24 Apr 2020 09:44:42 +0000 (11:44 +0200)]
Do not declare compiler extension member as const

It keeps them default constructible.

4 years ago[clangd] Fix modernize-loop-convert "multiple diag in flight" crash.
Haojian Wu [Thu, 23 Apr 2020 14:20:56 +0000 (16:20 +0200)]
[clangd] Fix modernize-loop-convert "multiple diag in flight" crash.

Summary:
this maybe not ideal, but it is trivial and does fix the crash.

Fixes https://github.com/clangd/clangd/issues/156.

Reviewers: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D78715

4 years ago[MC][mips] Replace setRType## methods by single setRTypes function. NFC
Simon Atanasyan [Fri, 24 Apr 2020 08:54:23 +0000 (11:54 +0300)]
[MC][mips] Replace setRType## methods by single setRTypes function. NFC

MCELFObjectWriter::setRType## methods are always used altogether to
build complete MIPS N64 ABI "chain" of relocations. Using single
function for this task makes code less verbose.

4 years ago[VE] Update floating-point arithmetic instructions
Kazushi (Jam) Marukawa [Fri, 24 Apr 2020 09:11:33 +0000 (11:11 +0200)]
[VE] Update floating-point arithmetic instructions

Summary:
Changing all mnemonic to match assembly instructions to simplify mnemonic
naming rules. This time update all floating-point arithmetic instructions.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D78768

4 years agoFix -Wunused-variable warning, NFC.
Haojian Wu [Fri, 24 Apr 2020 09:03:53 +0000 (11:03 +0200)]
Fix -Wunused-variable warning, NFC.

4 years ago[NFC][PowerPC] Fix the liveins for 3 mir test cases
Kang Zhang [Fri, 24 Apr 2020 08:03:02 +0000 (08:03 +0000)]
[NFC][PowerPC] Fix the liveins for 3 mir test cases

4 years agoRevert "[Attributor][NFC] Encode IRPositions in the bits of a single pointer"
Johannes Doerfert [Fri, 24 Apr 2020 07:53:51 +0000 (02:53 -0500)]
Revert "[Attributor][NFC] Encode IRPositions in the bits of a single pointer"

A dependent patch has been reverted [0]. Until it goes back in this one
has to stay out.

[0] ebdb89399499cfca56fbf98c5f97d892d5976237

This reverts commit d254b50b2b5b22368780c6003c419ffa1e23fa93.

4 years agoUpdate compiler extension integration into the build system
serge-sans-paille [Mon, 20 Apr 2020 10:39:32 +0000 (12:39 +0200)]
Update compiler extension integration into the build system

The approach here is to create a new (empty) component, `Extensions', where all
statically compiled extensions dynamically register their dependencies. That way
we're more natively compatible with LLVMBuild and llvm-config.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=44870

Differential Revision: https://reviews.llvm.org/D78192

4 years agoRevert "[Attributor][NFC] Let AbstractAttribute be an IRPosition"
Johannes Doerfert [Fri, 24 Apr 2020 07:23:13 +0000 (02:23 -0500)]
Revert "[Attributor][NFC] Let AbstractAttribute be an IRPosition"

It seems this breaks the windows builds:
  http://lab.llvm.org:8011/builders/llvm-clang-win-x-aarch64/builds/7454/steps/build-llvm-project/logs/stdio

This reverts commit 6782635e90c11a4535e5b08212c8bbd3b3486f8d.

4 years ago[MLIR] Ensure `gpu.func` must be inside a `gpu.module`.
Frederik Gossen [Tue, 21 Apr 2020 10:05:17 +0000 (10:05 +0000)]
[MLIR] Ensure `gpu.func` must be inside a `gpu.module`.

Ensure that `gpu.func` is only used within the dedicated `gpu.module`.
Implement the constraint to the GPU dialect and adopt test cases.

Differential Revision: https://reviews.llvm.org/D78541

4 years ago[Attributor][NFC] Encode IRPositions in the bits of a single pointer
Johannes Doerfert [Fri, 17 Apr 2020 23:03:04 +0000 (18:03 -0500)]
[Attributor][NFC] Encode IRPositions in the bits of a single pointer

This reduces memory consumption for IRPositions by eliminating the
vtable pointer and the `KindOrArgNo` integer. Since each abstract
attribute has an associated IRPosition, the 12-16 bytes we save add up
quickly.

No functional change is intended.

---

Single run of the Attributor module and then CGSCC pass (oldPM)
for SPASS/clause.c (~10k LLVM-IR loc):

Before:
```
calls to allocation functions: 469545 (260135/s)
temporary memory allocations: 77137 (42735/s)
peak heap memory consumption: 30.50MB
peak RSS (including heaptrack overhead): 119.50MB
total memory leaked: 269.07KB
```

After:
```
calls to allocation functions: 468999 (274108/s)
temporary memory allocations: 77002 (45004/s)
peak heap memory consumption: 28.83MB
peak RSS (including heaptrack overhead): 118.05MB
total memory leaked: 269.07KB
```

Difference:
```
calls to allocation functions: -546 (5808/s)
temporary memory allocations: -135 (1436/s)
peak heap memory consumption: -1.67MB
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B
```

---

CTMark 15 runs

Metric: compile_time

Program                                        lhs    rhs    diff
 test-suite...:: CTMark/sqlite3/sqlite3.test    25.07  24.09 -3.9%
 test-suite...Mark/mafft/pairlocalalign.test    14.58  14.14 -3.0%
 test-suite...-typeset/consumer-typeset.test    21.78  21.58 -0.9%
 test-suite :: CTMark/SPASS/SPASS.test          21.95  22.03  0.4%
 test-suite :: CTMark/lencod/lencod.test        25.43  25.50  0.3%
 test-suite...ark/tramp3d-v4/tramp3d-v4.test    23.88  23.83 -0.2%
 test-suite...TMark/7zip/7zip-benchmark.test    60.24  60.11 -0.2%
 test-suite :: CTMark/kimwitu++/kc.test         15.69  15.69 -0.0%
 test-suite...:: CTMark/ClamAV/clamscan.test    25.43  25.42 -0.0%
 test-suite :: CTMark/Bullet/bullet.test        37.63  37.62 -0.0%
 Geomean difference                                          -0.8%

---

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D78722

4 years ago[Attributor][NFC] Let AbstractAttribute be an IRPosition
Johannes Doerfert [Thu, 23 Apr 2020 03:03:44 +0000 (22:03 -0500)]
[Attributor][NFC] Let AbstractAttribute be an IRPosition

Since every AbstractAttribute so far, and for the foreseeable future,
corresponds to a single IRPosition we can simplify the class structure.
We already did this for IRAttribute but there is no reason to stop
there.

4 years ago[AMDGPU] Add the SGPR used for FP copy to block livein lists.
Christudasan Devadasan [Wed, 22 Apr 2020 19:20:45 +0000 (00:50 +0530)]
[AMDGPU] Add the SGPR used for FP copy to block livein lists.

The temporary register used for FP copy
should be live throughout the function.

4 years ago[Attributor][NFC] Remove and update old check lines
Johannes Doerfert [Thu, 23 Apr 2020 02:00:12 +0000 (21:00 -0500)]
[Attributor][NFC] Remove and update old check lines