platform/upstream/llvm.git
4 years ago[Docs] remove unused arguments in documentation examples on vectorization passes
Afanasyev Ivan [Mon, 27 Jul 2020 09:19:55 +0000 (10:19 +0100)]
[Docs] remove unused arguments in documentation examples on vectorization passes

Reviewers: nadav, tyler.nowicki

Reviewed By: nadav

Differential Revision: https://reviews.llvm.org/D83851

4 years ago[lld][ELF] Add LOG2CEIL builtin ldscript function
Isaac Richter [Mon, 27 Jul 2020 08:49:24 +0000 (11:49 +0300)]
[lld][ELF] Add LOG2CEIL builtin ldscript function

This patch adds support for the LOG2CEIL builtin function in linker scripts: https://sourceware.org/binutils/docs/ld/Builtin-Functions.html#index-LOG2CEIL_0028exp_0029

As documented for LD, and to keep compatibility, LOG2CEIL(0) returns 0 (not -inf).

The test vectors are somewhat arbitrary. We check minimum values (0-4); middle values (2^32, and 2^32+1); and the maximum value (2^64-1).

The checks for LOG2CEIL explicitly use full 64-bit values (16 hex digits). This is needed to properly verify that -inf and other interesting results aren't returned. (For some reason, all other tests in operators.test use only 14 digits.)

Differential revision: https://reviews.llvm.org/D84054

4 years ago[libcxx][lit] Fix running testsuite with python2.7 after 9020d28688492c437abb648b6ab6...
Alex Richardson [Mon, 27 Jul 2020 09:15:17 +0000 (10:15 +0100)]
[libcxx][lit] Fix running testsuite with python2.7 after 9020d28688492c437abb648b6ab69baeba523219

Python 2.7 fails with TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'
if you pass None as the prefix argument to NamedTemporaryFile.

Reviewed By: ldionne, bjope, #libc

Differential Revision: https://reviews.llvm.org/D84595

4 years ago[clangd] Switch from EXPECT_TRUE to ASSERT_TRUE in remote marshalling tests
Kirill Bobyrev [Mon, 27 Jul 2020 08:43:38 +0000 (10:43 +0200)]
[clangd] Switch from EXPECT_TRUE to ASSERT_TRUE in remote marshalling tests

Summary:
When dereferencing Optional's it makes sense to use ASSERT_TRUE for better
test failures readability. Switch from EXPECT_TRUE to ASSERT_TRUE where
it is appropriate.

Reviewers: kadircet

Reviewed By: kadircet

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D84535

Signed-off-by: Kirill Bobyrev <kbobyrev@google.com>
4 years ago[Alignment][NFC] Update Bitcodewriter to use Align
Guillaume Chatelet [Mon, 27 Jul 2020 08:16:28 +0000 (08:16 +0000)]
[Alignment][NFC] Update Bitcodewriter to use Align

Differential Revision: https://reviews.llvm.org/D83533

4 years ago[InstCombine] Fold freeze into phi if one operand is not undef
Juneyoung Lee [Mon, 27 Jul 2020 08:07:27 +0000 (17:07 +0900)]
[InstCombine] Fold freeze into phi if one operand is not undef

 This patch adds folding freeze into phi if it has only one operand to target.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D84601

4 years ago[lldb/Utility] Clean up Scalar constructors
Pavel Labath [Mon, 20 Jul 2020 14:42:01 +0000 (16:42 +0200)]
[lldb/Utility] Clean up Scalar constructors

- move initialization to initializer lists
- make desctructor non-virtual (nothing else is)
- fix long double constructor so that it actually works

4 years ago[lldb/Utility] Fix a bug in RangeMap::CombineConsecutiveRanges
Pavel Labath [Fri, 24 Jul 2020 12:49:17 +0000 (14:49 +0200)]
[lldb/Utility] Fix a bug in RangeMap::CombineConsecutiveRanges

The function didn't combine a large entry which overlapped several other
entries, if those other entries were not overlapping among each other.

E.g., (0,20),(5,6),(10,11) produced (0,20),(10,11)

Now it just produced (0,20).

4 years ago[MLIR][LLVMDialect] Added volatile and nontemporal attributes to load/store
George Mitenkov [Mon, 27 Jul 2020 07:19:48 +0000 (10:19 +0300)]
[MLIR][LLVMDialect] Added volatile and nontemporal attributes to load/store

This patch introduces 2 new optional attributes to `llvm.load`
and `llvm.store` ops: `volatile` and `nontemporal`. These attributes
are translated into proper LLVM as a `volatile` marker and a metadata node
respectively. They are also helpful with SPIR-V to LLVM dialect conversion
since they are the mappings for `Volatile` and `NonTemporal` Memory Operands.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D84396

4 years ago[AMDGPU] Make generating cache invalidating instructions optional
Piotr Sobczak [Thu, 23 Jul 2020 17:26:49 +0000 (19:26 +0200)]
[AMDGPU] Make generating cache invalidating instructions optional

Summary:
D78800 skipped generating cache invalidating instrucions altogether
on AMDPAL. However, this is sometimes too restrictive - we want a
more flexible option to be able to toggle this behaviour on and off
while we work towards developing a correct implementation of the
alternative memory model.

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, dexonsmith, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D84448

4 years ago[SVE] Don't use LocalStackAllocation for SVE objects
David Sherwood [Tue, 14 Jul 2020 15:20:00 +0000 (16:20 +0100)]
[SVE] Don't use LocalStackAllocation for SVE objects

I have introduced a new TargetFrameLowering query function:

  isStackIdSafeForLocalArea

that queries whether or not it is safe for objects of a given stack
id to be bundled into the local area. The default behaviour is to
always bundle regardless of the stack id, however for AArch64 this is
overriden so that it's only safe for fixed-size stack objects.
There is future work here to extend this algorithm for multiple local
areas so that SVE stack objects can be bundled together and accessed
from their own virtual base-pointer.

Differential Revision: https://reviews.llvm.org/D83859

4 years ago[XRay] Account: recursion detection
Roman Lebedev [Mon, 27 Jul 2020 07:15:44 +0000 (10:15 +0300)]
[XRay] Account: recursion detection

Summary:
Recursion detection can be non-trivial. Currently, the state-of-the-art for LLVM,
as far as i'm concerned, is D72362 `[clang-tidy] misc-no-recursion: a new check`.
However, it is quite limited:
* It does very basic call-graph based analysis, in the sense it will report even dynamically-unreachable recursion.
* It is inherently limited to a single TU
* It is hard to gauge how problematic each recursion is in practice.

Some of that can be addressed by adding clang analyzer-based check,
then it would at least support multiple TU's.

However, we can approach this problem from another angle - dynamic run-time analysis.
We already have means to capture a run-time callgraph (XRay, duh),
and there are already means to reconstruct it within `llvm-xray` tool.

This proposes to add a `-recursive-calls-only` switch to the `account` tool.
When the switch is on, when re-constructing callgraph for latency reconstruction,
each time we enter/leave some function, we increment/decrement an entry for the function
in a "recursion depth" map. If, when we leave the function, said entry was at `1`,
then that means the function didn't call itself, however if it is at `2` or more,
then that means the function (possibly indirectly) called itself.

If the depth is 1, we don't account the time spent there,
unless within this call stack the function already recursed into itself.
Note that we don't pay for recursion depth tracking when `recursive-calls-only` is not on,
and the perf impact is insignificant (+0.3% regression)

The overhead of the option is actually negative, around -5.26% user time on a medium-sized (3.5G) XRay log.
As a practical example, that 3.5G log is a capture of the entire middle-end opt pipeline
at `-O3` for RawSpeed unity build. There are total of `5500` functions in the log,
however `-recursive-calls-only` says that `269`, or 5%, are recursive.

Having this functionality could be helpful for recursion eradication.

Reviewers: dberris, mboerger

Reviewed By: dberris

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D84582

4 years ago[NewPM] NFC. remove obsolete TODO comment
Yuanfang Chen [Mon, 27 Jul 2020 05:32:24 +0000 (22:32 -0700)]
[NewPM] NFC. remove obsolete TODO comment

The deleted TODO was implemented in D82344.

4 years ago[PowerPC] Add Vector Extract Double Instruction Definitions and MC tests.
biplmish [Mon, 27 Jul 2020 04:56:19 +0000 (23:56 -0500)]
[PowerPC] Add Vector Extract Double Instruction Definitions and MC tests.

This patch adds the td definitions and asm/disasm tests for the following instructions:

Vector Extract Double Left Index - vextdubvlx, vextduhvlx, vextduwvlx, vextddvlx
Vector Extract Double Right Index - vextdubvrx, vextduhvrx, vextduwvrx, vextddvrx

Differential Revision: https://reviews.llvm.org/D84384

4 years agoRemove declaration of constexpr member kDynamicSize in MemRefType
Mehdi Amini [Mon, 27 Jul 2020 04:50:08 +0000 (04:50 +0000)]
Remove declaration of constexpr member kDynamicSize in MemRefType

This member is already publicly declared on the base class. The
redundant declaration is mangled differently though and in some
unoptimized build it requires a definition to also exist. However we
have a definition for the base ShapedType class, removing the
declaration here will redirect every use to the base class member
instead.

Differential Revision: https://reviews.llvm.org/D84615

4 years ago[gcov] Simplify/speed up CFG hash calculation
Fangrui Song [Mon, 27 Jul 2020 04:14:20 +0000 (21:14 -0700)]
[gcov] Simplify/speed up CFG hash calculation

4 years agoAMDGPU/GlobalISel: Don't assert in LegalizerInfo constructor
Matt Arsenault [Mon, 27 Jul 2020 03:01:28 +0000 (23:01 -0400)]
AMDGPU/GlobalISel: Don't assert in LegalizerInfo constructor

We don't really need these asserts. The LegalizerInfo is also
overly-aggressivly constructed, even when not in use. It needs to not
assert on dummy targets that have manually specified, unrelated
features.

4 years ago[PowerPC] Cleanup p10vector clang test
biplmish [Mon, 27 Jul 2020 02:23:00 +0000 (21:23 -0500)]
[PowerPC] Cleanup p10vector clang test

Remove the duplicate LE test, correct the labels and remove common tests for vec_splat builtin.

Differential Revision: https://reviews.llvm.org/D84382

4 years ago[Scheduling] Improve group algorithm for store cluster
QingShan Zhang [Mon, 27 Jul 2020 02:02:40 +0000 (02:02 +0000)]
[Scheduling] Improve group algorithm for store cluster

Store Addr and Store Addr+8 are clusterable pair. They have memory(ctrl) dependency on different loads.
Current implementation will put these two stores into different group and miss to cluster them.

Reviewed By: evandro

Differential Revision: https://reviews.llvm.org/D84139

4 years ago[InstCombine] Add more tests to freeze-phi.ll; NFC
Juneyoung Lee [Mon, 27 Jul 2020 00:43:00 +0000 (09:43 +0900)]
[InstCombine] Add more tests to freeze-phi.ll; NFC

4 years ago[ORC] Remove a redundant call to getTargetMemory.
Lang Hames [Mon, 27 Jul 2020 00:33:07 +0000 (17:33 -0700)]
[ORC] Remove a redundant call to getTargetMemory.

4 years ago[flang][openacc] Basic name resolution infrastructure for OpenACC construct
Valentin Clement [Mon, 27 Jul 2020 00:00:49 +0000 (20:00 -0400)]
[flang][openacc] Basic name resolution infrastructure for OpenACC construct

Reviewed By: tskeith, klausler, ichoyjx

Differential Revision: https://reviews.llvm.org/D83998

4 years ago[LLD] [COFF] Fix test to properly test all aspects of c3b1d730d6. NFC.
Martin Storsjö [Sat, 25 Jul 2020 12:01:48 +0000 (15:01 +0300)]
[LLD] [COFF] Fix test to properly test all aspects of c3b1d730d6. NFC.

Previously, the test could pass with one part of c3b1d730d6 removed.

4 years ago[lld-macho] Support lookup of dylibs in frameworks
Jez Ng [Sun, 26 Jul 2020 19:46:46 +0000 (12:46 -0700)]
[lld-macho] Support lookup of dylibs in frameworks

Needed for testing Objective-C programs (since e.g. Core
Foundation is a framework)

Reviewed By: #lld-macho, compnerd

Differential Revision: https://reviews.llvm.org/D83925

4 years ago[X86] Turn X86DAGToDAGISel::tryVPTERNLOG into a fully custom instruction selector...
Craig Topper [Sun, 26 Jul 2020 17:57:59 +0000 (10:57 -0700)]
[X86] Turn X86DAGToDAGISel::tryVPTERNLOG into a fully custom instruction selector that can handle bitcasts between logic ops

Previously we just matched the logic ops and replaced with an
X86ISD::VPTERNLOG node that we would send through the normal
pattern match. But that approach couldn't handle a bitcast
between the logic ops. Extending that approach would require us
to peek through the bitcasts and emit new bitcasts to match
the types. Those new bitcasts would then have to be properly
topologically sorted.

This patch instead switches to directly emitting the
MachineSDNode and skips the normal tablegen pattern matching.
We do have to handle load folding and broadcast load folding
ourselves now. Which also means commuting the immediate control.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D83630

4 years ago[flang] Fix implicit declarations in statement functions
Tim Keith [Sun, 26 Jul 2020 19:13:36 +0000 (12:13 -0700)]
[flang] Fix implicit declarations in statement functions

If a symbol (that is not a dummy argument) is implicitly declared inside
a statement function, don't create it in the statement function's scope.
Instead, treat statement functions like blocks when finding the inclusive
scope and create the symbol there.

Add a new flag, StmtFunction, to symbols that represent statement functions.

Differential Revision: https://reviews.llvm.org/D84588

4 years agoReplace comment by private method; NFC.
Hannes Käufler [Sun, 26 Jul 2020 17:59:45 +0000 (13:59 -0400)]
Replace comment by private method; NFC.

4 years ago[X86] Move getGatherOverhead/getScatterOverhead into X86TargetTransformInfo.
Craig Topper [Sun, 26 Jul 2020 17:38:34 +0000 (10:38 -0700)]
[X86] Move getGatherOverhead/getScatterOverhead into X86TargetTransformInfo.

These cost methods don't make much sense in X86Subtarget. Make
them methods in X86's TTI and move the feature checks from the
X86Subtarget constructor into these methods.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D84594

4 years ago[InstCombine] Add a test for folding freeze into phi; NFC
Juneyoung Lee [Sun, 26 Jul 2020 17:23:51 +0000 (02:23 +0900)]
[InstCombine] Add a test for folding freeze into phi; NFC

4 years ago[clang][NFC] Add a test for __attribute__((flag_enum)) with an unnamed enumeration.
Bruno Ricci [Sun, 26 Jul 2020 16:24:43 +0000 (17:24 +0100)]
[clang][NFC] Add a test for __attribute__((flag_enum)) with an unnamed enumeration.

4 years ago[clang][NFC] Add tests for the use of NamedDecl::getDeclName in the unused/unneeded...
Bruno Ricci [Sun, 26 Jul 2020 16:20:56 +0000 (17:20 +0100)]
[clang][NFC] Add tests for the use of NamedDecl::getDeclName in the unused/unneeded diagnostics.

4 years ago[clang][NFC] Remove spurious +x flag on SemaConcept.cpp
Bruno Ricci [Sun, 26 Jul 2020 16:10:59 +0000 (17:10 +0100)]
[clang][NFC] Remove spurious +x flag on SemaConcept.cpp

4 years ago[X86][SSE] lowerV2I64Shuffle - use undef elements in PSHUFD mask widening
Simon Pilgrim [Sun, 26 Jul 2020 15:03:53 +0000 (16:03 +0100)]
[X86][SSE] lowerV2I64Shuffle - use undef elements in PSHUFD mask widening

If we lower a v2i64 shuffle to PSHUFD, we currently clamp undef elements to 0, (elements 0,1 of the v4i32) which can result in the shuffle referencing more elements of the source vector than expected, affecting later shuffle combines and KnownBits/SimplifyDemanded calls.

By ensuring we widen the undef mask element we allow getV4X86ShuffleImm8 to use inline elements as the default, which are more likely to fold.

4 years ago[MLIR][Affine] Add test for non-hyperrectangular loop tiling
Vincent Zhao [Sun, 26 Jul 2020 14:40:07 +0000 (20:10 +0530)]
[MLIR][Affine] Add test for non-hyperrectangular loop tiling

This diff provides a concrete test case for the error that will be raised when the iteration space is non hyper-rectangular.

The corresponding emission method for this error message has been changed as well.

Differential Revision: https://reviews.llvm.org/D84531

4 years agoAMDGPU/GlobalISel: Fix not constraining ds_append/consume operands
Matt Arsenault [Sat, 25 Jul 2020 15:56:33 +0000 (11:56 -0400)]
AMDGPU/GlobalISel: Fix not constraining ds_append/consume operands

4 years agoGlobalISel: Handle G_PTR_ADD in narrowScalar
Matt Arsenault [Sat, 25 Jul 2020 15:00:35 +0000 (11:00 -0400)]
GlobalISel: Handle G_PTR_ADD in narrowScalar

4 years agoGlobalISel: Handle fewerElementsVector for G_PTR_ADD
Matt Arsenault [Sat, 25 Jul 2020 14:47:33 +0000 (10:47 -0400)]
GlobalISel: Handle fewerElementsVector for G_PTR_ADD

4 years agoAMDGPU/GlobalISel: Reorder G_CONSTANT legality rules
Matt Arsenault [Sat, 25 Jul 2020 15:14:27 +0000 (11:14 -0400)]
AMDGPU/GlobalISel: Reorder G_CONSTANT legality rules

The legal cases should be the first rules.

4 years agoAMDGPU/GlobalISel: Make sure <2 x s1> phis are scalarized
Matt Arsenault [Sat, 25 Jul 2020 21:22:22 +0000 (17:22 -0400)]
AMDGPU/GlobalISel: Make sure <2 x s1> phis are scalarized

4 years agoAMDGPU/GlobalISel: Legalize GDS atomics
Matt Arsenault [Sat, 25 Jul 2020 19:41:58 +0000 (15:41 -0400)]
AMDGPU/GlobalISel: Legalize GDS atomics

I noticed these don't use the _gfx9, non-m0 reading variants but not
sure if that's a bug or not. It's the same in the DAG.

4 years agoAMDGPU/GlobalISel: Pack constant G_BUILD_VECTOR_TRUNCs when selecting
Matt Arsenault [Sat, 18 Jul 2020 19:30:59 +0000 (15:30 -0400)]
AMDGPU/GlobalISel: Pack constant G_BUILD_VECTOR_TRUNCs when selecting

4 years ago[InstSimplify] fold integer min/max intrinsics with limit constant
Sanjay Patel [Sun, 26 Jul 2020 13:33:13 +0000 (09:33 -0400)]
[InstSimplify] fold integer min/max intrinsics with limit constant

4 years agoGlobalISel: Handle 'n' inline asm constraint
Matt Arsenault [Sun, 26 Jul 2020 13:26:48 +0000 (09:26 -0400)]
GlobalISel: Handle 'n' inline asm constraint

4 years agoAMDGPU/GlobalISel: Sign extend integer constants
Matt Arsenault [Sat, 25 Jul 2020 18:37:29 +0000 (14:37 -0400)]
AMDGPU/GlobalISel: Sign extend integer constants

This matches the DAG behavior and fixes immediate folding

4 years agoAMDGPU/GlobalISel: Replace selection tests for G_CONSTANT/G_FCONSTANT
Matt Arsenault [Sat, 25 Jul 2020 17:21:31 +0000 (13:21 -0400)]
AMDGPU/GlobalISel: Replace selection tests for G_CONSTANT/G_FCONSTANT

Split into separate tests and make more consistent with the others.

4 years ago[DWARFYAML] Rename getUsedSectionNames() to getNonEmptySectionNames().
Xing GUO [Sun, 26 Jul 2020 08:01:22 +0000 (16:01 +0800)]
[DWARFYAML] Rename getUsedSectionNames() to getNonEmptySectionNames().

This patch renames getUsedSectionNames() to getNonEmptySectionNames.
NFC.

4 years ago[InstSimplify] add tests for min/max intrinsics; NFC
Sanjay Patel [Sat, 25 Jul 2020 20:40:14 +0000 (16:40 -0400)]
[InstSimplify] add tests for min/max intrinsics; NFC

4 years ago[InstSimplify] fold fcmp using isKnownNeverInfinity + isKnownNeverNaN
Sanjay Patel [Fri, 24 Jul 2020 19:11:02 +0000 (15:11 -0400)]
[InstSimplify] fold fcmp using isKnownNeverInfinity + isKnownNeverNaN

Follow-up to D84035 / rG7393d7574c09.
This sidesteps a question of FMF/poison on fcmp raised in PR46077:
http://bugs.llvm.org/PR46077

https://alive2.llvm.org/ce/z/TCsyzD
  define i1 @src(float %x) {
  %0:
    %x42 = fadd nnan ninf float %x, 42.000000
    %r = fcmp ueq float %x42, inf
    ret i1 %r
  }
  =>
  define i1 @tgt(float %x) {
  %0:
    ret i1 0
  }
  Transformation seems to be correct!

https://alive2.llvm.org/ce/z/FQaH7a
  define i1 @src(i8 %x) {
  %0:
    %cast = uitofp i8 %x to float
    %r = fcmp one float inf, %cast
    ret i1 %r
  }
  =>
  define i1 @tgt(i8 %x) {
  %0:
    ret i1 1
  }
  Transformation seems to be correct!

4 years ago[InstSimplify] add tests for fcmp with infinity constant; NFC
Sanjay Patel [Fri, 24 Jul 2020 18:45:50 +0000 (14:45 -0400)]
[InstSimplify] add tests for fcmp with infinity constant; NFC

4 years ago[JumpThreading] Add a test for D84598; NFC
Juneyoung Lee [Sun, 26 Jul 2020 13:00:01 +0000 (22:00 +0900)]
[JumpThreading] Add a test for D84598; NFC

4 years ago[ConstantFolding] Fold freeze if it is never undef or poison
Juneyoung Lee [Sun, 26 Jul 2020 12:54:44 +0000 (21:54 +0900)]
[ConstantFolding] Fold freeze if it is never undef or poison

This is a simple patch that adds constant folding for freeze
instruction.

IIUC, it isn't needed to update ConstantFold.cpp because there is no freeze
constexpr.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D84597

4 years ago[ValueTracking] Instruction::isBinaryOp should be used for constexprs
Juneyoung Lee [Sun, 26 Jul 2020 12:48:51 +0000 (21:48 +0900)]
[ValueTracking] Instruction::isBinaryOp should be used for constexprs

This is a simple patch that makes canCreateUndefOrPoison use
Instruction::isBinaryOp because BinaryOperator inherits Instruction.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D84596

4 years agoNFC; add a test for freeze's constprop
Juneyoung Lee [Sun, 26 Jul 2020 12:02:31 +0000 (21:02 +0900)]
NFC; add a test for freeze's constprop

4 years agoNFC; add an example that subtracts pointers to two global vars
Juneyoung Lee [Sun, 26 Jul 2020 11:47:19 +0000 (20:47 +0900)]
NFC; add an example that subtracts pointers to two global vars

4 years ago[NFC][XRay] Account: migrate to DenseMap + SmallVector, -16% faster on large (3.8G...
Roman Lebedev [Sun, 26 Jul 2020 11:05:00 +0000 (14:05 +0300)]
[NFC][XRay] Account: migrate to DenseMap + SmallVector, -16% faster on large (3.8G) input

DenseMap is a single allocation underneath, so this is has pretty expected
performance impact on large-ish (3.8G) xray log processing time.

4 years ago[NFC][XRay] Account: decouple getStats() interface from underlying data structure
Roman Lebedev [Sun, 26 Jul 2020 11:00:15 +0000 (14:00 +0300)]
[NFC][XRay] Account: decouple getStats() interface from underlying data structure

It doesn't really need to know where Timings are stored, it just needs
to be able to sort them, so MutableArrayRef is enough.

That uncovers an interesting quirk that it relied on
implicit double->int conversion for calculating percentiles.

4 years ago[lit] Don't include tests skipped due to sharding in reports
Alex Richardson [Sun, 26 Jul 2020 10:39:22 +0000 (11:39 +0100)]
[lit] Don't include tests skipped due to sharding in reports

When running multiple shards, don't include skipped tests in the xunit
output since merging the files will result in duplicates.
In our CHERI Jenkins CI, I configured the libc++ tests to run using sharding
(since we are testing using a single-CPU QEMU). We then merge the generated
XUnit xml files to produce a final result, but if the individual XMLs
report tests excluded due to sharding each test is included N times in the
final result. This also makes it difficult to find the tests that were
skipped due to missing REQUIRES: etc.

Reviewed By: yln

Differential Revision: https://reviews.llvm.org/D84235

4 years ago[asan] Mark the strstr test as UNSUPPORTED on FreeBSD
Alex Richardson [Sun, 26 Jul 2020 10:37:47 +0000 (11:37 +0100)]
[asan] Mark the strstr test as UNSUPPORTED on FreeBSD

 Like Android, FreeBSDs libc calls memchr which causes this test to fail.

Reviewed By: emaste

Differential Revision: https://reviews.llvm.org/D84541

4 years ago[AArch64][GlobalISel] Make <8 x s16> and <16 x s8> legal types for G_SHUFFLE_VECTOR...
Amara Emerson [Sun, 26 Jul 2020 07:46:29 +0000 (00:46 -0700)]
[AArch64][GlobalISel] Make <8 x s16> and <16 x s8> legal types for G_SHUFFLE_VECTOR and G_IMPLICIT_DEF.

Trivial change, we're still missing support for rev matching for these types
in the combiner.

4 years ago[X86] Merge X86MCInstLowering's maxLongNopLength into emitNop and remove check for...
Craig Topper [Sun, 26 Jul 2020 05:05:46 +0000 (22:05 -0700)]
[X86] Merge X86MCInstLowering's maxLongNopLength into emitNop and remove check for FeatureNOPL.

The switch in emitNop uses 64-bit registers for nops exceeding
2 bytes. This isn't valid outside 64-bit mode. We could fix this
easily enough, but there are no users that ask for more than 2
bytes outside 64-bit mode.

Inlining the method to make the coupling between the two methods
more explicit.

4 years ago[X86] Remove getProcFamily() method from X86Subtarget. NFC
Craig Topper [Sun, 26 Jul 2020 03:48:46 +0000 (20:48 -0700)]
[X86] Remove getProcFamily() method from X86Subtarget. NFC

This isn't used and we've decided in the past that a CPU enum
for tuning is not a good idea.

4 years ago[mlir][shape] Further operand and result type generalization
Jacques Pienaar [Sun, 26 Jul 2020 04:37:15 +0000 (21:37 -0700)]
[mlir][shape] Further operand and result type generalization

Previous changes generalized some of the operands and results. Complete
a larger group of those to simplify progressive lowering. Also update
some of the declarative asm form due to generalization. Tried to keep it
mostly mechanical.

4 years agoDADCombiner: Don't simplify the token factor if the node's number of operands already...
Changpeng Fang [Sun, 26 Jul 2020 04:20:59 +0000 (21:20 -0700)]
DADCombiner: Don't simplify the token factor if the node's number of operands already exceeds TokenFactorInlineLimit

Summary:
  In parallelizeChainedStores, a TokenFactor was created with the size greater than 3000.
We found that DAGCombiner::visitTokenFactor will consume a huge amount of time on
such nodes. Since the number of operands already exceeds TokenFactorInlineLimit, we propose
to give up simplification with the consideration of compile time.

Reviewers:
  @spatel, @arsenm

Differential Revision:
  https://reviews.llvm.org/D84204

4 years ago[X86] Replace a use of ProcIntelSLM with FeatureFast7ByteNOP.
Craig Topper [Sun, 26 Jul 2020 03:46:42 +0000 (20:46 -0700)]
[X86] Replace a use of ProcIntelSLM with FeatureFast7ByteNOP.

4 years agoTemporarily Revert "Unify the return value of GetByteSize to an llvm::Optional<uint64...
Eric Christopher [Sun, 26 Jul 2020 01:42:04 +0000 (18:42 -0700)]
Temporarily Revert "Unify the return value of GetByteSize to an llvm::Optional<uint64_t> (NFC-ish)"
as it's causing numerous (176) test failures on linux.

This reverts commit 1d9b860fb6a85df33fd52fcacc6a5efb421621bd.

4 years agoFold StatepointBB into checks as it's only used from an NDEBUG or ASSERT
Eric Christopher [Sun, 26 Jul 2020 01:34:02 +0000 (18:34 -0700)]
Fold StatepointBB into checks as it's only used from an NDEBUG or ASSERT
context fixing an unused variable warning.

4 years ago[PowerPC][NFC] Fix an assert that cannot trip from 7d076e19e31a
Nemanja Ivanovic [Sun, 26 Jul 2020 00:28:52 +0000 (20:28 -0400)]
[PowerPC][NFC] Fix an assert that cannot trip from 7d076e19e31a

I mixed up the precedence of operators in the assert and thought I
had it right since there was no compiler warning. This just
adds the parentheses in the expression as needed.

4 years ago[Statepoints] Style cleanup after 3da1a963 [NFC]
Philip Reames [Sat, 25 Jul 2020 23:40:06 +0000 (16:40 -0700)]
[Statepoints] Style cleanup after 3da1a963 [NFC]

Just fixing a few minor stylistic issues.

4 years ago[X86] Add masked versions of the VPTERNLOG test cases added for D83630. NFC
Craig Topper [Sat, 25 Jul 2020 23:36:33 +0000 (16:36 -0700)]
[X86] Add masked versions of the VPTERNLOG test cases added for D83630. NFC

We don't handle these yet and D83630 won't improve that, but
at least we'll have the tests.

4 years ago[Reduce] Argument reduction: do deal with function declarations
Roman Lebedev [Sat, 25 Jul 2020 21:56:36 +0000 (00:56 +0300)]
[Reduce] Argument reduction: do deal with function declarations

We can happily turn function definitions into declarations,
thus obscuring their argument from being elided by this pass.

I don't believe there is a good reason to just ignore declarations.
likely even proper llvm intrinsics ones,
at worst the input becomes uninteresting.

The other question here is that all these transforms are all-or-nothing.
In some cases, should we be treating each use separately?

The main blocker here seemed to be that llvm::CloneFunctionInto()
does `&OldFunc->front()`, which inserts a nullptr into a densemap,
which is not happy about it and asserts.

4 years ago[Reduce] Argument reduction: do properly handle invoke insts (PR46819)
Roman Lebedev [Sat, 25 Jul 2020 20:24:13 +0000 (23:24 +0300)]
[Reduce] Argument reduction: do properly handle invoke insts (PR46819)

replaceFunctionCalls() is very non-exhaustive, it only handles
CallInst's. Which means, by the time we drop old function,
there may still be uses of it lurking around.
Let's instead whack-a-mole them by all by replacing with undef.

I'm not sure this is the best handling, especially for calls, but IMO
poorly reduced input is much better than crashing reduction tool.
A (previously-crashing!) test added.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46819

4 years ago[Reduce] Basic block reduction: do properly handle invoke insts (PR46818)
Roman Lebedev [Sat, 25 Jul 2020 19:31:05 +0000 (22:31 +0300)]
[Reduce] Basic block reduction: do properly handle invoke insts (PR46818)

Terminator may have returned value, so we need to replace uses,
and in general handle invoke as a branch inst.

I'm not sure this is the best handling, but IMO poorly reduced
input is much better than crashing reduction tool.
A (previously-crashing!) test added.

Fixes https://bugs.llvm.org/show_bug.cgi?id=46818

4 years ago[ORC] Rename TargetProcessControl DynamicLibraryHandle and loadLibrary.
Lang Hames [Sat, 25 Jul 2020 21:18:52 +0000 (14:18 -0700)]
[ORC] Rename TargetProcessControl DynamicLibraryHandle and loadLibrary.

The new names, DylibHandle and loadDylib, are more concise and make
clear that these utilities are for loading dynamic libraries, not static
ones.

4 years ago[ORC] Don't require PageSize or Triple during TargetProcessControl construction
Lang Hames [Sat, 25 Jul 2020 04:17:37 +0000 (21:17 -0700)]
[ORC] Don't require PageSize or Triple during TargetProcessControl construction

Subclasses will commonly gather that information from a remote during
construction, in which case they won't have meaningful values to pass to
TargetProcessControl's constructor.

4 years ago[MLIR][Shape] Allow `num_elements` to operate on extent tensors
Frederik Gossen [Sat, 25 Jul 2020 22:01:21 +0000 (15:01 -0700)]
[MLIR][Shape] Allow `num_elements` to operate on extent tensors

Re-landing with dependent change landed and error condition relaxed.
Beyond the change to error condition exactly https://reviews.llvm.org/D84445.

4 years ago[MLIR][Shape] Refactor verification
Jacques Pienaar [Sat, 25 Jul 2020 21:55:19 +0000 (14:55 -0700)]
[MLIR][Shape] Refactor verification

Based on https://reviews.llvm.org/D84439 but less restrictive, else we
don't allow shape_of to be able to produce a ranked output and doesn't
allow for iterative refinement here. We can consider making it more
restrictive later.

4 years agoRevert "[MLIR][Shape] Allow `num_elements` to operate on extent tensors"
Jacques Pienaar [Sat, 25 Jul 2020 21:47:57 +0000 (14:47 -0700)]
Revert "[MLIR][Shape] Allow `num_elements` to operate on extent tensors"

This reverts commit 55ced04d6bc13fd0f9396a0cfc393b44378d8784.

Forgot to submit depend change first.

4 years ago[MLIR][Shape] Allow `num_elements` to operate on extent tensors
Frederik Gossen [Sat, 25 Jul 2020 21:39:18 +0000 (14:39 -0700)]
[MLIR][Shape] Allow `num_elements` to operate on extent tensors

Differential Revision: https://reviews.llvm.org/D84445

4 years ago[Statepoints] Support lowering gc relocations to virtual registers
Philip Reames [Sat, 11 Jul 2020 17:50:34 +0000 (10:50 -0700)]
[Statepoints] Support lowering gc relocations to virtual registers

(Disabled under flag for the moment)

This is part of a larger project wherein we are finally integrating lowering of gc live operands with the register allocator.  Today, we force spill all operands in SelectionDAG.  The code to do so is distinctly non-optimal.  The approach this patch is working towards is to instead lower the relocations directly into the MI form, and let the register allocator pick which ones get spilled and which stack slots they get spilled to.  In terms of performance, the later part is actually more important as it avoids redundant shuffling of values between stack slots.

This particular change adds ISEL support to produce the variadic def STATEPOINT form required by the above.  In particular, the first N are lowered to variadic tied def/use pairs.  So new statepoint looks like this:
reloc1,reloc2,... = STATEPOINT ..., base1, derived1<tied-def0>, base2, derived2<tied-def1>, ...

N is limited by the maximal number of tied registers machine instruction can have (15 at the moment).

The current patch is restricted to handling relocations within a single basic block.  Cross block relocations (e.g. invokes) are handled via the legacy mechanism.  This restriction will be relaxed in future patches.

Patch By: dantrushin
Differential Revision: https://reviews.llvm.org/D81648

4 years ago[X86] Add llvm.roundeven test cases. Add f80 tests cases for constrained intrinsics...
Craig Topper [Sat, 25 Jul 2020 20:24:58 +0000 (13:24 -0700)]
[X86] Add llvm.roundeven test cases. Add f80 tests cases for constrained intrinsics that lower to libcalls. NFC

4 years ago[X86] Fix intrinsic names in strict fp80 tests to use f80 in their names instead...
Craig Topper [Sat, 25 Jul 2020 19:12:16 +0000 (12:12 -0700)]
[X86] Fix intrinsic names in strict fp80 tests to use f80 in their names instead of x86_fp80.

The type is called x86_fp80, but when it is printed in the intrinsic
name it should be f80. The parser doesn't seem to care that the
name was wrong.

4 years ago[Driver] Define LinkOption and fix forwarded options to GCC for linking
Fangrui Song [Sat, 25 Jul 2020 19:33:18 +0000 (12:33 -0700)]
[Driver] Define LinkOption and fix forwarded options to GCC for linking

Many driver options are neither 'DriverOption' nor 'LinkerInput'. When gcc is
used for linking, these options get forwarded even if they don't have anything
to do with linking. Among these options, clang-specific ones can cause gcc to
error.

Just use 'OPT_Link_Group' and a new flag 'LinkOption' for options which already
have a group.

gfortran support apparently bit rots (which does not seem to make much sense). XFAIL the test.

4 years ago[gn build] Port 136c8f50e96
LLVM GN Syncbot [Sat, 25 Jul 2020 18:51:58 +0000 (18:51 +0000)]
[gn build] Port 136c8f50e96

4 years ago[Reduce] Try turning function definitions into declarations first, NFCI-ish
Roman Lebedev [Sat, 25 Jul 2020 18:43:36 +0000 (21:43 +0300)]
[Reduce] Try turning function definitions into declarations first, NFCI-ish

ReduceFunctions could do it, but it also replaces *all* calls with undef,
so if any of undef replacements makes reduction uninteresting,
it won't work.

ReduceBasicBlocks also could do it, but well, it may take many guesses
for all the blocks of a function to happen to be out-of-chunk,
which is not a very efficient way to go about it.

So let's just do this first.

4 years agoUnify the return value of GetByteSize to an llvm::Optional<uint64_t> (NFC-ish)
Adrian Prantl [Sat, 25 Jul 2020 15:27:21 +0000 (08:27 -0700)]
Unify the return value of GetByteSize to an llvm::Optional<uint64_t> (NFC-ish)

This cleanup patch unifies all methods called GetByteSize() in the
ValueObject hierarchy to return an optional, like the methods in
CompilerType do. This means fewer magic 0 values, which could fix bugs
down the road in languages where types can have a size of zero, such
as Swift and C (but not C++).

Differential Revision: https://reviews.llvm.org/D84285

4 years ago[X86] Remove stress-scheduledagrrlist.ll.
Florian Hahn [Sat, 25 Jul 2020 14:45:24 +0000 (15:45 +0100)]
[X86] Remove stress-scheduledagrrlist.ll.

This test seems to take quite a long time with EXPENSIVE_CHECKS.

Remove it.

4 years ago[LVI] Don't require operand number for range (NFC)
Nikita Popov [Sat, 25 Jul 2020 14:32:22 +0000 (16:32 +0200)]
[LVI] Don't require operand number for range (NFC)

Pass the Value* instead of the operand number, rename I to CxtI.
This makes the function a bit more generally useful.

4 years agoAMDGPU/GlobalISel: Don't assert on G_INSERT > 128-bits
Matt Arsenault [Tue, 16 Jun 2020 00:13:24 +0000 (20:13 -0400)]
AMDGPU/GlobalISel: Don't assert on G_INSERT > 128-bits

Just fallback for now. Really tablegen needs to generate all of the
subregister index handling we need.

4 years ago[SCCP] Add assume non null test (NFC)
Nikita Popov [Sat, 25 Jul 2020 14:02:15 +0000 (16:02 +0200)]
[SCCP] Add assume non null test (NFC)

4 years ago[SCCP] Restore the change reporting as well
Nikita Popov [Sat, 25 Jul 2020 13:10:48 +0000 (15:10 +0200)]
[SCCP] Restore the change reporting as well

Reapply 5db5b4bc4394ca247c9eb665e03b851848aa2fbf.

4 years agoReapply [SCCP] Directly remove non-feasible edges
Nikita Popov [Tue, 21 Jul 2020 19:26:30 +0000 (21:26 +0200)]
Reapply [SCCP] Directly remove non-feasible edges

Reapply with DTU update moved after CFG update, which is a
requirement of the API.

-----

Non-feasible control-flow edges are currently removed by replacing
the branch condition with a constant and then calling
ConstantFoldTerminator. This happens in a rather roundabout manner,
by inspecting the users (effectively: predecessors) of unreachable
blocks, and further complicated by the need to explicitly materialize
the condition for "forced" edges. I would like to extend SCCP to
discard switch conditions that are non-feasible based on range
information, but this is incompatible with the current approach
(as there is no single constant we could use.)

Instead, this patch explicitly removes non-feasible edges. It
currently only needs to handle the case where there is a single
feasible edge. The llvm_unreachable() branch will need to be
implemented for the aforementioned switch improvement.

Differential Revision: https://reviews.llvm.org/D84264

4 years agoSimplifyLibCalls - remove unnecessary header and forward declaration. NFC.
Simon Pilgrim [Sat, 25 Jul 2020 11:58:39 +0000 (12:58 +0100)]
SimplifyLibCalls - remove unnecessary header and forward declaration. NFC.

We include TargetLibraryInfo.h so don't need to forward declare it, and we don't need to include TargetLibraryInfo.h in SimplifyLibCalls.cpp as well.

4 years ago[X86][SSE] combineX86ShufflesRecursively - move all Root node asserts to the same...
Simon Pilgrim [Sat, 25 Jul 2020 11:08:06 +0000 (12:08 +0100)]
[X86][SSE] combineX86ShufflesRecursively - move all Root node asserts to the same location. NFCI.

Minor tidyup for some upcoming shuffle combine improvements.

4 years agoSymbolRemappingReader.h - pass Twine by reference not value. NFCI.
Simon Pilgrim [Sat, 25 Jul 2020 10:35:47 +0000 (11:35 +0100)]
SymbolRemappingReader.h - pass Twine by reference not value. NFCI.

4 years ago[IPSCCP] Drop argmemonly after replacing pointer argument.
Florian Hahn [Sat, 25 Jul 2020 10:52:14 +0000 (11:52 +0100)]
[IPSCCP] Drop argmemonly after replacing pointer argument.

This patch updates IPSCCP to drop argmemonly and
inaccessiblemem_or_argmemonly if it replaces a pointer argument.

Fixes PR46717.

Reviewers: efriedma, davide, nikic, jdoerfert

Reviewed By: efriedma, jdoerfert

Differential Revision: https://reviews.llvm.org/D84432

4 years agoFix C2975 error under MSVC
Nathan James [Sat, 25 Jul 2020 10:03:59 +0000 (11:03 +0100)]
Fix C2975 error under MSVC

Apparantly a constexpr value isn't a compile time constant under certain versions of MSVC.

4 years ago[X86][SSE] getFauxShuffle - ignore undemanded sources for PACKSS/PACKUS faux shuffles
Simon Pilgrim [Sat, 25 Jul 2020 09:50:56 +0000 (10:50 +0100)]
[X86][SSE] getFauxShuffle - ignore undemanded sources for PACKSS/PACKUS faux shuffles

If we don't care about an entire LHS/RHS of the PACK op, then can just treat it the same as undef (we don't care if it saturates) and is safe to treat as a shuffle.

This can happen if we attempt to decode as a faux shuffle before SimplifyDemandedVectorElts has been called on the PACK which should replace the source with UNDEF entirely.

4 years ago[ADT] Add a range-based version of std::move
Nathan James [Sat, 25 Jul 2020 09:37:33 +0000 (10:37 +0100)]
[ADT] Add a range-based version of std::move

Adds a range-based version of `std::move`, the version that moves a range, not the one that creates r-value references.

Reviewed By: dblaikie, gamesh411

Differential Revision: https://reviews.llvm.org/D83902

4 years ago[AArch64][GlobalISel] Look through constants when selection stores of 0
Jessica Paquette [Sat, 25 Jul 2020 01:14:41 +0000 (18:14 -0700)]
[AArch64][GlobalISel] Look through constants when selection stores of 0

Very minor code size improvements (hits 8 times in Bullet at -O3), but still
something.

Also very minor NFC change to make sure we only search for a 0 constant when
selecting a store. Before, we'd do this for loads as well.

Differential Revision: https://reviews.llvm.org/D84573

4 years ago[tsan] Allow TSan in the Clang driver for Apple Silicon Macs
Kuba Mracek [Sat, 25 Jul 2020 03:14:00 +0000 (20:14 -0700)]
[tsan] Allow TSan in the Clang driver for Apple Silicon Macs

Differential Revision: https://reviews.llvm.org/D84082