Hannes Käufler [Sun, 26 Jul 2020 17:59:45 +0000 (13:59 -0400)]
Replace comment by private method; NFC.
Craig Topper [Sun, 26 Jul 2020 17:38:34 +0000 (10:38 -0700)]
[X86] Move getGatherOverhead/getScatterOverhead into X86TargetTransformInfo.
These cost methods don't make much sense in X86Subtarget. Make
them methods in X86's TTI and move the feature checks from the
X86Subtarget constructor into these methods.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D84594
Juneyoung Lee [Sun, 26 Jul 2020 17:23:51 +0000 (02:23 +0900)]
[InstCombine] Add a test for folding freeze into phi; NFC
Bruno Ricci [Sun, 26 Jul 2020 16:24:43 +0000 (17:24 +0100)]
[clang][NFC] Add a test for __attribute__((flag_enum)) with an unnamed enumeration.
Bruno Ricci [Sun, 26 Jul 2020 16:20:56 +0000 (17:20 +0100)]
[clang][NFC] Add tests for the use of NamedDecl::getDeclName in the unused/unneeded diagnostics.
Bruno Ricci [Sun, 26 Jul 2020 16:10:59 +0000 (17:10 +0100)]
[clang][NFC] Remove spurious +x flag on SemaConcept.cpp
Simon Pilgrim [Sun, 26 Jul 2020 15:03:53 +0000 (16:03 +0100)]
[X86][SSE] lowerV2I64Shuffle - use undef elements in PSHUFD mask widening
If we lower a v2i64 shuffle to PSHUFD, we currently clamp undef elements to 0, (elements 0,1 of the v4i32) which can result in the shuffle referencing more elements of the source vector than expected, affecting later shuffle combines and KnownBits/SimplifyDemanded calls.
By ensuring we widen the undef mask element we allow getV4X86ShuffleImm8 to use inline elements as the default, which are more likely to fold.
Vincent Zhao [Sun, 26 Jul 2020 14:40:07 +0000 (20:10 +0530)]
[MLIR][Affine] Add test for non-hyperrectangular loop tiling
This diff provides a concrete test case for the error that will be raised when the iteration space is non hyper-rectangular.
The corresponding emission method for this error message has been changed as well.
Differential Revision: https://reviews.llvm.org/D84531
Matt Arsenault [Sat, 25 Jul 2020 15:56:33 +0000 (11:56 -0400)]
AMDGPU/GlobalISel: Fix not constraining ds_append/consume operands
Matt Arsenault [Sat, 25 Jul 2020 15:00:35 +0000 (11:00 -0400)]
GlobalISel: Handle G_PTR_ADD in narrowScalar
Matt Arsenault [Sat, 25 Jul 2020 14:47:33 +0000 (10:47 -0400)]
GlobalISel: Handle fewerElementsVector for G_PTR_ADD
Matt Arsenault [Sat, 25 Jul 2020 15:14:27 +0000 (11:14 -0400)]
AMDGPU/GlobalISel: Reorder G_CONSTANT legality rules
The legal cases should be the first rules.
Matt Arsenault [Sat, 25 Jul 2020 21:22:22 +0000 (17:22 -0400)]
AMDGPU/GlobalISel: Make sure <2 x s1> phis are scalarized
Matt Arsenault [Sat, 25 Jul 2020 19:41:58 +0000 (15:41 -0400)]
AMDGPU/GlobalISel: Legalize GDS atomics
I noticed these don't use the _gfx9, non-m0 reading variants but not
sure if that's a bug or not. It's the same in the DAG.
Matt Arsenault [Sat, 18 Jul 2020 19:30:59 +0000 (15:30 -0400)]
AMDGPU/GlobalISel: Pack constant G_BUILD_VECTOR_TRUNCs when selecting
Sanjay Patel [Sun, 26 Jul 2020 13:33:13 +0000 (09:33 -0400)]
[InstSimplify] fold integer min/max intrinsics with limit constant
Matt Arsenault [Sun, 26 Jul 2020 13:26:48 +0000 (09:26 -0400)]
GlobalISel: Handle 'n' inline asm constraint
Matt Arsenault [Sat, 25 Jul 2020 18:37:29 +0000 (14:37 -0400)]
AMDGPU/GlobalISel: Sign extend integer constants
This matches the DAG behavior and fixes immediate folding
Matt Arsenault [Sat, 25 Jul 2020 17:21:31 +0000 (13:21 -0400)]
AMDGPU/GlobalISel: Replace selection tests for G_CONSTANT/G_FCONSTANT
Split into separate tests and make more consistent with the others.
Xing GUO [Sun, 26 Jul 2020 08:01:22 +0000 (16:01 +0800)]
[DWARFYAML] Rename getUsedSectionNames() to getNonEmptySectionNames().
This patch renames getUsedSectionNames() to getNonEmptySectionNames.
NFC.
Sanjay Patel [Sat, 25 Jul 2020 20:40:14 +0000 (16:40 -0400)]
[InstSimplify] add tests for min/max intrinsics; NFC
Sanjay Patel [Fri, 24 Jul 2020 19:11:02 +0000 (15:11 -0400)]
[InstSimplify] fold fcmp using isKnownNeverInfinity + isKnownNeverNaN
Follow-up to D84035 / rG7393d7574c09.
This sidesteps a question of FMF/poison on fcmp raised in PR46077:
http://bugs.llvm.org/PR46077
https://alive2.llvm.org/ce/z/TCsyzD
define i1 @src(float %x) {
%0:
%x42 = fadd nnan ninf float %x, 42.000000
%r = fcmp ueq float %x42, inf
ret i1 %r
}
=>
define i1 @tgt(float %x) {
%0:
ret i1 0
}
Transformation seems to be correct!
https://alive2.llvm.org/ce/z/FQaH7a
define i1 @src(i8 %x) {
%0:
%cast = uitofp i8 %x to float
%r = fcmp one float inf, %cast
ret i1 %r
}
=>
define i1 @tgt(i8 %x) {
%0:
ret i1 1
}
Transformation seems to be correct!
Sanjay Patel [Fri, 24 Jul 2020 18:45:50 +0000 (14:45 -0400)]
[InstSimplify] add tests for fcmp with infinity constant; NFC
Juneyoung Lee [Sun, 26 Jul 2020 13:00:01 +0000 (22:00 +0900)]
[JumpThreading] Add a test for D84598; NFC
Juneyoung Lee [Sun, 26 Jul 2020 12:54:44 +0000 (21:54 +0900)]
[ConstantFolding] Fold freeze if it is never undef or poison
This is a simple patch that adds constant folding for freeze
instruction.
IIUC, it isn't needed to update ConstantFold.cpp because there is no freeze
constexpr.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D84597
Juneyoung Lee [Sun, 26 Jul 2020 12:48:51 +0000 (21:48 +0900)]
[ValueTracking] Instruction::isBinaryOp should be used for constexprs
This is a simple patch that makes canCreateUndefOrPoison use
Instruction::isBinaryOp because BinaryOperator inherits Instruction.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D84596
Juneyoung Lee [Sun, 26 Jul 2020 12:02:31 +0000 (21:02 +0900)]
NFC; add a test for freeze's constprop
Juneyoung Lee [Sun, 26 Jul 2020 11:47:19 +0000 (20:47 +0900)]
NFC; add an example that subtracts pointers to two global vars
Roman Lebedev [Sun, 26 Jul 2020 11:05:00 +0000 (14:05 +0300)]
[NFC][XRay] Account: migrate to DenseMap + SmallVector, -16% faster on large (3.8G) input
DenseMap is a single allocation underneath, so this is has pretty expected
performance impact on large-ish (3.8G) xray log processing time.
Roman Lebedev [Sun, 26 Jul 2020 11:00:15 +0000 (14:00 +0300)]
[NFC][XRay] Account: decouple getStats() interface from underlying data structure
It doesn't really need to know where Timings are stored, it just needs
to be able to sort them, so MutableArrayRef is enough.
That uncovers an interesting quirk that it relied on
implicit double->int conversion for calculating percentiles.
Alex Richardson [Sun, 26 Jul 2020 10:39:22 +0000 (11:39 +0100)]
[lit] Don't include tests skipped due to sharding in reports
When running multiple shards, don't include skipped tests in the xunit
output since merging the files will result in duplicates.
In our CHERI Jenkins CI, I configured the libc++ tests to run using sharding
(since we are testing using a single-CPU QEMU). We then merge the generated
XUnit xml files to produce a final result, but if the individual XMLs
report tests excluded due to sharding each test is included N times in the
final result. This also makes it difficult to find the tests that were
skipped due to missing REQUIRES: etc.
Reviewed By: yln
Differential Revision: https://reviews.llvm.org/D84235
Alex Richardson [Sun, 26 Jul 2020 10:37:47 +0000 (11:37 +0100)]
[asan] Mark the strstr test as UNSUPPORTED on FreeBSD
Like Android, FreeBSDs libc calls memchr which causes this test to fail.
Reviewed By: emaste
Differential Revision: https://reviews.llvm.org/D84541
Amara Emerson [Sun, 26 Jul 2020 07:46:29 +0000 (00:46 -0700)]
[AArch64][GlobalISel] Make <8 x s16> and <16 x s8> legal types for G_SHUFFLE_VECTOR and G_IMPLICIT_DEF.
Trivial change, we're still missing support for rev matching for these types
in the combiner.
Craig Topper [Sun, 26 Jul 2020 05:05:46 +0000 (22:05 -0700)]
[X86] Merge X86MCInstLowering's maxLongNopLength into emitNop and remove check for FeatureNOPL.
The switch in emitNop uses 64-bit registers for nops exceeding
2 bytes. This isn't valid outside 64-bit mode. We could fix this
easily enough, but there are no users that ask for more than 2
bytes outside 64-bit mode.
Inlining the method to make the coupling between the two methods
more explicit.
Craig Topper [Sun, 26 Jul 2020 03:48:46 +0000 (20:48 -0700)]
[X86] Remove getProcFamily() method from X86Subtarget. NFC
This isn't used and we've decided in the past that a CPU enum
for tuning is not a good idea.
Jacques Pienaar [Sun, 26 Jul 2020 04:37:15 +0000 (21:37 -0700)]
[mlir][shape] Further operand and result type generalization
Previous changes generalized some of the operands and results. Complete
a larger group of those to simplify progressive lowering. Also update
some of the declarative asm form due to generalization. Tried to keep it
mostly mechanical.
Changpeng Fang [Sun, 26 Jul 2020 04:20:59 +0000 (21:20 -0700)]
DADCombiner: Don't simplify the token factor if the node's number of operands already exceeds TokenFactorInlineLimit
Summary:
In parallelizeChainedStores, a TokenFactor was created with the size greater than 3000.
We found that DAGCombiner::visitTokenFactor will consume a huge amount of time on
such nodes. Since the number of operands already exceeds TokenFactorInlineLimit, we propose
to give up simplification with the consideration of compile time.
Reviewers:
@spatel, @arsenm
Differential Revision:
https://reviews.llvm.org/D84204
Craig Topper [Sun, 26 Jul 2020 03:46:42 +0000 (20:46 -0700)]
[X86] Replace a use of ProcIntelSLM with FeatureFast7ByteNOP.
Eric Christopher [Sun, 26 Jul 2020 01:42:04 +0000 (18:42 -0700)]
Temporarily Revert "Unify the return value of GetByteSize to an llvm::Optional<uint64_t> (NFC-ish)"
as it's causing numerous (176) test failures on linux.
This reverts commit
1d9b860fb6a85df33fd52fcacc6a5efb421621bd.
Eric Christopher [Sun, 26 Jul 2020 01:34:02 +0000 (18:34 -0700)]
Fold StatepointBB into checks as it's only used from an NDEBUG or ASSERT
context fixing an unused variable warning.
Nemanja Ivanovic [Sun, 26 Jul 2020 00:28:52 +0000 (20:28 -0400)]
[PowerPC][NFC] Fix an assert that cannot trip from
7d076e19e31a
I mixed up the precedence of operators in the assert and thought I
had it right since there was no compiler warning. This just
adds the parentheses in the expression as needed.
Philip Reames [Sat, 25 Jul 2020 23:40:06 +0000 (16:40 -0700)]
[Statepoints] Style cleanup after
3da1a963 [NFC]
Just fixing a few minor stylistic issues.
Craig Topper [Sat, 25 Jul 2020 23:36:33 +0000 (16:36 -0700)]
[X86] Add masked versions of the VPTERNLOG test cases added for D83630. NFC
We don't handle these yet and D83630 won't improve that, but
at least we'll have the tests.
Roman Lebedev [Sat, 25 Jul 2020 21:56:36 +0000 (00:56 +0300)]
[Reduce] Argument reduction: do deal with function declarations
We can happily turn function definitions into declarations,
thus obscuring their argument from being elided by this pass.
I don't believe there is a good reason to just ignore declarations.
likely even proper llvm intrinsics ones,
at worst the input becomes uninteresting.
The other question here is that all these transforms are all-or-nothing.
In some cases, should we be treating each use separately?
The main blocker here seemed to be that llvm::CloneFunctionInto()
does `&OldFunc->front()`, which inserts a nullptr into a densemap,
which is not happy about it and asserts.
Roman Lebedev [Sat, 25 Jul 2020 20:24:13 +0000 (23:24 +0300)]
[Reduce] Argument reduction: do properly handle invoke insts (PR46819)
replaceFunctionCalls() is very non-exhaustive, it only handles
CallInst's. Which means, by the time we drop old function,
there may still be uses of it lurking around.
Let's instead whack-a-mole them by all by replacing with undef.
I'm not sure this is the best handling, especially for calls, but IMO
poorly reduced input is much better than crashing reduction tool.
A (previously-crashing!) test added.
Fixes https://bugs.llvm.org/show_bug.cgi?id=46819
Roman Lebedev [Sat, 25 Jul 2020 19:31:05 +0000 (22:31 +0300)]
[Reduce] Basic block reduction: do properly handle invoke insts (PR46818)
Terminator may have returned value, so we need to replace uses,
and in general handle invoke as a branch inst.
I'm not sure this is the best handling, but IMO poorly reduced
input is much better than crashing reduction tool.
A (previously-crashing!) test added.
Fixes https://bugs.llvm.org/show_bug.cgi?id=46818
Lang Hames [Sat, 25 Jul 2020 21:18:52 +0000 (14:18 -0700)]
[ORC] Rename TargetProcessControl DynamicLibraryHandle and loadLibrary.
The new names, DylibHandle and loadDylib, are more concise and make
clear that these utilities are for loading dynamic libraries, not static
ones.
Lang Hames [Sat, 25 Jul 2020 04:17:37 +0000 (21:17 -0700)]
[ORC] Don't require PageSize or Triple during TargetProcessControl construction
Subclasses will commonly gather that information from a remote during
construction, in which case they won't have meaningful values to pass to
TargetProcessControl's constructor.
Frederik Gossen [Sat, 25 Jul 2020 22:01:21 +0000 (15:01 -0700)]
[MLIR][Shape] Allow `num_elements` to operate on extent tensors
Re-landing with dependent change landed and error condition relaxed.
Beyond the change to error condition exactly https://reviews.llvm.org/D84445.
Jacques Pienaar [Sat, 25 Jul 2020 21:55:19 +0000 (14:55 -0700)]
[MLIR][Shape] Refactor verification
Based on https://reviews.llvm.org/D84439 but less restrictive, else we
don't allow shape_of to be able to produce a ranked output and doesn't
allow for iterative refinement here. We can consider making it more
restrictive later.
Jacques Pienaar [Sat, 25 Jul 2020 21:47:57 +0000 (14:47 -0700)]
Revert "[MLIR][Shape] Allow `num_elements` to operate on extent tensors"
This reverts commit
55ced04d6bc13fd0f9396a0cfc393b44378d8784.
Forgot to submit depend change first.
Frederik Gossen [Sat, 25 Jul 2020 21:39:18 +0000 (14:39 -0700)]
[MLIR][Shape] Allow `num_elements` to operate on extent tensors
Differential Revision: https://reviews.llvm.org/D84445
Philip Reames [Sat, 11 Jul 2020 17:50:34 +0000 (10:50 -0700)]
[Statepoints] Support lowering gc relocations to virtual registers
(Disabled under flag for the moment)
This is part of a larger project wherein we are finally integrating lowering of gc live operands with the register allocator. Today, we force spill all operands in SelectionDAG. The code to do so is distinctly non-optimal. The approach this patch is working towards is to instead lower the relocations directly into the MI form, and let the register allocator pick which ones get spilled and which stack slots they get spilled to. In terms of performance, the later part is actually more important as it avoids redundant shuffling of values between stack slots.
This particular change adds ISEL support to produce the variadic def STATEPOINT form required by the above. In particular, the first N are lowered to variadic tied def/use pairs. So new statepoint looks like this:
reloc1,reloc2,... = STATEPOINT ..., base1, derived1<tied-def0>, base2, derived2<tied-def1>, ...
N is limited by the maximal number of tied registers machine instruction can have (15 at the moment).
The current patch is restricted to handling relocations within a single basic block. Cross block relocations (e.g. invokes) are handled via the legacy mechanism. This restriction will be relaxed in future patches.
Patch By: dantrushin
Differential Revision: https://reviews.llvm.org/D81648
Craig Topper [Sat, 25 Jul 2020 20:24:58 +0000 (13:24 -0700)]
[X86] Add llvm.roundeven test cases. Add f80 tests cases for constrained intrinsics that lower to libcalls. NFC
Craig Topper [Sat, 25 Jul 2020 19:12:16 +0000 (12:12 -0700)]
[X86] Fix intrinsic names in strict fp80 tests to use f80 in their names instead of x86_fp80.
The type is called x86_fp80, but when it is printed in the intrinsic
name it should be f80. The parser doesn't seem to care that the
name was wrong.
Fangrui Song [Sat, 25 Jul 2020 19:33:18 +0000 (12:33 -0700)]
[Driver] Define LinkOption and fix forwarded options to GCC for linking
Many driver options are neither 'DriverOption' nor 'LinkerInput'. When gcc is
used for linking, these options get forwarded even if they don't have anything
to do with linking. Among these options, clang-specific ones can cause gcc to
error.
Just use 'OPT_Link_Group' and a new flag 'LinkOption' for options which already
have a group.
gfortran support apparently bit rots (which does not seem to make much sense). XFAIL the test.
LLVM GN Syncbot [Sat, 25 Jul 2020 18:51:58 +0000 (18:51 +0000)]
[gn build] Port
136c8f50e96
Roman Lebedev [Sat, 25 Jul 2020 18:43:36 +0000 (21:43 +0300)]
[Reduce] Try turning function definitions into declarations first, NFCI-ish
ReduceFunctions could do it, but it also replaces *all* calls with undef,
so if any of undef replacements makes reduction uninteresting,
it won't work.
ReduceBasicBlocks also could do it, but well, it may take many guesses
for all the blocks of a function to happen to be out-of-chunk,
which is not a very efficient way to go about it.
So let's just do this first.
Adrian Prantl [Sat, 25 Jul 2020 15:27:21 +0000 (08:27 -0700)]
Unify the return value of GetByteSize to an llvm::Optional<uint64_t> (NFC-ish)
This cleanup patch unifies all methods called GetByteSize() in the
ValueObject hierarchy to return an optional, like the methods in
CompilerType do. This means fewer magic 0 values, which could fix bugs
down the road in languages where types can have a size of zero, such
as Swift and C (but not C++).
Differential Revision: https://reviews.llvm.org/D84285
Florian Hahn [Sat, 25 Jul 2020 14:45:24 +0000 (15:45 +0100)]
[X86] Remove stress-scheduledagrrlist.ll.
This test seems to take quite a long time with EXPENSIVE_CHECKS.
Remove it.
Nikita Popov [Sat, 25 Jul 2020 14:32:22 +0000 (16:32 +0200)]
[LVI] Don't require operand number for range (NFC)
Pass the Value* instead of the operand number, rename I to CxtI.
This makes the function a bit more generally useful.
Matt Arsenault [Tue, 16 Jun 2020 00:13:24 +0000 (20:13 -0400)]
AMDGPU/GlobalISel: Don't assert on G_INSERT > 128-bits
Just fallback for now. Really tablegen needs to generate all of the
subregister index handling we need.
Nikita Popov [Sat, 25 Jul 2020 14:02:15 +0000 (16:02 +0200)]
[SCCP] Add assume non null test (NFC)
Nikita Popov [Sat, 25 Jul 2020 13:10:48 +0000 (15:10 +0200)]
[SCCP] Restore the change reporting as well
Reapply
5db5b4bc4394ca247c9eb665e03b851848aa2fbf.
Nikita Popov [Tue, 21 Jul 2020 19:26:30 +0000 (21:26 +0200)]
Reapply [SCCP] Directly remove non-feasible edges
Reapply with DTU update moved after CFG update, which is a
requirement of the API.
-----
Non-feasible control-flow edges are currently removed by replacing
the branch condition with a constant and then calling
ConstantFoldTerminator. This happens in a rather roundabout manner,
by inspecting the users (effectively: predecessors) of unreachable
blocks, and further complicated by the need to explicitly materialize
the condition for "forced" edges. I would like to extend SCCP to
discard switch conditions that are non-feasible based on range
information, but this is incompatible with the current approach
(as there is no single constant we could use.)
Instead, this patch explicitly removes non-feasible edges. It
currently only needs to handle the case where there is a single
feasible edge. The llvm_unreachable() branch will need to be
implemented for the aforementioned switch improvement.
Differential Revision: https://reviews.llvm.org/D84264
Simon Pilgrim [Sat, 25 Jul 2020 11:58:39 +0000 (12:58 +0100)]
SimplifyLibCalls - remove unnecessary header and forward declaration. NFC.
We include TargetLibraryInfo.h so don't need to forward declare it, and we don't need to include TargetLibraryInfo.h in SimplifyLibCalls.cpp as well.
Simon Pilgrim [Sat, 25 Jul 2020 11:08:06 +0000 (12:08 +0100)]
[X86][SSE] combineX86ShufflesRecursively - move all Root node asserts to the same location. NFCI.
Minor tidyup for some upcoming shuffle combine improvements.
Simon Pilgrim [Sat, 25 Jul 2020 10:35:47 +0000 (11:35 +0100)]
SymbolRemappingReader.h - pass Twine by reference not value. NFCI.
Florian Hahn [Sat, 25 Jul 2020 10:52:14 +0000 (11:52 +0100)]
[IPSCCP] Drop argmemonly after replacing pointer argument.
This patch updates IPSCCP to drop argmemonly and
inaccessiblemem_or_argmemonly if it replaces a pointer argument.
Fixes PR46717.
Reviewers: efriedma, davide, nikic, jdoerfert
Reviewed By: efriedma, jdoerfert
Differential Revision: https://reviews.llvm.org/D84432
Nathan James [Sat, 25 Jul 2020 10:03:59 +0000 (11:03 +0100)]
Fix C2975 error under MSVC
Apparantly a constexpr value isn't a compile time constant under certain versions of MSVC.
Simon Pilgrim [Sat, 25 Jul 2020 09:50:56 +0000 (10:50 +0100)]
[X86][SSE] getFauxShuffle - ignore undemanded sources for PACKSS/PACKUS faux shuffles
If we don't care about an entire LHS/RHS of the PACK op, then can just treat it the same as undef (we don't care if it saturates) and is safe to treat as a shuffle.
This can happen if we attempt to decode as a faux shuffle before SimplifyDemandedVectorElts has been called on the PACK which should replace the source with UNDEF entirely.
Nathan James [Sat, 25 Jul 2020 09:37:33 +0000 (10:37 +0100)]
[ADT] Add a range-based version of std::move
Adds a range-based version of `std::move`, the version that moves a range, not the one that creates r-value references.
Reviewed By: dblaikie, gamesh411
Differential Revision: https://reviews.llvm.org/D83902
Jessica Paquette [Sat, 25 Jul 2020 01:14:41 +0000 (18:14 -0700)]
[AArch64][GlobalISel] Look through constants when selection stores of 0
Very minor code size improvements (hits 8 times in Bullet at -O3), but still
something.
Also very minor NFC change to make sure we only search for a 0 constant when
selecting a store. Before, we'd do this for loads as well.
Differential Revision: https://reviews.llvm.org/D84573
Kuba Mracek [Sat, 25 Jul 2020 03:14:00 +0000 (20:14 -0700)]
[tsan] Allow TSan in the Clang driver for Apple Silicon Macs
Differential Revision: https://reviews.llvm.org/D84082
Amy Kwan [Sat, 25 Jul 2020 01:57:57 +0000 (20:57 -0500)]
[PowerPC] Exploit the High Order Vector Multiply Instructions on Power10
This patch aims to exploit the following vector multiply high instructions on Power10.
vmulhsw VRT, VRA, VRB
vmulhsd VRT, VRA, VRB
vmulhuw VRT, VRA, VRB
vmulhud VRT, VRA, VRB
Differential Revision: https://reviews.llvm.org/D82584
Adrian Prantl [Sat, 25 Jul 2020 00:59:28 +0000 (17:59 -0700)]
Upstream macCatalyst support in ArchSpec and associated unit tests.
Rong Xu [Sat, 25 Jul 2020 00:39:55 +0000 (17:39 -0700)]
[PGO] Fix incorrect function entry count
Function entry count might be zero after the profile counts reset and
before reentry to the function.
Zero profile entry count is very bad as the profile count from BFI will
be wrong.
A simple fix is to set the profile entry count to 1 if there are
non-zero profile counts in this function.
Differential Revision: https://reviews.llvm.org/D84378
Rong Xu [Sat, 25 Jul 2020 00:38:31 +0000 (17:38 -0700)]
[PGO][InstrProf] Do not promote count if the exit blocks contains ret instruction
Skip profile count promotion if any of the ExitBlocks contains a ret
instruction. This is to prevent dumping of incomplete profile -- if the
the loop is a long running loop and dump is called in the middle
of the loop, the result profile is incomplete.
ExitBlocks containing a ret instruction is an indication of a long running
loop -- early exit to error handling code.
Differential Revision: https://reviews.llvm.org/D84379
Rong Xu [Sat, 25 Jul 2020 00:35:44 +0000 (17:35 -0700)]
Revert "[PGO][InstrProf] Do not promote count if the exit blocks contains ret instruction"
This reverts commit
6fdc6f6c7d34af60c45c405f448370a684ef6f2a.
Rong Xu [Sat, 25 Jul 2020 00:33:49 +0000 (17:33 -0700)]
Revert "[PGO][InstrProf] Do not promote count if the exit blocks contains ret instruction"
This reverts commit
867ef4472d8e57384c929e4f06b74d1ac8883a99.
Rong Xu [Sat, 25 Jul 2020 00:16:25 +0000 (17:16 -0700)]
[PGO][InstrProf] Do not promote count if the exit blocks contains ret instruction
Forgot including the tests in the commit
6fdc6f6c7d34af60c4.
Amy Kwan [Fri, 24 Jul 2020 22:41:50 +0000 (17:41 -0500)]
[PowerPC] Implement Truncate and Store VSX Vector Builtins
This patch implements the `vec_xst_trunc` function in altivec.h in order to
utilize the Store VSX Vector Rightmost [byte | half | word | doubleword] Indexed
instructions introduced in Power10.
Differential Revision: https://reviews.llvm.org/D82467
Jessica Paquette [Fri, 24 Jul 2020 23:57:37 +0000 (16:57 -0700)]
[AArch64][GlobalISel] Use wzr/xzr for 16 and 32 bit stores of zero
We weren't performing this optimization on 16 and 32 bit stores. SDAG happily
does this though.
e.g. https://godbolt.org/z/cWocKr
This saves about 0.2% in code size on CTMark at -O3.
Differential Revision: https://reviews.llvm.org/D84568
Rong Xu [Sat, 25 Jul 2020 00:13:58 +0000 (17:13 -0700)]
[PGO][InstrProf] Do not promote count if the exit blocks contains ret instruction
Skip profile count promotion if any of the ExitBlocks contains a ret
instruction. This is to prevent dumping of incomplete profile -- if the
the loop is a long running loop and dump is called in the middle
of the loop, the result profile is incomplete.
ExitBlocks containing a ret instruction is an indication of a long running
loop -- early exit to error handling code.
Differential Revision: https://reviews.llvm.org/D84379
Matt Arsenault [Sun, 19 Jul 2020 17:09:48 +0000 (13:09 -0400)]
GlobalISel: Define mulfix/divfix opcodes
The full expansion involves the funnel shifts, which depend on another
patch to expand those.
Amara Emerson [Fri, 24 Jul 2020 23:43:55 +0000 (16:43 -0700)]
[AArch64][GlobalISel] Promote G_UITOFP vector operands to same elt size as result.
Fixes legalization failures.
Jonas Devlieghere [Fri, 24 Jul 2020 23:20:55 +0000 (16:20 -0700)]
[lldb] Have LanguageRuntime and SystemRuntime share a base class (NFC)
LangaugeRuntime and SystemRuntime now both inherit from Runtime.
Jonas Devlieghere [Fri, 24 Jul 2020 22:10:05 +0000 (15:10 -0700)]
[lldb] Don't wrap and release raw pointer in unique_ptr (NFC)
Jez Ng [Fri, 24 Jul 2020 22:55:14 +0000 (15:55 -0700)]
[lld-macho] Ignore -dependency_info and its argument
XCode passes in this flag, which we do not yet implement. Skip
over the argument for now so we can at least successfully parse the
linker invocation.
Reviewed By: #lld-macho, compnerd
Differential Revision: https://reviews.llvm.org/D84485
Jez Ng [Fri, 24 Jul 2020 22:55:25 +0000 (15:55 -0700)]
[lld-macho] Partial support for weak definitions
This diff adds support for weak definitions, though it doesn't handle weak
symbols in dylibs quite correctly -- we need to emit binding opcodes for them
in the weak binding section rather than the lazy binding section.
What *is* covered in this diff:
1. Reading the weak flag from symbol table / export trie, and writing it to the
export trie
2. Refining the symbol table's rules for choosing one symbol definition over
another. Wrote a few dozen test cases to make sure we were matching ld64's
behavior.
We can now link basic C++ programs.
Reviewed By: #lld-macho, compnerd
Differential Revision: https://reviews.llvm.org/D83532
Alina Sbirlea [Fri, 10 Apr 2020 01:29:40 +0000 (18:29 -0700)]
Reapply "[DomTree] Replace ChildrenGetter with GraphTraits over GraphDiff."
This is the part of the patch that's moving the Updates to a CFGDiff
object. Splitting off from the clean-up work merging the two branches when BUI is null.
Differential Revision: https://reviews.llvm.org/D77341
Jinsong Ji [Fri, 24 Jul 2020 20:55:52 +0000 (20:55 +0000)]
[compiler-rt][CMake] Remove unused -stdlib when passing -nostdinc++
We added -nostdinc++ to clang_rt.profile in https://reviews.llvm.org/D84205.
This will cause warnings when building with LLVM_ENABLE_LIBCXX,
and failure if with Werror on.
This patch is to fix it by removing unused -stdlib,
similar to what we have done in https://reviews.llvm.org/D42238.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D84543
Jon Roelofs [Fri, 24 Jul 2020 20:54:17 +0000 (14:54 -0600)]
[compiler-rt][fuzzer] Disable bcmp.test on darwin
It broke one of the buildbots:
http://lab.llvm.org:8080/green/job/clang-stage1-RA/13026/console
Matt Arsenault [Thu, 23 Jul 2020 01:24:21 +0000 (21:24 -0400)]
AMDGPU: Skip other terminators before inserting s_cbranch_exec[n]z
PHIElimination/createPHISourceCopy inserts non-branch terminators
after the control flow pseudo if a successor phi reads a register
defined by the control flow pseudo. If this happens, we need to split
the expansion of the control flow pseudo to ensure all the branches
are after all of the other mask management instructions.
GlobalISel hit this in testscases that happened to be tail
duplicated. The original testcase still does not work, since the same
problem appears to be present in a later pass.
Petr Hosek [Fri, 24 Jul 2020 20:36:13 +0000 (13:36 -0700)]
[CMake] Find zlib when building lldb as standalone
This addresses the issue introduced by 10b1b4a.
Yifan Shen [Fri, 24 Jul 2020 20:30:04 +0000 (13:30 -0700)]
Add Debug Info Size to Symbol Status
If a module has debug info, the size of debug symbol will be displayed after the Symbols Loaded Message for each module in the VScode modules view.{
F12335461}
Reviewed By: wallace, clayborg
Differential Revision: https://reviews.llvm.org/D83731
Walter Erquinigo [Fri, 24 Jul 2020 20:28:29 +0000 (13:28 -0700)]
Revert "Add Debug Info Size to Symbol Status"
This reverts commit
986e3af53bfe591e88a1ae4f82ea1cc0a15819a3.
It incorrectly deleted clang/tools/clang-format/git-clang-format
Yifan Shen [Fri, 24 Jul 2020 19:45:41 +0000 (12:45 -0700)]
Add Debug Info Size to Symbol Status
Summary: If a module has debug info, the size of debug symbol will be displayed after the Symbols Loaded Message for each module in the VScode modules view.{
F12335461}
Reviewers: wallace, clayborg
Reviewed By: wallace, clayborg
Subscribers: cfe-commits, aprantl, lldb-commits
Tags: #lldb, #clang
Differential Revision: https://reviews.llvm.org/D83731
Eli Friedman [Thu, 23 Jul 2020 19:52:46 +0000 (12:52 -0700)]
[AArch64][SVE] Add "fast" fcmp operations.
dacf8d3 added support for most fcmp operations, but there are some extra
variations I hadn't considered: SelectionDAG supports float comparisons
that are neither ordered nor unordered. Add support for the missing
operations.
Differential Revision: https://reviews.llvm.org/D84460
Johannes Doerfert [Fri, 24 Jul 2020 19:06:27 +0000 (14:06 -0500)]
[SROA] Teach promote to register about droppable instructions
This is the second of two patches to address PR46753. We basically allow
SROA to promote allocas that are used in doppable instructions, for
now that means `llvm.assume`. The (transitive) uses are replaced by
`undef` in the droppable instructions.
See also D83976.
Reviewed By: Tyker
Differential Revision: https://reviews.llvm.org/D83978