Xin Tong [Mon, 8 Oct 2018 15:12:48 +0000 (15:12 +0000)]
[ThinLTO] Keep non-prevailing (linkonce|weak)_odr symbols live
Summary:
If we have a symbol with (linkonce|weak)_odr linkage, we do not want
to dead strip it even it is not prevailing.
IR level (linkonce|weak)_odr symbol can become non-prevailing when we mix
ELF objects and IR objects where the (linkonce|weak)_odr symbol in the ELF
object is prevailing and the ones in the IR objects are not. Stripping
them will prevent us from doing optimizations with them.
By not dead stripping them, We will convert these symbols to
available_externally linkage as a result of non-prevailing and eventually
dropping them after inlining.
I modified cache-prevailing.ll to use linkonce linkage as it is
testing whether cache prevailing bit is effective or not, not
we should treat linkonce_odr alive or not
Reviewers: tejohnson, pcc
Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D52893
llvm-svn: 343970
Oliver Stannard [Mon, 8 Oct 2018 14:12:08 +0000 (14:12 +0000)]
[AArch64][v8.5A] Don't create BR instructions in outliner when BTI enabled
When branch target identification is enabled, we can only do indirect
tail-calls through x16 or x17. This means that the outliner can't
transform a BLR instruction at the end of an outlined region into a BR.
Differential revision: https://reviews.llvm.org/D52869
llvm-svn: 343969
Oliver Stannard [Mon, 8 Oct 2018 14:09:15 +0000 (14:09 +0000)]
[AArch64][v8.5A] Restrict indirect tail calls to use x16/17 only when using BTI
When branch target identification is enabled, all indirectly-callable
functions start with a BTI C instruction. this instruction can only be
the target of certain indirect branches (direct branches and
fall-through are not affected):
- A BLR instruction, in either a protected or unprotected page.
- A BR instruction in a protected page, using x16 or x17.
- A BR instruction in an unprotected page, using any register.
Without BTI, we can use any non call-preserved register to hold the
address for an indirect tail call. However, when BTI is enabled, then
the code being compiled might be loaded into a BTI-protected page, where
only x16 and x17 can be used for indirect tail calls.
Legacy code withiout this restriction can still indirectly tail-call
BTI-protected functions, because they will be loaded into an unprotected
page, so any register is allowed.
Differential revision: https://reviews.llvm.org/D52868
llvm-svn: 343968
Oliver Stannard [Mon, 8 Oct 2018 14:04:24 +0000 (14:04 +0000)]
[AArch64][v8.5A] Branch Target Identification code-generation pass
The Branch Target Identification extension, introduced to AArch64 in
Armv8.5-A, adds the BTI instruction, which is used to mark valid targets
of indirect branches. When enabled, the processor will trap if an
instruction in a protected page tries to perform an indirect branch to
any instruction other than a BTI. The BTI instruction uses encodings
which were NOPs in earlier versions of the architecture, so BTI-enabled
code will still run on earlier hardware, just without the extra
protection.
There are 3 variants of the BTI instruction, which are valid targets for
different kinds or branches:
- BTI C can be targeted by call instructions, and is inteneded to be
used at function entry points. These are the BLR instruction, as well
as BR with x16 or x17. These BR instructions are allowed for use in
PLT entries, and we can also use them to allow indirect tail-calls.
- BTI J can be targeted by BR only, and is intended to be used by jump
tables.
- BTI JC acts ab both a BTI C and a BTI J instruction, and can be
targeted by any BLR or BR instruction.
Note that RET instructions are not restricted by branch target
identification, the reason for this is that return addresses can be
protected more effectively using return address signing. Direct branches
and calls are also unaffected, as it is assumed that an attacker cannot
modify executable pages (if they could, they wouldn't need to do a
ROP/JOP attack).
This patch adds a MachineFunctionPass which:
- Adds a BTI C at the start of every function which could be indirectly
called (either because it is address-taken, or externally visible so
could be address-taken in another translation unit).
- Adds a BTI J at the start of every basic block which could be
indirectly branched to. This could be either done by a jump table, or
by taking the address of the block (e.g. the using GCC label values
extension).
We only need to use BTI JC when a function is indirectly-callable, and
takes the address of the entry block. I've not been able to trigger this
from C or IR, but I've included a MIR test just in case.
Using BTI C at function entries relies on the fact that no other code in
BTI-protected pages uses indirect tail-calls, unless they use x16 or x17
to hold the address. I'll add that code-generation restriction as a
separate patch.
Differential revision: https://reviews.llvm.org/D52867
llvm-svn: 343967
Alexander Ivchenko [Mon, 8 Oct 2018 13:40:34 +0000 (13:40 +0000)]
[GlobalIsel][X86] Support G_UDIV/G_UREM/G_SREM
Support G_UDIV/G_UREM/G_SREM. The instruction selection
code is taken from FastISel with only minor tweaks to adapt
for GlobalISel.
Differential Revision: https://reviews.llvm.org/D49781
llvm-svn: 343966
Sanjay Patel [Mon, 8 Oct 2018 12:54:33 +0000 (12:54 +0000)]
[x86] add 16 missed hadd patterns (PR39195); NFC
llvm-svn: 343965
David Carlier [Mon, 8 Oct 2018 12:18:19 +0000 (12:18 +0000)]
[Sanitizer] fix internal_sysctlbyname build for FreeBSD.
llvm-svn: 343964
Haojian Wu [Mon, 8 Oct 2018 10:44:54 +0000 (10:44 +0000)]
[clangd] Update the out-of-date yaml-symbol-file flag in clangd.
Summary:
The flag is stale due to the recent changes of clangd indexer, this
patch renames the flag to "index-file".
Reviewers: sammccall
Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits
Differential Revision: https://reviews.llvm.org/D52976
llvm-svn: 343963
Neil Henning [Mon, 8 Oct 2018 10:32:33 +0000 (10:32 +0000)]
[IRBuilder] Fixup CreateIntrinsic to allow specifying Types to Mangle.
The IRBuilder CreateIntrinsic method wouldn't allow you to specify the
types that you wanted the intrinsic to be mangled with. To fix this
I've:
- Added an ArrayRef<Type *> member to both CreateIntrinsic overloads.
- Used that array to pass into the Intrinsic::getDeclaration call.
- Added a CreateUnaryIntrinsic to replace the most common use of
CreateIntrinsic where the type was auto-deduced from operand 0.
- Added a bunch more unit tests to test Create*Intrinsic calls that
weren't being tested (including the FMF flag that wasn't checked).
This was suggested as part of the AMDGPU specific atomic optimizer
review (https://reviews.llvm.org/D51969).
Differential Revision: https://reviews.llvm.org/D52087
llvm-svn: 343962
Francis Visoiu Mistrih [Mon, 8 Oct 2018 10:28:11 +0000 (10:28 +0000)]
[AsmParser] Return an error in the case of empty symbol ref in an expression
The following instruction:
> str q28, [x0, #1*6*4*@]
contains a @ which is parsed as an empty symbol. The parser returns true
but has no error, so the assembler continues by ignoring the
instruction.
Differential Revision: https://reviews.llvm.org/D52645
llvm-svn: 343961
Peter Smith [Mon, 8 Oct 2018 09:38:28 +0000 (09:38 +0000)]
[ARM] Account for implicit IT when calculating inline asm size
When deciding if it is safe to optimize a conditional branch to a CBZ or
CBNZ the offsets of the BasicBlocks from the start of the function are
estimated. For inline assembly the generic getInlineAsmLength() function is
used to get a worst case estimate of the inline assembly by multiplying the
number of instructions by the max instruction size of 4 bytes. This
unfortunately doesn't take into account the generation of Thumb implicit IT
instructions. In edge cases such as when all the instructions in the block
are 4-bytes in size and there is an implicit IT then the size is
underestimated. This can cause an out of range CBZ or CBNZ to be generated.
The patch takes a conservative approach and assumes that every instruction
in the inline assembly block may have an implicit IT.
Fixes pr31805
Differential Revision: https://reviews.llvm.org/D52834
llvm-svn: 343960
Oliver Stannard [Mon, 8 Oct 2018 09:18:48 +0000 (09:18 +0000)]
[AArch64] Fix verifier error when outlining indirect calls
The MachineOutliner for AArch64 transforms indirect calls into indirect
tail calls, replacing the call with the TCRETURNri pseudo-instruction.
This pseudo lowers to a BR, but has the isCall and isReturn flags set.
The problem is that TCRETURNri takes a tcGPR64 as the register argument,
to prevent indiret tail-calls from using caller-saved registers. The
indirect calls transformed by the outliner could use caller-saved
registers. This is fine, because the outliner ensures that the register
is available at all call sites. However, this causes a verifier failure
when the register is not in tcGPR64. The fix is to add a new
pseudo-instruction like TCRETURNri, but which accepts any GPR.
Differential revision: https://reviews.llvm.org/D52829
llvm-svn: 343959
Alex Bradbury [Mon, 8 Oct 2018 09:08:51 +0000 (09:08 +0000)]
[RISCV] Update alu8.ll and alu16.ll test cases
The srli test in alu8.ll was a no-op, as it shifted by 8 bits. Fix this, and
also change the immediate in alu16.ll as shifted by something other than a
poewr of 8 is more interesting.
llvm-svn: 343958
Kristina Brooks [Mon, 8 Oct 2018 09:03:17 +0000 (09:03 +0000)]
[DebugInfo][PDB] Fix a signed/unsigned coversion warning
Fix the following warning when compiling with clang (caused by commit
rL343951):
GlobalsStream.cpp:61:33: warning: comparison of integers of different
signs: 'int' and 'uint32_t'
This also avoids double evaluation of `GlobalsTable.HashBuckets.size()`.
llvm-svn: 343957
Ewan Crawford [Mon, 8 Oct 2018 08:40:45 +0000 (08:40 +0000)]
[InstCombine] Fix incongruous GEP type addrspace
Currently running the @insertelem_after_gep function below through the InstCombine pass with opt produces invalid IR.
Input:
```
define void @insertelem_after_gep(<16 x i32>* %t0) {
%t1 = bitcast <16 x i32>* %t0 to [16 x i32]*
%t2 = addrspacecast [16 x i32]* %t1 to [16 x i32] addrspace(3)*
%t3 = getelementptr inbounds [16 x i32], [16 x i32] addrspace(3)* %t2, i64 0, i64 0
%t4 = insertelement <16 x i32 addrspace(3)*> undef, i32 addrspace(3)* %t3, i32 0
call void @extern_vec_pointers_func(<16 x i32 addrspace(3)*> %t4)
ret void
}
```
Output:
```
define void @insertelem_after_gep(<16 x i32>* %t0) {
%t3 = getelementptr inbounds <16 x i32>, <16 x i32>* %t0, i64 0, i64 0
%t4 = insertelement <16 x i32 addrspace(3)*> undef, i32 addrspace(3)* %t3, i32 0
call void @my_extern_func(<16 x i32 addrspace(3)*> %t4)
ret void
}
```
Which although causes no complaints when produced, isn't valid IR as the insertelement use of the %t3 GEP expects an address space.
```
opt: /tmp/bad.ll:52:73: error: '%t3' defined with type 'i32*' but expected 'i32 addrspace(3)*'
%t4 = insertelement <16 x i32 addrspace(3)*> undef, i32 addrspace(3)* %t3, i32 0
```
I've fixed this by adding an addrspacecast after the GEP in the InstCombine pass, and including a check for this type mismatch to the verifier.
Reviewers: spatel, lebedev.ri
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D52294
llvm-svn: 343956
Alex Bradbury [Mon, 8 Oct 2018 06:24:59 +0000 (06:24 +0000)]
[SelectionDAGBuilder][NFC] Pass LHSTy to getShiftAmountTy rather than RHSTy
r126518 introduced a a type parameter to the getShiftAmountTy target hook. It
produces the type of the shift (RHSTy), parameterised by the type of the value
being shifted (LHSTy). SelectionDAGBuilder::visitShift passed RHSTy rather
than LHSTy and this patch corrects this. The change is a no-op because in LLVM
IR the LHS and RHS types for a shift must be equal anyway.
llvm-svn: 343955
Max Kazantsev [Mon, 8 Oct 2018 05:46:29 +0000 (05:46 +0000)]
[LV] Do not create SCEVs on broken IR in emitTransformedIndex. PR39160
At the point when we perform `emitTransformedIndex`, we have a broken IR (in
particular, we have Phis for which not every incoming value is properly set). On
such IR, it is illegal to create SCEV expressions, because their internal
simplification process may try to prove some predicates and break when it
stumbles across some broken IR.
The only purpose of using SCEV in this particular place is attempt to simplify
the generated code slightly. It seems that the result isn't worth it, because
some trivial cases (like addition of zero and multiplication by 1) can be
handled separately if needed, but more generally InstCombine is able to achieve
the goals we want to achieve by using SCEV.
This patch fixes a functional crash described in PR39160, and as side-effect it
also generates a bit smarter code in some simple cases. It also may cause some
optimality loss (i.e. we will now generate `mul` by power of `2` instead of
shift etc), but there is nothing what InstCombine could not handle later. In
case of dire need, we can support more trivial cases just in place.
Note that this patch only fixes one particular case of the general problem that
LV misuses SCEV, attempting to create SCEVs or prove predicates on invalid IR.
The general solution, however, seems complex enough.
Differential Revision: https://reviews.llvm.org/D52881
Reviewed By: fhahn, hsaito
llvm-svn: 343954
Zachary Turner [Mon, 8 Oct 2018 04:44:12 +0000 (04:44 +0000)]
Fix a -Wsign-compare warning.
llvm-svn: 343953
Zachary Turner [Mon, 8 Oct 2018 04:34:41 +0000 (04:34 +0000)]
Fix a compilation failure on non-MSVC compilers.
llvm-svn: 343952
Zachary Turner [Mon, 8 Oct 2018 04:19:16 +0000 (04:19 +0000)]
[PDB] Add the ability to lookup global symbols by name.
The Globals table is a hash table keyed on symbol name, so
it's possible to lookup symbols by name in O(1) time. Add
a function to the globals stream to do this, and add an option
to llvm-pdbutil to exercise this, then use it to write some
tests to verify correctness.
llvm-svn: 343951
Craig Topper [Mon, 8 Oct 2018 03:12:12 +0000 (03:12 +0000)]
Revert r343948 "[LegalizeDAG] Make one of the ReplaceNode signatures take an ArrayRef instead a pointer to an array. Add assert on size of array. NFC"
The assert is failing some asan tests on the bots.
llvm-svn: 343950
Brian Gesiak [Mon, 8 Oct 2018 03:08:39 +0000 (03:08 +0000)]
[coro]Pass rvalue reference for named local variable to return_value
Summary:
Addressing https://bugs.llvm.org/show_bug.cgi?id=37265.
Implements [class.copy]/33 of coroutines TS.
When the criteria for elision of a copy/move operation are met, but not
for an exception-declaration, and the object to be copied is designated by an
lvalue, or when the expression in a return or co_return statement is a
(possibly parenthesized) id-expression that names an object with automatic
storage duration declared in the body or parameter-declaration-clause of the
innermost enclosing function or lambda-expression, overload resolution to select
the constructor for the copy or the return_value overload to call is first
performed as if the object were designated by an rvalue.
Patch by Tanoy Sinha!
Reviewers: modocache, GorNishanov
Reviewed By: modocache, GorNishanov
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D51741
llvm-svn: 343949
Craig Topper [Mon, 8 Oct 2018 02:02:08 +0000 (02:02 +0000)]
[LegalizeDAG] Make one of the ReplaceNode signatures take an ArrayRef instead a pointer to an array. Add assert on size of array. NFC
llvm-svn: 343948
Craig Topper [Mon, 8 Oct 2018 00:04:55 +0000 (00:04 +0000)]
[LegalizeDAG] Move legalization of scatter and masked store from LegalizeVectorOps to LegalizeDAG.
This is where we legalize gather and masked load so this is consistent.
Since these ops are always on vectors I've chosen to go with LegalizeDAG since that's what we do for other vector only ops like BUILD_VECTOR, VECTOR_SHUFFLE, etc. The ScalarizeMaskedMemIntrinsic pass should take care of scalarizing these before SelectionDAG so hopefully we don't need to worry about illegally typed scalar ops being emitted in the legalizing. If we did we would need to do this in LegalizeVectorOps so we could get the second type legalization that runs between LegalizeVectorOps and LegalizeDAG.
llvm-svn: 343947
Fangrui Song [Sun, 7 Oct 2018 17:21:08 +0000 (17:21 +0000)]
[clangd] Migrate to LLVM STLExtras range API
llvm-svn: 343946
Sanjay Patel [Sun, 7 Oct 2018 16:30:42 +0000 (16:30 +0000)]
[DAGCombiner] allow undef elts in vector fadd matching
llvm-svn: 343945
Sanjay Patel [Sun, 7 Oct 2018 16:27:50 +0000 (16:27 +0000)]
[x86] add vector fadd with undef elts test; NFC
llvm-svn: 343944
Sanjay Patel [Sun, 7 Oct 2018 16:13:38 +0000 (16:13 +0000)]
[x86] remove redundant tests; NFC
The equivalent tests were added to the file with related folds in rL343941.
llvm-svn: 343943
Sanjay Patel [Sun, 7 Oct 2018 16:05:37 +0000 (16:05 +0000)]
[DAGCombiner] allow undefs when matching vector splats for fmul folds
llvm-svn: 343942
Sanjay Patel [Sun, 7 Oct 2018 16:00:55 +0000 (16:00 +0000)]
[x86] add vector fmul with undef elts tests; NFC
llvm-svn: 343941
Sanjay Patel [Sun, 7 Oct 2018 15:32:06 +0000 (15:32 +0000)]
[DAGCombiner] allow undef elts in vector fabs/fneg matching
This change is proposed as a part of D44548, but we
need this independently to avoid regressions from improved
undef propagation in SimplifyDemandedVectorElts().
llvm-svn: 343940
Sanjay Patel [Sun, 7 Oct 2018 15:18:30 +0000 (15:18 +0000)]
[DAGCombiner] shorten code for bitcast+fabs fold; NFC
llvm-svn: 343939
Sanjay Patel [Sun, 7 Oct 2018 15:05:39 +0000 (15:05 +0000)]
[x86] add tests for FP logic folding for vectors with undefs; NFC
llvm-svn: 343938
Kirill Bobyrev [Sun, 7 Oct 2018 14:49:41 +0000 (14:49 +0000)]
[clangd] NFC: Migrate to LLVM STLExtras API where possible
This patch improves readability by migrating `std::function(ForwardIt
start, ForwardIt end, ...)` to LLVM's STLExtras range-based equivalent
`llvm::function(RangeT &&Range, ...)`.
Similar change in Clang: D52576.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D52650
llvm-svn: 343937
Sanjay Patel [Sun, 7 Oct 2018 14:46:33 +0000 (14:46 +0000)]
[InstSimplify] add vector test for fneg+fdiv; NFC
This should be fixed with D52934.
llvm-svn: 343936
Simon Pilgrim [Sun, 7 Oct 2018 11:45:46 +0000 (11:45 +0000)]
[SelectionDAG] Respect multiple uses in SimplifyDemandedBits to SimplifyDemandedVectorElts simplification
rL343913 was using SimplifyDemandedBits's original demanded mask instead of the adjusted 'NewMask' that accounts for multiple uses of the op (those variable names really need improving....).
Annoyingly many of the test changes (back to pre-rL343913 state) are actually safe - but only because their multiple uses are all by PMULDQ/PMULUDQ.
Thanks to Jan Vesely (@jvesely) for bisecting the bug.
llvm-svn: 343935
Simon Pilgrim [Sun, 7 Oct 2018 11:24:04 +0000 (11:24 +0000)]
[AARCH64][X86] Remove _nonsplat from test names
As discussed on D50222
llvm-svn: 343934
Craig Topper [Sun, 7 Oct 2018 07:16:44 +0000 (07:16 +0000)]
[LegalizeVectorOps] Make ExpandStrictFPOp return the result corresponding to the result number of the SDValue passed in.
It was always returning the chain which seems to be the result number of the SDValue in the lit tests we have. But I don't know if that's guaranteed.
llvm-svn: 343933
Dorit Nuzman [Sun, 7 Oct 2018 06:57:25 +0000 (06:57 +0000)]
[IAI,LV] Avoid creating interleave-groups for predicated accesse
This patch fixes PR39099.
When strided loads are predicated, each of them will form an interleaved-group
(with gaps). However, subsequent stages of vectorization (planning and
transformation) assume that if a load is part of an Interleave-Group it is not
predicated, resulting in wrong code - unmasked wide loads are created.
The Interleaving Analysis does take care not to have conditional interleave
groups of size > 1, but until we extend the planning and transformation stages
to support masked-interleave-groups we should also avoid having them for
size == 1.
Reviewers: Ayal, hsaito, dcaballe, fhahn
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D52682
llvm-svn: 343931
Alex Bradbury [Sun, 7 Oct 2018 06:53:46 +0000 (06:53 +0000)]
[RISCV] Introduce alu8.ll and alu16.ll tests
These track the quality of generated code for simple arithmetic operations
that were legalised from non-native types.
llvm-svn: 343930
Lang Hames [Sun, 7 Oct 2018 01:08:02 +0000 (01:08 +0000)]
[ORC] Consume unhandled errors in unit test.
This should fix the failures on the debug buildbots.
llvm-svn: 343929
Lang Hames [Sat, 6 Oct 2018 23:03:59 +0000 (23:03 +0000)]
[ORC] Add a 'remove' method to JITDylib to remove symbols.
Symbols can be removed provided that all are present in the JITDylib and none
are currently in the materializing state. On success all requested symbols are
removed. On failure an error is returned and no symbols are removed.
llvm-svn: 343928
Lang Hames [Sat, 6 Oct 2018 23:02:06 +0000 (23:02 +0000)]
[ORC] Pass symbol name to discard by const reference.
This saves some unnecessary atomic ref-counting operations.
llvm-svn: 343927
Simon Pilgrim [Sat, 6 Oct 2018 22:13:44 +0000 (22:13 +0000)]
[X86] getFauxShuffleMask - Handle undef + sentinel values in subvector insertion
llvm-svn: 343926
Simon Pilgrim [Sat, 6 Oct 2018 20:24:27 +0000 (20:24 +0000)]
[X86][SSE] Add SSE41 vector int2fp tests
llvm-svn: 343925
Simon Pilgrim [Sat, 6 Oct 2018 17:18:41 +0000 (17:18 +0000)]
[X86][AVX] Ensure resolveTargetShuffleInputs shuffle masks are the correct width
Don't handle ZERO_EXTEND style shuffles until we support bitcasts. Found by inspection.
llvm-svn: 343924
Marshall Clow [Sat, 6 Oct 2018 15:07:03 +0000 (15:07 +0000)]
Papers and Issues for San Diego
llvm-svn: 343923
Simon Pilgrim [Sat, 6 Oct 2018 14:51:14 +0000 (14:51 +0000)]
[X86] combinePMULDQ - add op back to worklist if SimplifyDemandedBits succeeds on either operand
Prevents missing other simplifications that may occur deep in the operand chain where CommitTargetLoweringOpt won't add the PMULDQ back to the worklist itself
llvm-svn: 343922
Simon Pilgrim [Sat, 6 Oct 2018 14:26:38 +0000 (14:26 +0000)]
[X86] Regenerate LSR loop iteration test
llvm-svn: 343921
Sanjay Patel [Sat, 6 Oct 2018 14:11:05 +0000 (14:11 +0000)]
[x86] add test for masked store with extra shift op; NFC
llvm-svn: 343920
Simon Pilgrim [Sat, 6 Oct 2018 13:49:31 +0000 (13:49 +0000)]
[X86][SSE] SimplifyDemandedVectorEltsForTargetNode - simplify PSHUFB masks
Attempt to simplify PSHUFB masks (even non-constant ones) - we should probably be able to simplify other variable shuffles as well as the need arises.
llvm-svn: 343919
Simon Pilgrim [Sat, 6 Oct 2018 13:29:08 +0000 (13:29 +0000)]
[X86] Use the SimplifyDemandedBits wrappers where possible. NFCI.
Leave the wrapper to handle TargetLowering::TargetLoweringOpt and CommitTargetLoweringOpt.
llvm-svn: 343918
Simon Pilgrim [Sat, 6 Oct 2018 11:59:31 +0000 (11:59 +0000)]
Revert rL343916: Fix -Wmissing-braces warning. NFCI.
llvm-svn: 343917
Simon Pilgrim [Sat, 6 Oct 2018 11:46:27 +0000 (11:46 +0000)]
Fix -Wmissing-braces warning. NFCI.
llvm-svn: 343916
Simon Pilgrim [Sat, 6 Oct 2018 11:12:59 +0000 (11:12 +0000)]
Wdocumentation fix
llvm-svn: 343915
Simon Pilgrim [Sat, 6 Oct 2018 11:09:15 +0000 (11:09 +0000)]
Wdocumentation fix
llvm-svn: 343914
Simon Pilgrim [Sat, 6 Oct 2018 10:20:04 +0000 (10:20 +0000)]
[SelectionDAG] Add SimplifyDemandedBits to SimplifyDemandedVectorElts simplification
This patch enables SimplifyDemandedBits to call SimplifyDemandedVectorElts in cases where the demanded bits mask covers entire elements of a bitcasted source vector.
There are a couple of cases here where simplification at a deeper level (such as through bitcasts) prevents further simplification - CommitTargetLoweringOpt only adds immediate uses/users back to the worklist when we might want to combine the original caller again to see what else it can simplify.
As well as that I had to disable handling of bool vector until SimplifyDemandedVectorElts better supports some of their opcodes (SETCC, shifts etc.).
Fixes PR39178
Differential Revision: https://reviews.llvm.org/D52935
llvm-svn: 343913
Fangrui Song [Sat, 6 Oct 2018 07:00:50 +0000 (07:00 +0000)]
[clangd] Remove unused headers from CodeComplete.cpp
queue is not used after index-provided completions' merge with those from Sema
USRGeneration.h is not used after introduction of getSymbolID
llvm-svn: 343912
Alex Bradbury [Sat, 6 Oct 2018 06:09:46 +0000 (06:09 +0000)]
[RISCV] Compress addiw rd, x0, simm6 to c.li rd, simm6
A pattern was present for addi rd, x0, simm6 but not addiw which is
semantically identical when the source register is x0. This patch addresses
that, and the benefit can be seen in rv64c-aliases-valid.s.
llvm-svn: 343911
Tom Stellard [Sat, 6 Oct 2018 03:32:43 +0000 (03:32 +0000)]
AMDGPU: Consolidate SMRD TableGen patterns
Summary:
Merge the SMRD patterns for CI into the same multiclass as the
patterns for other sub-targets.
This removes some duplicate code and will make it easier for some
future GlobalISel changes I would like to do.
Reviewers: arsenm
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D52557
llvm-svn: 343909
Aaron Puchert [Sat, 6 Oct 2018 01:09:28 +0000 (01:09 +0000)]
Thread safety analysis: Handle conditional expression in getTrylockCallExpr
Summary:
We unwrap conditional expressions containing try-lock functions.
Additionally we don't acquire on conditional expression branches, since
that is usually not helpful. When joining the branches we would almost
certainly get a warning then.
Hopefully fixes an issue that was raised in D52398.
Reviewers: aaron.ballman, delesley, hokein
Reviewed By: aaron.ballman
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D52888
llvm-svn: 343902
Jordan Rupprecht [Fri, 5 Oct 2018 23:25:39 +0000 (23:25 +0000)]
[llvm-ar] Use POSIX-specified timestamps for 'tv'.
Summary:
The POSIX spec says:
```
If the −t option is used with the −v option, the standard output format shall be:
"%s %u/%u %u %s %d %d:%d %d %s\n", <member mode>, <user ID>,
<group ID>, <number of bytes in member>,
<abbreviated month>, <day-of-month>, <hour>,
<minute>, <year>, <file>
where:
...
<abbreviated month>
Equivalent to the format of the %b conversion specification format in date.
<day-of-month>
Equivalent to the format of the %e conversion specification format in date.
<hour> Equivalent to the format of the %H conversion specification format in date.
<minute> Equivalent to the format of the %M conversion specification format in date.
<year> Equivalent to the format of the %Y conversion specification format in date.
```
This actually used to be the format printed by llvm-ar. It was apparently accidentally changed (see r207385 followed by comments in r207387). This makes it conform to GNU ar for easier replacement.
Reviewers: MaskRay
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D52940
llvm-svn: 343901
Vedant Kumar [Fri, 5 Oct 2018 23:23:15 +0000 (23:23 +0000)]
Add support for artificial tail call frames
This patch teaches lldb to detect when there are missing frames in a
backtrace due to a sequence of tail calls, and to fill in the backtrace
with artificial tail call frames when this happens. This is only done
when the execution history can be determined from the call graph and
from the return PC addresses of calls on the stack. Ambiguous sequences
of tail calls (e.g anything involving tail calls and recursion) are
detected and ignored.
Depends on D49887.
Differential Revision: https://reviews.llvm.org/D50478
llvm-svn: 343900
Vedant Kumar [Fri, 5 Oct 2018 23:14:13 +0000 (23:14 +0000)]
Relax a data formatter test
Before inspecting the contents of a list, make sure that we've stepped
past the push_back() that inserts the element we're interested in.
llvm-svn: 343899
Fedor Sergeev [Fri, 5 Oct 2018 22:32:01 +0000 (22:32 +0000)]
[New PM][PassTiming] implement -time-passes for the new pass manager
Enable time-passes functionality through PassInstrumentation callbacks
for passes and analyses.
TimePassesHandler class keeps all the callbacks, the timing data as it
is being collected as well as the stack of currently active timers.
Parts of the fix that might be somewhat unobvious:
- mapping of passes into Timer (TimingData) can not be done per-instance.
PassID name provided into the callback is common for all the pass invocations.
Thus the only way to get a timing with reasonable granularity is to collect
timing data per pass invocation, getting a new timer for each BeforePass.
Hence the key for TimingData uses a pair of <StringRef/unsigned count> to
uniquely identify a pass invocation.
- consequently, this new-pass-manager implementation performs no aggregation
of timing data, reporting timings for each pass invocation separately.
In that it differs from legacy-pass-manager time-passes implementation that
reports timing data aggregated per pass instance.
- pass managers and adaptors are not tracked, similar to how pass managers are
not tracked in legacy time-passes.
- TimerStack tracks timers that are active, each BeforePass pushes the new timer
on stack, each AfterPass pops active timer from stack and stops it.
Reviewers: chandlerc, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D51276
llvm-svn: 343898
Joel Jones [Fri, 5 Oct 2018 22:23:21 +0000 (22:23 +0000)]
[AArch64] -mcpu=native CPU detection for Cavium processors
This small patch updates the CPU detection for Cavium processors when
-mcpu=native is passed on compile-line.
Patch by Stefan Teleman
Differential Revision: https://reviews.llvm.org/D51939
llvm-svn: 343897
Petr Hosek [Fri, 5 Oct 2018 22:16:37 +0000 (22:16 +0000)]
[llvm-nm] Update all tests to redirect stderr to stdout
This addresses the breakage introduced in r343887.
llvm-svn: 343896
Matthias Braun [Fri, 5 Oct 2018 22:00:13 +0000 (22:00 +0000)]
X86, AArch64, ARM: Do not attach debug location to spill/reload instructions
This rebases and recommits r343520. hwasan should be fixed now and this
shouldn't break the tests anymore.
Spill/reload instructions are artificially generated by the compiler and
have no relation to the original source code. So the best thing to do is
not attach any debug location to them (instead of just taking the next
debug location we find on following instructions).
Differential Revision: https://reviews.llvm.org/D52125
llvm-svn: 343895
Mandeep Singh Grang [Fri, 5 Oct 2018 21:57:41 +0000 (21:57 +0000)]
[COFF, ARM64] Add _InterlockedAdd intrinsic
Reviewers: rnk, mstorsjo, compnerd, TomTan, haripul, javed.absar, efriedma
Reviewed By: efriedma
Subscribers: efriedma, kristof.beyls, chrib, jfb, cfe-commits
Differential Revision: https://reviews.llvm.org/D52811
llvm-svn: 343894
Vedant Kumar [Fri, 5 Oct 2018 21:54:58 +0000 (21:54 +0000)]
Specify -mtriple=x86_64 in an X86-specific dwarf test
On the PPC bot, the %llc_dwarf substitution does not contain an -mtriple
argument. This can cause the wrong backend to be exercised.
This causes issues because the backends differ in when they decide to
emit tail calls:
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/12440
This is mostly a speculative fix as I don't have a PPC machine to test
with.
llvm-svn: 343893
James Y Knight [Fri, 5 Oct 2018 21:53:51 +0000 (21:53 +0000)]
Emit CK_NoOp casts in C mode, not just C++.
Previously, it had been using CK_BitCast even for casts that only
change const/restrict/volatile. Now it will use CK_Noop where
appropriate.
This is an alternate solution to r336746.
Differential Revision: https://reviews.llvm.org/D52918
llvm-svn: 343892
Simon Pilgrim [Fri, 5 Oct 2018 21:44:19 +0000 (21:44 +0000)]
[X86][AVX] Limit getFauxShuffleMask INSERT_SUBVECTOR support to 2 inputs
rL343853 didn't limit the number of subinputs, but we don't currently support faux shuffles with more than 2 total inputs, so put a limiter in place until this is fixed.
Found by Artem Dergachev.
llvm-svn: 343891
Vedant Kumar [Fri, 5 Oct 2018 21:44:15 +0000 (21:44 +0000)]
[LiveDebugValues] Extend var ranges through artificial blocks
ASan often introduces basic blocks consisting exclusively of
instructions without debug locations, or with line 0 debug locations.
LiveDebugValues needs to extend variable ranges through these artificial
blocks. Otherwise, a lot of variables disappear -- even at -O0.
Typically, LiveDebugValues does not extend a variable's range into a
block unless the block is essentially "part of" the variable's scope
(for a precise definition, see LexicalScopes::dominates). This patch
relaxes the lexical dominance check for artificial blocks.
This makes the following Swift program debuggable at -O0:
```
1| var x = 100
2| print("x = \(x)")
```
rdar://
39127144
Differential Revision: https://reviews.llvm.org/D52921
llvm-svn: 343890
Vedant Kumar [Fri, 5 Oct 2018 21:44:00 +0000 (21:44 +0000)]
Clarify debug output in LiveDebugValues
MachineBasicBlocks often do not have names, so it helps to refer to them
by block number when printing debug messages.
llvm-svn: 343889
Vedant Kumar [Fri, 5 Oct 2018 21:28:14 +0000 (21:28 +0000)]
Disable the dwarf callsite attrs test on Windows
The Windows formats don't understand relocations inside of AT_return_pc.
http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/270
llvm-svn: 343888
Petr Hosek [Fri, 5 Oct 2018 21:10:03 +0000 (21:10 +0000)]
[llvm-nm] Write "no symbol" output to stderr
This matches the output of binutils' nm and ensures that any scripts
or tools that use nm and expect empty output in case there no symbols
don't break.
Differential Revision: https://reviews.llvm.org/D52943
llvm-svn: 343887
Vedant Kumar [Fri, 5 Oct 2018 21:05:31 +0000 (21:05 +0000)]
Avoid hardcoding PC addresses in a dwarf test
The PCs appear to vary from builder-to-builder:
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/20053
llvm-svn: 343886
Jessica Paquette [Fri, 5 Oct 2018 21:02:46 +0000 (21:02 +0000)]
[GlobalIsel] Add llvm.invariant.start and llvm.invariant.end
Port over the implementation in SelectionDAGBuilder.cpp into the IRTranslator
and update the arm64-irtranslator test.
These were causing fallbacks in CTMark/Bullet (-Rpass-missed=gisel-select),
and this patch fixes that.
https://reviews.llvm.org/D52945
llvm-svn: 343885
David Blaikie [Fri, 5 Oct 2018 20:55:20 +0000 (20:55 +0000)]
dwarfdump: Avoid parsing units unnecessarily
NFC-ish (the parsing of the units is not a functional change - no
errors/warnings are emitted during the shallow parsing - though without
parsing them here, the "max version" would be wrong (still zero) later
on, so in those cases the units do need to be parsed)
llvm-svn: 343884
Vedant Kumar [Fri, 5 Oct 2018 20:37:17 +0000 (20:37 +0000)]
[DebugInfo] Add support for DWARF5 call site-related attributes
DWARF v5 introduces DW_AT_call_all_calls, a subprogram attribute which
indicates that all calls (both regular and tail) within the subprogram
have call site entries. The information within these call site entries
can be used by a debugger to populate backtraces with synthetic tail
call frames.
Tail calling frames go missing in backtraces because the frame of the
caller is reused by the callee. Call site entries allow a debugger to
reconstruct a sequence of (tail) calls which led from one function to
another. This improves backtrace quality. There are limitations: tail
recursion isn't handled, variables within synthetic frames may not
survive to be inspected, etc. This approach is not novel, see:
https://gcc.gnu.org/wiki/summit2010?action=AttachFile&do=get&target=jelinek.pdf
This patch adds an IR-level flag (DIFlagAllCallsDescribed) which lowers
to DW_AT_call_all_calls. It adds the minimal amount of DWARF generation
support needed to emit standards-compliant call site entries. For easier
deployment, when the debugger tuning is LLDB, the DWARF requirement is
adjusted to v4.
Testing: Apart from check-{llvm, clang}, I built a stage2 RelWithDebInfo
clang binary. Its dSYM passed verification and grew by 1.4% compared to
the baseline. 151,879 call site entries were added.
rdar://
42001377
Differential Revision: https://reviews.llvm.org/D49887
llvm-svn: 343883
Sanjay Patel [Fri, 5 Oct 2018 20:26:54 +0000 (20:26 +0000)]
[x86] make blend tests resistant to demanded elements improvements; NFC
Similar to rL343858 - we don't want these tests to lose value with D52912.
llvm-svn: 343882
Mandeep Singh Grang [Fri, 5 Oct 2018 19:49:36 +0000 (19:49 +0000)]
[COFF, ARM64] Add _InterlockedCompareExchangePointer_nf intrinsic
Reviewers: rnk, mstorsjo, compnerd, TomTan, haripul, efriedma
Reviewed By: efriedma
Subscribers: efriedma, kristof.beyls, chrib, jfb, cfe-commits
Differential Revision: https://reviews.llvm.org/D52807
llvm-svn: 343881
Reid Kleckner [Fri, 5 Oct 2018 19:46:51 +0000 (19:46 +0000)]
Fix dwarf-no-source-loc.ll path separator on Windows
llvm-svn: 343880
Martin Storsjo [Fri, 5 Oct 2018 19:43:24 +0000 (19:43 +0000)]
[COFF] Do MinGW specific entry/subsystem inference
ld.bfd doesn't do any inference of subsystem; unless the windows
subsystem is specified, the console subsystem is used.
For the console subsystem, the entry point is called mainCRTStartup,
regardless of whether the the user code entry point is main or wmain.
The same goes for the windows subsystem, where the entry point always
is WinMainCRTStartup, for both WinMain and wWinMain in user code.
One detail that we don't emulate, is that if the inferred entry point
is undefined, ld.bfd silently just sets the entry point to the start
of the image. And if an explicit entry point is set, but it is
undefined, the link still succeeds but the linker warns about the
entry point not being found.
Differential Revision: https://reviews.llvm.org/D52931
llvm-svn: 343879
Martin Storsjo [Fri, 5 Oct 2018 19:43:20 +0000 (19:43 +0000)]
[docs] Mention some notable feature in the release notes
Differential Revision: https://reviews.llvm.org/D52908
llvm-svn: 343878
Martin Storsjo [Fri, 5 Oct 2018 19:43:16 +0000 (19:43 +0000)]
[COFF] Cope with GCC produced weak aliases referring to comdat functions
For certain cases of inline functions written to comdat sections,
GCC 5.x produces a weak symbol in addition, which would end up
undefined in some cases.
This no longer seems to happen with GCC 6.x or newer though.
Differential Revision: https://reviews.llvm.org/D52602
llvm-svn: 343877
Reid Kleckner [Fri, 5 Oct 2018 18:48:53 +0000 (18:48 +0000)]
Revert r343606/r342652 "[winasan] Unpoison the stack in NtTerminateThread""
This still seems to be causing pnacl + asan to crash.
llvm-svn: 343876
Artem Belevich [Fri, 5 Oct 2018 18:39:58 +0000 (18:39 +0000)]
[CUDA] Use all 64 bits of GUID in __nv_module_id
getGUID() returns an uint64_t and "%x" only prints 32 bits of it.
Use PRIx64 format string to print all 64 bits.
Differential Revision: https://reviews.llvm.org/D52938
llvm-svn: 343875
Matthias Braun [Fri, 5 Oct 2018 18:29:24 +0000 (18:29 +0000)]
DwarfDebug: Pick next location in case of missing location at block begin
Context: Compiler generated instructions do not have a debug location
assigned to them. However emitting 0-line records for all of them bloats
the line tables for very little benefit so we usually avoid doing that.
Not emitting anything will lead to the previous debug location getting
applied to the locationless instructions. This is not desirable for
block begin and after labels. Previously we would emit simply emit
line-0 records in this case, this patch changes the behavior to do a
forward search for a debug location in these cases before emitting a
line-0 record to further reduce line table bloat.
Inspired by the discussion in https://reviews.llvm.org/D52862
llvm-svn: 343874
Alex Bradbury [Fri, 5 Oct 2018 18:25:55 +0000 (18:25 +0000)]
[RISCV] Regenerate several tests now enableMultipleCopyHints is enabled by default
r343851 caused codegen changes in several tests. This patch regenerates them.
llvm-svn: 343873
Nico Weber [Fri, 5 Oct 2018 18:22:21 +0000 (18:22 +0000)]
clang-format: Don't insert spaces in front of :: for Java 8 Method References.
The existing code kept the space if it was there for identifiers, and it didn't
handle `this`. After this patch, for Java `this` is handled in addition to
identifiers, and existing space is always stripped between identifier and `::`.
Also accept `::` in addition to `.` in front of `<` in `foo::<T>bar` generic
calls.
Differential Revision: https://reviews.llvm.org/D52842
llvm-svn: 343872
Craig Topper [Fri, 5 Oct 2018 18:13:36 +0000 (18:13 +0000)]
[X86] Don't promote i16 compares to i32 if the immediate will fit in 8 bits.
The comments in this code say we were trying to avoid 16-bit immediates, but if the immediate fits in 8-bits this isn't an issue. This avoids creating a zero extend that probably won't go away.
The movmskb related changes are interesting. The movmskb instruction writes a 32-bit result, but fills the upper bits with 0. So the zero_extend we were previously emitting was free, but we turned a -1 immediate that would fit in 8-bits into a 32-bit immediate so it was still bad.
llvm-svn: 343871
Kamil Rytarowski [Fri, 5 Oct 2018 18:07:34 +0000 (18:07 +0000)]
Unwind local macro DEFINE_INTERNAL()
No functional change intended.
This is a follow up of a suggestion from D52793.
llvm-svn: 343870
Jonathan Peyton [Fri, 5 Oct 2018 17:59:39 +0000 (17:59 +0000)]
[OpenMP] Convert KMP_DYNAMIC_LIB to a 0 or 1 guard everywhere
llvm-svn: 343869
Simon Pilgrim [Fri, 5 Oct 2018 17:57:29 +0000 (17:57 +0000)]
[X86] Move ReadAfterLd functionality into X86FoldableSchedWrite (PR36957)
Currently we hardcode instructions with ReadAfterLd if the register operands don't need to be available until the folded load has completed. This doesn't take into account the different load latencies of different memory operands (PR36957).
This patch adds a ReadAfterFold def into X86FoldableSchedWrite to replace ReadAfterLd, allowing us to specify the load latency at a scheduler class level.
I've added ReadAfterVec*Ld classes that match the XMM/Scl, XMM and YMM/ZMM WriteVecLoad classes that we currently use, we can tweak these values in future patches once this infrastructure is in place.
Differential Revision: https://reviews.llvm.org/D52886
llvm-svn: 343868
James Y Knight [Fri, 5 Oct 2018 17:49:48 +0000 (17:49 +0000)]
Emit diagnostic note when calling an invalid function declaration.
The comment said it was intentionally not emitting any diagnostic
because the declaration itself was already diagnosed. However,
everywhere else that wants to not emit a diagnostic without an extra
note emits note_invalid_subexpr_in_const_expr instead, which gets
suppressed later.
This was the only place which did not emit a diagnostic note.
Differential Revision: https://reviews.llvm.org/D52919
llvm-svn: 343867
Jonathan Peyton [Fri, 5 Oct 2018 17:47:58 +0000 (17:47 +0000)]
[OpenMP] Fix KMP_DYNAMIC_LIB to be dependent on LIBOMP_ENABLE_SHARED
The KMP_DYNAMIC_LIB guard was hard set to 1. This patch has the guard depend
on CMake variable LIBOMP_ENABLE_SHARED.
llvm-svn: 343866
Sanjay Patel [Fri, 5 Oct 2018 17:42:19 +0000 (17:42 +0000)]
[SelectionDAG] allow undefs when matching splat constants
And use that to transform fsub with zero constant operands.
The integer part isn't used yet, but it is proposed for use in
D44548, so adding both enhancements here makes that
patch simpler.
llvm-svn: 343865
Adrian Prantl [Fri, 5 Oct 2018 17:41:30 +0000 (17:41 +0000)]
Format the dwarfdump --statistics version as an integer instead of a string.
llvm-svn: 343864
Sanjay Patel [Fri, 5 Oct 2018 17:36:51 +0000 (17:36 +0000)]
[x86] add test for (X - 0.0) vector with undef elts; NFC
llvm-svn: 343863