Artem Belevich [Mon, 28 Nov 2016 19:55:46 +0000 (19:55 +0000)]
Revert r287637 "[wasm] hack around test failure after r287553."
-cgp-freq-ratio-to-skip-merge option was removed by rollback in r288052.
llvm-svn: 288055
Andrey Churbanov [Mon, 28 Nov 2016 19:23:09 +0000 (19:23 +0000)]
Cleanup: memory leaks on warnings printing fixed; some memory freeing cleaned; poor indents and one typo fixed.
Patch by Victor Campos.
Differential Revision: https://reviews.llvm.org/D26786
llvm-svn: 288054
Stanislav Mekhanoshin [Mon, 28 Nov 2016 18:58:49 +0000 (18:58 +0000)]
[AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies
Codegen prepare sinks comparisons close to a user is we have only one register
for conditions. For AMDGPU we have many SGPRs capable to hold vector conditions.
Changed BE to report we have many condition registers. That way IR LICM pass
would hoist an invariant comparison out of a loop and codegen prepare will not
sink it.
With that done a condition is calculated in one block and used in another.
Current behavior is to store workitem's condition in a VGPR using v_cndmask_b32
and then restore it with yet another v_cmp instruction from that v_cndmask's
result. To mitigate the issue a propagation of source SGPR pair in place of v_cmp
is implemented. Additional side effect of this is that we may consume less VGPRs
at a cost of more SGPRs in case if holding of multiple conditions is needed, and
that is a clear win in most cases.
Differential Revision: https://reviews.llvm.org/D26114
llvm-svn: 288053
Joerg Sonnenberger [Mon, 28 Nov 2016 18:56:54 +0000 (18:56 +0000)]
Revert r287553: [CodeGenPrep] Skip merging empty case blocks
It results in assertions in lib/Analysis/BlockFrequencyInfoImpl.cpp line
670 ("Expected irreducible CFG").
llvm-svn: 288052
Justin Lebar [Mon, 28 Nov 2016 18:50:03 +0000 (18:50 +0000)]
[StructurizeCFG] Use range-based for loops.
Reviewers: arsenm
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D27000
llvm-svn: 288051
Justin Lebar [Mon, 28 Nov 2016 18:49:59 +0000 (18:49 +0000)]
[StructurizeCFG] Refactor NearestCommonDominator.
Summary:
As far as I can tell, doing our own computations in
NearestCommonDominator is a false optimization -- DomTree will build up
what appears to be exactly this data when it decides it's worthwhile.
Moreover, by building the cache ourselves, we cannot take advantage of
the cache that the domtree might have available.
In addition, I am not convinced of the correctness of the original code.
In particular, setting ResultIndex = 1 on the first addBlock instead of
setting it to 0 is quite fishy. Similarly, it's not clear to me that
setting IndexMap[Node] = 0 for every node as we walk up the tree finding
a common parent is correct. But rather than ponder over these
questions, I'd rather just make the code do the obviously-correct thing.
This patch also changes the NearestCommonDominator API a bit, improving
the names and getting rid of the boolean parameter in addBlock -- see
http://jlebar.com/2011/12/16/Boolean_parameters_to_API_functions_considered_harmful..html
Reviewers: arsenm
Subscribers: aemerson, wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D26998
llvm-svn: 288050
Simon Pilgrim [Mon, 28 Nov 2016 17:58:19 +0000 (17:58 +0000)]
[X86][SSE] Add initial support for combining (V)PMOVZX with shuffles.
llvm-svn: 288049
Adam Nemet [Mon, 28 Nov 2016 17:45:34 +0000 (17:45 +0000)]
[GVN, OptDiag] Include the value that is forwarded in load elimination
This requires some changes to the opt-diag API. Hal and I have
discussed this at the Dev Meeting and came up with a streaming delimiter
(setExtraArgs) to solve this.
Arguments after this delimiter are only included in the optimization
records and not in the remarks printed in the compiler output. (Note,
how in the test the content of the YAML file changes but the remarks on
the compiler output don't.)
This implements the green GVN message with a bug fix at line
http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L446
The fix is that now we properly include the constant value in the
message: "load of type i32 eliminated in favor of 7"
Differential Revision: https://reviews.llvm.org/D26489
llvm-svn: 288047
Adam Nemet [Mon, 28 Nov 2016 17:45:28 +0000 (17:45 +0000)]
[GVN] Basic optimization remark support
Follow-on patches will add more interesting cases.
The goal of this patch-set is to get the GVN messages printed in
opt-viewer from Dhrystone as was presented in my Dev Meeting talk. This
is the optimization view for the function (the last remark in the
function has a bug which is fixed in this series):
http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L430
Differential Revision: https://reviews.llvm.org/D26488
llvm-svn: 288046
Sanjay Patel [Mon, 28 Nov 2016 17:39:21 +0000 (17:39 +0000)]
[x86] fix formatting; NFC
llvm-svn: 288045
Todd Fiala [Mon, 28 Nov 2016 17:19:03 +0000 (17:19 +0000)]
fix up Xcode build for r287916
llvm-svn: 288044
Benjamin Kramer [Mon, 28 Nov 2016 17:16:18 +0000 (17:16 +0000)]
[include-fixer] Don't interfere with typo correction if we found nothing.
Just let the existing typo correction machinery handle that.
llvm-svn: 288043
Daniil Fukalov [Mon, 28 Nov 2016 17:12:09 +0000 (17:12 +0000)]
[CMAKE] fix LLVM_OPTIMIZED_TABLEGEN for Visual Studio
At the moment optimized tablegen is generated by LLVM_USE_HOST_TOOLS variable that is not set for Visual Sudio since LLVM_ENABLE_ASSERTIONS depends on CMAKE_BUILD_TYPE value that is not equal to "DEBUG" in case of Visual Studio soltion generation.
Modified to do not depend on LLVM_ENABLE_ASSERTIONS value in VS and Xcode cases
Reviewers: beanz
Subscribers: RKSimon, llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D27135
llvm-svn: 288042
Adam Nemet [Mon, 28 Nov 2016 16:51:49 +0000 (16:51 +0000)]
[LTO] Move finishOptimizationRemarks after codegen
This addresses the comment D26832.
llvm-svn: 288041
Simon Pilgrim [Mon, 28 Nov 2016 16:25:01 +0000 (16:25 +0000)]
[X86][SSE] Added support for combining bit-shifts with shuffles.
Bit-shifts by a whole number of bytes can be represented as a shuffle mask suitable for combining.
Added a 'getFauxShuffleMask' function to allow us to create shuffle masks from other suitable operations.
llvm-svn: 288040
Alexey Bataev [Mon, 28 Nov 2016 15:55:15 +0000 (15:55 +0000)]
[OPENMP] Fix for PR31137: Wrong DSA for members in struct.
If member expression is used in the task region and the base expression
is a DeclRefExp and the variable used in this ref expression is private,
it should be marked as implicitly firstprivate inside this region. Patch
fixes this issue.
llvm-svn: 288039
Pavel Labath [Mon, 28 Nov 2016 15:51:47 +0000 (15:51 +0000)]
Fix floating point register reads x86_64 linux on targets with no AVX support
Summary:
On for 64-bit targets, the correct register set to read the fxsave are is
NT_PRFPREG (only 32-bit targets need NT_PRXFPREG, presumably for historic
reasons). Reference:
<https://github.com/torvalds/linux/blob/v4.8/arch/x86/kernel/ptrace.c#L1261>.
Reviewers: tberghammer, valentinagiusti
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D27161
llvm-svn: 288038
Simon Pilgrim [Mon, 28 Nov 2016 15:50:39 +0000 (15:50 +0000)]
[X86][SSE] Added tests showing missed combines of shifts with shuffles.
llvm-svn: 288037
Daniel Cederman [Mon, 28 Nov 2016 15:33:03 +0000 (15:33 +0000)]
Test commit
llvm-svn: 288036
Nirav Dave [Mon, 28 Nov 2016 14:30:29 +0000 (14:30 +0000)]
Revert "[DAG] Improve loads-from-store forwarding to handle TokenFactor"
This reverts commit r287773 which caused issues with ppc64le builds.
llvm-svn: 288035
NAKAMURA Takumi [Mon, 28 Nov 2016 14:27:37 +0000 (14:27 +0000)]
ClangMoveTests.cpp: Fix a bogus comparison of iterator.
msc Debug build detected it.
llvm-svn: 288034
Ulrich Weigand [Mon, 28 Nov 2016 14:24:14 +0000 (14:24 +0000)]
[SystemZ] Fix build bot fallout from r288030
Remove unused variable that came in due to a copy-and-paste bug
and caused build bot failures.
llvm-svn: 288033
Pavel Labath [Mon, 28 Nov 2016 14:06:56 +0000 (14:06 +0000)]
XFAIL: TestNoreturnUnwind on android x86_64
llvm-svn: 288032
Ulrich Weigand [Mon, 28 Nov 2016 14:01:51 +0000 (14:01 +0000)]
[SystemZ] Support execution hint instructions
This adds assembler support for the instructions provided by the
execution-hint facility (NIAI and BP(R)P). This required adding
support for the new relocation types for 12-bit and 24-bit PC-
relative offsets used by the BP(R)P instructions.
llvm-svn: 288031
Ulrich Weigand [Mon, 28 Nov 2016 13:59:22 +0000 (13:59 +0000)]
[SystemZ] Support load-and-trap instructions
This adds support for the instructions provided with the
load-and-trap facility.
llvm-svn: 288030
Ulrich Weigand [Mon, 28 Nov 2016 13:40:08 +0000 (13:40 +0000)]
[SystemZ] Add remaining branch instructions
This patch adds assembler support for the remaining branch instructions:
the non-relative branch on count variants, and all variants of branch
on index.
The only one of those that can be readily exploited for code generation
is BRCTH (branch on count using a high 32-bit register as count). Do
use it, however, it is necessary to also introduce a hew CHIMux pseudo
to allow comparisons of a 32-bit value agains a short immediate to go
into a high register as well (implemented via CHI/CIH).
This causes a bit of codegen changes overall, but those have proven to
be neutral (or even beneficial) in performance measurements.
llvm-svn: 288029
Ulrich Weigand [Mon, 28 Nov 2016 13:34:08 +0000 (13:34 +0000)]
[SystemZ] Improve use of conditional instructions
This patch moves formation of LOC-type instructions from (late)
IfConversion to the early if-conversion pass, and in some cases
additionally creates them directly from select instructions
during DAG instruction selection.
To make early if-conversion work, the patch implements the
canInsertSelect / insertSelect callbacks. It also implements
the commuteInstructionImpl and FoldImmediate callbacks to
enable generation of the full range of LOC instructions.
Finally, the patch adds support for all instructions of the
load-store-on-condition-2 facility, which allows using LOC
instructions also for high registers.
Due to the use of the GRX32 register class to enable high registers,
we now also have to handle the cases where there are still no single
hardware instructions (conditional move from a low register to a high
register or vice versa). These are converted back to a branch sequence
after register allocation. Since the expandRAPseudos callback is not
allowed to create new basic blocks, this requires a simple new pass,
modelled after the ARM/AArch64 ExpandPseudos pass.
Overall, this patch causes significantly more LOC-type instructions
to be used, and results in a measurable performance improvement.
llvm-svn: 288028
Pavel Labath [Mon, 28 Nov 2016 12:15:19 +0000 (12:15 +0000)]
skip android in @skipIfHostIncompatibleWithRemote
The current implementation of the decorator does not skip if the android target
arch is the same as host arch (as in both cases the platform comes out as linux).
Nonetheless android x86_64 binaries are not compatible with linux ones.
Technically this should be "skip if target is android and host is *not* android",
but currently nobody runs lldb test suite on an android host, so we don't even
have a way of specifying that the host is android.
llvm-svn: 288027
Pavel Labath [Mon, 28 Nov 2016 11:47:14 +0000 (11:47 +0000)]
Fix a crash in ProcessPOSIXLog
We are getting a null pointer for the list of categories here (presumably due to
the args refactor).
llvm-svn: 288026
Malcolm Parsons [Mon, 28 Nov 2016 11:11:34 +0000 (11:11 +0000)]
[Sema] Set range end of constructors and destructors in template instantiations
Summary:
clang-tidy checks frequently use source ranges of functions.
The source range of constructors and destructors in template instantiations
is currently a single token.
The factory method for constructors and destructors does not allow the
end source location to be specified.
Set end location manually after creating instantiation.
Reviewers: aaron.ballman, rsmith, arphaman
Subscribers: arphaman, cfe-commits
Differential Revision: https://reviews.llvm.org/D26849
llvm-svn: 288025
James Molloy [Mon, 28 Nov 2016 11:07:37 +0000 (11:07 +0000)]
[InlineCost] Reduce inline thresholds to compensate for cost changes
In r286814, the algorithm for calculating inline costs changed. This
caused more inlining to take place which is especially apparent
in optsize and minsize modes.
As the cost calculation removed a skewed behaviour (we were inconsistent
about the cost of calls) it isn't possible to update the thresholds to
get exactly the same behaviour as before. However, this threshold change
accounts for the very common case where an inline candidate has no
calls within it. In this case, r286814 would inline around 5-6 more (IR)
instructions.
The changes to -Oz have been heavily benchmarked. The "obvious" value
for the inline threshold at -Oz is zero, but due to inaccuracies in the
inline heuristics this can actually cause code size increases due to
not inlining key thunk functions (that then disappear). Experimentally,
5 was the sweet spot for code size over the test-suite.
For -Os, this change removes the outlier results shown up by green dragon
(http://104.154.54.203/db_default/v4/nts/13248).
Fixes D26848.
llvm-svn: 288024
Chandler Carruth [Mon, 28 Nov 2016 10:42:21 +0000 (10:42 +0000)]
[PM] Remove weird marking of invalidated analyses as "preserved".
This never made a lot of sense. They've been invalidated for one IR unit
but they aren't really preserved in any normal sense. It seemed like it
would be an elegant way of communicating to outer IR units that pass
managers and adaptors had already handled invalidation, but we've since
ended up adding sets that model this more clearly: we're now using
the 'AllAnalysesOn<IRUnitT>' set to handle cases where the trick of
"preserving" invalidated analyses didn't work.
This patch moves to rely on that technique exclusively and removes the
cumbersome API aspect of updating the preserved set when doing
invalidation. This in turn will simplify a *number* of upcoming patches.
This has a side benefit of exposing a number of places where we were
failing to mark the 'AllAnalysesOn<IRUnitT>' set as preserved. This
patch fixes those, and with those fixes shouldn't change any observable
behavior.
llvm-svn: 288023
George Rimar [Mon, 28 Nov 2016 10:26:21 +0000 (10:26 +0000)]
[ELF] - Do not put non exec sections first when -no-rosegment
That unifies handling cases when we have SECTIONS and when
-no-rosegment is given in compareSectionsNonScript()
Now Config->SingleRoRx is used for check, testcase is provided.
llvm-svn: 288022
George Rimar [Mon, 28 Nov 2016 10:11:10 +0000 (10:11 +0000)]
[ELF] - Set Config->SingleRoRx differently. NFC.
Previously Config->SingleRoRx was set in
createFiles() and used HasSections.
This change moves it to readConfigs at place of
common flags handling, and adds logic that sets
this flag separatelly from ScriptParser if SECTIONS present.
llvm-svn: 288021
George Rimar [Mon, 28 Nov 2016 10:05:20 +0000 (10:05 +0000)]
[ELF] - Implemented -no-rosegment.
--no-rosegment: Do not put read-only non-executable sections in their own segment
Differential revision: https://reviews.llvm.org/D26889
llvm-svn: 288020
Eugene Leviant [Mon, 28 Nov 2016 09:58:04 +0000 (09:58 +0000)]
[ELF] Print file:line for 'undefined section' errors
Differential revision: https://reviews.llvm.org/D27108
llvm-svn: 288019
Davide Italiano [Mon, 28 Nov 2016 09:17:12 +0000 (09:17 +0000)]
[ThreadPool] Rollback recent changes until I figure out the breakage.
llvm-svn: 288018
Davide Italiano [Mon, 28 Nov 2016 08:57:05 +0000 (08:57 +0000)]
[ThreadPool] Remove outdated comment after r288016.
llvm-svn: 288017
Davide Italiano [Mon, 28 Nov 2016 08:53:41 +0000 (08:53 +0000)]
[ThreadPool] Simplify the interface. NFCI.
The callers don't use the return value. Found by Michael
Spencer.
llvm-svn: 288016
Mehdi Amini [Mon, 28 Nov 2016 04:57:04 +0000 (04:57 +0000)]
Revert "Improve error handling in YAML parsing"
This reverts commit r288014, the unittest isn't passing
llvm-svn: 288015
Mehdi Amini [Mon, 28 Nov 2016 04:44:13 +0000 (04:44 +0000)]
Improve error handling in YAML parsing
Some scanner errors were not checked and reported by the parser.
Fix PR30934
Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu>
Differential Revision: https://reviews.llvm.org/D26419
llvm-svn: 288014
Chandler Carruth [Mon, 28 Nov 2016 03:40:33 +0000 (03:40 +0000)]
[PM] Add an ASCII-art diagram for the call graph in the CGSCC unit test.
No functionality changed.
llvm-svn: 288013
Rafael Espindola [Mon, 28 Nov 2016 00:40:21 +0000 (00:40 +0000)]
Always create a PT_ARM_EXIDX if needed.
Unfortunatelly PT_ARM_EXIDX is special. There is no way to create it
from linker scripts, so we have to create it even if PHDRS is used.
This matches bfd and is required for the lld output to survive bfd's strip.
llvm-svn: 288012
Craig Topper [Sun, 27 Nov 2016 21:37:04 +0000 (21:37 +0000)]
[X86][FMA4] Remove isCommutable from FMA4 scalar intrinsics. They aren't commutable as operand 0 should pass its upper bits through to the output.
llvm-svn: 288011
Craig Topper [Sun, 27 Nov 2016 21:37:02 +0000 (21:37 +0000)]
[X86][FMA] Add missing Predicates qualifier around scalar FMA intrinsic patterns.
llvm-svn: 288010
Craig Topper [Sun, 27 Nov 2016 21:37:00 +0000 (21:37 +0000)]
[X86][FMA4] Add load folding support for FMA4 scalar intrinsic instructions.
llvm-svn: 288009
Craig Topper [Sun, 27 Nov 2016 21:36:58 +0000 (21:36 +0000)]
[X86][FMA4] Add test cases to demonstrate missed folding opportunities for FMA4 scalar intrinsics.
llvm-svn: 288008
Craig Topper [Sun, 27 Nov 2016 21:36:54 +0000 (21:36 +0000)]
[X86] Add SHL by 1 to the load folding tables.
I don't think isel selects these today, favoring adding the register to itself instead. But the load folding tables shouldn't be so concerned with what isel will use and just represent the relationships.
llvm-svn: 288007
Simon Pilgrim [Sun, 27 Nov 2016 21:08:19 +0000 (21:08 +0000)]
[X86][SSE] Add support for combining target shuffles to 128/256-bit PSLL/PSRL bit shifts
llvm-svn: 288006
Sanjay Patel [Sun, 27 Nov 2016 21:07:28 +0000 (21:07 +0000)]
[InstSimplify] allow integer vector types to use computeKnownBits
Note that the non-splat lshr+lshr test folded, but that does not
work in general. Something is missing or wrong in computeKnownBits
as the non-splat shl+shl test still shows.
llvm-svn: 288005
Craig Topper [Sun, 27 Nov 2016 19:51:41 +0000 (19:51 +0000)]
[AVX-512] Add integer and fp unpck instructions to load folding tables.
llvm-svn: 288004
Simon Pilgrim [Sun, 27 Nov 2016 19:28:39 +0000 (19:28 +0000)]
[X86][SSE] Split lowerVectorShuffleAsShift ready for combines. NFCI.
Moved most of matching code into matchVectorShuffleAsShift to share with target shuffle combines (in a future commit).
llvm-svn: 288003
Rui Ueyama [Sun, 27 Nov 2016 19:28:32 +0000 (19:28 +0000)]
Add paralell_for and use it where appropriate.
When we iterate over numbers as opposed to iterable elements,
parallel_for fits better than parallel_for_each.
llvm-svn: 288002
Craig Topper [Sun, 27 Nov 2016 18:51:13 +0000 (18:51 +0000)]
[X86] Add TB_NO_REVERSE to entries in the load folding table where the instruction's load size is smaller than the register size.
If we were to unfold these, the load size would be increased to the register size. This is not safe to do since the enlarged load can do things like cross a page boundary into a page that doesn't exist.
I probably missed some instructions, but this should be a large portion of them.
llvm-svn: 288001
Simon Pilgrim [Sun, 27 Nov 2016 18:25:02 +0000 (18:25 +0000)]
[X86][SSE] Added tests showing missed combines for shuffle to shifts.
llvm-svn: 288000
Hal Finkel [Sun, 27 Nov 2016 16:26:14 +0000 (16:26 +0000)]
Adjust type-trait evaluation to properly handle Using(Shadow)Decls
Since r274049, for an inheriting constructor declaration, the name of the using
declaration (and using shadow declaration comes from the using declaration) is
the name of a derived class, not the base class (line 8225-8232 of
lib/Sema/SemaDeclCXX.cpp in https://reviews.llvm.org/rL274049). Because of
this, name-based lookup performed inside Sema::LookupConstructors returns not
only CXXConstructorDecls but also Using(Shadow)Decls, which results assertion
failure reported in PR29087.
Patch by Taewook Oh, thanks!
Differential Revision: https://reviews.llvm.org/D23765
llvm-svn: 287999
Sanjay Patel [Sun, 27 Nov 2016 15:54:45 +0000 (15:54 +0000)]
add tests to show missing analysis; NFC
llvm-svn: 287998
Sanjay Patel [Sun, 27 Nov 2016 15:53:48 +0000 (15:53 +0000)]
fix formatting; NFC
llvm-svn: 287997
Rafael Espindola [Sun, 27 Nov 2016 09:44:45 +0000 (09:44 +0000)]
Also skip regular symbol assignment at the start of a script.
Unfortunatelly some scripts look like
kernphys = ...
. = ....
and the expectation in that every orphan section is after the
assignment.
llvm-svn: 287996
Craig Topper [Sun, 27 Nov 2016 08:55:31 +0000 (08:55 +0000)]
[AVX-512] Add masked EVEX vpmovzx/sx instructions to load folding tables.
llvm-svn: 287995
Rafael Espindola [Sun, 27 Nov 2016 07:39:45 +0000 (07:39 +0000)]
Don't put an orphan before the first . assignment.
This is an horrible special case, but seems to match bfd's behaviour
and is important for avoiding placing an orphan section before the
expected start of the file.
llvm-svn: 287994
Mohammad Shahid [Sun, 27 Nov 2016 03:35:31 +0000 (03:35 +0000)]
[SLP] Add new and update existing lit testfor providing more context to incoming patch for vectorization of jumbled load
Change-Id: Ifb9091bb0f84c1937c2c8bd2fc345734f250d2f9
llvm-svn: 287992
Craig Topper [Sun, 27 Nov 2016 01:52:51 +0000 (01:52 +0000)]
[X86] Remove alignment restrictions from load folding table for some instructions that don't have a restriction.
Most of these are the SSE4.1 PMOVZX/PMOVSX instructions which all read less than 128-bits. The only other was PMOVUPD which by definition is an unaligned load.
llvm-svn: 287991
Ekaterina Romanova [Sat, 26 Nov 2016 19:38:19 +0000 (19:38 +0000)]
[DOXYGEN] Updated instruction names corresponding to avxintrin.h intrinsics.
Documentation for some of the avxintrin.h's intrinsics errorneously said that
non VEX-prefixed instructions could be generated. This was fixed.
I tried several different solutions to achieve pretty printing of unordered lists (nested and non-nested) in param sections in doxygen.
llvm-svn: 287990
Kuba Mracek [Sat, 26 Nov 2016 19:09:32 +0000 (19:09 +0000)]
[tsan] Fix the lit expansion of %deflake not to eat a space
The lit expansion of "%deflake " (notice the space after) expands in a way that the space is removed, this fixes that.
Differential Revision: https://reviews.llvm.org/D27139
llvm-svn: 287989
Marshall Clow [Sat, 26 Nov 2016 18:45:03 +0000 (18:45 +0000)]
Implement conjuntion/disjuntion/negation for LFTS v2. Same code and tests for the ones in std::
llvm-svn: 287988
Craig Topper [Sat, 26 Nov 2016 18:43:26 +0000 (18:43 +0000)]
[X86] Remove hasOneUse check that is redundant with the one in IsProfitableToFold.
llvm-svn: 287987
Craig Topper [Sat, 26 Nov 2016 18:43:24 +0000 (18:43 +0000)]
[X86] Fix the zero extending load detection in X86DAGToDAGISel::selectScalarSSELoad to pass the load node to IsProfitableToFold and IsLegalToFold.
Previously we were passing the SCALAR_TO_VECTOR node.
llvm-svn: 287986
Craig Topper [Sat, 26 Nov 2016 18:43:21 +0000 (18:43 +0000)]
[X86] Simplify control flow. NFCI
llvm-svn: 287985
Tobias Grosser [Sat, 26 Nov 2016 17:58:40 +0000 (17:58 +0000)]
[ScopInfo] Use SCEVRewriteVisitor to simplify SCEVSensitiveParameterRewriter [NFC]
llvm-svn: 287984
Craig Topper [Sat, 26 Nov 2016 17:29:25 +0000 (17:29 +0000)]
[X86] Add a hasOneUse check to selectScalarSSELoad to keep the same load from being folded multiple times.
Summary: When selectScalarSSELoad is looking for a scalar_to_vector of a scalar load, it makes sure the load is only used by the scalar_to_vector. But it doesn't make sure the scalar_to_vector is only used once. This can cause the same load to be folded multiple times. This can be bad for performance. This also causes the chain output to be duplicated, but not connected to anything so chain dependencies will not be satisfied.
Reviewers: RKSimon, zvi, delena, spatel
Subscribers: andreadb, llvm-commits
Differential Revision: https://reviews.llvm.org/D26790
llvm-svn: 287983
Sanjay Patel [Sat, 26 Nov 2016 16:13:23 +0000 (16:13 +0000)]
[InstCombine] add test to show missing vector optimization; NFC
llvm-svn: 287982
Marshall Clow [Sat, 26 Nov 2016 15:49:40 +0000 (15:49 +0000)]
Implement the 'detection idiom' from LFTS v2
llvm-svn: 287981
Sanjay Patel [Sat, 26 Nov 2016 15:23:20 +0000 (15:23 +0000)]
[InstCombine] don't drop metadata in FoldOpIntoSelect()
llvm-svn: 287980
Rui Ueyama [Sat, 26 Nov 2016 15:15:11 +0000 (15:15 +0000)]
Change return types of split{Non,}Strings.
They return new vectors, but at the same time they mutate other vectors,
so returning values doesn't make much sense. We should just mutate two
vectors.
llvm-svn: 287979
Rui Ueyama [Sat, 26 Nov 2016 15:10:01 +0000 (15:10 +0000)]
Make getColorDiagnostics return a boolean value instead of an enum.
Config->ColorDiagnostics was of type enum before. Now it is just a
boolean flag. Thanks Rafael for suggestion.
llvm-svn: 287978
Rui Ueyama [Sat, 26 Nov 2016 15:09:58 +0000 (15:09 +0000)]
Split MergeOutputSection::finalize.
llvm-svn: 287977
Sanjay Patel [Sat, 26 Nov 2016 15:01:59 +0000 (15:01 +0000)]
add optional param to copy metadata when creating selects; NFC
There are other spots where we can use this; we're currently dropping
metadata in some places, and there are proposed changes where we will
want to propagate metadata.
IRBuilder's CreateSelect() already has a parameter like this, so this
change makes the regular 'Create' API line up with that.
llvm-svn: 287976
Craig Topper [Sat, 26 Nov 2016 08:21:52 +0000 (08:21 +0000)]
[AVX-512] Add unmasked EVEX vpmovzx/sx instructions to load folding tables.
llvm-svn: 287975
Craig Topper [Sat, 26 Nov 2016 08:21:48 +0000 (08:21 +0000)]
[AVX-512] Add masked 128/256-bit integer add/sub instructions to load folding tables.
llvm-svn: 287974
Tobias Grosser [Sat, 26 Nov 2016 07:37:46 +0000 (07:37 +0000)]
[ScopDetect] Expand statistics of the detected scops
We now collect:
Number of total loops
Number of loops in scops
Number of scops
Number of scops with maximal loop depth 1
Number of scops with maximal loop depth 2
Number of scops with maximal loop depth 3
Number of scops with maximal loop depth 4
Number of scops with maximal loop depth 5
Number of scops with maximal loop depth 6 and larger
Number of loops in scops (profitable scops only)
Number of scops (profitable scops only)
Number of scops with maximal loop depth 1 (profitable scops only)
Number of scops with maximal loop depth 2 (profitable scops only)
Number of scops with maximal loop depth 3 (profitable scops only)
Number of scops with maximal loop depth 4 (profitable scops only)
Number of scops with maximal loop depth 5 (profitable scops only)
Number of scops with maximal loop depth 6 and larger (profitable scops only)
These statistics are certainly completely accurate as we might drop scops
when building up their polyhedral representation, but they should give a good
indication of the number of scops we detect.
llvm-svn: 287973
Craig Topper [Sat, 26 Nov 2016 07:21:00 +0000 (07:21 +0000)]
[AVX-512] Add masked 512-bit integer add/sub instructions to load folding tables.
llvm-svn: 287972
Craig Topper [Sat, 26 Nov 2016 07:20:57 +0000 (07:20 +0000)]
[AVX-512] Teach LowerFormalArguments to use the extended register class when available. Fix the avx512vl stack folding tests to clobber more registers or otherwise they use xmm16 after this change.
llvm-svn: 287971
Craig Topper [Sat, 26 Nov 2016 07:20:53 +0000 (07:20 +0000)]
[AVX-512] Add VLX versions of VDIVPD/PS and VMULPD/PS to load folding tables.
llvm-svn: 287970
Rafael Espindola [Sat, 26 Nov 2016 06:55:35 +0000 (06:55 +0000)]
Create sections with just assignments as STT_NOBITS.
This matches the behaviour of bfd ld. Using 0 was causing problems
with strip, which would remove these sections.
llvm-svn: 287969
Tobias Grosser [Sat, 26 Nov 2016 05:53:09 +0000 (05:53 +0000)]
[ScopDetectionDiagnostic] Collect statistics for each diagnostic type
Our original statistics were added before we introduced a more fine-grained
diagnostic system, but the granularity of our statistics has never been
increased accordingly. This change introduces now one statistic counter per
diagnostic to enable us to collect fine-grained statistics about who certain
scops are not detected. In case coarser grained statistics are needed, the
user is expected to combine counters manually.
llvm-svn: 287968
Davide Italiano [Sat, 26 Nov 2016 05:37:04 +0000 (05:37 +0000)]
[ELF] Be compliant with LLVM and rename Lto into LTO. NFCI.
llvm-svn: 287967
Alexander Shaposhnikov [Sat, 26 Nov 2016 05:23:44 +0000 (05:23 +0000)]
[lldb] Fix typos in file headers
This diff fixes typos in file headers (incorrect file names).
Test plan:
Under llvm/tools/lldb/source:
find ./* -type f | grep -e '\(cpp\|h\)$' | while read F; do B=$(basename $F); echo $F head -n 1 $F | grep -v $B | wc -l ; done
Differential revision: https://reviews.llvm.org/D27115
llvm-svn: 287966
Tobias Grosser [Sat, 26 Nov 2016 05:08:27 +0000 (05:08 +0000)]
[ScopDetectionDiagnostic] IrreducibleRegion is a subclasses of CFG
Reflect this correctly in the RejectReasonKind enum. The definition of
RejectReasonKind::IrreducibleRegion was introduced in r258497, when we started
to refuse regions containing irreducible loops.
llvm-svn: 287965
Tobias Grosser [Sat, 26 Nov 2016 05:08:24 +0000 (05:08 +0000)]
[ScopDetectionDiagnostic] Remove leftover RejectReasonKind for Conditions [NFC]
In r248118 some diagnostics for unstructured control flow have been removed,
but the corresponding RejectReasonKind was accidentally not removed. This
change removes it, as it is not needed any more.
llvm-svn: 287964
Tobias Grosser [Sat, 26 Nov 2016 03:44:31 +0000 (03:44 +0000)]
[ScopDectionDiagnostic] Use scoped enums instead three letter prefix [NFC]
This improves readability of the code.
llvm-svn: 287963
Tom Stellard [Sat, 26 Nov 2016 02:26:04 +0000 (02:26 +0000)]
AMDGPU/SI: Use float as the operand type for amdgcn.interp intrinsics
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D26724
llvm-svn: 287962
Craig Topper [Sat, 26 Nov 2016 02:14:00 +0000 (02:14 +0000)]
[X86][XOP] Add a reversed reg/reg form for VPROT instructions.
The W bit distinquishes which operand is the memory operand. But if the mod bits are 3 then the memory operand is a register and there are two possible encodings. We already did this correctly for several other XOP instructions.
llvm-svn: 287961
Craig Topper [Sat, 26 Nov 2016 02:13:58 +0000 (02:13 +0000)]
[X86] Add SSE, AVX, and AVX2 version of MOVDQU to the load/store folding tables for consistency.
Not sure this is truly needed but we had the floating point equivalents, the aligned equivalents, and the EVEX equivalents. So this just makes it complete.
llvm-svn: 287960
Kuba Mracek [Sat, 26 Nov 2016 01:30:31 +0000 (01:30 +0000)]
[asan] Support handle_sigill on Darwin
Handling SIGILL on Darwin works fine, so let's just make this feature work and re-enable the ill.cc testcase.
Differential Revision: https://reviews.llvm.org/D27141
llvm-svn: 287959
Dylan McKay [Sat, 26 Nov 2016 01:07:32 +0000 (01:07 +0000)]
Un-XFAIL an AVR CodeGen test
llvm-svn: 287958
Kuba Mracek [Sat, 26 Nov 2016 00:50:08 +0000 (00:50 +0000)]
[asan] Add a "dump_registers" flag to print out CPU registers after a SIGSEGV
This patch prints out all CPU registers after a SIGSEGV. These are available in the signal handler context. Only implemented for Darwin. Can be turned off with the dump_registers flag.
Differential Revision: https://reviews.llvm.org/D11365
llvm-svn: 287957
Craig Topper [Fri, 25 Nov 2016 23:21:34 +0000 (23:21 +0000)]
[AVX-512] Put the AVX-512 sections of the load folding tables into mostly alphabetical order. This is consistent with the older sections of the table. NFC
llvm-svn: 287956
David Majnemer [Fri, 25 Nov 2016 22:35:09 +0000 (22:35 +0000)]
Replace some callers of setTailCall with setTailCallKind
We were a little sloppy with adding tailcall markers. Be more
consistent by using setTailCallKind instead of setTailCall.
llvm-svn: 287955
Sanjay Patel [Fri, 25 Nov 2016 21:12:39 +0000 (21:12 +0000)]
[SimplifyCFG] auto-generate better checks; NFC
llvm-svn: 287954