Malcolm Parsons [Thu, 19 Jan 2017 17:19:22 +0000 (17:19 +0000)]
[Sema] Reword unused lambda capture warning
Summary:
The warning doesn't know why the variable was looked up but not
odr-used, so reword it to not claim that it was used in an unevaluated
context.
Reviewers: aaron.ballman
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D28902
llvm-svn: 292498
Alex Lorenz [Thu, 19 Jan 2017 17:17:57 +0000 (17:17 +0000)]
[Sema] Fix PR28181 by avoiding calling BuildOverloadedBinOp in C mode
rdar://
28532840
Differential Revision: https://reviews.llvm.org/D25213
llvm-svn: 292497
Sumanth Gundapaneni [Thu, 19 Jan 2017 16:54:04 +0000 (16:54 +0000)]
[Hexagon] Linux linker does not support .gnu-hash
Hexagon Linux dynamic loader does not use (in fact does not support)
.gnu-hash
Differential Revision: https://reviews.llvm.org/D28865
llvm-svn: 292496
Simon Pilgrim [Thu, 19 Jan 2017 16:25:02 +0000 (16:25 +0000)]
[X86][SSE] Attempt to pre-truncate arithmetic operations that have already been extended
As discussed on D28219 - it is profitable to combine trunc(binop (s/zext(x), s/zext(y)) to binop(trunc(s/zext(x)), trunc(s/zext(y))) assuming the trunc(ext()) will simplify further
llvm-svn: 292493
Sanjay Patel [Thu, 19 Jan 2017 16:12:10 +0000 (16:12 +0000)]
[InstCombine] icmp Pred (shl nsw X, C1), C0 --> icmp Pred X, C0 >> C1
Try harder to fold icmp with shl nsw as discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/108749.html
This is similar to the 'shl nuw' transforms that were added with D25913.
This may eventually help solve:
https://llvm.org/bugs/show_bug.cgi?id=30773
Differential Revision: https://reviews.llvm.org/D28406
llvm-svn: 292492
Felix Berger [Thu, 19 Jan 2017 15:51:10 +0000 (15:51 +0000)]
[clang-tidy] Do not trigger move fix for non-copy assignment operators in performance-unnecessary-value-param check
Reviewers: alexfh, sbenza, malcolm.parsons
Subscribers: JDevlieghere, cfe-commits
Differential Revision: https://reviews.llvm.org/D28899
llvm-svn: 292491
Marshall Clow [Thu, 19 Jan 2017 15:30:36 +0000 (15:30 +0000)]
Mark two of the TS implementations as 'in progress'
llvm-svn: 292490
Pavel Labath [Thu, 19 Jan 2017 15:26:04 +0000 (15:26 +0000)]
Refactor logging in NativeProcessLinux
Use the LLDB_LOG macro instead of the more verbose if(log) ... syntax.
I have also consolidated the log channels (everything now goes to the posix
channel, instead of a mixture of posix and lldb), and cleaned up some of the
more convoluted log statements.
llvm-svn: 292489
Hafiz Abid Qadeer [Thu, 19 Jan 2017 15:11:01 +0000 (15:11 +0000)]
Avoid unused variable warning when assert is disabled.
llvm-svn: 292488
Simon Pilgrim [Thu, 19 Jan 2017 15:03:00 +0000 (15:03 +0000)]
[X86][SSE] Added tests for pre-truncating arithmetic operations that have already been extended
As discussed on D28219 - it is profitable to combine trunc(binop (s/zext(x), s/zext(y)) to binop(trunc(s/zext(x)), trunc(s/zext(y))) assuming the trunc(ext()) will simplify further
llvm-svn: 292487
Tobias Grosser [Thu, 19 Jan 2017 14:12:45 +0000 (14:12 +0000)]
BlockGenerator: Do not redundantly reload from PHI-allocas in non-affine stmts
Before this change we created an additional reload in the copy of the incoming
block of a PHI node to reload the incoming value, even though the necessary
value has already been made available by the normally generated scalar loads.
In this change, we drop the code that generates this redundant reload and
instead just reuse the scalar value already available.
Besides making the generated code slightly cleaner, this change also makes sure
that scalar loads go through the normal logic, which means they can be remapped
(e.g. to array slots) and corresponding code is generated to load from the
remapped location. Without this change, the original scalar load at the
beginning of the non-affine region would have been remapped, but the redundant
scalar load would continue to load from the old PHI slot location.
It might be possible to further simplify the code in addOperandToPHI,
but this would not only mean to pull out getNewValue, but to also change the
insertion point update logic. As this did not work when trying it the first
time, this change is likely not trivial. To not introduce bugs last minute, we
postpone further simplications to a subsequent commit.
We also document the current behavior a little bit better.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D28892
llvm-svn: 292486
Mikael Holmen [Thu, 19 Jan 2017 13:55:55 +0000 (13:55 +0000)]
[DAG] Don't increase SDNodeOrder for dbg.value/declare.
Summary:
The SDNodeOrder is saved in the IROrder field in the SDNode, and this
field may affects scheduling. Thus, letting dbg.value/declare increase
the order numbers may in turn affect scheduling.
Because of this change we also need to update the code deciding when
dbg values should be output, in ScheduleDAGSDNodes.cpp/ProcessSDDbgValues.
Dbg values now have the same order as the SDNode they are connected to,
not the following orders.
Test cases provided by Florian Hahn.
Reviewers: bogner, aprantl, sunfish, atrick
Reviewed By: atrick
Subscribers: fhahn, probinson, andreadb, llvm-commits, MatzeB
Differential Revision: https://reviews.llvm.org/D25318
llvm-svn: 292485
Malcolm Parsons [Thu, 19 Jan 2017 13:38:19 +0000 (13:38 +0000)]
[docs] Tell Doxygen to expand LLVM_ALIGNAS to nothing
Summary:
Docs for clang::Decl and clang::TemplateSpecializationType have
not been generated since LLVM_ALIGNAS was added to them.
Tell Doxygen to expand LLVM_ALIGNAS to nothing as described at
https://www.stack.nl/~dimitri/doxygen/manual/preprocessing.html
Reviewers: aaron.ballman, klimek, alexfh
Subscribers: ioeric, cfe-commits
Differential Revision: https://reviews.llvm.org/D28850
llvm-svn: 292484
Malcolm Parsons [Thu, 19 Jan 2017 13:37:42 +0000 (13:37 +0000)]
[docs] Tell Doxygen to expand LLVM_ALIGNAS to nothing
Summary:
Docs for clang::Decl and clang::TemplateSpecializationType have
not been generated since LLVM_ALIGNAS was added to them.
Tell Doxygen to expand LLVM_ALIGNAS to nothing as described at
https://www.stack.nl/~dimitri/doxygen/manual/preprocessing.html
Reviewers: aaron.ballman, klimek, alexfh
Subscribers: ioeric, cfe-commits
Differential Revision: https://reviews.llvm.org/D28850
llvm-svn: 292483
Mikael Holmen [Thu, 19 Jan 2017 13:35:13 +0000 (13:35 +0000)]
Test commit access, remove trailing whitespace
llvm-svn: 292482
Kristof Beyls [Thu, 19 Jan 2017 13:32:14 +0000 (13:32 +0000)]
[GlobalISel] Pointers are legal operands for G_SELECT on AArch64
Differential Revision: https://reviews.llvm.org/D28805
llvm-svn: 292481
Tobias Grosser [Thu, 19 Jan 2017 13:25:52 +0000 (13:25 +0000)]
BlockGenerator: remove obfuscating const and const casts
Making certain values 'const' to just cast it away a little later mainly
obfuscates the code. Hence, we just drop the 'const' parts.
Suggested-by: Michael Kruse <llvm@meinersbur.de>
llvm-svn: 292480
Elena Demikhovsky [Thu, 19 Jan 2017 12:08:21 +0000 (12:08 +0000)]
Recommiting unsigned saturation with a bugfix.
A test case that crached is added to avx512-trunc.ll.
(PR31589)
llvm-svn: 292479
Daniel Sanders [Thu, 19 Jan 2017 11:15:55 +0000 (11:15 +0000)]
Re-commit: [globalisel] Tablegen-erate current Register Bank Information
Summary:
Adds a RegisterBank tablegen class that can be used to declare the register
banks and an associated tablegen pass to generate the necessary code.
Changes since first commit attempt:
* Added missing guards
* Added more missing guards
* Found and fixed a use-after-free bug involving Twine locals
Reviewers: t.p.northover, ab, rovka, qcolombet
Reviewed By: qcolombet
Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka
Differential Revision: https://reviews.llvm.org/D27338
llvm-svn: 292478
Malcolm Parsons [Thu, 19 Jan 2017 09:27:45 +0000 (09:27 +0000)]
[docs] Tell Doxygen to expand LLVM_ALIGNAS to nothing
Summary:
Docs for clang::Decl and clang::TemplateSpecializationType have
not been generated since LLVM_ALIGNAS was added to them.
Tell Doxygen to expand LLVM_ALIGNAS to nothing as described at
https://www.stack.nl/~dimitri/doxygen/manual/preprocessing.html
Reviewers: aaron.ballman, klimek, alexfh
Subscribers: ioeric, cfe-commits
Differential Revision: https://reviews.llvm.org/D28850
llvm-svn: 292477
Justin Bogner [Thu, 19 Jan 2017 07:51:17 +0000 (07:51 +0000)]
GlobalISel: Implement widening for shifts
llvm-svn: 292476
Craig Topper [Thu, 19 Jan 2017 07:37:45 +0000 (07:37 +0000)]
[AVX-512] Add test cases that show where we are using two subvector inserts to broadcast a 128-bit subvector into a 512-bit vector. We'd be better off using something like SHUFF32X4.
If the subvector comes from a load, we convert to SUBV_BROADCAST and use a broadcast instruction. But if there is no load we keep the inserts. I think we should create the SUBV_BROADCAST even without the load and let isel use the fallback patterns that are used if the load can't be folded. This will use the SHUFF32X4 or similar instruction for the 128-bit into 512-bit case and a single insert for 128 into 256 or 256 into 512.
This should be fixed so subvector broadcast intrinsics can be replaced with native IR since some of those currently lower directly to SHUFF32X4.
llvm-svn: 292475
Craig Topper [Thu, 19 Jan 2017 07:12:35 +0000 (07:12 +0000)]
[AVX-512] Support ADD/SUB/MUL of mask vectors
Summary:
Currently we expand and scalarize these operations, but I think we should be able to implement ADD/SUB with KXOR and MUL with KAND.
We already do this for scalar i1 operations so I just extended it to vectors of i1.
Reviewers: zvi, delena
Reviewed By: delena
Subscribers: guyblank, llvm-commits
Differential Revision: https://reviews.llvm.org/D28888
llvm-svn: 292474
Matt Arsenault [Thu, 19 Jan 2017 06:35:27 +0000 (06:35 +0000)]
AMDGPU: Disable some fneg combines unless nsz
For -(x + y) -> (-x) + (-y), if x == -y, this would
change the result from -0.0 to 0.0. Since the fma/fmad
combine is an extension of this problem it also
applies there.
fmul should be fine, and I don't think any of the unary
operators or conversions should be a problem either.
llvm-svn: 292473
Matt Arsenault [Thu, 19 Jan 2017 06:04:12 +0000 (06:04 +0000)]
AMDGPU: Remove modifiers from v_div_scale_*
They seem to produce nonsense results when used.
This should be applied to the release branch.
llvm-svn: 292472
Tobias Grosser [Thu, 19 Jan 2017 05:09:23 +0000 (05:09 +0000)]
Use range-based for loop [NFC]
llvm-svn: 292471
Tobias Grosser [Thu, 19 Jan 2017 04:54:45 +0000 (04:54 +0000)]
Improve test coverage in test/Isl/CodeGen/loop_partially_in_scop.ll [NFC]
We rename the test case with -metarenamer to make the variable names easier to
read and add additional check lines that verify the code we currently generate
for PHI nodes. This code is interesting as it contains a PHI node in a
non-affine sub-region, where some incoming blocks are within the non-affine
sub-region and others are outside of the non-affine subregion.
As can be seen in the check lines we currently load the PHI-node value twice.
This commit documents this behavior. In a subsequent patch we will try to
improve this.
llvm-svn: 292470
Craig Topper [Thu, 19 Jan 2017 03:49:29 +0000 (03:49 +0000)]
[X86] Merge LowerADD and LowerSUB into a single LowerADD_SUB since they are identical.
llvm-svn: 292469
Mike Aizatsky [Thu, 19 Jan 2017 03:49:18 +0000 (03:49 +0000)]
[sancov] applying blacklist to covered points too
Differential Revision: https://reviews.llvm.org/D28872
llvm-svn: 292468
Saleem Abdulrasool [Thu, 19 Jan 2017 02:58:46 +0000 (02:58 +0000)]
llvm-cxxfilt: filter out invalid manglings
c++filt does not attempt to demangle symbols which do not match its
expected format. This means that the symbol must start with _Z or ___Z
(block invocation function extension). Any other symbols are returned
as is. Note that this is different from the behaviour of __cxa_demangle
which will demangle fragments.
llvm-svn: 292467
Craig Topper [Thu, 19 Jan 2017 02:34:29 +0000 (02:34 +0000)]
[AVX-512] Use VSHUF instructions instead of two inserts as fallback for subvector broadcasts that can't fold the load.
llvm-svn: 292466
Craig Topper [Thu, 19 Jan 2017 02:34:25 +0000 (02:34 +0000)]
[AVX-512] Add additional test cases for broadcast intrinsics that demonstates that we don't fold the loads to use a broadcast instruction.
llvm-svn: 292465
Michael Kuperstein [Thu, 19 Jan 2017 02:21:54 +0000 (02:21 +0000)]
[PM] Add LoopVectorize to the default module pipeline
LV no longer "requires" LCSSA and LoopSimplify, and instead forms
them internally as required. So, there's nothing preventing it from
being enabled.
llvm-svn: 292464
Peter Collingbourne [Thu, 19 Jan 2017 01:20:11 +0000 (01:20 +0000)]
LowerTypeTests: Implement exporting of type identifiers.
Type identifiers are exported by:
- Adding coarse-grained information about how to test the type
identifier to the summary.
- Creating symbols in the object file (aliases and absolute symbols)
containing fine-grained information about the type identifier.
Differential Revision: https://reviews.llvm.org/D28424
llvm-svn: 292462
Justin Bogner [Thu, 19 Jan 2017 01:05:48 +0000 (01:05 +0000)]
GlobalISel: Implement narrowing for G_LOAD
llvm-svn: 292461
Justin Bogner [Thu, 19 Jan 2017 01:04:46 +0000 (01:04 +0000)]
GlobalISel: Fix text wrapping in a comment. NFC
llvm-svn: 292460
Matthias Braun [Thu, 19 Jan 2017 01:04:08 +0000 (01:04 +0000)]
Use an actual valid register in test
llvm-svn: 292459
Dehao Chen [Thu, 19 Jan 2017 00:44:21 +0000 (00:44 +0000)]
Add -fdebug-info-for-profiling to emit more debug info for sample pgo profile collection
Summary:
SamplePGO uses profile with debug info to collect profile. Unlike the traditional debugging purpose, sample pgo needs more accurate debug info to represent the profile. We add -femit-accurate-debug-info for this purpose. It can be combined with all debugging modes (-g, -gmlt, etc). It makes sure that the following pieces of info is always emitted:
* start line of all subprograms
* linkage name of all subprograms
* standalone subprograms (functions that has neither inlined nor been inlined)
The impact on speccpu2006 binary size (size increase comparing with -g0 binary, also includes data for -g binary, which does not change with this patch):
-gmlt(orig) -gmlt(patched) -g
433.milc 4.68% 5.40% 19.73%
444.namd 8.45% 8.93% 45.99%
447.dealII 97.43% 115.21% 374.89%
450.soplex 27.75% 31.88% 126.04%
453.povray 21.81% 26.16% 92.03%
470.lbm 0.60% 0.67% 1.96%
482.sphinx3 5.77% 6.47% 26.17%
400.perlbench 17.81% 19.43% 73.08%
401.bzip2 3.73% 3.92% 12.18%
403.gcc 31.75% 34.48% 122.75%
429.mcf 0.78% 0.88% 3.89%
445.gobmk 6.08% 7.92% 42.27%
456.hmmer 10.36% 11.25% 35.23%
458.sjeng 5.08% 5.42% 14.36%
462.libquantum 1.71% 1.96% 6.36%
464.h264ref 15.61% 16.56% 43.92%
471.omnetpp 11.93% 15.84% 60.09%
473.astar 3.11% 3.69% 14.18%
483.xalancbmk 56.29% 81.63% 353.22%
geomean 15.60% 18.30% 57.81%
Debug info size change for -gmlt binary with this patch:
433.milc 13.46%
444.namd 5.35%
447.dealII 18.21%
450.soplex 14.68%
453.povray 19.65%
470.lbm 6.03%
482.sphinx3 11.21%
400.perlbench 8.91%
401.bzip2 4.41%
403.gcc 8.56%
429.mcf 8.24%
445.gobmk 29.47%
456.hmmer 8.19%
458.sjeng 6.05%
462.libquantum 11.23%
464.h264ref 5.93%
471.omnetpp 31.89%
473.astar 16.20%
483.xalancbmk 44.62%
geomean 16.83%
Reviewers: davidxl, andreadb, rob.lougher, dblaikie, echristo
Reviewed By: dblaikie, echristo
Subscribers: hfinkel, rob.lougher, andreadb, gbedwell, cfe-commits, probinson, llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D25435
llvm-svn: 292458
Dehao Chen [Thu, 19 Jan 2017 00:44:11 +0000 (00:44 +0000)]
Add -debug-info-for-profiling to emit more debug info for sample pgo profile collection
Summary:
SamplePGO binaries built with -gmlt to collect profile. The current -gmlt debug info is limited, and we need some additional info:
* start line of all subprograms
* linkage name of all subprograms
* standalone subprograms (functions that has neither inlined nor been inlined)
This patch adds these information to the -gmlt binary. The impact on speccpu2006 binary size (size increase comparing with -g0 binary, also includes data for -g binary, which does not change with this patch):
-gmlt(orig) -gmlt(patched) -g
433.milc 4.68% 5.40% 19.73%
444.namd 8.45% 8.93% 45.99%
447.dealII 97.43% 115.21% 374.89%
450.soplex 27.75% 31.88% 126.04%
453.povray 21.81% 26.16% 92.03%
470.lbm 0.60% 0.67% 1.96%
482.sphinx3 5.77% 6.47% 26.17%
400.perlbench 17.81% 19.43% 73.08%
401.bzip2 3.73% 3.92% 12.18%
403.gcc 31.75% 34.48% 122.75%
429.mcf 0.78% 0.88% 3.89%
445.gobmk 6.08% 7.92% 42.27%
456.hmmer 10.36% 11.25% 35.23%
458.sjeng 5.08% 5.42% 14.36%
462.libquantum 1.71% 1.96% 6.36%
464.h264ref 15.61% 16.56% 43.92%
471.omnetpp 11.93% 15.84% 60.09%
473.astar 3.11% 3.69% 14.18%
483.xalancbmk 56.29% 81.63% 353.22%
geomean 15.60% 18.30% 57.81%
Debug info size change for -gmlt binary with this patch:
433.milc 13.46%
444.namd 5.35%
447.dealII 18.21%
450.soplex 14.68%
453.povray 19.65%
470.lbm 6.03%
482.sphinx3 11.21%
400.perlbench 8.91%
401.bzip2 4.41%
403.gcc 8.56%
429.mcf 8.24%
445.gobmk 29.47%
456.hmmer 8.19%
458.sjeng 6.05%
462.libquantum 11.23%
464.h264ref 5.93%
471.omnetpp 31.89%
473.astar 16.20%
483.xalancbmk 44.62%
geomean 16.83%
Reviewers: davidxl, echristo, dblaikie
Reviewed By: echristo, dblaikie
Subscribers: aprantl, probinson, llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D25434
llvm-svn: 292457
Michael Kuperstein [Thu, 19 Jan 2017 00:42:28 +0000 (00:42 +0000)]
[LV] Run loop-simplify and LCSSA explicitly instead of "requiring" them
This changes the vectorizer to explicitly use the loopsimplify and lcssa utils,
instead of "requiring" the transformations as if they were analyses.
This is not NFC, since it changes the LCSSA behavior - we no longer run LCSSA
for all loops, but rather only for the loops we expect to modify.
Differential Revision: https://reviews.llvm.org/D28868
llvm-svn: 292456
Matthias Braun [Thu, 19 Jan 2017 00:32:13 +0000 (00:32 +0000)]
LiveIntervalAnalysis: Cleanup; NFC
- Fix doxygen comments: Do not repeat name, remove duplicated doxygen
comment (on declaration + implementation), etc.
- Use more range based for
llvm-svn: 292455
Jason Molenda [Thu, 19 Jan 2017 00:20:29 +0000 (00:20 +0000)]
Fix a problem with the new dyld interface code -- when a new process
starts up, we need to clear the target's image list and only add
the binaries into the target that are actually present in this
process run.
<rdar://problem/
29857613>
llvm-svn: 292454
Artem Belevich [Thu, 19 Jan 2017 00:14:45 +0000 (00:14 +0000)]
[NVPTX] Fix lowering of fp16 ISD::FNEG.
There's no neg.f16 instruction, so negation has to
be done via subtraction from zero.
Differential Revision: https://reviews.llvm.org/D28876
llvm-svn: 292452
Peter Collingbourne [Thu, 19 Jan 2017 00:04:44 +0000 (00:04 +0000)]
Add llvm-dis dependency to check-clang.
llvm-svn: 292450
Eli Friedman [Wed, 18 Jan 2017 23:56:42 +0000 (23:56 +0000)]
[SCEV] Make getUDivExactExpr handle non-nuw multiplies correctly.
To avoid regressions, make ScalarEvolution::createSCEV a bit more
clever.
Also get rid of some useless code in ScalarEvolution::howFarToZero
which was hiding this bug.
No new testcase because it's impossible to actually expose this bug:
we don't have any in-tree users of getUDivExactExpr besides the two
functions I just mentioned, and they both dodged the problem. I'll
try to add some interesting users in a followup.
Differential Revision: https://reviews.llvm.org/D28587
llvm-svn: 292449
Peter Collingbourne [Wed, 18 Jan 2017 23:55:27 +0000 (23:55 +0000)]
Move vtable type metadata emission behind a cc1-level flag.
In ThinLTO mode, type metadata will require the module to be written as a
multi-module bitcode file, which is currently incompatible with the Darwin
linker. It is also useful to be able to enable or disable multi-module bitcode
for testing purposes. This introduces a cc1-level flag, -f{,no-}lto-unit,
which is used by the driver to enable multi-module bitcode on all but
Darwin+ThinLTO, and can also be used to enable/disable the feature manually.
Differential Revision: https://reviews.llvm.org/D28877
llvm-svn: 292448
Eli Friedman [Wed, 18 Jan 2017 23:26:37 +0000 (23:26 +0000)]
Preserve domtree and loop-simplify for runtime unrolling.
Mostly straightforward changes; we just didn't do the computation before.
One sort of interesting change in LoopUnroll.cpp: we weren't handling
dominance for children of the loop latch correctly, but
foldBlockIntoPredecessor hid the problem for complete unrolling.
Currently punting on loop peeling; made some minor changes to isolate
that problem to LoopUnrollPeel.cpp.
Adds a flag -unroll-verify-domtree; it verifies the domtree immediately
after we finish updating it. This is on by default for +Asserts builds.
Differential Revision: https://reviews.llvm.org/D28073
llvm-svn: 292447
Krzysztof Parzyszek [Wed, 18 Jan 2017 23:12:19 +0000 (23:12 +0000)]
Treat segment [B, E) as not overlapping block with boundaries [A, B)
llvm-svn: 292446
Krzysztof Parzyszek [Wed, 18 Jan 2017 23:11:40 +0000 (23:11 +0000)]
[Hexagon] Remove dead defs from the live set when expanding wstores
llvm-svn: 292445
Michael Kuperstein [Wed, 18 Jan 2017 23:05:58 +0000 (23:05 +0000)]
Revert r291670 because it introduces a crash.
r291670 doesn't crash on the original testcase from PR31589,
but it crashes on a slightly more complex one.
PR31589 has the new reproducer.
llvm-svn: 292444
Stephan T. Lavavej [Wed, 18 Jan 2017 22:19:14 +0000 (22:19 +0000)]
[libcxx] [test] Add msvc_stdlib_force_include.hpp.
No functional change; nothing includes this, instead our test harness
injects it via the /FI compiler option.
No code review; blessed in advance by EricWF.
llvm-svn: 292443
Mehdi Amini [Wed, 18 Jan 2017 21:37:11 +0000 (21:37 +0000)]
Improve the `-filter-print-funcs` option to skip the banner for CGSCC pass when nothing is to be printed
Before, it would print a sequence of:
*** IR Dump After Function Integration/Inlining ******
*** IR Dump After Function Integration/Inlining ******
*** IR Dump After Function Integration/Inlining ******
...
for every single function in the module.
llvm-svn: 292442
Sanjay Patel [Wed, 18 Jan 2017 21:31:21 +0000 (21:31 +0000)]
[InstCombine] add tests for shl nsw with icmp eq/ne; NFCI
These should be fixed with D28406.
llvm-svn: 292441
Sanjay Patel [Wed, 18 Jan 2017 21:16:12 +0000 (21:16 +0000)]
[InstCombine] add an assert to make a shl+icmp transform assumption explicit; NFCI
llvm-svn: 292440
David Blaikie [Wed, 18 Jan 2017 21:15:18 +0000 (21:15 +0000)]
Remove now redundant code that ensured debug info for class definitions was emitted under certain circumstances
Introduced in r181561 - it may've been subsumed by work done to allow
emission of declarations for vtable types while still emitting some of
their member functions correctly for those declarations. Whatever the
reason, the tests pass without this code now.
llvm-svn: 292439
Haicheng Wu [Wed, 18 Jan 2017 21:12:10 +0000 (21:12 +0000)]
[CodeGenPrepare] Fix a typo in the comment. NFC.
encode => endcode.
Differential Revision: https://reviews.llvm.org/D28866
llvm-svn: 292438
Arpith Chacko Jacob [Wed, 18 Jan 2017 20:40:48 +0000 (20:40 +0000)]
[OpenMP] Support for the if-clause on the combined directive 'target parallel'.
The if-clause on the combined directive potentially applies to both the
'target' and the 'parallel' regions. Codegen'ing the if-clause on the
combined directive requires additional support because the expression in
the clause must be captured by the 'target' capture statement but not
the 'parallel' capture statement. Note that this situation arises for
other clauses such as num_threads.
The OMPIfClause class inherits OMPClauseWithPreInit to support capturing
of expressions in the clause. A member CaptureRegion is added to
OMPClauseWithPreInit to indicate which captured statement (in this case
'target' but not 'parallel') captures these expressions.
To ensure correct codegen of captured expressions in the presence of
combined 'target' directives, OMPParallelScope was added to 'parallel'
codegen.
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D28781
llvm-svn: 292437
Graydon Hoare [Wed, 18 Jan 2017 20:36:59 +0000 (20:36 +0000)]
[ASTReader] Add a DeserializationListener callback for IMPORTED_MODULES
Summary:
Add a callback from ASTReader to DeserializationListener when the former
reads an IMPORTED_MODULES block. This supports Swift in using PCH for
bridging headers.
Reviewers: doug.gregor, manmanren, bruno
Reviewed By: manmanren
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D28779
llvm-svn: 292436
Graydon Hoare [Wed, 18 Jan 2017 20:34:44 +0000 (20:34 +0000)]
[Modules] Correct test comment from obsolete earlier version of code. NFC
Summary:
Code committed in rL290219 went through a few iterations; test wound up with
stale comment.
Reviewers: doug.gregor, manmanren
Reviewed By: manmanren
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D28790
llvm-svn: 292435
Stephan T. Lavavej [Wed, 18 Jan 2017 20:10:25 +0000 (20:10 +0000)]
[libcxx] [test] Fix comment typos, strip trailing whitespace.
No functional change, no code review.
llvm-svn: 292434
Sanjay Patel [Wed, 18 Jan 2017 20:09:59 +0000 (20:09 +0000)]
[InstCombine] remove a redundant check; NFCI
I missed deleting this check when I refactored this chunk in:
https://reviews.llvm.org/rL292260
llvm-svn: 292433
Stephan T. Lavavej [Wed, 18 Jan 2017 20:09:56 +0000 (20:09 +0000)]
[libcxx] [test] Fix MSVC warnings C4127 and C6326 about constants.
MSVC has compiler warnings C4127 "conditional expression is constant" (enabled
by /W4) and C6326 "Potential comparison of a constant with another constant"
(enabled by /analyze). They're potentially useful, although they're slightly
annoying to library devs who know what they're doing. In the latest version of
the compiler, C4127 is suppressed when the compiler sees simple tests like
"if (name_of_thing)", so extracting comparison expressions into named
constants is a workaround. At the same time, using std::integral_constant
avoids C6326, which doesn't look at template arguments.
test/std/containers/sequences/vector.bool/emplace.pass.cpp
Replace 1 == 1 with true, which is the same as far as the library is concerned.
Fixes D28837.
llvm-svn: 292432
Peter Collingbourne [Wed, 18 Jan 2017 20:03:02 +0000 (20:03 +0000)]
ThinLTOBitcodeWriter: Clear comdats on filtered globals.
Differential Revision: https://reviews.llvm.org/D28839
llvm-svn: 292431
Peter Collingbourne [Wed, 18 Jan 2017 20:02:31 +0000 (20:02 +0000)]
Cloning: Copy comdats when cloning globals.
Differential Revision: https://reviews.llvm.org/D28838
llvm-svn: 292430
Arpith Chacko Jacob [Wed, 18 Jan 2017 19:35:00 +0000 (19:35 +0000)]
[OpenMP] Codegen for the 'target parallel' directive on the NVPTX device.
This patch adds codegen for the 'target parallel' directive on the NVPTX
device. We term offload OpenMP directives such as 'target parallel' and
'target teams distribute parallel for' as SPMD constructs. SPMD constructs,
in contrast to Generic ones like the plain 'target', can never contain
a serial region.
SPMD constructs can be handled more efficiently on the GPU and do not
require the Warp Loop of the Generic codegen scheme. This patch adds
SPMD codegen support for 'target parallel' on the NVPTX device and can
be reused for other SPMD constructs.
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D28755
llvm-svn: 292428
Richard Smith [Wed, 18 Jan 2017 19:19:22 +0000 (19:19 +0000)]
PR9551: Implement DR1004 (wg21.link/cwg1004).
This rule permits the injected-class-name of a class template to be used as
both a template type argument and a template template argument, with no extra
syntax required to disambiguate.
llvm-svn: 292426
Michael Kuperstein [Wed, 18 Jan 2017 19:05:48 +0000 (19:05 +0000)]
Fix up a comment. NFC.
llvm-svn: 292425
Michael Kuperstein [Wed, 18 Jan 2017 19:02:52 +0000 (19:02 +0000)]
[LV] Allow reductions that have several uses outside the loop
We currently check whether a reduction has a single outside user. We don't
really need to require that - we just need to make sure a single value is
used externally. The number of external users of that value shouldn't actually
matter.
Differential Revision: https://reviews.llvm.org/D28830
llvm-svn: 292424
Justin Bogner [Wed, 18 Jan 2017 19:01:58 +0000 (19:01 +0000)]
cmake: Only sanitize use-after-scope if the host compiler supports it
In r292256, we started adding -fsanitize-use-after-scope when using
the address sanitizer, but that flag wasn't always available. This
fixes the config to only add the flag if the host compiler supports
it.
llvm-svn: 292423
Evandro Menezes [Wed, 18 Jan 2017 18:57:08 +0000 (18:57 +0000)]
[AArch64] Generate literals by the little end
ARM seems to prefer that long literals be formed from their little end in
order to promote the fusion of the instrs pairs MOV/MOVK and MOVK/MOVK on
Cortex A57 and others (v. "Cortex A57 Software Optimisation Guide", section
4.14).
Differential revision: https://reviews.llvm.org/D28697
llvm-svn: 292422
Davide Italiano [Wed, 18 Jan 2017 18:42:28 +0000 (18:42 +0000)]
[NewGVN] We don't use postdom info anymore. Update.
Differential Revision: https://reviews.llvm.org/D28842
llvm-svn: 292421
Mehdi Amini [Wed, 18 Jan 2017 18:36:21 +0000 (18:36 +0000)]
[ThinLTO] Add a recursive step in Metadata lazy-loading
Summary:
Without this, we're stressing the RAUW of unique nodes,
which is a costly operation. This is intended to limit
the number of RAUW, and is very effective on the total
link-time of opt with ThinLTO, before:
real 4m4.587s user 15m3.401s sys 0m23.616s
after:
real 3m25.261s user 12m22.132s sys 0m24.152s
Reviewers: tejohnson, pcc
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28751
llvm-svn: 292420
Arpith Chacko Jacob [Wed, 18 Jan 2017 18:18:53 +0000 (18:18 +0000)]
[OpenMP] Codegen support for 'target parallel' on the host.
This patch adds support for codegen of 'target parallel' on the host.
It is also the first combined directive that requires two or more
captured statements. Support for this functionality is included in
the patch.
A combined directive such as 'target parallel' has two captured
statements, one for the 'target' and the other for the 'parallel'
region. Two captured statements are required because each has
different implicit parameters (see SemaOpenMP.cpp). For example,
the 'parallel' has 'global_tid' and 'bound_tid' while the 'target'
does not. The patch adds support for handling multiple captured
statements based on the combined directive.
When codegen'ing the 'target parallel' directive, the 'target'
outlined function is created using the outer captured statement
and the 'parallel' outlined function is created using the inner
captured statement.
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D28753
llvm-svn: 292419
Jonathan Roelofs [Wed, 18 Jan 2017 18:12:39 +0000 (18:12 +0000)]
Revert r286788
The Itanium ABI [1] specifies that __cxa_demangle accept either:
1) symbol names, which start with "_Z"
2) type manglings, which do not start with "_Z"
r286788 erroneously assumes that it should only handle symbols, so this patch
reverts it and adds a counterexample to the testcase.
1: https://mentorembedded.github.io/cxx-abi/abi.html#demangler
Reviewers: zygoloid, EricWF
llvm-svn: 292418
Graydon Hoare [Wed, 18 Jan 2017 18:12:20 +0000 (18:12 +0000)]
[lit] Support sharding testsuites, for parallel execution.
Summary:
This change equips lit.py with two new options, --num-shards=M and
--run-shard=N (set by default from env vars LIT_NUM_SHARDS and LIT_RUN_SHARD).
The options must be used together, and N must be in 1..M.
Together these options effect only test selection: they partition the testsuite
into M equal-sized "shards", then select only the Nth shard. They can be used
in a cluster of test machines to achieve a very crude (static) form of
parallelism, with minimal configuration work.
Reviewers: modocache, ddunbar
Reviewed By: ddunbar
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28789
llvm-svn: 292417
Alexey Bataev [Wed, 18 Jan 2017 18:07:46 +0000 (18:07 +0000)]
[SLP] Add a tests for a fix for PR30787.
Add a test for PR30787: Failure to beneficially vectorize 'copyable'
elements in integer binary ops.
llvm-svn: 292416
Ehsan Akhgari [Wed, 18 Jan 2017 17:49:35 +0000 (17:49 +0000)]
[clang-tidy] Add -extra-arg and -extra-arg-before to run-clang-tidy.py
Summary:
These flags allow specifying extra arguments to the tool's command
line which don't appear in the compilation database.
Reviewers: alexfh
Differential Revision: https://reviews.llvm.org/D28334
llvm-svn: 292415
Pavel Labath [Wed, 18 Jan 2017 17:31:55 +0000 (17:31 +0000)]
Fix new Log unit test
the test was flaky because I specified the format string for the process id
incorrectly. This should fix it.
llvm-svn: 292414
Stanislav Mekhanoshin [Wed, 18 Jan 2017 17:30:05 +0000 (17:30 +0000)]
[AMDGPU] Do not allow register coalescer to create big superregs
Limit register coalescer by not allowing it to artificially increase
size of registers beyond dword. Such super-registers are in fact
register sequences and not distinct HW registers.
With more super-regs we would need to allocate adjacent registers
and constraint regalloc more than needed. Moreover, our super
registers are overlapping. For instance we have VGPR0_VGPR1_VGPR2,
VGPR1_VGPR2_VGPR3, VGPR2_VGPR3_VGPR4 etc, which complicates registers
allocation even more, resulting in excessive spilling.
Differential Revision: https://reviews.llvm.org/D28782
llvm-svn: 292413
Justin Bogner [Wed, 18 Jan 2017 17:29:54 +0000 (17:29 +0000)]
GlobalISel: Implement narrowing for G_STORE
Legalize stores of types that are too wide by breaking them up into
sequences of smaller stores.
llvm-svn: 292412
Justin Bogner [Wed, 18 Jan 2017 17:28:41 +0000 (17:28 +0000)]
GlobalISel: Correct copy-pasted comment. NFC
llvm-svn: 292411
Kostya Kortchinsky [Wed, 18 Jan 2017 17:11:17 +0000 (17:11 +0000)]
[scudo] Refactor of CRC32 and ARM runtime CRC32 detection
Summary:
ARM & AArch64 runtime detection for hardware support of CRC32 has been added
via check of the AT_HWVAL auxiliary vector.
Following Michal's suggestions in D28417, the CRC32 code has been further
changed and looks better now. When compiled with full relro (which is strongly
suggested to benefit from additional hardening), the weak symbol for
computeHardwareCRC32 is read-only and the assembly generated is fairly clean
and straight forward. As suggested, an additional optimization is to skip
the runtime check if SSE 4.2 has been enabled globally, as opposed to only
for scudo_crc32.cpp.
scudo_crc32.h has no purpose anymore and was removed.
Reviewers: alekseyshl, kcc, rengolin, mgorny, phosek
Reviewed By: rengolin, mgorny
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D28574
llvm-svn: 292409
Teresa Johnson [Wed, 18 Jan 2017 16:58:43 +0000 (16:58 +0000)]
Don't create a comdat group for a dropped def with initializer
Non-prevailing weak/linkonce odr symbols will be dropped by ThinLTO to
available_externally when possible. If they had an initializer in the
global_ctors list, a comdat group was being created. This code
already had logic to skip available_externally defs, but now the
EliminateAvailableExternally pass will drop these symbols to
declarations earlier. Change the check to skip all declarations for
linker (which includes available_externally along with declarations).
Reviewers: mehdi_amini
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28737
llvm-svn: 292408
Kirill Bobyrev [Wed, 18 Jan 2017 16:34:25 +0000 (16:34 +0000)]
Revert 292404 due to buildbot failures.
llvm-svn: 292407
Benjamin Kramer [Wed, 18 Jan 2017 16:25:48 +0000 (16:25 +0000)]
[ASTUnit] Reset diag state when creating the ASTUnit.
A client could call this with a dirty diagnostic engine, don't crash.
llvm-svn: 292406
Benjamin Kramer [Wed, 18 Jan 2017 16:22:58 +0000 (16:22 +0000)]
[include-fixer] Don't return a correction if the header insertion failed.
This is could happen in cases involving macros and we don't want to
return an invalid fixit for it or a confusing error message with no
fixit.
llvm-svn: 292405
Kirill Bobyrev [Wed, 18 Jan 2017 16:15:47 +0000 (16:15 +0000)]
[X86] Minor code cleanup to fix several clang-tidy warnings. NFC
llvm-svn: 292404
Sam Parker [Wed, 18 Jan 2017 15:52:11 +0000 (15:52 +0000)]
[ARM] Create SubtargetFeatures from build attrs
An ELFObjectFile can now create SubtargetFeatures from the available
ARM build attributes, in a similar manner to MIPS. I've moved the
MIPS code into its own function and the ARM handler also has a
separate function.
Differential Revision: https://reviews.llvm.org/D28291
llvm-svn: 292403
Benjamin Kramer [Wed, 18 Jan 2017 15:50:26 +0000 (15:50 +0000)]
[Basic] Remove source manager references from diag state points.
This is just wasted space, we don't support state points from multiple
source managers. Validate that there's no state when resetting the
source manager and use the 'global' reference to the sourcemanager
instead of the ones in the diag state.
llvm-svn: 292402
Pavel Labath [Wed, 18 Jan 2017 15:46:50 +0000 (15:46 +0000)]
raw_fd_ostream: Make file handles non-inheritable by default
Summary:
This makes the file descriptors on unix platform non-inheritable (O_CLOEXEC).
There is no change in behavior on windows, as the handles were already
non-inheritable there.
Reviewers: rnk, rafael
Subscribers: llvm-commits, mgorny
Differential Revision: https://reviews.llvm.org/D28854
llvm-svn: 292401
Arpith Chacko Jacob [Wed, 18 Jan 2017 15:36:05 +0000 (15:36 +0000)]
Revert r292374 to debug Windows buildbot failure.
llvm-svn: 292400
Jonathan Roelofs [Wed, 18 Jan 2017 15:31:11 +0000 (15:31 +0000)]
Warn when calling a non interrupt function from an interrupt on ARM
The idea for this originated from a really tricky bug: ISRs on ARM don't
automatically save off the VFP regs, so if say, memcpy gets interrupted and the
ISR itself calls memcpy, the regs are left clobbered when the ISR is done.
https://reviews.llvm.org/D28820
llvm-svn: 292375
Arpith Chacko Jacob [Wed, 18 Jan 2017 15:14:52 +0000 (15:14 +0000)]
[OpenMP] Codegen support for 'target parallel' on the host.
This patch adds support for codegen of 'target parallel' on the host.
It is also the first combined directive that requires two or more
captured statements. Support for this functionality is included in
the patch.
A combined directive such as 'target parallel' has two captured
statements, one for the 'target' and the other for the 'parallel'
region. Two captured statements are required because each has
different implicit parameters (see SemaOpenMP.cpp). For example,
the 'parallel' has 'global_tid' and 'bound_tid' while the 'target'
does not. The patch adds support for handling multiple captured
statements based on the combined directive.
When codegen'ing the 'target parallel' directive, the 'target'
outlined function is created using the outer captured statement
and the 'parallel' outlined function is created using the inner
captured statement.
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D28753
llvm-svn: 292374
Chad Rosier [Wed, 18 Jan 2017 15:02:54 +0000 (15:02 +0000)]
[Assembler] Fix crash when assembling .quad for AArch32.
A 64-bit relocation does not exist in 32-bit ARMELF. Report an error
instead of crashing.
PR23870
Patch by Sanne Wouda (sanwou01).
Differential Revision: https://reviews.llvm.org/D28851
llvm-svn: 292373
Florian Hahn [Wed, 18 Jan 2017 15:01:22 +0000 (15:01 +0000)]
[thumb,framelowering] Reset NoVRegs in Thumb1FrameLowering::emitPrologue.
Summary:
In this function, virtual registers can be introduced (for example
through calls to emitThumbRegPlusImmInReg). doScavengeFrameVirtualRegs
will replace those virtual registers with concrete registers later on
in PrologEpilogInserter, which sets NoVRegs again.
This patch fixes the Codegen/Thumb/segmented-stacks.ll test case which
failed with expensive checks.
https://llvm.org/bugs/show_bug.cgi?id=27484
Reviewers: rnk, bkramer, olista01
Reviewed By: olista01
Subscribers: llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D28829
llvm-svn: 292372
Simon Pilgrim [Wed, 18 Jan 2017 14:47:49 +0000 (14:47 +0000)]
[InstCombine][AVX2] Add DemandedElts support for VPERMD/VPERMPS shuffles
Simplify a vpermv shuffle mask based on the elements of the mask that are actually demanded.
llvm-svn: 292371
Daniel Sanders [Wed, 18 Jan 2017 14:26:12 +0000 (14:26 +0000)]
Re-revert: [globalisel] Tablegen-erate current Register Bank Information
More missing guards. My build didn't notice it due to a stale file left over
from a Global ISel build.
llvm-svn: 292369
Simon Pilgrim [Wed, 18 Jan 2017 14:23:06 +0000 (14:23 +0000)]
[InstCombine][AVX2] Tests showing missed opportunities to pass demanded elts through a vpermd/vpermps shuffle
llvm-svn: 292368
Daniel Sanders [Wed, 18 Jan 2017 14:17:50 +0000 (14:17 +0000)]
Re-commit: [globalisel] Tablegen-erate current Register Bank Information
Summary:
Adds a RegisterBank tablegen class that can be used to declare the register
banks and an associated tablegen pass to generate the necessary code.
Changes since last commit:
The new tablegen pass is now correctly guarded by LLVM_BUILD_GLOBAL_ISEL and
this should fix the buildbots however it may not be the whole fix. The previous
buildbot failures suggest there may be a memory bug lurking that I'm unable to
reproduce (including when using asan) or spot in the source. If they re-occur
on this commit then I'll need assistance from the bot owners to track it down.
Reviewers: t.p.northover, ab, rovka, qcolombet
Reviewed By: qcolombet
Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka
Differential Revision: https://reviews.llvm.org/D27338
llvm-svn: 292367
Sam Parker [Wed, 18 Jan 2017 13:52:12 +0000 (13:52 +0000)]
[ARM] Create objdump subtarget from build attrs
Enable an ELFObjectFile to read the its arm build attributes to
produce a target triple with a specific ARM architecture.
llvm-objdump now uses this functionality to automatically produce
a more accurate target.
Differential Revision: https://reviews.llvm.org/D28769
llvm-svn: 292366