platform/upstream/llvm.git
7 years ago[Sema] Reword unused lambda capture warning
Malcolm Parsons [Thu, 19 Jan 2017 17:19:22 +0000 (17:19 +0000)]
[Sema] Reword unused lambda capture warning

Summary:
The warning doesn't know why the variable was looked up but not
odr-used, so reword it to not claim that it was used in an unevaluated
context.

Reviewers: aaron.ballman

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D28902

llvm-svn: 292498

7 years ago[Sema] Fix PR28181 by avoiding calling BuildOverloadedBinOp in C mode
Alex Lorenz [Thu, 19 Jan 2017 17:17:57 +0000 (17:17 +0000)]
[Sema] Fix PR28181 by avoiding calling BuildOverloadedBinOp in C mode

rdar://28532840

Differential Revision: https://reviews.llvm.org/D25213

llvm-svn: 292497

7 years ago[Hexagon] Linux linker does not support .gnu-hash
Sumanth Gundapaneni [Thu, 19 Jan 2017 16:54:04 +0000 (16:54 +0000)]
[Hexagon] Linux linker does not support .gnu-hash

Hexagon Linux dynamic loader does not use (in fact does not support)
.gnu-hash

Differential Revision: https://reviews.llvm.org/D28865

llvm-svn: 292496

7 years ago[X86][SSE] Attempt to pre-truncate arithmetic operations that have already been extended
Simon Pilgrim [Thu, 19 Jan 2017 16:25:02 +0000 (16:25 +0000)]
[X86][SSE] Attempt to pre-truncate arithmetic operations that have already been extended

As discussed on D28219 - it is profitable to combine trunc(binop (s/zext(x), s/zext(y)) to binop(trunc(s/zext(x)), trunc(s/zext(y))) assuming the trunc(ext()) will simplify further

llvm-svn: 292493

7 years ago[InstCombine] icmp Pred (shl nsw X, C1), C0 --> icmp Pred X, C0 >> C1
Sanjay Patel [Thu, 19 Jan 2017 16:12:10 +0000 (16:12 +0000)]
[InstCombine] icmp Pred (shl nsw X, C1), C0 --> icmp Pred X, C0 >> C1

Try harder to fold icmp with shl nsw as discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/108749.html

This is similar to the 'shl nuw' transforms that were added with D25913.

This may eventually help solve:
https://llvm.org/bugs/show_bug.cgi?id=30773

Differential Revision: https://reviews.llvm.org/D28406

llvm-svn: 292492

7 years ago[clang-tidy] Do not trigger move fix for non-copy assignment operators in performance...
Felix Berger [Thu, 19 Jan 2017 15:51:10 +0000 (15:51 +0000)]
[clang-tidy] Do not trigger move fix for non-copy assignment operators in performance-unnecessary-value-param check

Reviewers: alexfh, sbenza, malcolm.parsons

Subscribers: JDevlieghere, cfe-commits

Differential Revision: https://reviews.llvm.org/D28899

llvm-svn: 292491

7 years agoMark two of the TS implementations as 'in progress'
Marshall Clow [Thu, 19 Jan 2017 15:30:36 +0000 (15:30 +0000)]
Mark two of the TS implementations as 'in progress'

llvm-svn: 292490

7 years agoRefactor logging in NativeProcessLinux
Pavel Labath [Thu, 19 Jan 2017 15:26:04 +0000 (15:26 +0000)]
Refactor logging in NativeProcessLinux

Use the LLDB_LOG macro instead of the more verbose if(log) ... syntax.

I have also consolidated the log channels (everything now goes to the posix
channel, instead of a mixture of posix and lldb), and cleaned up some of the
more convoluted log statements.

llvm-svn: 292489

7 years agoAvoid unused variable warning when assert is disabled.
Hafiz Abid Qadeer [Thu, 19 Jan 2017 15:11:01 +0000 (15:11 +0000)]
Avoid unused variable warning when assert is disabled.

llvm-svn: 292488

7 years ago[X86][SSE] Added tests for pre-truncating arithmetic operations that have already...
Simon Pilgrim [Thu, 19 Jan 2017 15:03:00 +0000 (15:03 +0000)]
[X86][SSE] Added tests for pre-truncating arithmetic operations that have already been extended

As discussed on D28219 - it is profitable to combine trunc(binop (s/zext(x), s/zext(y)) to binop(trunc(s/zext(x)), trunc(s/zext(y))) assuming the trunc(ext()) will simplify further

llvm-svn: 292487

7 years agoBlockGenerator: Do not redundantly reload from PHI-allocas in non-affine stmts
Tobias Grosser [Thu, 19 Jan 2017 14:12:45 +0000 (14:12 +0000)]
BlockGenerator: Do not redundantly reload from PHI-allocas in non-affine stmts

Before this change we created an additional reload in the copy of the incoming
block of a PHI node to reload the incoming value, even though the necessary
value has already been made available by the normally generated scalar loads.
In this change, we drop the code that generates this redundant reload and
instead just reuse the scalar value already available.

Besides making the generated code slightly cleaner, this change also makes sure
that scalar loads go through the normal logic, which means they can be remapped
(e.g. to array slots) and corresponding code is generated to load from the
remapped location. Without this change, the original scalar load at the
beginning of the non-affine region would have been remapped, but the redundant
scalar load would continue to load from the old PHI slot location.

It might be possible to further simplify the code in addOperandToPHI,
but this would not only mean to pull out getNewValue, but to also change the
insertion point update logic. As this did not work when trying it the first
time, this change is likely not trivial. To not introduce bugs last minute, we
postpone further simplications to a subsequent commit.

We also document the current behavior a little bit better.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D28892

llvm-svn: 292486

7 years ago[DAG] Don't increase SDNodeOrder for dbg.value/declare.
Mikael Holmen [Thu, 19 Jan 2017 13:55:55 +0000 (13:55 +0000)]
[DAG] Don't increase SDNodeOrder for dbg.value/declare.

Summary:
The SDNodeOrder is saved in the IROrder field in the SDNode, and this
field may affects scheduling. Thus, letting dbg.value/declare increase
the order numbers may in turn affect scheduling.

Because of this change we also need to update the code deciding when
dbg values should be output, in ScheduleDAGSDNodes.cpp/ProcessSDDbgValues.

Dbg values now have the same order as the SDNode they are connected to,
not the following orders.

Test cases provided by Florian Hahn.

Reviewers: bogner, aprantl, sunfish, atrick

Reviewed By: atrick

Subscribers: fhahn, probinson, andreadb, llvm-commits, MatzeB

Differential Revision: https://reviews.llvm.org/D25318

llvm-svn: 292485

7 years ago[docs] Tell Doxygen to expand LLVM_ALIGNAS to nothing
Malcolm Parsons [Thu, 19 Jan 2017 13:38:19 +0000 (13:38 +0000)]
[docs] Tell Doxygen to expand LLVM_ALIGNAS to nothing

Summary:
Docs for clang::Decl and clang::TemplateSpecializationType have
not been generated since LLVM_ALIGNAS was added to them.

Tell Doxygen to expand LLVM_ALIGNAS to nothing as described at
https://www.stack.nl/~dimitri/doxygen/manual/preprocessing.html

Reviewers: aaron.ballman, klimek, alexfh

Subscribers: ioeric, cfe-commits

Differential Revision: https://reviews.llvm.org/D28850

llvm-svn: 292484

7 years ago[docs] Tell Doxygen to expand LLVM_ALIGNAS to nothing
Malcolm Parsons [Thu, 19 Jan 2017 13:37:42 +0000 (13:37 +0000)]
[docs] Tell Doxygen to expand LLVM_ALIGNAS to nothing

Summary:
Docs for clang::Decl and clang::TemplateSpecializationType have
not been generated since LLVM_ALIGNAS was added to them.

Tell Doxygen to expand LLVM_ALIGNAS to nothing as described at
https://www.stack.nl/~dimitri/doxygen/manual/preprocessing.html

Reviewers: aaron.ballman, klimek, alexfh

Subscribers: ioeric, cfe-commits

Differential Revision: https://reviews.llvm.org/D28850

llvm-svn: 292483

7 years agoTest commit access, remove trailing whitespace
Mikael Holmen [Thu, 19 Jan 2017 13:35:13 +0000 (13:35 +0000)]
Test commit access, remove trailing whitespace

llvm-svn: 292482

7 years ago[GlobalISel] Pointers are legal operands for G_SELECT on AArch64
Kristof Beyls [Thu, 19 Jan 2017 13:32:14 +0000 (13:32 +0000)]
[GlobalISel] Pointers are legal operands for G_SELECT on AArch64

Differential Revision: https://reviews.llvm.org/D28805

llvm-svn: 292481

7 years agoBlockGenerator: remove obfuscating const and const casts
Tobias Grosser [Thu, 19 Jan 2017 13:25:52 +0000 (13:25 +0000)]
BlockGenerator: remove obfuscating const and const casts

Making certain values 'const' to just cast it away a little later mainly
obfuscates the code. Hence, we just drop the 'const' parts.

Suggested-by: Michael Kruse <llvm@meinersbur.de>
llvm-svn: 292480

7 years agoRecommiting unsigned saturation with a bugfix.
Elena Demikhovsky [Thu, 19 Jan 2017 12:08:21 +0000 (12:08 +0000)]
Recommiting unsigned saturation with a bugfix.
A test case that crached is added to avx512-trunc.ll.
(PR31589)

llvm-svn: 292479

7 years agoRe-commit: [globalisel] Tablegen-erate current Register Bank Information
Daniel Sanders [Thu, 19 Jan 2017 11:15:55 +0000 (11:15 +0000)]
Re-commit: [globalisel] Tablegen-erate current Register Bank Information

Summary:
Adds a RegisterBank tablegen class that can be used to declare the register
banks and an associated tablegen pass to generate the necessary code.

Changes since first commit attempt:
* Added missing guards
* Added more missing guards
* Found and fixed a use-after-free bug involving Twine locals

Reviewers: t.p.northover, ab, rovka, qcolombet

Reviewed By: qcolombet

Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka

Differential Revision: https://reviews.llvm.org/D27338

llvm-svn: 292478

7 years ago[docs] Tell Doxygen to expand LLVM_ALIGNAS to nothing
Malcolm Parsons [Thu, 19 Jan 2017 09:27:45 +0000 (09:27 +0000)]
[docs] Tell Doxygen to expand LLVM_ALIGNAS to nothing

Summary:
Docs for clang::Decl and clang::TemplateSpecializationType have
not been generated since LLVM_ALIGNAS was added to them.

Tell Doxygen to expand LLVM_ALIGNAS to nothing as described at
https://www.stack.nl/~dimitri/doxygen/manual/preprocessing.html

Reviewers: aaron.ballman, klimek, alexfh

Subscribers: ioeric, cfe-commits

Differential Revision: https://reviews.llvm.org/D28850

llvm-svn: 292477

7 years agoGlobalISel: Implement widening for shifts
Justin Bogner [Thu, 19 Jan 2017 07:51:17 +0000 (07:51 +0000)]
GlobalISel: Implement widening for shifts

llvm-svn: 292476

7 years ago[AVX-512] Add test cases that show where we are using two subvector inserts to broadc...
Craig Topper [Thu, 19 Jan 2017 07:37:45 +0000 (07:37 +0000)]
[AVX-512] Add test cases that show where we are using two subvector inserts to broadcast a 128-bit subvector into a 512-bit vector. We'd be better off using something like SHUFF32X4.

If the subvector comes from a load, we convert to SUBV_BROADCAST and use a broadcast instruction. But if there is no load we keep the inserts. I think we should create the SUBV_BROADCAST even without the load and let isel use the fallback patterns that are used if the load can't be folded. This will use the SHUFF32X4 or similar instruction for the 128-bit into 512-bit case and a single insert for 128 into 256 or 256 into 512.

This should be fixed so subvector broadcast intrinsics can be replaced with native IR since some of those currently lower directly to SHUFF32X4.

llvm-svn: 292475

7 years ago[AVX-512] Support ADD/SUB/MUL of mask vectors
Craig Topper [Thu, 19 Jan 2017 07:12:35 +0000 (07:12 +0000)]
[AVX-512] Support ADD/SUB/MUL of mask vectors

Summary:
Currently we expand and scalarize these operations, but I think we should be able to implement ADD/SUB with KXOR and MUL with KAND.

We already do this for scalar i1 operations so I just extended it to vectors of i1.

Reviewers: zvi, delena

Reviewed By: delena

Subscribers: guyblank, llvm-commits

Differential Revision: https://reviews.llvm.org/D28888

llvm-svn: 292474

7 years agoAMDGPU: Disable some fneg combines unless nsz
Matt Arsenault [Thu, 19 Jan 2017 06:35:27 +0000 (06:35 +0000)]
AMDGPU: Disable some fneg combines unless nsz

For -(x + y) -> (-x) + (-y), if x == -y, this would
change the result from -0.0 to 0.0. Since the fma/fmad
combine is an extension of this problem it also
applies there.

fmul should be fine, and I don't think any of the unary
operators or conversions should be a problem either.

llvm-svn: 292473

7 years agoAMDGPU: Remove modifiers from v_div_scale_*
Matt Arsenault [Thu, 19 Jan 2017 06:04:12 +0000 (06:04 +0000)]
AMDGPU: Remove modifiers from v_div_scale_*

They seem to produce nonsense results when used.

This should be applied to the release branch.

llvm-svn: 292472

7 years agoUse range-based for loop [NFC]
Tobias Grosser [Thu, 19 Jan 2017 05:09:23 +0000 (05:09 +0000)]
Use range-based for loop [NFC]

llvm-svn: 292471

7 years agoImprove test coverage in test/Isl/CodeGen/loop_partially_in_scop.ll [NFC]
Tobias Grosser [Thu, 19 Jan 2017 04:54:45 +0000 (04:54 +0000)]
Improve test coverage in test/Isl/CodeGen/loop_partially_in_scop.ll [NFC]

We rename the test case with -metarenamer to make the variable names easier to
read and add additional check lines that verify the code we currently generate
for PHI nodes. This code is interesting as it contains a PHI node in a
non-affine sub-region, where some incoming blocks are within the non-affine
sub-region and others are outside of the non-affine subregion.

As can be seen in the check lines we currently load the PHI-node value twice.
This commit documents this behavior. In a subsequent patch we will try to
improve this.

llvm-svn: 292470

7 years ago[X86] Merge LowerADD and LowerSUB into a single LowerADD_SUB since they are identical.
Craig Topper [Thu, 19 Jan 2017 03:49:29 +0000 (03:49 +0000)]
[X86] Merge LowerADD and LowerSUB into a single LowerADD_SUB since they are identical.

llvm-svn: 292469

7 years ago[sancov] applying blacklist to covered points too
Mike Aizatsky [Thu, 19 Jan 2017 03:49:18 +0000 (03:49 +0000)]
[sancov] applying blacklist to covered points too

Differential Revision: https://reviews.llvm.org/D28872

llvm-svn: 292468

7 years agollvm-cxxfilt: filter out invalid manglings
Saleem Abdulrasool [Thu, 19 Jan 2017 02:58:46 +0000 (02:58 +0000)]
llvm-cxxfilt: filter out invalid manglings

c++filt does not attempt to demangle symbols which do not match its
expected format.  This means that the symbol must start with _Z or ___Z
(block invocation function extension).  Any other symbols are returned
as is.  Note that this is different from the behaviour of __cxa_demangle
which will demangle fragments.

llvm-svn: 292467

7 years ago[AVX-512] Use VSHUF instructions instead of two inserts as fallback for subvector...
Craig Topper [Thu, 19 Jan 2017 02:34:29 +0000 (02:34 +0000)]
[AVX-512] Use VSHUF instructions instead of two inserts as fallback for subvector broadcasts that can't fold the load.

llvm-svn: 292466

7 years ago[AVX-512] Add additional test cases for broadcast intrinsics that demonstates that...
Craig Topper [Thu, 19 Jan 2017 02:34:25 +0000 (02:34 +0000)]
[AVX-512] Add additional test cases for broadcast intrinsics that demonstates that we don't fold the loads to use a broadcast instruction.

llvm-svn: 292465

7 years ago[PM] Add LoopVectorize to the default module pipeline
Michael Kuperstein [Thu, 19 Jan 2017 02:21:54 +0000 (02:21 +0000)]
[PM] Add LoopVectorize to the default module pipeline

LV no longer "requires" LCSSA and LoopSimplify, and instead forms
them internally as required. So, there's nothing preventing it from
being enabled.

llvm-svn: 292464

7 years agoLowerTypeTests: Implement exporting of type identifiers.
Peter Collingbourne [Thu, 19 Jan 2017 01:20:11 +0000 (01:20 +0000)]
LowerTypeTests: Implement exporting of type identifiers.

Type identifiers are exported by:
- Adding coarse-grained information about how to test the type
  identifier to the summary.
- Creating symbols in the object file (aliases and absolute symbols)
  containing fine-grained information about the type identifier.

Differential Revision: https://reviews.llvm.org/D28424

llvm-svn: 292462

7 years agoGlobalISel: Implement narrowing for G_LOAD
Justin Bogner [Thu, 19 Jan 2017 01:05:48 +0000 (01:05 +0000)]
GlobalISel: Implement narrowing for G_LOAD

llvm-svn: 292461

7 years agoGlobalISel: Fix text wrapping in a comment. NFC
Justin Bogner [Thu, 19 Jan 2017 01:04:46 +0000 (01:04 +0000)]
GlobalISel: Fix text wrapping in a comment. NFC

llvm-svn: 292460

7 years agoUse an actual valid register in test
Matthias Braun [Thu, 19 Jan 2017 01:04:08 +0000 (01:04 +0000)]
Use an actual valid register in test

llvm-svn: 292459

7 years agoAdd -fdebug-info-for-profiling to emit more debug info for sample pgo profile collection
Dehao Chen [Thu, 19 Jan 2017 00:44:21 +0000 (00:44 +0000)]
Add -fdebug-info-for-profiling to emit more debug info for sample pgo profile collection

Summary:
SamplePGO uses profile with debug info to collect profile. Unlike the traditional debugging purpose, sample pgo needs more accurate debug info to represent the profile. We add -femit-accurate-debug-info for this purpose. It can be combined with all debugging modes (-g, -gmlt, etc). It makes sure that the following pieces of info is always emitted:

* start line of all subprograms
* linkage name of all subprograms
* standalone subprograms (functions that has neither inlined nor been inlined)

The impact on speccpu2006 binary size (size increase comparing with -g0 binary, also includes data for -g binary, which does not change with this patch):

               -gmlt(orig) -gmlt(patched) -g
433.milc       4.68%       5.40%          19.73%
444.namd       8.45%       8.93%          45.99%
447.dealII     97.43%      115.21%        374.89%
450.soplex     27.75%      31.88%         126.04%
453.povray     21.81%      26.16%         92.03%
470.lbm        0.60%       0.67%          1.96%
482.sphinx3    5.77%       6.47%          26.17%
400.perlbench  17.81%      19.43%         73.08%
401.bzip2      3.73%       3.92%          12.18%
403.gcc        31.75%      34.48%         122.75%
429.mcf        0.78%       0.88%          3.89%
445.gobmk      6.08%       7.92%          42.27%
456.hmmer      10.36%      11.25%         35.23%
458.sjeng      5.08%       5.42%          14.36%
462.libquantum 1.71%       1.96%          6.36%
464.h264ref    15.61%      16.56%         43.92%
471.omnetpp    11.93%      15.84%         60.09%
473.astar      3.11%       3.69%          14.18%
483.xalancbmk  56.29%      81.63%         353.22%
geomean        15.60%      18.30%         57.81%

Debug info size change for -gmlt binary with this patch:

433.milc       13.46%
444.namd       5.35%
447.dealII     18.21%
450.soplex     14.68%
453.povray     19.65%
470.lbm        6.03%
482.sphinx3    11.21%
400.perlbench  8.91%
401.bzip2      4.41%
403.gcc        8.56%
429.mcf        8.24%
445.gobmk      29.47%
456.hmmer      8.19%
458.sjeng      6.05%
462.libquantum 11.23%
464.h264ref    5.93%
471.omnetpp    31.89%
473.astar      16.20%
483.xalancbmk  44.62%
geomean        16.83%

Reviewers: davidxl, andreadb, rob.lougher, dblaikie, echristo

Reviewed By: dblaikie, echristo

Subscribers: hfinkel, rob.lougher, andreadb, gbedwell, cfe-commits, probinson, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D25435

llvm-svn: 292458

7 years agoAdd -debug-info-for-profiling to emit more debug info for sample pgo profile collection
Dehao Chen [Thu, 19 Jan 2017 00:44:11 +0000 (00:44 +0000)]
Add -debug-info-for-profiling to emit more debug info for sample pgo profile collection

Summary:
SamplePGO binaries built with -gmlt to collect profile. The current -gmlt debug info is limited, and we need some additional info:

* start line of all subprograms
* linkage name of all subprograms
* standalone subprograms (functions that has neither inlined nor been inlined)

This patch adds these information to the -gmlt binary. The impact on speccpu2006 binary size (size increase comparing with -g0 binary, also includes data for -g binary, which does not change with this patch):

               -gmlt(orig) -gmlt(patched) -g
433.milc       4.68%       5.40%          19.73%
444.namd       8.45%       8.93%          45.99%
447.dealII     97.43%      115.21%        374.89%
450.soplex     27.75%      31.88%         126.04%
453.povray     21.81%      26.16%         92.03%
470.lbm        0.60%       0.67%          1.96%
482.sphinx3    5.77%       6.47%          26.17%
400.perlbench  17.81%      19.43%         73.08%
401.bzip2      3.73%       3.92%          12.18%
403.gcc        31.75%      34.48%         122.75%
429.mcf        0.78%       0.88%          3.89%
445.gobmk      6.08%       7.92%          42.27%
456.hmmer      10.36%      11.25%         35.23%
458.sjeng      5.08%       5.42%          14.36%
462.libquantum 1.71%       1.96%          6.36%
464.h264ref    15.61%      16.56%         43.92%
471.omnetpp    11.93%      15.84%         60.09%
473.astar      3.11%       3.69%          14.18%
483.xalancbmk  56.29%      81.63%         353.22%
geomean        15.60%      18.30%         57.81%

Debug info size change for -gmlt binary with this patch:

433.milc       13.46%
444.namd       5.35%
447.dealII     18.21%
450.soplex     14.68%
453.povray     19.65%
470.lbm        6.03%
482.sphinx3    11.21%
400.perlbench  8.91%
401.bzip2      4.41%
403.gcc        8.56%
429.mcf        8.24%
445.gobmk      29.47%
456.hmmer      8.19%
458.sjeng      6.05%
462.libquantum 11.23%
464.h264ref    5.93%
471.omnetpp    31.89%
473.astar      16.20%
483.xalancbmk  44.62%
geomean        16.83%

Reviewers: davidxl, echristo, dblaikie

Reviewed By: echristo, dblaikie

Subscribers: aprantl, probinson, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D25434

llvm-svn: 292457

7 years ago[LV] Run loop-simplify and LCSSA explicitly instead of "requiring" them
Michael Kuperstein [Thu, 19 Jan 2017 00:42:28 +0000 (00:42 +0000)]
[LV] Run loop-simplify and LCSSA explicitly instead of "requiring" them

This changes the vectorizer to explicitly use the loopsimplify and lcssa utils,
instead of "requiring" the transformations as if they were analyses.

This is not NFC, since it changes the LCSSA behavior - we no longer run LCSSA
for all loops, but rather only for the loops we expect to modify.

Differential Revision: https://reviews.llvm.org/D28868

llvm-svn: 292456

7 years agoLiveIntervalAnalysis: Cleanup; NFC
Matthias Braun [Thu, 19 Jan 2017 00:32:13 +0000 (00:32 +0000)]
LiveIntervalAnalysis: Cleanup; NFC

- Fix doxygen comments: Do not repeat name, remove duplicated doxygen
  comment (on declaration + implementation), etc.
- Use more range based for

llvm-svn: 292455

7 years agoFix a problem with the new dyld interface code -- when a new process
Jason Molenda [Thu, 19 Jan 2017 00:20:29 +0000 (00:20 +0000)]
Fix a problem with the new dyld interface code -- when a new process
starts up, we need to clear the target's image list and only add
the binaries into the target that are actually present in this
process run.

<rdar://problem/29857613>

llvm-svn: 292454

7 years ago[NVPTX] Fix lowering of fp16 ISD::FNEG.
Artem Belevich [Thu, 19 Jan 2017 00:14:45 +0000 (00:14 +0000)]
[NVPTX] Fix lowering of fp16 ISD::FNEG.

There's no neg.f16 instruction, so negation has to
be done via subtraction from zero.

Differential Revision: https://reviews.llvm.org/D28876

llvm-svn: 292452

7 years agoAdd llvm-dis dependency to check-clang.
Peter Collingbourne [Thu, 19 Jan 2017 00:04:44 +0000 (00:04 +0000)]
Add llvm-dis dependency to check-clang.

llvm-svn: 292450

7 years ago[SCEV] Make getUDivExactExpr handle non-nuw multiplies correctly.
Eli Friedman [Wed, 18 Jan 2017 23:56:42 +0000 (23:56 +0000)]
[SCEV] Make getUDivExactExpr handle non-nuw multiplies correctly.

To avoid regressions, make ScalarEvolution::createSCEV a bit more
clever.

Also get rid of some useless code in ScalarEvolution::howFarToZero
which was hiding this bug.

No new testcase because it's impossible to actually expose this bug:
we don't have any in-tree users of getUDivExactExpr besides the two
functions I just mentioned, and they both dodged the problem. I'll
try to add some interesting users in a followup.

Differential Revision: https://reviews.llvm.org/D28587

llvm-svn: 292449

7 years agoMove vtable type metadata emission behind a cc1-level flag.
Peter Collingbourne [Wed, 18 Jan 2017 23:55:27 +0000 (23:55 +0000)]
Move vtable type metadata emission behind a cc1-level flag.

In ThinLTO mode, type metadata will require the module to be written as a
multi-module bitcode file, which is currently incompatible with the Darwin
linker. It is also useful to be able to enable or disable multi-module bitcode
for testing purposes. This introduces a cc1-level flag, -f{,no-}lto-unit,
which is used by the driver to enable multi-module bitcode on all but
Darwin+ThinLTO, and can also be used to enable/disable the feature manually.

Differential Revision: https://reviews.llvm.org/D28877

llvm-svn: 292448

7 years agoPreserve domtree and loop-simplify for runtime unrolling.
Eli Friedman [Wed, 18 Jan 2017 23:26:37 +0000 (23:26 +0000)]
Preserve domtree and loop-simplify for runtime unrolling.

Mostly straightforward changes; we just didn't do the computation before.
One sort of interesting change in LoopUnroll.cpp: we weren't handling
dominance for children of the loop latch correctly, but
foldBlockIntoPredecessor hid the problem for complete unrolling.

Currently punting on loop peeling; made some minor changes to isolate
that problem to LoopUnrollPeel.cpp.

Adds a flag -unroll-verify-domtree; it verifies the domtree immediately
after we finish updating it. This is on by default for +Asserts builds.

Differential Revision: https://reviews.llvm.org/D28073

llvm-svn: 292447

7 years agoTreat segment [B, E) as not overlapping block with boundaries [A, B)
Krzysztof Parzyszek [Wed, 18 Jan 2017 23:12:19 +0000 (23:12 +0000)]
Treat segment [B, E) as not overlapping block with boundaries [A, B)

llvm-svn: 292446

7 years ago[Hexagon] Remove dead defs from the live set when expanding wstores
Krzysztof Parzyszek [Wed, 18 Jan 2017 23:11:40 +0000 (23:11 +0000)]
[Hexagon] Remove dead defs from the live set when expanding wstores

llvm-svn: 292445

7 years agoRevert r291670 because it introduces a crash.
Michael Kuperstein [Wed, 18 Jan 2017 23:05:58 +0000 (23:05 +0000)]
Revert r291670 because it introduces a crash.

r291670 doesn't crash on the original testcase from PR31589,
but it crashes on a slightly more complex one.

PR31589 has the new reproducer.

llvm-svn: 292444

7 years ago[libcxx] [test] Add msvc_stdlib_force_include.hpp.
Stephan T. Lavavej [Wed, 18 Jan 2017 22:19:14 +0000 (22:19 +0000)]
[libcxx] [test] Add msvc_stdlib_force_include.hpp.

No functional change; nothing includes this, instead our test harness
injects it via the /FI compiler option.

No code review; blessed in advance by EricWF.

llvm-svn: 292443

7 years agoImprove the `-filter-print-funcs` option to skip the banner for CGSCC pass when nothi...
Mehdi Amini [Wed, 18 Jan 2017 21:37:11 +0000 (21:37 +0000)]
Improve the `-filter-print-funcs` option to skip the banner for CGSCC pass when nothing is to be printed

Before, it would print a sequence of:

  *** IR Dump After Function Integration/Inlining ******
  *** IR Dump After Function Integration/Inlining ******
  *** IR Dump After Function Integration/Inlining ******
  ...

for every single function in the module.

llvm-svn: 292442

7 years ago[InstCombine] add tests for shl nsw with icmp eq/ne; NFCI
Sanjay Patel [Wed, 18 Jan 2017 21:31:21 +0000 (21:31 +0000)]
[InstCombine] add tests for shl nsw with icmp eq/ne; NFCI

These should be fixed with D28406.

llvm-svn: 292441

7 years ago[InstCombine] add an assert to make a shl+icmp transform assumption explicit; NFCI
Sanjay Patel [Wed, 18 Jan 2017 21:16:12 +0000 (21:16 +0000)]
[InstCombine] add an assert to make a shl+icmp transform assumption explicit; NFCI

llvm-svn: 292440

7 years agoRemove now redundant code that ensured debug info for class definitions was emitted...
David Blaikie [Wed, 18 Jan 2017 21:15:18 +0000 (21:15 +0000)]
Remove now redundant code that ensured debug info for class definitions was emitted under certain circumstances

Introduced in r181561 - it may've been subsumed by work done to allow
emission of declarations for vtable types while still emitting some of
their member functions correctly for those declarations. Whatever the
reason, the tests pass without this code now.

llvm-svn: 292439

7 years ago[CodeGenPrepare] Fix a typo in the comment. NFC.
Haicheng Wu [Wed, 18 Jan 2017 21:12:10 +0000 (21:12 +0000)]
[CodeGenPrepare] Fix a typo in the comment. NFC.

encode => endcode.

Differential Revision: https://reviews.llvm.org/D28866

llvm-svn: 292438

7 years ago[OpenMP] Support for the if-clause on the combined directive 'target parallel'.
Arpith Chacko Jacob [Wed, 18 Jan 2017 20:40:48 +0000 (20:40 +0000)]
[OpenMP] Support for the if-clause on the combined directive 'target parallel'.

The if-clause on the combined directive potentially applies to both the
'target' and the 'parallel' regions.  Codegen'ing the if-clause on the
combined directive requires additional support because the expression in
the clause must be captured by the 'target' capture statement but not
the 'parallel' capture statement.  Note that this situation arises for
other clauses such as num_threads.

The OMPIfClause class inherits OMPClauseWithPreInit to support capturing
of expressions in the clause.  A member CaptureRegion is added to
OMPClauseWithPreInit to indicate which captured statement (in this case
'target' but not 'parallel') captures these expressions.

To ensure correct codegen of captured expressions in the presence of
combined 'target' directives, OMPParallelScope was added to 'parallel'
codegen.

Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D28781

llvm-svn: 292437

7 years ago[ASTReader] Add a DeserializationListener callback for IMPORTED_MODULES
Graydon Hoare [Wed, 18 Jan 2017 20:36:59 +0000 (20:36 +0000)]
[ASTReader] Add a DeserializationListener callback for IMPORTED_MODULES

Summary:
Add a callback from ASTReader to DeserializationListener when the former
reads an IMPORTED_MODULES block. This supports Swift in using PCH for
bridging headers.

Reviewers: doug.gregor, manmanren, bruno

Reviewed By: manmanren

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D28779

llvm-svn: 292436

7 years ago[Modules] Correct test comment from obsolete earlier version of code. NFC
Graydon Hoare [Wed, 18 Jan 2017 20:34:44 +0000 (20:34 +0000)]
[Modules] Correct test comment from obsolete earlier version of code. NFC

Summary:
Code committed in rL290219 went through a few iterations; test wound up with
stale comment.

Reviewers: doug.gregor, manmanren

Reviewed By: manmanren

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D28790

llvm-svn: 292435

7 years ago[libcxx] [test] Fix comment typos, strip trailing whitespace.
Stephan T. Lavavej [Wed, 18 Jan 2017 20:10:25 +0000 (20:10 +0000)]
[libcxx] [test] Fix comment typos, strip trailing whitespace.

No functional change, no code review.

llvm-svn: 292434

7 years ago[InstCombine] remove a redundant check; NFCI
Sanjay Patel [Wed, 18 Jan 2017 20:09:59 +0000 (20:09 +0000)]
[InstCombine] remove a redundant check; NFCI

I missed deleting this check when I refactored this chunk in:
https://reviews.llvm.org/rL292260

llvm-svn: 292433

7 years ago[libcxx] [test] Fix MSVC warnings C4127 and C6326 about constants.
Stephan T. Lavavej [Wed, 18 Jan 2017 20:09:56 +0000 (20:09 +0000)]
[libcxx] [test] Fix MSVC warnings C4127 and C6326 about constants.

MSVC has compiler warnings C4127 "conditional expression is constant" (enabled
by /W4) and C6326 "Potential comparison of a constant with another constant"
(enabled by /analyze). They're potentially useful, although they're slightly
annoying to library devs who know what they're doing. In the latest version of
the compiler, C4127 is suppressed when the compiler sees simple tests like
"if (name_of_thing)", so extracting comparison expressions into named
constants is a workaround. At the same time, using std::integral_constant
avoids C6326, which doesn't look at template arguments.

test/std/containers/sequences/vector.bool/emplace.pass.cpp
Replace 1 == 1 with true, which is the same as far as the library is concerned.

Fixes D28837.

llvm-svn: 292432

7 years agoThinLTOBitcodeWriter: Clear comdats on filtered globals.
Peter Collingbourne [Wed, 18 Jan 2017 20:03:02 +0000 (20:03 +0000)]
ThinLTOBitcodeWriter: Clear comdats on filtered globals.

Differential Revision: https://reviews.llvm.org/D28839

llvm-svn: 292431

7 years agoCloning: Copy comdats when cloning globals.
Peter Collingbourne [Wed, 18 Jan 2017 20:02:31 +0000 (20:02 +0000)]
Cloning: Copy comdats when cloning globals.

Differential Revision: https://reviews.llvm.org/D28838

llvm-svn: 292430

7 years ago[OpenMP] Codegen for the 'target parallel' directive on the NVPTX device.
Arpith Chacko Jacob [Wed, 18 Jan 2017 19:35:00 +0000 (19:35 +0000)]
[OpenMP] Codegen for the 'target parallel' directive on the NVPTX device.

This patch adds codegen for the 'target parallel' directive on the NVPTX
device.  We term offload OpenMP directives such as 'target parallel' and
'target teams distribute parallel for' as SPMD constructs.  SPMD constructs,
in contrast to Generic ones like the plain 'target', can never contain
a serial region.

SPMD constructs can be handled more efficiently on the GPU and do not
require the Warp Loop of the Generic codegen scheme. This patch adds
SPMD codegen support for 'target parallel' on the NVPTX device and can
be reused for other SPMD constructs.

Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D28755

llvm-svn: 292428

7 years agoPR9551: Implement DR1004 (http://wg21.link/cwg1004).
Richard Smith [Wed, 18 Jan 2017 19:19:22 +0000 (19:19 +0000)]
PR9551: Implement DR1004 (wg21.link/cwg1004).

This rule permits the injected-class-name of a class template to be used as
both a template type argument and a template template argument, with no extra
syntax required to disambiguate.

llvm-svn: 292426

7 years agoFix up a comment. NFC.
Michael Kuperstein [Wed, 18 Jan 2017 19:05:48 +0000 (19:05 +0000)]
Fix up a comment. NFC.

llvm-svn: 292425

7 years ago[LV] Allow reductions that have several uses outside the loop
Michael Kuperstein [Wed, 18 Jan 2017 19:02:52 +0000 (19:02 +0000)]
[LV] Allow reductions that have several uses outside the loop

We currently check whether a reduction has a single outside user. We don't
really need to require that - we just need to make sure a single value is
used externally. The number of external users of that value shouldn't actually
matter.

Differential Revision: https://reviews.llvm.org/D28830

llvm-svn: 292424

7 years agocmake: Only sanitize use-after-scope if the host compiler supports it
Justin Bogner [Wed, 18 Jan 2017 19:01:58 +0000 (19:01 +0000)]
cmake: Only sanitize use-after-scope if the host compiler supports it

In r292256, we started adding -fsanitize-use-after-scope when using
the address sanitizer, but that flag wasn't always available. This
fixes the config to only add the flag if the host compiler supports
it.

llvm-svn: 292423

7 years ago[AArch64] Generate literals by the little end
Evandro Menezes [Wed, 18 Jan 2017 18:57:08 +0000 (18:57 +0000)]
[AArch64] Generate literals by the little end

ARM seems to prefer that long literals be formed from their little end in
order to promote the fusion of the instrs pairs MOV/MOVK and MOVK/MOVK on
Cortex A57 and others (v.  "Cortex A57 Software Optimisation Guide", section
4.14).

Differential revision: https://reviews.llvm.org/D28697

llvm-svn: 292422

7 years ago[NewGVN] We don't use postdom info anymore. Update.
Davide Italiano [Wed, 18 Jan 2017 18:42:28 +0000 (18:42 +0000)]
[NewGVN] We don't use postdom info anymore. Update.

Differential Revision:  https://reviews.llvm.org/D28842

llvm-svn: 292421

7 years ago[ThinLTO] Add a recursive step in Metadata lazy-loading
Mehdi Amini [Wed, 18 Jan 2017 18:36:21 +0000 (18:36 +0000)]
[ThinLTO] Add a recursive step in Metadata lazy-loading

Summary:
Without this, we're stressing the RAUW of unique nodes,
which is a costly operation. This is intended to limit
the number of RAUW, and is very effective on the total
link-time of opt with ThinLTO, before:

  real 4m4.587s  user 15m3.401s  sys 0m23.616s

after:

  real 3m25.261s user 12m22.132s sys 0m24.152s

Reviewers: tejohnson, pcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28751

llvm-svn: 292420

7 years ago[OpenMP] Codegen support for 'target parallel' on the host.
Arpith Chacko Jacob [Wed, 18 Jan 2017 18:18:53 +0000 (18:18 +0000)]
[OpenMP] Codegen support for 'target parallel' on the host.

This patch adds support for codegen of 'target parallel' on the host.
It is also the first combined directive that requires two or more
captured statements.  Support for this functionality is included in
the patch.

A combined directive such as 'target parallel' has two captured
statements, one for the 'target' and the other for the 'parallel'
region.  Two captured statements are required because each has
different implicit parameters (see SemaOpenMP.cpp).  For example,
the 'parallel' has 'global_tid' and 'bound_tid' while the 'target'
does not.  The patch adds support for handling multiple captured
statements based on the combined directive.

When codegen'ing the 'target parallel' directive, the 'target'
outlined function is created using the outer captured statement
and the 'parallel' outlined function is created using the inner
captured statement.

Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D28753

llvm-svn: 292419

7 years agoRevert r286788
Jonathan Roelofs [Wed, 18 Jan 2017 18:12:39 +0000 (18:12 +0000)]
Revert r286788

The Itanium ABI [1] specifies that __cxa_demangle accept either:

   1) symbol names, which start with "_Z"
   2) type manglings, which do not start with "_Z"

r286788 erroneously assumes that it should only handle symbols, so this patch
reverts it and adds a counterexample to the testcase.

1: https://mentorembedded.github.io/cxx-abi/abi.html#demangler

Reviewers: zygoloid, EricWF
llvm-svn: 292418

7 years ago[lit] Support sharding testsuites, for parallel execution.
Graydon Hoare [Wed, 18 Jan 2017 18:12:20 +0000 (18:12 +0000)]
[lit] Support sharding testsuites, for parallel execution.

Summary:
This change equips lit.py with two new options, --num-shards=M and
--run-shard=N (set by default from env vars LIT_NUM_SHARDS and LIT_RUN_SHARD).

The options must be used together, and N must be in 1..M.

Together these options effect only test selection: they partition the testsuite
into M equal-sized "shards", then select only the Nth shard. They can be used
in a cluster of test machines to achieve a very crude (static) form of
parallelism, with minimal configuration work.

Reviewers: modocache, ddunbar

Reviewed By: ddunbar

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28789

llvm-svn: 292417

7 years ago[SLP] Add a tests for a fix for PR30787.
Alexey Bataev [Wed, 18 Jan 2017 18:07:46 +0000 (18:07 +0000)]
[SLP] Add a tests for a fix for PR30787.

Add a test for PR30787: Failure to beneficially vectorize 'copyable'
elements in integer binary ops.

llvm-svn: 292416

7 years ago[clang-tidy] Add -extra-arg and -extra-arg-before to run-clang-tidy.py
Ehsan Akhgari [Wed, 18 Jan 2017 17:49:35 +0000 (17:49 +0000)]
[clang-tidy] Add -extra-arg and -extra-arg-before to run-clang-tidy.py

Summary:
These flags allow specifying extra arguments to the tool's command
line which don't appear in the compilation database.

Reviewers: alexfh

Differential Revision: https://reviews.llvm.org/D28334

llvm-svn: 292415

7 years agoFix new Log unit test
Pavel Labath [Wed, 18 Jan 2017 17:31:55 +0000 (17:31 +0000)]
Fix new Log unit test

the test was flaky because I specified the format string for the process id
incorrectly. This should fix it.

llvm-svn: 292414

7 years ago[AMDGPU] Do not allow register coalescer to create big superregs
Stanislav Mekhanoshin [Wed, 18 Jan 2017 17:30:05 +0000 (17:30 +0000)]
[AMDGPU] Do not allow register coalescer to create big superregs

Limit register coalescer by not allowing it to artificially increase
size of registers beyond dword. Such super-registers are in fact
register sequences and not distinct HW registers.

With more super-regs we would need to allocate adjacent registers
and constraint regalloc more than needed. Moreover, our super
registers are overlapping. For instance we have VGPR0_VGPR1_VGPR2,
VGPR1_VGPR2_VGPR3, VGPR2_VGPR3_VGPR4 etc, which complicates registers
allocation even more, resulting in excessive spilling.

Differential Revision: https://reviews.llvm.org/D28782

llvm-svn: 292413

7 years agoGlobalISel: Implement narrowing for G_STORE
Justin Bogner [Wed, 18 Jan 2017 17:29:54 +0000 (17:29 +0000)]
GlobalISel: Implement narrowing for G_STORE

Legalize stores of types that are too wide by breaking them up into
sequences of smaller stores.

llvm-svn: 292412

7 years agoGlobalISel: Correct copy-pasted comment. NFC
Justin Bogner [Wed, 18 Jan 2017 17:28:41 +0000 (17:28 +0000)]
GlobalISel: Correct copy-pasted comment. NFC

llvm-svn: 292411

7 years ago[scudo] Refactor of CRC32 and ARM runtime CRC32 detection
Kostya Kortchinsky [Wed, 18 Jan 2017 17:11:17 +0000 (17:11 +0000)]
[scudo] Refactor of CRC32 and ARM runtime CRC32 detection

Summary:
ARM & AArch64 runtime detection for hardware support of CRC32 has been added
via check of the AT_HWVAL auxiliary vector.

Following Michal's suggestions in D28417, the CRC32 code has been further
changed and looks better now. When compiled with full relro (which is strongly
suggested to benefit from additional hardening), the weak symbol for
computeHardwareCRC32 is read-only and the assembly generated is fairly clean
and straight forward. As suggested, an additional optimization is to skip
the runtime check if SSE 4.2 has been enabled globally, as opposed to only
for scudo_crc32.cpp.

scudo_crc32.h has no purpose anymore and was removed.

Reviewers: alekseyshl, kcc, rengolin, mgorny, phosek

Reviewed By: rengolin, mgorny

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D28574

llvm-svn: 292409

7 years agoDon't create a comdat group for a dropped def with initializer
Teresa Johnson [Wed, 18 Jan 2017 16:58:43 +0000 (16:58 +0000)]
Don't create a comdat group for a dropped def with initializer

Non-prevailing weak/linkonce odr symbols will be dropped by ThinLTO to
available_externally when possible. If they had an initializer in the
global_ctors list, a comdat group was being created. This code
already had logic to skip available_externally defs, but now the
EliminateAvailableExternally pass will drop these symbols to
declarations earlier. Change the check to skip all declarations for
linker (which includes available_externally along with declarations).

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28737

llvm-svn: 292408

7 years agoRevert 292404 due to buildbot failures.
Kirill Bobyrev [Wed, 18 Jan 2017 16:34:25 +0000 (16:34 +0000)]
Revert 292404 due to buildbot failures.

llvm-svn: 292407

7 years ago[ASTUnit] Reset diag state when creating the ASTUnit.
Benjamin Kramer [Wed, 18 Jan 2017 16:25:48 +0000 (16:25 +0000)]
[ASTUnit] Reset diag state when creating the ASTUnit.

A client could call this with a dirty diagnostic engine, don't crash.

llvm-svn: 292406

7 years ago[include-fixer] Don't return a correction if the header insertion failed.
Benjamin Kramer [Wed, 18 Jan 2017 16:22:58 +0000 (16:22 +0000)]
[include-fixer] Don't return a correction if the header insertion failed.

This is could happen in cases involving macros and we don't want to
return an invalid fixit for it or a confusing error message with no
fixit.

llvm-svn: 292405

7 years ago[X86] Minor code cleanup to fix several clang-tidy warnings. NFC
Kirill Bobyrev [Wed, 18 Jan 2017 16:15:47 +0000 (16:15 +0000)]
[X86] Minor code cleanup to fix several clang-tidy warnings. NFC

llvm-svn: 292404

7 years ago[ARM] Create SubtargetFeatures from build attrs
Sam Parker [Wed, 18 Jan 2017 15:52:11 +0000 (15:52 +0000)]
[ARM] Create SubtargetFeatures from build attrs

An ELFObjectFile can now create SubtargetFeatures from the available
ARM build attributes, in a similar manner to MIPS. I've moved the
MIPS code into its own function and the ARM handler also has a
separate function.

Differential Revision: https://reviews.llvm.org/D28291

llvm-svn: 292403

7 years ago[Basic] Remove source manager references from diag state points.
Benjamin Kramer [Wed, 18 Jan 2017 15:50:26 +0000 (15:50 +0000)]
[Basic] Remove source manager references from diag state points.

This is just wasted space, we don't support state points from multiple
source managers. Validate that there's no state when resetting the
source manager and use the 'global' reference to the sourcemanager
instead of the ones in the diag state.

llvm-svn: 292402

7 years agoraw_fd_ostream: Make file handles non-inheritable by default
Pavel Labath [Wed, 18 Jan 2017 15:46:50 +0000 (15:46 +0000)]
raw_fd_ostream: Make file handles non-inheritable by default

Summary:
This makes the file descriptors on unix platform non-inheritable (O_CLOEXEC).

There is no change in behavior on windows, as the handles were already
non-inheritable there.

Reviewers: rnk, rafael

Subscribers: llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D28854

llvm-svn: 292401

7 years agoRevert r292374 to debug Windows buildbot failure.
Arpith Chacko Jacob [Wed, 18 Jan 2017 15:36:05 +0000 (15:36 +0000)]
Revert r292374 to debug Windows buildbot failure.

llvm-svn: 292400

7 years agoWarn when calling a non interrupt function from an interrupt on ARM
Jonathan Roelofs [Wed, 18 Jan 2017 15:31:11 +0000 (15:31 +0000)]
Warn when calling a non interrupt function from an interrupt on ARM

The idea for this originated from a really tricky bug: ISRs on ARM don't
automatically save off the VFP regs, so if say, memcpy gets interrupted and the
ISR itself calls memcpy, the regs are left clobbered when the ISR is done.

https://reviews.llvm.org/D28820

llvm-svn: 292375

7 years ago[OpenMP] Codegen support for 'target parallel' on the host.
Arpith Chacko Jacob [Wed, 18 Jan 2017 15:14:52 +0000 (15:14 +0000)]
[OpenMP] Codegen support for 'target parallel' on the host.

This patch adds support for codegen of 'target parallel' on the host.
It is also the first combined directive that requires two or more
captured statements.  Support for this functionality is included in
the patch.

A combined directive such as 'target parallel' has two captured
statements, one for the 'target' and the other for the 'parallel'
region.  Two captured statements are required because each has
different implicit parameters (see SemaOpenMP.cpp).  For example,
the 'parallel' has 'global_tid' and 'bound_tid' while the 'target'
does not.  The patch adds support for handling multiple captured
statements based on the combined directive.

When codegen'ing the 'target parallel' directive, the 'target'
outlined function is created using the outer captured statement
and the 'parallel' outlined function is created using the inner
captured statement.

Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D28753

llvm-svn: 292374

7 years ago[Assembler] Fix crash when assembling .quad for AArch32.
Chad Rosier [Wed, 18 Jan 2017 15:02:54 +0000 (15:02 +0000)]
[Assembler] Fix crash when assembling .quad for AArch32.

A 64-bit relocation does not exist in 32-bit ARMELF. Report an error
instead of crashing.

PR23870
Patch by Sanne Wouda (sanwou01).
Differential Revision: https://reviews.llvm.org/D28851

llvm-svn: 292373

7 years ago[thumb,framelowering] Reset NoVRegs in Thumb1FrameLowering::emitPrologue.
Florian Hahn [Wed, 18 Jan 2017 15:01:22 +0000 (15:01 +0000)]
[thumb,framelowering] Reset NoVRegs in Thumb1FrameLowering::emitPrologue.

Summary:
In this function, virtual registers can be introduced (for example
through calls to emitThumbRegPlusImmInReg). doScavengeFrameVirtualRegs
will replace those virtual registers with concrete registers later on
in PrologEpilogInserter, which sets NoVRegs again.

This patch fixes the Codegen/Thumb/segmented-stacks.ll test case which
failed with expensive checks.
https://llvm.org/bugs/show_bug.cgi?id=27484

Reviewers: rnk, bkramer, olista01

Reviewed By: olista01

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D28829

llvm-svn: 292372

7 years ago[InstCombine][AVX2] Add DemandedElts support for VPERMD/VPERMPS shuffles
Simon Pilgrim [Wed, 18 Jan 2017 14:47:49 +0000 (14:47 +0000)]
[InstCombine][AVX2] Add DemandedElts support for VPERMD/VPERMPS shuffles

Simplify a vpermv shuffle mask based on the elements of the mask that are actually demanded.

llvm-svn: 292371

7 years agoRe-revert: [globalisel] Tablegen-erate current Register Bank Information
Daniel Sanders [Wed, 18 Jan 2017 14:26:12 +0000 (14:26 +0000)]
Re-revert: [globalisel] Tablegen-erate current Register Bank Information

More missing guards. My build didn't notice it due to a stale file left over
from a Global ISel build.

llvm-svn: 292369

7 years ago[InstCombine][AVX2] Tests showing missed opportunities to pass demanded elts through...
Simon Pilgrim [Wed, 18 Jan 2017 14:23:06 +0000 (14:23 +0000)]
[InstCombine][AVX2] Tests showing missed opportunities to pass demanded elts through a vpermd/vpermps shuffle

llvm-svn: 292368

7 years agoRe-commit: [globalisel] Tablegen-erate current Register Bank Information
Daniel Sanders [Wed, 18 Jan 2017 14:17:50 +0000 (14:17 +0000)]
Re-commit: [globalisel] Tablegen-erate current Register Bank Information

Summary:
Adds a RegisterBank tablegen class that can be used to declare the register
banks and an associated tablegen pass to generate the necessary code.

Changes since last commit:
The new tablegen pass is now correctly guarded by LLVM_BUILD_GLOBAL_ISEL and
this should fix the buildbots however it may not be the whole fix. The previous
buildbot failures suggest there may be a memory bug lurking that I'm unable to
reproduce (including when using asan) or spot in the source. If they re-occur
on this commit then I'll need assistance from the bot owners to track it down.

Reviewers: t.p.northover, ab, rovka, qcolombet

Reviewed By: qcolombet

Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka

Differential Revision: https://reviews.llvm.org/D27338

llvm-svn: 292367

7 years ago[ARM] Create objdump subtarget from build attrs
Sam Parker [Wed, 18 Jan 2017 13:52:12 +0000 (13:52 +0000)]
[ARM] Create objdump subtarget from build attrs

Enable an ELFObjectFile to read the its arm build attributes to
produce a target triple with a specific ARM architecture.
llvm-objdump now uses this functionality to automatically produce
a more accurate target.

Differential Revision: https://reviews.llvm.org/D28769

llvm-svn: 292366