Craig Topper [Tue, 13 Mar 2018 22:36:07 +0000 (22:36 +0000)]
[X86] Simplify the LowerAVXCONCAT_VECTORS code a little by creating a single path for insert_subvector handling.
We now only create recursive concats if we have more than two non-zero values. This keeps our subvector broadcast DAG combine functioning.
llvm-svn: 327457
Joel E. Denny [Tue, 13 Mar 2018 22:18:29 +0000 (22:18 +0000)]
[Attr] Merge two dependent tests from different directories
Suggested at: https://reviews.llvm.org/D43248
llvm-svn: 327456
Sjoerd Meijer [Tue, 13 Mar 2018 22:11:06 +0000 (22:11 +0000)]
[ARM] ACLE FP16 feature test macros
This is a partial recommit of r327189 that was reverted
due to test issues. I.e., this recommits minimal functional
change, the FP16 feature test macros, and adds tests that
were missing in the original commit.
llvm-svn: 327455
Craig Topper [Tue, 13 Mar 2018 22:05:25 +0000 (22:05 +0000)]
[X86] Rewrite LowerAVXCONCAT_VECTORS similar to how we handle vXi1 concats.
This better able to detect undef and zeros pieces in the concat. Or cases when only one subvector is non-zero. This allows us to avoid silly things like double inserts into progressively larger undefs.
This still builds 512 bit concats of 128 bits by building up through 256 bits first. But I don't know if that's best.
We probably want to merge this with the vXi1 concat code since they are very similar.
llvm-svn: 327454
Eugene Zelenko [Tue, 13 Mar 2018 21:32:01 +0000 (21:32 +0000)]
[Analysis] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 327453
Julie Hockett [Tue, 13 Mar 2018 21:24:08 +0000 (21:24 +0000)]
[clang-tidy] Fixing incorrect comment
llvm-svn: 327452
Zachary Turner [Tue, 13 Mar 2018 21:18:00 +0000 (21:18 +0000)]
Disable PDB injected sources test temporarily.
llvm-svn: 327451
Hiroshi Yamauchi [Tue, 13 Mar 2018 21:13:18 +0000 (21:13 +0000)]
Simplify more cases of logical ops of masked icmps.
Summary:
For example,
((X & 255) != 0) && ((X & 15) == 8) -> ((X & 15) == 8).
((X & 7) != 0) && ((X & 15) == 8) -> false.
Reviewers: davidxl
Reviewed By: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43835
llvm-svn: 327450
Eugene Zemtsov [Tue, 13 Mar 2018 21:10:15 +0000 (21:10 +0000)]
Remove explicit triple and data layout from the test
llvm-svn: 327449
Jim Ingham [Tue, 13 Mar 2018 21:06:05 +0000 (21:06 +0000)]
Add a missing return in SBPlatform::IsConnected and test
for the behavior - using the fact that the Host platform
is always present & connected.
llvm-svn: 327448
Gheorghe-Teodor Bercea [Tue, 13 Mar 2018 20:50:12 +0000 (20:50 +0000)]
Revert revision 327438.
llvm-svn: 327447
Craig Topper [Tue, 13 Mar 2018 20:36:28 +0000 (20:36 +0000)]
[DAGCombiner] Allow visitEXTRACT_SUBVECTOR to combine with BUILD_VECTORS between LegalizeVectorOps and LegalizeDAG.
BUILD_VECTORs aren't themselves legalized until LegalizeDAG so we should still be able to create an "illegal" one before that. This helps combine with BUILD_VECTORS that are introduced during LegalizeVectorOps due to unrolling.
llvm-svn: 327446
Davide Italiano [Tue, 13 Mar 2018 20:26:38 +0000 (20:26 +0000)]
[DataFormatter] Remove dead code for the NSDictionary formatter.
I'm going to make changes in this area soon, so I figured I
could clean things a bit while I was around.
llvm-svn: 327445
Zachary Turner [Tue, 13 Mar 2018 20:16:37 +0000 (20:16 +0000)]
Update modulemap to exclude new DIA headers.
llvm-svn: 327444
Eugene Zemtsov [Tue, 13 Mar 2018 20:06:33 +0000 (20:06 +0000)]
Fix debuglineinfo-path.ll
This fix is based on an assumption that some build bots are missing 'echo
-n'
llvm-svn: 327443
Francis Visoiu Mistrih [Tue, 13 Mar 2018 19:53:16 +0000 (19:53 +0000)]
[MIR] Allow frame-setup and frame-destroy on the same instruction
Nothing prevents us from having both frame-setup and frame-destroy on
the same instruction.
When merging:
* frame-setup OPCODE1
* frame-destroy OPCODE2
into
* frame-setup frame-destroy OPCODE3
we want to be able to print and parse both flags.
llvm-svn: 327442
Eugene Zemtsov [Tue, 13 Mar 2018 19:48:31 +0000 (19:48 +0000)]
Temporary disable debuglineinfo-path.ll to fix build
llvm-svn: 327441
Gheorghe-Teodor Bercea [Tue, 13 Mar 2018 19:44:53 +0000 (19:44 +0000)]
[OpenMP][libomptarget] Add global memory data sharing support for master-worker sharing.
Summary:
This patch adds support for the sharing of variables from the master thread of a team to the worker threads of the team.
The runtime uses a stack structure implemented as a doubly-linked list of slots with each slot having the exact same size as the size requested. This implementation leverages existing data structures. The runtime functions are added as separate functions to avoid interfering with the current interface.
Limitations to be addressed in future patches:
- This current patch only employs global memory. In a future patch we will enable to usage for shared memory as an optimization.
- Allow the allocation of several requested sizes in the same slot.
Reviewers: ABataev, grokos, caomhin, carlo.bertolli
Reviewed By: grokos
Subscribers: Hahnfeld, guansong, openmp-commits
Differential Revision: https://reviews.llvm.org/D44260
llvm-svn: 327440
Gheorghe-Teodor Bercea [Tue, 13 Mar 2018 19:39:19 +0000 (19:39 +0000)]
[OpenMP] Add flag for linking runtime bitcode library
Summary: This patch adds an additional flag to the OpenMP device offloading toolchain to link in the runtime library bitcode.
Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, grokos, hfinkel
Reviewed By: ABataev, grokos
Subscribers: jholewinski, guansong, cfe-commits
Differential Revision: https://reviews.llvm.org/D43197
llvm-svn: 327438
Sjoerd Meijer [Tue, 13 Mar 2018 19:38:56 +0000 (19:38 +0000)]
This reverts "r327189 - [ARM] Add ARMv8.2-A FP16 vector intrinsic"
This is causing problems in testing, and PR36683 was raised.
Reverting it until we have sorted out how to pass f16 vectors.
llvm-svn: 327437
Anna Thomas [Tue, 13 Mar 2018 19:38:45 +0000 (19:38 +0000)]
Test Commit NFC. Updated comment
llvm-svn: 327436
Sanjay Patel [Tue, 13 Mar 2018 19:20:01 +0000 (19:20 +0000)]
[x86] add test for WriteZero sched class instructions; NFC
Nops should have zero latency because there is no result.
Idioms like 'xorps xmm0, xmm0' may have zero latency because
they are handled without using an execution unit.
llvm-svn: 327435
Akira Hatanaka [Tue, 13 Mar 2018 18:58:25 +0000 (18:58 +0000)]
Serialize the NonTrivialToPrimitive* flags I added in r326307.
rdar://problem/
38421774
llvm-svn: 327434
Haicheng Wu [Tue, 13 Mar 2018 18:44:19 +0000 (18:44 +0000)]
[SLP] clean some formats
llvm-svn: 327433
Brian M. Rzycki [Tue, 13 Mar 2018 18:14:10 +0000 (18:14 +0000)]
[LazyValueInfo] PR33357 prevent infinite recursion on BinaryOperator
Summary:
It is possible for LVI to encounter instructions that are not in valid
SSA form and reference themselves. One example is the following:
%tmp4 = and i1 %tmp4, undef
Before this patch LVI would recurse until running out of stack memory
and crashed. This patch marks these self-referential instructions as
Overdefined and aborts analysis on the instruction.
Fixes https://bugs.llvm.org/show_bug.cgi?id=33357
Reviewers: craig.topper, anna, efriedma, dberlin, sebpop, kuhar
Reviewed by: dberlin
Subscribers: uabelho, spatel, a.elovikov, fhahn, eli.friedman, mzolotukhin, spop, evandro, davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D34135
llvm-svn: 327432
Zachary Turner [Tue, 13 Mar 2018 17:58:28 +0000 (17:58 +0000)]
Implement pure virtual method to fix build.
llvm-svn: 327431
Eugene Zemtsov [Tue, 13 Mar 2018 17:54:29 +0000 (17:54 +0000)]
Handle mixed-OS paths in DWARF reader
Make sure that DWARF line information generated by Windows can be properly read by Posix OS and vice versa.
Differential Revision: https://reviews.llvm.org/D44290
llvm-svn: 327430
Sanjay Patel [Tue, 13 Mar 2018 17:50:27 +0000 (17:50 +0000)]
[MC] fix documentation comments; NFC
llvm-svn: 327429
Zachary Turner [Tue, 13 Mar 2018 17:46:06 +0000 (17:46 +0000)]
[PDB] Support dumping injected sources via the DIA reader.
Injected sources are basically a way to add actual source file content
to your PDB. Presumably you could use this for shipping your source code
with your debug information, but in practice I can only find this being
used for embedding natvis files inside of PDBs.
In order to effectively test LLVM's natvis file injection, we need a way
to dump the injected sources of a PDB in a way that is authoritative
(i.e. based on Microsoft's understanding of the PDB format, and not
LLVM's). To this end, I've added support for dumping injected sources
via DIA. I made a PDB file that used the /natvis option to generate a
test case.
Differential Revision: https://reviews.llvm.org/D44405
llvm-svn: 327428
Simon Dardis [Tue, 13 Mar 2018 17:31:11 +0000 (17:31 +0000)]
Revert "[mips] Guard traps for microMIPS correctly"
This appears to have broken the expensive checks bot in
a strange fashion. Reverting until I can investigate.
This reverts r327409.
llvm-svn: 327427
George Karpenkov [Tue, 13 Mar 2018 17:27:01 +0000 (17:27 +0000)]
[analyzer] Fix the matcher for GCDAntipattern to look for "signal" call in all parameters
rdar://
38405904
llvm-svn: 327426
Andrea Di Biagio [Tue, 13 Mar 2018 17:24:32 +0000 (17:24 +0000)]
[llvm-mca] Remove the logic that computes the reciprocal throughput, and make the SummaryView independent from the Backend. NFCI
Since r327420, the tool can query the MCSchedModel interface to obtain the
reciprocal throughput information.
As a consequence, method `ResourceManager::getRThroughput`, and
method `Backend::getRThroughput` are no longer needed.
This patch simplifies the code by removing the custom RThroughput computation.
This patch also refactors class SummaryView by removing the dependency with
the Backend object.
No functional change intended.
llvm-svn: 327425
Simon Pilgrim [Tue, 13 Mar 2018 17:17:15 +0000 (17:17 +0000)]
[DAGCombine] visitREM - Don't assume that one divrem isn't driving another
Under some circumstances the divrems won't have been combined together before getting to this code.
So replace the assertion with a if() guard to not expand to X-((X/C)*C) to give the other combine chance to happen.
Reduced from OSS-Fuzz #6883
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=6883
llvm-svn: 327424
Azharuddin Mohammed [Tue, 13 Mar 2018 17:04:33 +0000 (17:04 +0000)]
Build system changes for RISCV
Summary: Build system changes for RISCV. Makes it possible to build just the RISCV target alone.
Reviewers: asb, apazos, mgrang, beanz
Reviewed By: asb
Subscribers: mgorny, kito-cheng, shiva0217, llvm-commits
Differential Revision: https://reviews.llvm.org/D44153
llvm-svn: 327423
Brian Homerding [Tue, 13 Mar 2018 16:37:59 +0000 (16:37 +0000)]
[lit] - Allow 1 test to report multiple micro-test results to provide support for microbenchmarks.
Summary:
These changes are to allow to a Result object to have nested Result objects in
order to support microbenchmarks. Currently lit is restricted to reporting one
result object for one test, this change provides support tests that want to
report individual timings for individual kernels.
This revision is the result of the discussions in
https://reviews.llvm.org/D32272#794759,
https://reviews.llvm.org/D37421#
f8003b27 and https://reviews.llvm.org/D38496.
It is a separation of the changes purposed in https://reviews.llvm.org/D40077.
This change will enable adding LCALS (Livermore Compiler Analysis Loop Suite)
collection of loop kernels to the llvm test suite using the google benchmark
library (https://reviews.llvm.org/D43319) with tracking of individual kernel
timings.
Previously microbenchmarks had been handled by using macros to section groups
of microbenchmarks together and build many executables while still getting a
grouped timing (MultiSource/TSVC). Recently the google benchmark library was
added to the test suite and utilized with a litsupport plugin. However the
limitation of 1 test 1 result limited its use to passing a runtime option to
run only 1 microbenchmark with several hand written tests
(MicroBenchmarks/XRay). This runs the same executable many times with different
hand-written tests. I will update the litsupport plugin to utilize the new
functionality (https://reviews.llvm.org/D43316).
These changes allow lit to report micro test results if desired in order to get
many precise timing results from 1 run of 1 test executable.
Reviewers: MatzeB, hfinkel, rengolin, delcypher
Differential Revision: https://reviews.llvm.org/D43314
llvm-svn: 327422
Daniel Neilson [Tue, 13 Mar 2018 16:31:19 +0000 (16:31 +0000)]
[SelectionDAGBuilder] Replace deprecated calls to MemoryIntrinsic::getAlignment() (NFCI)
Summary:
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
SelectionDAGBuilder to cease using the old getAlignment() API of MemoryIntrinsic in favour of getting
source & dest specific alignments through the new API.
Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278,
rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774,
rL324781, rL324784, rL324955, rL324960, rL325816, rL327398 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.
Reference
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-
20151109/312083.html
llvm-svn: 327421
Andrea Di Biagio [Tue, 13 Mar 2018 16:28:55 +0000 (16:28 +0000)]
[MC] Move the reciprocal throughput computation from TargetSchedModel to MCSchedModel.
The goal is to make the reciprocal throughput computation accessible through the
MCSchedModel interface. This is particularly important for llvm-mca because it
can only query the MCSchedModel interface.
No functional change intended.
Differential Revision: https://reviews.llvm.org/D44392
llvm-svn: 327420
George Rimar [Tue, 13 Mar 2018 16:23:48 +0000 (16:23 +0000)]
[ELF} - Fix build bots.
llvm-svn: 327419
Craig Topper [Tue, 13 Mar 2018 16:23:27 +0000 (16:23 +0000)]
[X86] Remove SplitBinaryOpsAndApply and use SplitOpsAndApply by adding curly braces around the ops.
Summary: Unless you were intentionally avoiding this syntax? I saw you mentioned makeArrayRef in your commit that added SplitOpsAndApply.
Reviewers: RKSimon
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44403
llvm-svn: 327418
George Rimar [Tue, 13 Mar 2018 16:11:02 +0000 (16:11 +0000)]
[ELF] - Fix mistype in comment. NFC.
llvm-svn: 327417
George Rimar [Tue, 13 Mar 2018 16:02:45 +0000 (16:02 +0000)]
[ELF] - Rename test cases to *.test.
This is a follow-up for r327410.
llvm-svn: 327416
Andrea Di Biagio [Tue, 13 Mar 2018 15:59:59 +0000 (15:59 +0000)]
[llvm-mca] Simplify code that computes the latency of an instruction in
InstrBuilder. NFCI
This was possible because of r327406, which added function`computeInstrLatency`
to MCSchedModel.
llvm-svn: 327415
Brock Wyma [Tue, 13 Mar 2018 15:56:20 +0000 (15:56 +0000)]
Revert r327397 [CodeView] Omit forward references for unnamed structs and ...
This reverts commit r327397 to investigate a buildbot failure.
llvm-svn: 327414
Pavel Labath [Tue, 13 Mar 2018 15:55:00 +0000 (15:55 +0000)]
include locale.h in IOHandler.cpp
This is needed for the setlocale() call, and it seems that it is not
transitively pulled in for some build configurations.
llvm-svn: 327413
Zaara Syeda [Tue, 13 Mar 2018 15:49:05 +0000 (15:49 +0000)]
test commit: fix formatting of a comment
This is a simple change to do the test commit.
llvm-svn: 327412
Jonas Devlieghere [Tue, 13 Mar 2018 15:47:38 +0000 (15:47 +0000)]
[dsymutil] Unify error handling outside DwarfLinker.
This is a follow-up to r327137 where we unified error handling for the
DwarfLinker. This replaces calls to errs() and outs() with the
appropriate ostream wrapper everywhere in dsymutil.
llvm-svn: 327411
George Rimar [Tue, 13 Mar 2018 15:47:14 +0000 (15:47 +0000)]
[ELF] - Represent tests as linker scripts instead of asm.
This follows recently started direction and sometimes
allows to fully get rid from `echo` calls.
I'll rename changed files to *.test in a follow-up.
llvm-svn: 327410
Simon Dardis [Tue, 13 Mar 2018 15:46:58 +0000 (15:46 +0000)]
[mips] Guard traps for microMIPS correctly
This is part of fixing the instruction predicates for MIPS.
Reviewers: atanasyan, abeserminji
Differential Revision: https://reviews.llvm.org/D44212
llvm-svn: 327409
Rafael Espindola [Tue, 13 Mar 2018 15:24:51 +0000 (15:24 +0000)]
[ThinLTO] Clear dllimport when setting dso_local.
This is PR36686.
If a user of a library is LTOed with that library we take the
opportunity to set dso_local, but we don't clear dllimport, which
creates an invalid IR.
llvm-svn: 327408
Simon Pilgrim [Tue, 13 Mar 2018 15:22:24 +0000 (15:22 +0000)]
[X86][Btver2] Split i8/i16/i32/i64 div/idiv costs
We were assuming a mixture of 32/64 division costs.
llvm-svn: 327407
Andrea Di Biagio [Tue, 13 Mar 2018 15:22:13 +0000 (15:22 +0000)]
[MC] Move the instruction latency computation from TargetSchedModel to MCSchedModel.
The goal is to make the latency information accessible through the MCSchedModel
interface. This is particularly important for tools like llvm-mca that only have
access to the MCSchedModel API.
This partially fixes PR36676.
No functional change intended.
Differential Revision: https://reviews.llvm.org/D44383
llvm-svn: 327406
Joel E. Denny [Tue, 13 Mar 2018 14:51:22 +0000 (14:51 +0000)]
Reland "[Attr] Fix parameter indexing for several attributes"
Relands r326602 (reverted in r326862) with new test and fix for
PR36620.
Differential Revision: https://reviews.llvm.org/D43248
llvm-svn: 327405
Sanjay Patel [Tue, 13 Mar 2018 14:46:32 +0000 (14:46 +0000)]
[InstCombine] fix fmul reassociation to avoid creating an extra fdiv
This was supposed to be an NFC refactoring that will eventually allow
eliminating the isFast() predicate, but there's a rare possibility
that we would pessimize the code as shown in the test case because
we failed to check 'hasOneUse()' properly. This version also removes
an inefficiency of the old code; we would look for:
(X * C) * C1 --> X * (C * C1)
...but that pattern is always handled by
SimplifyAssociativeOrCommutative().
llvm-svn: 327404
Simon Dardis [Tue, 13 Mar 2018 14:39:44 +0000 (14:39 +0000)]
[mips] Fix the definitions of the EVA instructions
Correct their availability to their respective ISAs.
Reviewers: atanasyan
Differential Revision: https://reviews.llvm.org/D44209
llvm-svn: 327403
Sylvestre Ledru [Tue, 13 Mar 2018 14:35:10 +0000 (14:35 +0000)]
fix some user facing typos / in the comments
llvm-svn: 327402
Haojian Wu [Tue, 13 Mar 2018 14:31:31 +0000 (14:31 +0000)]
[clangd] Use the macro name range as the definition range.
Summary: This also aligns with the behavior of declarations.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: klimek, ilya-biryukov, jkorous-apple, ioeric, cfe-commits
Differential Revision: https://reviews.llvm.org/D44423
llvm-svn: 327401
Jonas Devlieghere [Tue, 13 Mar 2018 14:28:07 +0000 (14:28 +0000)]
[dsymutil] Remove old error/warn functions. NFC.
This removes the old error and warn functions that were still present in
the dwarf linker.
llvm-svn: 327400
Jonas Devlieghere [Tue, 13 Mar 2018 14:27:15 +0000 (14:27 +0000)]
[dsymutil] Perform analyzeContextInfo and CloneDIEs in parallel
This patch makes dsymutil perform analyzeContextInfo and CloneDIEs in
parallel. For the same object file, there is a dependency between the
two. However, we can do analyzeContextInfo for the next object file
while cloning DIEs for the current. This is exactly the approach taken
in this patch.
For WebCore, this leads to a performance improvement of 29% and for
clang we see similar results with at 32% improvement.
A big thanks to Pete Cooper who came up with the original idea and
the PoC.
Differential revision: https://reviews.llvm.org/D43945
llvm-svn: 327399
Daniel Neilson [Tue, 13 Mar 2018 14:25:33 +0000 (14:25 +0000)]
[SROA] Take advantage of separate alignments for memcpy source and destination
Summary:
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
SROA pass to cease using the old getAlignment() & setAlignment() APIs of MemoryIntrinsic in
favour of getting source & dest specific alignments through the new API. This allows us
to enhance visitMemTransferInst to be more aggressive setting the alignment in memcpy
calls that it creates, as well as to only change the alignment of a memcpy/memmove
argument that it replaces.
Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278,
rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774,
rL324781, rL324784, rL324955, rL324960, rL325816 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.
Reference
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-
20151109/312083.html
Reviewers: chandlerc, bollu, efriedma
Reviewed By: efriedma
Subscribers: efriedma, eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D42974
llvm-svn: 327398
Brock Wyma [Tue, 13 Mar 2018 14:14:16 +0000 (14:14 +0000)]
[CodeView] Omit forward references for unnamed structs and unions
Codeview references to unnamed structs and unions are expected to refer to the
complete type definition instead of a forward reference so Visual Studio can
resolve the type properly.
Differential Revision: https://reviews.llvm.org/D32498
llvm-svn: 327397
Andrea Di Biagio [Tue, 13 Mar 2018 13:58:02 +0000 (13:58 +0000)]
[llvm-mca] Use a const ArrayRef in a few places. NFC
llvm-svn: 327396
Haicheng Wu [Tue, 13 Mar 2018 13:52:47 +0000 (13:52 +0000)]
[TTI] Fix a typo in the comment
llvm-svn: 327395
Clement Courbet [Tue, 13 Mar 2018 13:44:18 +0000 (13:44 +0000)]
[llvm-mca] Fix unused variable warning in opt mode.
llvm-svn: 327394
Krzysztof Parzyszek [Tue, 13 Mar 2018 13:30:43 +0000 (13:30 +0000)]
[Hexagon] Clang side of r327302 in LLVM
Add option -m[no-]packets to control generation of instruction packets
(enabled by default).
llvm-svn: 327393
Nicholas Wilson [Tue, 13 Mar 2018 13:30:04 +0000 (13:30 +0000)]
[WebAssembly] Demangle symbol names for use by the browser debugger
Differential Revision: https://reviews.llvm.org/D44316
llvm-svn: 327392
Nicholas Wilson [Tue, 13 Mar 2018 13:16:15 +0000 (13:16 +0000)]
[WebAssembly] Use helper macro from ELF/Options.td to tidy. NFC
Differential Revision: https://reviews.llvm.org/D44394
llvm-svn: 327391
Nicholas Wilson [Tue, 13 Mar 2018 13:12:03 +0000 (13:12 +0000)]
[WebAssembly] Add missing --demangle arg
Previously, Config->Demangle was uninitialised (not hooked up to
commandline handling)
Differential Revision: https://reviews.llvm.org/D44301
llvm-svn: 327390
Clement Courbet [Tue, 13 Mar 2018 13:11:01 +0000 (13:11 +0000)]
[llvm-mca] Refactor event listeners to make the backend agnostic to event types.
Summary: This is a first step towards making the pipeline configurable.
Subscribers: llvm-commits, andreadb
Differential Revision: https://reviews.llvm.org/D44309
llvm-svn: 327389
Simon Dardis [Tue, 13 Mar 2018 12:50:03 +0000 (12:50 +0000)]
[mips] Don't create nested CALLSEQ_START..CALLSEQ_END nodes.
For the MIPS O32 ABI, the current call lowering logic naively lowers each
call, creating the reserved argument area to hold the argument spill areas for
$a0..$a3 and the outgoing parameter area if one is required at each call site.
In the case of a sufficently large byval argument, a call to memcpy is used
to write the start+16..end of the argument into the outgoing parameter area.
This is done within the CALLSEQ_START..CALLSEQ_END of the callee. The CALLSEQ
nodes are responsible for performing the necessary stack adjustments.
Since the O32/N32/N64 MIPS ABIs do not have a red-zone and writing below the
stack pointer and reading the values back is unpredictable, the call to memcpy
cannot be hoisted out of the callee's CALLSEQ nodes.
However, for the O32 ABI requires the reserved argument area for functions
which have parameters. The naive lowering of calls will then create nested
CALLSEQ sequences. For N32 and N64 these nodes are also created, but with
zero stack adjustments as those ABIs do not have a reserved argument area.
This patch addresses the correctness issue by recognizing the special case
of lowering a byval argument that uses memcpy. By recognizing that the
incoming chain already has a CALLSEQ_START node on it when calling memcpy,
the CALLSEQ nodes are not created. For the N32 and N64 ABIs, this is not an
issue, as no stack adjustment has to be performed.
For the O32 ABI, the correctness reasoning is different. In the case of a
sufficently large byval argument, registers a0..a3 are going to be used for
the callee's arguments, mandating the creation of the reserved argument area.
The call to memcpy in the naive case will also create its own reserved
argument area. However, since the reserved argument area consists of undefined
values, both calls can use the same reserved argument area.
Reviewers: abeserminji, atanasyan
Differential Revision: https://reviews.llvm.org/D44296
llvm-svn: 327388
Haojian Wu [Tue, 13 Mar 2018 12:30:59 +0000 (12:30 +0000)]
[clangd] Fix irrelevant declaratations in goto definition (on macros).
Summary:
DeclrationAndMacrosFinder will find some declarations (not macro!) that are
referened inside the macro somehow, isSearchedLocation() is not sufficient, we
don't know whether the searched source location is macro or not.
Reviewers: ilya-biryukov
Subscribers: klimek, jkorous-apple, ioeric, cfe-commits
Differential Revision: https://reviews.llvm.org/D44293
llvm-svn: 327387
Haojian Wu [Tue, 13 Mar 2018 12:26:28 +0000 (12:26 +0000)]
[clangd] Remove extra ";", NFC.
llvm-svn: 327386
Simon Pilgrim [Tue, 13 Mar 2018 12:22:58 +0000 (12:22 +0000)]
[X86][SSE41] createVariablePermute v2X64 - PCMPEQQ can test for index 0/1 and select between them.
llvm-svn: 327385
Jonas Devlieghere [Tue, 13 Mar 2018 11:32:19 +0000 (11:32 +0000)]
[dsymutil] Unbreak non-Darwin bots.
BinaryHolder -> BinHolder
llvm-svn: 327384
Pavel Labath [Tue, 13 Mar 2018 11:28:27 +0000 (11:28 +0000)]
clang-import-test: fix build with clang-3.8
clang-3.8 complains that constructor for '...' must explicitly
initialize the const object. Newer clangs and gcc seem to be fine with
this, but explicitly initializing the variable does not hurt.
llvm-svn: 327383
Jonas Devlieghere [Tue, 13 Mar 2018 10:52:49 +0000 (10:52 +0000)]
[dsymutil] Introduce LinkContext. NFC.
This patch introduces the LinkContext which is necessary to have
dsymutil perform analysis and cloning of DIEs in parallel. As requested
in D43945, I'm landing this as two separate commits.
llvm-svn: 327382
Eugene Leviant [Tue, 13 Mar 2018 10:19:50 +0000 (10:19 +0000)]
[Evaluator] Evaluate load/store with bitcast
Differential revision: https://reviews.llvm.org/D43457
llvm-svn: 327381
Pavel Labath [Tue, 13 Mar 2018 09:46:10 +0000 (09:46 +0000)]
Fix clang-3.8 build
clang-3.8 complains that constructor for '...' must explicitly
initialize the const member. Newer clangs and gcc seem to be fine with
this, but explicitly initializing the member does not hurt.
llvm-svn: 327380
Pavel Labath [Tue, 13 Mar 2018 09:46:00 +0000 (09:46 +0000)]
Fix linux s390x build (pr36694)
llvm-svn: 327379
George Rimar [Tue, 13 Mar 2018 09:18:11 +0000 (09:18 +0000)]
[ELF] - Implement INSERT BEFORE.
This finishes PR35877.
INSERT BEFORE used similar to INSERT AFTER,
it inserts sections before the given target section.
Differential revision: https://reviews.llvm.org/D44380
llvm-svn: 327378
George Rimar [Tue, 13 Mar 2018 08:50:36 +0000 (08:50 +0000)]
[ELF] - Fix wrong "REQUIRES" in test.
Its a follow up for r327374 to fix BB.
llvm-svn: 327377
George Rimar [Tue, 13 Mar 2018 08:47:17 +0000 (08:47 +0000)]
[ELF] - Restrict section offsets that exceeds file size.
This is part of PR36515.
With some linkerscripts it is possible to get file offset overlaps
and overflows. Currently LLD checks overlaps in checkNoOverlappingSections().
And also we allow broken output with --no-inhibit-exec.
Problem is that sometimes final offset of sections is completely broken
and we calculate output file size wrong and might crash.
Patch implements check to verify that there is no output section
which offset exceeds file size.
Differential revision: https://reviews.llvm.org/D43819
llvm-svn: 327376
Jonas Paulsson [Tue, 13 Mar 2018 08:36:20 +0000 (08:36 +0000)]
[CodeGenPrepare] Respect endianness in splitMergedValStore.
splitMergedValStore will split a store into two if target prefers this, or if
-force-split-store is passed.
This patch adds the missing handling for endianness in this function along
with a test case.
Review: Eli Friedman
https://reviews.llvm.org/D44396
llvm-svn: 327375
George Rimar [Tue, 13 Mar 2018 08:32:56 +0000 (08:32 +0000)]
[ELF] - Drop special flags for empty output sections.
This fixes PR36598.
LLD currently crashes when we have empty output section
with SHF_LINK_ORDER flag. This might happen if we place an
empty synthetic section in the linker script, but keep output
section alive with the use of additional symbol, for example.
The patch fixes the issue by dropping all special flags
for empty sections.
Differential revision: https://reviews.llvm.org/D44376
llvm-svn: 327374
Max Kazantsev [Tue, 13 Mar 2018 07:46:06 +0000 (07:46 +0000)]
[SCEV][NFC] Smarter implementation of isAvailableAtLoopEntry
isAvailableAtLoopEntry duplicates logic of `properlyDominates` after checking invariance.
This patch replaces this logic with invocation of this method which is more profitable
because it supports caching.
Differential Revision: https://reviews.llvm.org/D43997
llvm-svn: 327373
Clement Courbet [Tue, 13 Mar 2018 07:05:55 +0000 (07:05 +0000)]
[MergeICmps] Make sure that the comparison only has one use.
Summary: Fixes PR36557.
Reviewers: trentxintong, spatel
Subscribers: mstorsjo, llvm-commits
Differential Revision: https://reviews.llvm.org/D44083
llvm-svn: 327372
Yonghong Song [Tue, 13 Mar 2018 06:47:07 +0000 (06:47 +0000)]
bpf: Enhance debug information for peephole optimization passes
Add more debug information for peephole optimization passes.
These would only be enabled for debug version binary and could help
analyzing why some optimization opportunities were missed.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 327371
Yonghong Song [Tue, 13 Mar 2018 06:47:06 +0000 (06:47 +0000)]
bpf: New post-RA peephole optimization pass to eliminate bad RA codegen
This new pass eliminate identical move:
MOV rA, rA
This is particularly possible to happen when sub-register support
enabled. The special type cast insn MOV_32_64 involves different
register class on src (i32) and dst (i64), RA could generate useless
instruction due to this.
This pass also could serve as the bast for further post-RA optimization.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 327370
Yonghong Song [Tue, 13 Mar 2018 06:47:05 +0000 (06:47 +0000)]
bpf: Don't expand BSWAP on i32, promote it
Currently, there is no ALU32 bswap support in eBPF ISA.
BSWAP on i32 was set to EXPAND which would need about eight instructions
for single BSWAP.
It would be more efficient to promote it to i64, then doing BSWAP on i64.
For eBPF programs, most of the promotion are zero extensions which are
likely be elimiated later by peephole optimizations.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 327369
Yonghong Song [Tue, 13 Mar 2018 06:47:04 +0000 (06:47 +0000)]
bpf: Support subregister definition check on PHI node
This patch relax the subregister definition check on Phi node.
Previously, we just cancel the optimizatoin when the definition is Phi
node while actually we could further check the definitions of incoming
parameters of PHI node.
This helps catch more elimination opportunities.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 327368
Yonghong Song [Tue, 13 Mar 2018 06:47:03 +0000 (06:47 +0000)]
bpf: Extends zero extension elimination beyond comparison instructions
The current zero extension elimination was restricted to operands of
comparison. It actually could be extended to more cases.
For example:
int *inc_p (int *p, unsigned a)
{
return p + a;
}
'a' will be promoted to i64 during addition, and the zero extension could
be eliminated as well.
For the elimination optimization, it should be much better to start
recognizing the candidate sequence from the SRL instruction instead of J*
instructions.
This patch makes it an generic zero extension elimination pass instead of
one restricted with comparison.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 327367
Yonghong Song [Tue, 13 Mar 2018 06:47:02 +0000 (06:47 +0000)]
bpf: J*_RR should check both operands
There is a mistake in current code that we "break" out the optimization
when the first operand of J*_RR doesn't qualify the elimination. This
caused some elimination opportunities missed, for example the one in the
testcase.
The code should just fall through to handle the second operand.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 327366
Yonghong Song [Tue, 13 Mar 2018 06:47:00 +0000 (06:47 +0000)]
bpf: Tighten subregister definition check
The current subregister definition check stops after the MOV_32_64
instruction.
This means we are thinking all the following instruction sequences
are safe to be eliminated:
MOV_32_64 rB, wA
SLL_ri rB, rB, 32
SRL_ri rB, rB, 32
However, this is *not* true. The source subregister wA of MOV_32_64 could
come from a implicit truncation of 64-bit register in which case the high
bits of the 64-bit register is not zeroed, therefore we can't eliminate
above sequence.
For example, for i32_val, we shouldn't do the elimination:
long long bar ();
int foo (int b, int c)
{
unsigned int i32_val = (unsigned int) bar();
if (i32_val < 10)
return b;
else
return c;
}
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 327365
Yonghong Song [Tue, 13 Mar 2018 06:46:59 +0000 (06:46 +0000)]
bpf: Add more check directives in peephole testcase
Improve the test accuracy by adding more check directives.
Shifts are expected to be eliminated for zero extension but not for signed
extension.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 327364
Serguei Katkov [Tue, 13 Mar 2018 06:36:00 +0000 (06:36 +0000)]
Revert [SCEV] Fix isKnownPredicate
It is a revert of rL327362 which causes build bot failures with assert like
Assertion `isAvailableAtLoopEntry(RHS, L) && "RHS is not available at Loop Entry"' failed.
llvm-svn: 327363
Serguei Katkov [Tue, 13 Mar 2018 06:10:27 +0000 (06:10 +0000)]
[SCEV] Fix isKnownPredicate
IsKnownPredicate is updated to implement the following algorithm
proposed by @sanjoy and @mkazantsev :
isKnownPredicate(Pred, LHS, RHS) {
Collect set S all loops on which either LHS or RHS depend.
If S is non-empty
a. Let PD be the element of S which is dominated by all other elements of S
b. Let E(LHS) be value of LHS on entry of PD.
To get E(LHS), we should just take LHS and replace all AddRecs that
are attached to PD on with their entry values.
Define E(RHS) in the same way.
c. Let B(LHS) be value of L on backedge of PD.
To get B(LHS), we should just take LHS and replace all AddRecs that
are attached to PD on with their backedge values.
Define B(RHS) in the same way.
d. Note that E(LHS) and E(RHS) are automatically available on entry of PD,
so we can assert on that.
e. Return true if isLoopEntryGuardedByCond(Pred, E(LHS), E(RHS)) &&
isLoopBackedgeGuardedByCond(Pred, B(LHS), B(RHS))
Return true if Pred, L, R is known from ranges, splitting etc.
}
This is follow-up for https://reviews.llvm.org/D42417.
Reviewers: sanjoy, mkazantsev, reames
Reviewed By: sanjoy, mkazantsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43507
llvm-svn: 327362
Mandeep Singh Grang [Tue, 13 Mar 2018 05:25:23 +0000 (05:25 +0000)]
[polly] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.
To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.
Reviewers: grosser, efriedma, jdoerfert, bollu, sebpop
Reviewed By: sebpop
Subscribers: sebpop, mehdi_amini, llvm-commits, pollydev
Tags: #polly
Differential Revision: https://reviews.llvm.org/D44361
llvm-svn: 327361
Vlad Tsyrklevich [Tue, 13 Mar 2018 05:08:48 +0000 (05:08 +0000)]
Reland r327041: [ThinLTO] Keep available_externally symbols live
Summary:
This change fixes PR36483. The bug was originally introduced by a change
that marked non-prevailing symbols dead. This broke LowerTypeTests
handling of available_externally functions, which are non-prevailing.
LowerTypeTests uses liveness information to avoid emitting thunks for
unused functions.
Marking available_externally functions dead is incorrect, the functions
are used though the function definitions are not. This change keeps them
live, and lets the EliminateAvailableExternally/GlobalDCE passes remove
them later instead.
(Reland with a suspected fix for a unit test failure I haven't been able
to reproduce locally)
Reviewers: pcc, tejohnson
Reviewed By: tejohnson
Subscribers: grimar, mehdi_amini, inglorion, eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D43690
llvm-svn: 327360
Adam Nemet [Tue, 13 Mar 2018 04:37:01 +0000 (04:37 +0000)]
[LTO] Return proper error object rather than null LTOModule
This caused a crash in LTOModule::createInLocalContext.
rdar://
37926841
llvm-svn: 327359
Taewook Oh [Tue, 13 Mar 2018 04:26:58 +0000 (04:26 +0000)]
[ThinLTO] Add funtions in callees metadata to CallGraphEdges
Summary:
If there's a callees metadata attached to the indirect call instruction, add CallGraphEdges to the callees mentioned in the metadata when computing FunctionSummary.
* Why this is necessary:
Consider following code example:
```
(foo.c)
static int f1(int x) {...}
static int f2(int x);
static int (*fptr)(int) = f2;
static int f2(int x) {
if (x) fptr=f1; return f1(x);
}
int foo(int x) {
(*fptr)(x); // !callees metadata of !{i32 (i32)* @f1, i32 (i32)* @f2} would be attached to this call.
}
(bar.c)
int bar(int x) {
return foo(x);
}
```
At LTO time when `foo.o` is imported into `bar.o`, function `foo` might be inlined into `bar` and PGO-guided indirect call promotion will run after that. If the profile data tells that the promotion of `@f1` or `@f2` is beneficial, the optimizer will check if the "promoted" `@f1` or `@f2` (such as `@f1.llvm.0` or `@f2.llvm.0`) is available. Without this patch, importing `!callees` metadata would only add promoted declarations of `@f1` and `@f2` to the `bar.o`, but still the optimizer will assume that the function is available and perform the promotion. The result of that is link failure with `undefined reference to @f1.llvm.0`.
This patch fixes this problem by adding callees in the `!callees` metadata to CallGraphEdges so that their definition would be properly imported into.
One may ask that there already is a logic to add indirect call promotion targets to be added to CallGraphEdges. However, if profile data says "indirect call promotion is only beneficial under a certain inline context", the logic wouldn't work. In the code example above, if profile data is like
```
bar:1000000:100000
1:100000
1: foo:100000
1: 100000 f1:100000
```
, Computing FunctionSummary for `foo.o` wouldn't add `foo->f1` to CallGraphEdges. (Also, it is at least "possible" that one can provide profile data to only link step but not to compilation step).
Reviewers: tejohnson, mehdi_amini, pcc
Reviewed By: tejohnson
Subscribers: inglorion, eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D44399
llvm-svn: 327358
Rafael Espindola [Tue, 13 Mar 2018 01:41:49 +0000 (01:41 +0000)]
Use PLT relocations in test.
Currently lld creates plain plt entries when a R_386_PC32 resolves to
a symbol in a shared library. That is a bug (PR36678). Don't depend on
that behavior on this test.
llvm-svn: 327357