Haicheng Wu [Mon, 26 Mar 2018 18:59:28 +0000 (18:59 +0000)]
[SLP] Add more checks to a test case. NFC.
llvm-svn: 328572
Rafael Espindola [Mon, 26 Mar 2018 18:55:33 +0000 (18:55 +0000)]
Reduce code duplication a bit.
Thanks to George Rimar for pointing it out.
llvm-svn: 328571
Reid Kleckner [Mon, 26 Mar 2018 18:49:48 +0000 (18:49 +0000)]
[X86] Fix Windows `i1 zeroext` conventions to use i8 instead of i32
Summary:
Re-lands r328386 and r328443, reverting r328482.
Incorporates fixes from @mstorsjo in D44876 (thanks!) so that small
parameters in i8 and i16 do not end up in the SysV register parameters
(EDI, ESI, etc).
I added tests for how we receive small parameters, since that is the
important part. It's always safe to store more bytes than will be read,
but the assumptions you make when loading them are what really matter.
I also tested this by self-hosting clang and it passed tests on win64.
Reviewers: mstorsjo, hans
Subscribers: hiraditya, mstorsjo, llvm-commits
Differential Revision: https://reviews.llvm.org/D44900
llvm-svn: 328570
Rafael Espindola [Mon, 26 Mar 2018 18:49:31 +0000 (18:49 +0000)]
Reduce code duplication a bit. NFC
llvm-svn: 328569
Jonathan Peyton [Mon, 26 Mar 2018 18:44:48 +0000 (18:44 +0000)]
Add summarizeStats.py to tools directory
The summarizeStats.py script processes raw data provided by the
instrumented (stats-gathering) OpenMP* runtime library. It provides:
1) A radar chart which plots counters as frequency (per GigaTick) of use within
the program. The frequencies are plotted as log10, however values less than
one are kept as it is and represented in red color. This was done to help
visualize the differences better.
2) Pie charts separating total time as compute and non-compute. The compute and
non-compute times have their own pie charts showing the constructs that
contributed to them. The percentages listed are with respect to the total
time.
3) '.csv' file with percentage of time spent within the different constructs.
The script can be used as:
$ python $PATH_TO_SCRIPT/summarizeStats.py instrumented1.csv instrumented2.csv
Patch by Taru Doodi
Differential Revision: https://reviews.llvm.org/D41838
llvm-svn: 328568
Reid Kleckner [Mon, 26 Mar 2018 18:22:47 +0000 (18:22 +0000)]
[MS] Fix late-parsed template infinite loop in eager instantiation
Summary:
This fixes PR33561 and PR34185.
Don't store pending template instantiations for late-parsed templates in
the normal PendingInstantiations queue. Instead, use a separate list
that will only be parsed and instantiated at end of TU when late
template parsing actually works and doesn't infinite loop.
Reviewers: rsmith, thakis, hans
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D44846
llvm-svn: 328567
Simon Pilgrim [Mon, 26 Mar 2018 18:19:28 +0000 (18:19 +0000)]
[X86] Add WriteBitScan/WriteLZCNT/WriteTZCNT/WritePOPCNT scheduler classes (PR36881)
Give the bit count instructions their own scheduler classes instead of forcing them into existing classes.
These were mostly overridden anyway, but I had to add in costs from Agner for silvermont and znver1 and the Fam16h SoG for btver2 (Jaguar).
Differential Revision: https://reviews.llvm.org/D44879
llvm-svn: 328566
David Blaikie [Mon, 26 Mar 2018 18:10:31 +0000 (18:10 +0000)]
Remove unused file, ExecutionEngine/MCJIT/ObjectBuffer.h
This header also wasn't self contained/modular - but with no users, it
didn't seem worth fixing because it'd break so easily again.
llvm-svn: 328565
Mandeep Singh Grang [Mon, 26 Mar 2018 18:08:26 +0000 (18:08 +0000)]
[XCore] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.
To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.
Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.
Reviewers: dblaikie, RKSimon, robertlytton
Reviewed By: robertlytton
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44875
llvm-svn: 328564
Reid Kleckner [Mon, 26 Mar 2018 18:05:12 +0000 (18:05 +0000)]
[lit] Implement 'cat' command for internal shell
Fixes PR36449
Patch by Chamal de Silva
Differential Revision: https://reviews.llvm.org/D43501
llvm-svn: 328563
Zachary Turner [Mon, 26 Mar 2018 18:01:07 +0000 (18:01 +0000)]
Delete pdbutil diff mode.
This has been made obsolete by the fact that almost all of the
things it previously checked for are no longer relevant since
we can just compare bytes in a lot of places.
llvm-svn: 328562
Krzysztof Parzyszek [Mon, 26 Mar 2018 17:53:48 +0000 (17:53 +0000)]
[Hexagon] Add more lit tests
llvm-svn: 328561
Sanjay Patel [Mon, 26 Mar 2018 17:52:02 +0000 (17:52 +0000)]
[InstCombine] improve code comment; NFC
llvm-svn: 328560
Zaara Syeda [Mon, 26 Mar 2018 17:50:52 +0000 (17:50 +0000)]
[ELF] GotSection increment NumEntries when Target saves GlobalOffsetTable in the .got
When the target saves ElfSym::GlobalOffsetTable in the .got rather than
.got.plt, Target->GotHeaderEntriesNum states the number of extra entries
required in the .got. Rather than having to add Target->GotHeaderEntriesNum to
NumEntries in every function which refers to NumEntries, this patch changes the
initial value of NumEntries in the constructor.
Differential Revision: https://reviews.llvm.org/D44744
llvm-svn: 328559
Lei Huang [Mon, 26 Mar 2018 17:46:25 +0000 (17:46 +0000)]
[Power9]Legalize and emit code for quad-precision convert from double-precision
Legalize and emit code for quad-precision floating point operation xscvdpqp
and add option to guard the quad precision operation support.
Differential Revision: https://reviews.llvm.org/D44746
llvm-svn: 328558
Adrian Prantl [Mon, 26 Mar 2018 17:40:44 +0000 (17:40 +0000)]
Fix check for verbose logging.
Thanks to Pavel for pointing this out!
llvm-svn: 328557
Stefan Pintilie [Mon, 26 Mar 2018 17:39:18 +0000 (17:39 +0000)]
[PowerPC] Infrastructure work. Implement getting the opcode for a spill in one place.
A new function getOpcodeForSpill should now be the only place to get
the opcode for a given spilled register.
Differential Revision: https://reviews.llvm.org/D43086
llvm-svn: 328556
Zaara Syeda [Mon, 26 Mar 2018 17:22:33 +0000 (17:22 +0000)]
Disable [MachineLICM] Add functions to MachineLICM to hoist invariant stores
Disable https://reviews.llvm.org/D40196 with setting option
hoist-const-stores to false since failing s390 buildbot.
llvm-svn: 328555
Krzysztof Parzyszek [Mon, 26 Mar 2018 17:07:41 +0000 (17:07 +0000)]
[Pipeliner] Several node-ordering fixes
First, we change the heuristic that is used to ignore the recurrent
node-sets in the node ordering. In certain cases it's not important
to focus on the recurrent node-sets. Instead, the algorithm begins
by considering all the instructions in the node ordering step.
Second, a minor change to the bottom up traversal, which needs to
consider loop carried dependences (modeled as anti dependences).
Previously, these instructions were skipped, which caused problems
because the instruction ends up having both predecessors and
sucessors in the schedule.
Third, consider anti-dependences as a tie breaker when choosing
between instructions in the node ordering. We want to make sure
that the source of the anti-dependence does not end up with both
predecesssors and sucessors in the final node ordering.
Patch by Brendon Cahoon.
llvm-svn: 328554
Tim Corringham [Mon, 26 Mar 2018 17:06:33 +0000 (17:06 +0000)]
[AMDGPU] Improve disassembler error handling
Summary:
llvm-objdump now disassembles unrecognised opcodes as data, using
the .long directive. We treat unrecognised opcodes as being 32 bit
values, so move along 4 bytes rather than the single byte which
previously resulted in a cascade of bogus disassembly following an
unrecognised opcode.
While no solution can always disassemble code that contains
embedded data correctly this provides a significant improvement.
The disassembler will now cope with an arbitrary length section
as it no longer truncates it to a multiple of 4 bytes, and will
use the .byte directive for trailing bytes.
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D44685
llvm-svn: 328553
Pirama Arumuga Nainar [Mon, 26 Mar 2018 17:03:34 +0000 (17:03 +0000)]
[CodeGen] Mark fma as const for Android
Summary:
r318093 sets fma, fmaf, fmal as const for Gnu and MSVC. Android also
does not set errno for these functions. So mark these const for
Android.
Reviewers: spatel, efriedma, srhines, chh, enh
Subscribers: cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D44852
llvm-svn: 328552
Simon Pilgrim [Mon, 26 Mar 2018 17:02:02 +0000 (17:02 +0000)]
[X86][Btver2] Add CVTSI2SD/CVTSI2SS scheduler costs
We still need to account for how Jaguar passes data from GPR -> XMM, which isn't as clean as XMM -> GPR.....
llvm-svn: 328551
Krzysztof Parzyszek [Mon, 26 Mar 2018 16:58:40 +0000 (16:58 +0000)]
[Pipeliner] Check for affine expression in isLoopCarriedOrder
The pipeliner must add a loop carried dependence between two memory
operations if the base register is not an affine (linear) exression.
The current implementation doesn't check how the base register is
defined, which allows non-affine expressions, and then the pipeliner
does not add a loop carried dependence when one is needed.
This patch adds code to isLoopCarriedOrder that checks if the base
register of the memory operations is defined by a phi, and the loop
definition for the phi is a constant increment value. This is a very
simple check for a linear expression.
Patch by Brendon Cahoon.
llvm-svn: 328550
David Blaikie [Mon, 26 Mar 2018 16:57:31 +0000 (16:57 +0000)]
Remove an unneeded (& mislayered) include from Target/TargetLoweringObjectFile on a CodeGen header
llvm-svn: 328549
David Blaikie [Mon, 26 Mar 2018 16:52:10 +0000 (16:52 +0000)]
Remove unneeded (& mislayered) include from TargetMachine.cpp on a CodeGen header
llvm-svn: 328548
Krzysztof Parzyszek [Mon, 26 Mar 2018 16:50:11 +0000 (16:50 +0000)]
[Pipeliner] Add missing loop carried dependences
The pipeliner is not adding a dependence edge for a loop carried
dependence, and ends up scheduling a load from iteration n prior
to an aliased store in iteration n-1.
The code that adds the loop carried dependences in the pipeliner
doesn't check if the memory objects for loads and stores are
"identified" (i.e., distinct) objects. If they are not, then the
code that adds the dependences needs to be conservative. The
objects can be used to check dependences only when they are
distinct objects.
The code that checks for loop carried dependences has been updated
to classify loads and stores that are not identified as "unknown"
values. A store with an "unknown" value can potentially create
a loop carried dependence with any pending load.
Patch by Brendon Cahoon.
llvm-svn: 328547
Haicheng Wu [Mon, 26 Mar 2018 16:47:37 +0000 (16:47 +0000)]
[SLP] Add a test case. NFC.
llvm-svn: 328546
Krzysztof Parzyszek [Mon, 26 Mar 2018 16:41:36 +0000 (16:41 +0000)]
[Pipeliner] Fix renaming in pipeliner when eliminating phis
The phi renaming code in the pipeliner uses the wrong value when
rewriting phi uses, which results in an undefined value. In this
case, the original phi is no longer needed due to the order of
instruction in the pipelined loop. The pipeliner was assuming, in
this case, the the phi loop definition should be used to
rewrite the uses. However, the pipeliner needs to check to make
sure that the loop definition has already been scheduled. If not,
then the phi initial value needs to be used instead.
Patch by Brendon Cahoon.
llvm-svn: 328545
Alexey Bataev [Mon, 26 Mar 2018 16:40:55 +0000 (16:40 +0000)]
[OPENMP] Codegen for declare target with link clause.
If the link clause is used on the declare target directive, the object
should be linked on target or target data directives, not during the
codegen. Patch adds support for this clause.
llvm-svn: 328544
Krzysztof Parzyszek [Mon, 26 Mar 2018 16:37:55 +0000 (16:37 +0000)]
[Pipeliner] Fix number of phis to generate in the epilog
The pipeliner was generating too many phis in the epilog blocks, which
caused incorrect code generation when rewriting an instruction that uses
the phi.
In this case, there 3 prolog and epilog stages. An existing phi was
scheduled at stage 1. When generating the code for the 2nd epilog an
extra new phi was generated.
To fix this, we need to update the code that calculates the maximum
number of phis that can be generated, which is based upon the current
prolog stage and the stage of the original phi. In this case, when the
prolog stage is 1 and the original phi stage is 1, the maximum number
of phis to generate is 2.
Patch by Brendon Cahoon.
llvm-svn: 328543
Krzysztof Parzyszek [Mon, 26 Mar 2018 16:33:16 +0000 (16:33 +0000)]
[Pipeliner] Use latency to compute RecMII
The patch contains severals changes needed to pipeline an example
that was transformed so that a Phi with a subreg is converted to
copies.
The pipeliner wasn't working for a couple of reasons.
- The RecMII was 3 instead of 2 due to the extra copies.
- Copy instructions contained a latency of 1.
- The node order algorithm was not choosing the best "bottom"
node, which caused an instruction to be scheduled that had a
predecessor and successor already scheduled.
- Updated the Hexagon Machine Scheduler to check if the node is
latency bound when adding the cost for a 0-latency dependence.
The RecMII was 3 because the computation looks at the number of
nodes in the recurrence. The extra copy is an extra node but
it shouldn't increase the latency. The new RecMII computation
looks at the latency of the instructions in the recurrence. We
changed the latency of the dependence of a copy to 0. The latency
computation for the copy also checks the use of the copy (similar
to a reg_sequence).
The node order algorithm was not choosing the last instruction
in the recurrence for a bottom up traversal. This was when the
last instruction is a copy. A check was added when choosing the
instruction to check for NodeNum if the maxASAP is the same. This
means that the scheduler will not end up with another node in
the recurrence that has both a predecessor and successor already
scheduled.
The cost computation in Hexagon Machine Scheduler adds cost when
an instruction can be packetized with a zero-latency instruction.
We should only do this if the schedule is latency bound.
Patch by Brendon Cahoon.
llvm-svn: 328542
Simon Pilgrim [Mon, 26 Mar 2018 16:24:13 +0000 (16:24 +0000)]
[X86][Btver2] Add CVTSD2SS/CVTSS2SD scheduler costs
llvm-svn: 328541
Krzysztof Parzyszek [Mon, 26 Mar 2018 16:23:29 +0000 (16:23 +0000)]
[Pipeliner] Fix assert caused by pipeliner serialization
The pipeliner is asserting because the serialization step that
occurs at the end is deleting an instruction. The assert
occurs later on because there is a use without a definition.
The problem occurs when an instruction defines a value used
by a REQ_SEQUENCE and that value is used by a COPY instruction.
The latencies between these instructions are zero, so they are
put in to the same packet. The serialization code is unable to
handle this correctly, and ends up putting the REG_SEQUENCE
before its definition.
There is special code in the serialization step that attempts
to handle zero-cost instructions (phis, copy, reg_sequence)
differently than regular instructions. Unfortunately, this means
the order does not come out correct.
This patch simplifies the code by changing the seperate steps for
handling zero-cost and regular instructions. Only phis are
handled separate now, since they should occurs first. Then, this
patch adds checks to make use the MoveUse is set to the smallest
value if there are multiple uses in a cycle.
Patch by Brendon Cahoon.
llvm-svn: 328540
Sebastian Pop [Mon, 26 Mar 2018 16:19:31 +0000 (16:19 +0000)]
[InstCombine] reassociate loop invariant GEP chains to enable LICM
This change brings performance of zlib up by 10%. The example below is from a
hot loop in longest_match() from zlib.
do.body:
%cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ]
%idx.ext = zext i32 %cur_match.addr.0 to i64
%add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext
%add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 %idx.ext1
%add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 -1
In this example %idx.ext1 is a loop invariant. It will be moved above the use of
loop induction variable %idx.ext such that it can be hoisted out of the loop by
LICM. The operands that have dependences carried by the loop will be sinked down
in the GEP chain. This patch will produce the following output:
do.body:
%cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ]
%idx.ext = zext i32 %cur_match.addr.0 to i64
%add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext1
%add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 -1
%add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 %idx.ext
llvm-svn: 328539
Krzysztof Parzyszek [Mon, 26 Mar 2018 16:17:06 +0000 (16:17 +0000)]
[Pipeliner] Enable more base+offset dependence changes in pipeliner
The pipeliner changes dependences between base+offset instructions
(loads and stores) so that the instructions have more flexibility
to be scheduled with respect to each other. This occurs when the
pipeliner is able to compute that the instructions will not alias
if their order is changed. The prevous code enforced the alias
property by checking if the base register is the same, and that the
offset values are either both positive or negative.
This patch improves the alias check by using the API
areMemAccessesTriviallyDisjoint instead. This enables more cases,
especially if the offset is a negative value. The pipeliner uses
the function by creating a new instruction with the offset used
in the next iteration.
Patch by Brendon Cahoon.
llvm-svn: 328538
Krzysztof Parzyszek [Mon, 26 Mar 2018 16:10:48 +0000 (16:10 +0000)]
[Pipeliner] Fix calculation when reusing phis
A schedule may require that a phi from the original loop is used in
multiple iterations in the scheduled loop. When this occurs, we generate
multiple phis in the pipelined loop to save the value across iterations.
When we generate the new phis and update the register names in the
pipelined loop, the pipeliner attempts to reuse a previously generated
phi, when possible. The calculation for the name of the new phi needs
to account for the version/iteration of the original phi. Also, in the
epilog, the code only needs to check backwards for a previous iteration
until reaching the first prolog block.
Patch by Brendon Cahoon.
llvm-svn: 328537
Simon Pilgrim [Mon, 26 Mar 2018 16:10:08 +0000 (16:10 +0000)]
[X86][Btver2] Account for the "+i" integer pipe transfer costs (1cy use of JALU0 for GPR PRF write)
llvm-svn: 328536
Krzysztof Parzyszek [Mon, 26 Mar 2018 16:05:55 +0000 (16:05 +0000)]
[Pipeliner] Fix check for order dependences when finalizing instructions
The code in orderDepdences that looks at the order dependences between
instructions was processing all the successor and predecessor order
dependences. However, we really only want to check for an order dependence
for instructions scheduled in the same cycle.
Also, fixed how the pipeliner handles output dependences. An output
dependence is also a potential loop carried dependence. The pipeliner
didn't handle this case properly so an invalid schedule could be created
that allowed an output dependence to be scheduled in the next iteration
at the same cycle.
Patch by Brendon Cahoon.
llvm-svn: 328516
Krzysztof Parzyszek [Mon, 26 Mar 2018 15:58:16 +0000 (15:58 +0000)]
[Pipeliner] Fix in the pipeliner phi reuse code
When the definition of a phi is used by a phi in the next iteration,
the pipeliner was assuming that the definition is processed first.
Because of the assumption, an incorrect phi name was used. This patch
has a check to see if the phi definition has been processed already.
Patch by Brendon Cahoon.
llvm-svn: 328510
Krzysztof Parzyszek [Mon, 26 Mar 2018 15:53:23 +0000 (15:53 +0000)]
[Pipeliner] Pipeliner should mark physical registers as used
The software pipeliner attempts to delete dead instructions after
generating the pipelined loop. The code looks for uses of each
instruction. Physical registers should be treated differently because
the use chains do not exist. The code that checks for dead
instructions should assume that definitions of physical registers
are used if the operand doesn't contain the dead flag.
Patch by Brendon Cahoon.
llvm-svn: 328509
Krzysztof Parzyszek [Mon, 26 Mar 2018 15:45:55 +0000 (15:45 +0000)]
[Pipeliner] Correctly update memoperands in the epilog
The pipeliner needs to be conservative when updating the memoperands
of instructions in the epilog. Previously, the pipeliner was changing
the offset of the memoperand based upon the scheduling stage. However,
that is incorrect when control flow branches around the kernel code.
The bug enabled a load and store to the same stack offset to be swapped.
This patch fixes the bug by updating the size of the memoperands to be
UINT_MAX. This conservative value means that dependences will be created
between other loads and stores.
Patch by Brendon Cahoon.
llvm-svn: 328508
Erik Pilkington [Mon, 26 Mar 2018 15:34:36 +0000 (15:34 +0000)]
[demangler] Fix a bug in r328464 found by oss-fuzz.
llvm-svn: 328507
Krzysztof Parzyszek [Mon, 26 Mar 2018 15:32:03 +0000 (15:32 +0000)]
[Hexagon] Give priority to post-incremementing memory accesses in LSR
llvm-svn: 328506
Simon Pilgrim [Mon, 26 Mar 2018 15:30:47 +0000 (15:30 +0000)]
[X86][Btver2] Add CVTSD2SI/CVTSS2SI scheduler costs
Account for the "+i" integer pipe transfer cost (1cy use of JALU0 for GPR PRF write)
This also adds missing vcvttss2si tests
llvm-svn: 328505
Pavel Labath [Mon, 26 Mar 2018 15:17:58 +0000 (15:17 +0000)]
Fix TestDisassembleBreakpoint broken by r328488
The first issue was that the test was capturing the "before" disassembly
before launching, and the "after" after. This is a problem because some
of the disassembly will change after we know the load address (e.g. PCs
in call instructions). I fix this by capturing both disassemblies with
the process running.
The second issue was that the refactor in r328488 accidentaly changed
the meaning of the test, as it was no longer disassembling the function
which contained the breakpoint.
While inside, I also modernize the test to use
lldbutil.run_to_source_breakpoint and prevent debug-info replication.
llvm-svn: 328504
Ilya Biryukov [Mon, 26 Mar 2018 15:12:30 +0000 (15:12 +0000)]
Migrate dockerfiles to use multi-stage builds.
Summary:
We previously emulated multi-staged builds using two dockerfiles,
native support from Docker allows us to merge them into one,
simplifying our scripts.
For more details about multi-stage builds, see:
https://docs.docker.com/develop/develop-images/multistage-build/
Reviewers: mehdi_amini, klimek, sammccall
Reviewed By: sammccall
Subscribers: llvm-commits, ioeric, cfe-commits
Differential Revision: https://reviews.llvm.org/D44787
llvm-svn: 328503
Sanjay Patel [Mon, 26 Mar 2018 15:03:57 +0000 (15:03 +0000)]
[InstCombine] distribute fmul over fadd/fsub
This replaces a large chunk of code that was looking for compound
patterns that include these sub-patterns. Existing tests ensure that
all of the previous examples are still folded as expected.
We still need to loosen the FMF check.
llvm-svn: 328502
Simon Pilgrim [Mon, 26 Mar 2018 14:44:24 +0000 (14:44 +0000)]
[X86][Btver2] Fix YMM BLENDPD/BLENDPS + UNPCKPD/UNPCKP instructions costs
These should match the YMM MOVDUP/ PERMILPD/PERMILPS + SHUFPD/SHUFPS shuffles instead of using the WriteFShuffle defaults.
llvm-svn: 328501
Simon Marchi [Mon, 26 Mar 2018 14:41:40 +0000 (14:41 +0000)]
[clangd] Support incremental document syncing
Summary:
This patch adds support for incremental document syncing, as described
in the LSP spec. The protocol specifies ranges in terms of Position (a
line and a character), and our drafts are stored as plain strings. So I
see two things that may not be super efficient for very large files:
- Converting a Position to an offset (the positionToOffset function)
requires searching for end of lines until we reach the desired line.
- When we update a range, we construct a new string, which implies
copying the whole document.
However, for the typical size of a C++ document and the frequency of
update (at which a user types), it may not be an issue. This patch aims
at getting the basic feature in, and we can always improve it later if
we find it's too slow.
Signed-off-by: Simon Marchi <simon.marchi@ericsson.com>
Reviewers: malaperle, ilya-biryukov
Reviewed By: ilya-biryukov
Subscribers: MaskRay, klimek, mgorny, ilya-biryukov, jkorous-apple, ioeric, cfe-commits
Differential Revision: https://reviews.llvm.org/D44272
llvm-svn: 328500
Andrea Di Biagio [Mon, 26 Mar 2018 14:25:52 +0000 (14:25 +0000)]
[llvm-mca] Fix how views are added to the InstructionTables.
This should fix the stack-use-after-scope reported by the asan buildbots after
revision 328493.
llvm-svn: 328499
Sanjay Patel [Mon, 26 Mar 2018 14:25:43 +0000 (14:25 +0000)]
[InstCombine] check uses before creating instructions for fmul distribution
As the tests show, we could create extra instructions without any obvious benefit.
llvm-svn: 328498
Simon Pilgrim [Mon, 26 Mar 2018 14:03:40 +0000 (14:03 +0000)]
[X86][Btver2] Add (V)SQRTPD/(V)SQRTSD costs
The xmm sd/pd versions were using the WriteFSQRT default which is modelled on sqrtss/sqrtps
llvm-svn: 328497
Nicolai Haehnle [Mon, 26 Mar 2018 13:56:53 +0000 (13:56 +0000)]
AMDGPU: Introduce common SOP_Pseudo and VOP_Pseudo TableGen base classes
Differential revision: https://reviews.llvm.org/D44820
Change-Id: I732979e2964006aa15d78a333d8886e6855f319a
llvm-svn: 328496
Alexander Kornienko [Mon, 26 Mar 2018 13:54:17 +0000 (13:54 +0000)]
[clang-format] Wildcard expansion on Windows.
Summary:
Add support for wildcard expansion in command line arguments on Windows.
See https://docs.microsoft.com/en-us/cpp/c-language/expanding-wildcard-arguments
Fixes https://bugs.llvm.org/show_bug.cgi?id=17217
Reviewers: klimek, djasper, rnk
Reviewed By: rnk
Subscribers: rnk, smeenai, zturner, alexfh, mgorny, cfe-commits
Differential Revision: https://reviews.llvm.org/D44778
llvm-svn: 328495
Carlos Alberto Enciso [Mon, 26 Mar 2018 13:48:03 +0000 (13:48 +0000)]
[SemaCXX] _Pragma("clang optimize off") not affecting lambda.
Declaring "_Pragma("clang optimize off")" before the body of a
function with a lambda leads to the lambda functions in the body
not being affected.
Differential Revision: https://reviews.llvm.org/D43821
llvm-svn: 328494
Andrea Di Biagio [Mon, 26 Mar 2018 13:44:54 +0000 (13:44 +0000)]
[llvm-mca] Add a flag -instruction-info to enable/disable the instruction info view.
llvm-svn: 328493
Andrea Di Biagio [Mon, 26 Mar 2018 13:21:48 +0000 (13:21 +0000)]
[llvm-mca] Update the commandline docs after r328305.
Document that flag -resource-pressure can be used to enable/disable the resource
pressure view. This change should have been part of r328305.
llvm-svn: 328492
Simon Pilgrim [Mon, 26 Mar 2018 13:15:20 +0000 (13:15 +0000)]
[X86][Btver2] Double the AGU and schedule pipe resources for YMM
Both the AGUs and schedule pipes are double pumped for 256-bit instructions as well as the functional units which we already model.
llvm-svn: 328491
Krzysztof Parzyszek [Mon, 26 Mar 2018 13:10:09 +0000 (13:10 +0000)]
[LSR] Allow giving priority to post-incrementing addressing modes
Implement TTI interface for targets to indicate that the LSR should give
priority to post-incrementing addressing modes.
Combination of patches by Sebastian Pop and Brendon Cahoon.
Differential Revision: https://reviews.llvm.org/D44758
llvm-svn: 328490
Pavel Labath [Mon, 26 Mar 2018 12:47:40 +0000 (12:47 +0000)]
Make @skipUnlessSupportedTypeAttribute windows-compatible
- close_fds is not compatible with stdin/out redirection on windows. I
just remove it, as this is not required for correct operation.
- the command string was assuming a posix shell. I rewrite the Popen
invocation to avoid the need for passing the arguments through a shell.
llvm-svn: 328489
Pavel Labath [Mon, 26 Mar 2018 12:42:07 +0000 (12:42 +0000)]
Add and fix some tests for PPC64
Summary:
TestExprsChar.py
Char is unsigned char by default in PowerPC.
TestDisassembleBreakpoint.py
Modify disassemble testcase to consider multiple architectures.
TestThreadJump.py
Jumping directly to the return line on PowerPC architecture dos not
means returning the value that is seen on the code. The last test fails,
because it needs the execution of some assembly in the beginning of the
function. Avoiding this test for this architecture.
TestEhFrameUnwind.py
Implement func for ppc64le test case.
TestWatchLocation.py
TestStepOverWatchpoint.py
PowerPC currently supports only one H/W watchpoint.
TestDisassembleRawData.py
Add PowerPC opcode and instruction for disassemble testcase.
Reviewers: labath
Reviewed By: labath
Subscribers: davide, labath, alexandreyy, lldb-commits, luporl, lbianc
Differential Revision: https://reviews.llvm.org/D44472
Patch by Alexandre Yukio Yamashita <alexandre.yamashita@eldorado.org.br>.
llvm-svn: 328488
Andrea Di Biagio [Mon, 26 Mar 2018 12:04:53 +0000 (12:04 +0000)]
[llvm-mca] Add flag -instruction-tables to print the theoretical resource pressure distribution for instructions (PR36874)
The goal of this patch is to address most of PR36874. To fully fix PR36874 we
need to split the "InstructionInfo" view from the "SummaryView". That would make
easy to check the latency and rthroughput as well.
The patch reuses all the logic from ResourcePressureView to print out the
"instruction tables".
We have an entry for every instruction in the input sequence. Each entry reports
the theoretical resource pressure distribution. Resource pressure is uniformly
distributed across all the processor resource units of a group.
At the moment, the backend pipeline is not configurable, so the only way to fix
this is by creating a different driver that simply sends instruction events to
the resource pressure view. That means, we don't use the Backend interface.
Instead, it is simpler to just have a different code-path for when flag
-instruction-tables is specified.
Once Clement addresses bug 36663, then we can port the "instruction tables"
logic into a stage of our configurable pipeline.
Updated the BtVer2 test cases (thanks Simon for the help). Now we pass flag
-instruction-tables to each modified test.
Differential Revision: https://reviews.llvm.org/D44839
llvm-svn: 328487
Pavel Labath [Mon, 26 Mar 2018 12:00:52 +0000 (12:00 +0000)]
[LLDB][PPC64] Fix TestGdbRemoteAuxvSupport
Summary: PPC64's auxvec has a special key that must be ignored.
Reviewers: clayborg, labath
Reviewed By: clayborg, labath
Subscribers: alexandreyy, lbianc
Differential Revision: https://reviews.llvm.org/D43771
Patch by Leandro Lupori <leandro.lupori@gmail.com>.
llvm-svn: 328486
Pavel Labath [Mon, 26 Mar 2018 11:45:32 +0000 (11:45 +0000)]
Add a test for setting the load address of a module with differing physical/virtual addresses
Summary:
First attempt at landing D42145 was reverted because it caused test
failures on some android devices. It turned out this was because these
devices had vdso modules with differing physical and virtual addresses.
This was not caught earlier because all of the modules in our tests
either lack physical addresses or have them identical to virtual ones.
In the discussion on the patch, we came to the conclusion that in the
scenario where we are merely setting a load address of a module (for
example from a dynamic loader plugin), we should always use virtual
addresses (i.e., preserve status quo). This patch adds a test to make
sure we don't regress in that direction.
Reviewers: owenpshaw
Subscribers: lldb-commits
Differential Revision: https://reviews.llvm.org/D44738
llvm-svn: 328485
Carlos Alberto Enciso [Mon, 26 Mar 2018 11:38:01 +0000 (11:38 +0000)]
Test commit - adding a new line.
llvm-svn: 328484
Max Kazantsev [Mon, 26 Mar 2018 11:31:46 +0000 (11:31 +0000)]
[LoopUnroll] Fix dangling pointers in SCEV
Current logic of loop SCEV invalidation in Loop Unroller implicitly relies on
fact that exit count of outer loops cannot rely on exiting blocks of
inner loops, which is true in current implementation of backedge taken count
calculation but is wrong in general. As result, when we only forget the loop that
we have just unrolled, we may still have cached data for its outer loops (in particular,
exit counts) which keeps references on blocks of inner loop that could have been
changed or even deleted.
The attached test demonstrates a situaton when after unrolling of innermost loop
the outermost loop contains a dangling pointer on non-existant block. The problem
shows up when we apply patch https://reviews.llvm.org/D44677 that makes SCEV
smarter about exit count calculation. I am not sure if the bug exists without this patch,
it appears that now it is accidentally correct just because in practice exact backedge
taken count for outer loops with complex control flow inside is never calculated.
But when SCEV learns to do so, this problem shows up.
This patch replaces existing logic of SCEV loop invalidation with a correct one, which
happens to be invalidation of outermost loop (which also leads to invalidation of all
loops inside of it). It is the only way to ensure that no outer loop keeps dangling pointers
on removed blocks, or just outdated information that has changed after unrolling.
Differential Revision: https://reviews.llvm.org/D44818
Reviewed By: samparker
llvm-svn: 328483
Hans Wennborg [Mon, 26 Mar 2018 10:07:51 +0000 (10:07 +0000)]
Revert r328386 "[X86] Fix Windows `i1 zeroext` conventions to use i8 instead of i32"
This broke Chromium (see crbug.com/825748). It looks like mstorsjo's follow-up
patch at D44876 fixes this, but let's revert back to green for now until that's
ready to land.
(Also reverts r328443.)
> Both GCC and MSVC only look at the low byte of a boolean when it is
> passed.
llvm-svn: 328482
Benjamin Kramer [Mon, 26 Mar 2018 09:44:24 +0000 (09:44 +0000)]
[DeadArgElim] Strip allocsize attributes when deleting an argument.
Since allocsize refers to the argument number it gets invalidated when
an argument is removed and the numbers shift.
llvm-svn: 328481
Sam Parker [Mon, 26 Mar 2018 09:29:42 +0000 (09:29 +0000)]
[IRCE] Enable increasing loops of variable bounds
CanBeMin is currently used which will report true for any unknown
values, but often a check is performed outside the loop which covers
this situation:
for (int i = 0; i < N; ++i)
...
if (N > 0)
for (int i = 0; i < N; ++i)
...
So I've add 'LoopGuardedAgainstMin' which reports whether N is
greater than the minimum value which then allows loop with a variable
loop count to be optimised. I've also moved the increasing bound
checking into its own function and replaced SumCanReachMax is another
isLoopEntryGuardedByCond function.
llvm-svn: 328480
George Rimar [Mon, 26 Mar 2018 08:58:16 +0000 (08:58 +0000)]
This is PR36799.
Currently, we might have a bug with scripts like below:
.foo : ALIGN(8)
{
*(.foo)
} > ram
because do not expand the memory region when doing ALIGN.
This might result in file range overlaps. The patch fixes the issue.
Differential revision: https://reviews.llvm.org/D44730
llvm-svn: 328479
Martin Storsjo [Mon, 26 Mar 2018 08:41:10 +0000 (08:41 +0000)]
[ARM] Simplify constructing the ARMArchFeature string. NFC.
Differential Revision: https://reviews.llvm.org/D44819
llvm-svn: 328478
Eric Fiselier [Mon, 26 Mar 2018 07:06:25 +0000 (07:06 +0000)]
Fix test case initialization issues in permissions test
llvm-svn: 328477
Eric Fiselier [Mon, 26 Mar 2018 06:23:55 +0000 (06:23 +0000)]
Implement filesystem::perm_options specified in NB comments.
The NB comments for filesystem changed permissions and added
a new enum `perm_options` which control how the permissions
are applied.
This implements than NB resolution
llvm-svn: 328476
Eric Fiselier [Mon, 26 Mar 2018 05:46:57 +0000 (05:46 +0000)]
Make filesystem tests generic between experimental and std versions.
As I move towards implementing std::filesystem, there is a need to
make the existing tests run against both the std and experimental versions.
Additionally, it's helpful to allow running the tests against other
implementations of filesystem.
This patch converts the test to easily target either. First, it
adds a filesystem_include.hpp header which is soley responsible
for selecting and including the correct implementation. Second,
it converts existing tests to use this header instead of including
filesystem directly.
llvm-svn: 328475
Craig Topper [Mon, 26 Mar 2018 05:05:12 +0000 (05:05 +0000)]
[X86] Fix the SchedRW for intrinsic register form of SQRT/RCP/RSQRT.
llvm-svn: 328474
Craig Topper [Mon, 26 Mar 2018 05:05:10 +0000 (05:05 +0000)]
[X86] Merge the SSE and AVX versions of fp divs and sqrts in the SandyBridge/Haswell/Broadwell/Skylake scheduler models.
I've used Agner's data as best I could to get the values to converge on.
llvm-svn: 328473
Craig Topper [Mon, 26 Mar 2018 04:20:36 +0000 (04:20 +0000)]
[X86] Add itinerary to intrinsic version of sqrtss, rcpss, and rsqrtss instructions.
llvm-svn: 328472
Craig Topper [Mon, 26 Mar 2018 02:17:15 +0000 (02:17 +0000)]
[X86] Correct the itineraries for the dot production instructions.
llvm-svn: 328471
Craig Topper [Mon, 26 Mar 2018 02:17:14 +0000 (02:17 +0000)]
[X86] Use the same itinerary for VCVTDQ2PD as the SSE version so that the generated scheduler classes will merge.
llvm-svn: 328470
Craig Topper [Mon, 26 Mar 2018 02:17:13 +0000 (02:17 +0000)]
[X86] Swap the itineraries on the memory and register forms of CVTDQ2PD.
They were backwards.
llvm-svn: 328469
Craig Topper [Mon, 26 Mar 2018 02:17:12 +0000 (02:17 +0000)]
[X86] Give VMOVSX/ZX the same itinerary as the SSE version so they'll reuse the same generated scheduler class.
llvm-svn: 328468
Vitaly Buka [Mon, 26 Mar 2018 01:29:48 +0000 (01:29 +0000)]
[sanitizer] Make test compatible with Darwin
llvm-svn: 328467
Craig Topper [Sun, 25 Mar 2018 23:52:06 +0000 (23:52 +0000)]
[X86] Give vpmsadbw the same itinerary as the SSE version so they'll be able to share the same generated scheduler class.
llvm-svn: 328466
Craig Topper [Sun, 25 Mar 2018 23:40:56 +0000 (23:40 +0000)]
[X86] Move (v)movss to port 5 only for Skylake. Move (v)movups/d to port 015 for Skylake.
This matches Agner's data and is consistent with what the EVEX instructions were doing on SKX.
llvm-svn: 328465
Erik Pilkington [Sun, 25 Mar 2018 22:50:33 +0000 (22:50 +0000)]
[demangler] Use a back-patching scheme to resolve forward references.
Strictly in a conversion operator's type, a <template-param> refers to a
<template-arg> that is further ahead in the mangled name. Instead of
doing a second parse to resolve these, introduce a
ForwardTemplateReference Node and back-patch the referenced
<template-arg> when we're in the right context.
This is also a correctness fix, previously we would only do a second
parse if the <template-param> was out of bounds in the current set of
<template-args>. This lead to misdemangles (gasp!) when the conversion
operator was a member of a templated struct, for instance.
llvm-svn: 328464
Erik Pilkington [Sun, 25 Mar 2018 22:49:57 +0000 (22:49 +0000)]
[demangler] Tweak how parameter pack sizes are determined.
Rather than eagerly propagating up parameter pack sizes in Node ctors,
find the parameter pack size during printing. This is being done to
support back-patching forward referencing <template-param>s.
llvm-svn: 328463
Erik Pilkington [Sun, 25 Mar 2018 22:49:16 +0000 (22:49 +0000)]
[demangler] Support for clang's enable_if attribute.
Fixes PR33569.
llvm-svn: 328462
Sanjay Patel [Sun, 25 Mar 2018 21:16:33 +0000 (21:16 +0000)]
[PatternMatch] allow undef elements when matching vector FP +0.0
This continues the FP constant pattern matching improvements from:
https://reviews.llvm.org/rL327627
https://reviews.llvm.org/rL327339
https://reviews.llvm.org/rL327307
Several integer constant matchers also have this ability. I'm
separating matching of integer/pointer null from FP positive zero
and renaming/commenting to make the functionality clearer.
llvm-svn: 328461
Simon Pilgrim [Sun, 25 Mar 2018 20:16:53 +0000 (20:16 +0000)]
[X86] Use WriteResPair for WriteIDiv to cleanup sched defs. NFCI.
llvm-svn: 328460
Simon Pilgrim [Sun, 25 Mar 2018 19:20:08 +0000 (19:20 +0000)]
[SchedModel] Remove instregex entries that don't match any instructions
This patch throws a fatal error if an instregex entry doesn't actually match any instructions. This is part of the work to reduce the compile time impact of increased instregex usage (PR35955), although the x86 models seem to be relatively clean.
All the cases I encountered have now been fixed in trunk and this will ensure they don't get reintroduced.
Differential Revision: https://reviews.llvm.org/D44687
llvm-svn: 328459
Simon Pilgrim [Sun, 25 Mar 2018 19:17:17 +0000 (19:17 +0000)]
[X86][SkylakeClient] Fix missing comma
llvm-svn: 328458
Simon Pilgrim [Sun, 25 Mar 2018 19:07:17 +0000 (19:07 +0000)]
[ARM] Remove sched model instregex entries that don't match any instructions (D44687)
Reviewed by @javed.absar
llvm-svn: 328457
Simon Pilgrim [Sun, 25 Mar 2018 18:49:48 +0000 (18:49 +0000)]
[X86] Add missing full stop to comment. NFCI.
llvm-svn: 328456
Sanjay Patel [Sun, 25 Mar 2018 17:48:20 +0000 (17:48 +0000)]
[InstSimplify, InstCombine] add/update tests with FP +0.0 vector with undef; NFC
llvm-svn: 328455
Craig Topper [Sun, 25 Mar 2018 17:33:14 +0000 (17:33 +0000)]
[X86][SkylakeClient] Fix a set of regular expressions that were checking for optionally starting with 'Y' instead of 'V'
These bad regexs were introduced by r328435
llvm-svn: 328454
Simon Pilgrim [Sun, 25 Mar 2018 17:28:06 +0000 (17:28 +0000)]
[X86][MMX] MOVQ2DQ/MOVDQ2Q are better described as WriteVecMove than WriteMove
Not that it makes a difference to current cost values, but will when we try to better model GPR-SIMD transfer costs
llvm-svn: 328453
Simon Pilgrim [Sun, 25 Mar 2018 17:25:37 +0000 (17:25 +0000)]
[X86][SkylakeServer] Merge multiple instregex. NFCI
llvm-svn: 328452
Craig Topper [Sun, 25 Mar 2018 15:58:12 +0000 (15:58 +0000)]
[X86] Update cost model for Goldmont. Add fsqrt costs for Silvermont
Add fdiv costs for Goldmont using table 16-17 of the Intel Optimization Manual. Also add overrides for FSQRT for Goldmont and Silvermont.
Reviewers: RKSimon
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D44644
llvm-svn: 328451
Sanjay Patel [Sun, 25 Mar 2018 14:24:32 +0000 (14:24 +0000)]
[InstCombine] adjust test comments; NFC
llvm-svn: 328450
Sanjay Patel [Sun, 25 Mar 2018 14:19:25 +0000 (14:19 +0000)]
[InstCombine] consolidate casted icmp vector tests
We have thorough coverage of predicates and scalar types,
so we just need a sampling of vector tests to show that
things are working or not with vectors types.
llvm-svn: 328449