review.tizen.org Git - platform/upstream/llvm.git/log

Revert "[clang-format] Remove empty lines before }[;] // comment"

This reverts commit r327861.

The empty line before namespaces is desired in some places. We need a
better approach to handle this.

llvm-svn: 328621

[X86][Btver2] Add MMX_PMOVMSKBrr to MOVMSK scheduler class

llvm-svn: 328620

[analyzer] LoopUnrolling: update the matched assignment operators

Extended the matched assignment operators when checking for bound changes in a body of the loop by using the freshly added isAssignmentOperator matcher.
This covers all the (current) possible assignments, tests added as well.

Differential Revision: https://reviews.llvm.org/D38921

llvm-svn: 328619

[ASTMatchers] Add isAssignmentOperator matcher

Adding a matcher for BinaryOperator and cxxOperatorCallExpr to be able to
decide whether it is any kind of assignment operator or not. This would be
useful since allows us to easily detect assignments via matchers for static
analysis (Tidy, SA) purposes.

Differential Revision: https://reviews.llvm.org/D44893

llvm-svn: 328618

[PowerPC] Secure PLT support

This patch supports secure PLT mode for PowerPC 32 architecture.

Differential Revision: https://reviews.llvm.org/D42112

llvm-svn: 328617

[MIPS] Add static_assert that all Fixups are handled in getFixupKind

Summary:
I recently added a new Fixup kind to our fork of LLVM but forgot to add
it to the table in MipsAsmBackend.cpp. With this static_assert the error
would have been caught instead of zero-initializing the array entries for
the new fixups.

Reviewers: sdardis, atanasyan

Reviewed By: atanasyan

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44895

llvm-svn: 328616

[LoopUnroll][NFC] Remove redundant canPeel check

We check `canPeel` twice: when evaluating the number of iterations to be peeled
and within the method `peelLoop` that performs peeling. This method is only
executed if the calculated peel count is positive. Thus, the check in `peelLoop` can
never fail. This patch replaces this check with an assert.

Differential Revision: https://reviews.llvm.org/D44919
Reviewed By: fhahn

llvm-svn: 328615

[IRCE] Enable decreasing loops of non-const bound

As a follow-up to r328480, this updates the logic for the decreasing
safety checks in a similar manner:
- CanBeMax is replaced by CannotBeMaxInLoop which queries
  isLoopEntryGuardedByCond on the maximum value.
- SumCanReachMin is replaced by isSafeDecreasingBound which includes
  some logic from parseLoopStructure and, again, has been updated to
  use isLoopEntryGuardedByCond on the given bounds.

Differential Revision: https://reviews.llvm.org/D44776

llvm-svn: 328613

[NFC] Fix comments in getExact()

llvm-svn: 328612

[SCEV] Make exact taken count calculation more optimistic

Currently, `getExact` fails if it sees two exit counts in different blocks. There is
no solid reason to do so, given that we only calculate exact non-taken count
for exiting blocks that dominate latch. Using this fact, we can simply take min
out of all exits of all blocks to get the exact taken count.

This patch makes the calculation more optimistic with enforcing our assumption
with asserts. It allows us to calculate exact backedge taken count in trivial loops
like

  for (int i = 0; i < 100; i++) {
    if (i > 50) break;
    . . .
  }

Differential Revision: https://reviews.llvm.org/D44676
Reviewed By: fhahn

llvm-svn: 328611

[lld] fix data race in ICF.cpp

Summary: Fixes PR36823.

Reviewers: ruiu, pcc, rnk

Reviewed By: ruiu

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44716

llvm-svn: 328610

[SCEV] Add one more case in computeConstantDifference

This patch teaches `computeConstantDifference` handle calculation of constant
difference between `(X + C1)` and `(X + C2)` which is `(C2 - C1)`.

Differential Revision: https://reviews.llvm.org/D43759
Reviewed By: anna

llvm-svn: 328609

[MachineScheduler] Add itinerary to schedcover.py. Make default work in the command line filter

Summary:
This patch adds itinerary support to the schedcover.py script. I've been trying to use this script to figure out why SSE and AVX instructions are ending up in separate tablegen scheduler classes and sometimes its because we are using different itineraries.

Rather than using None to indicate the default scheduler model, I now use the string "default". I had to hack around the sorting a little to keep "default" at the beginning. But this also makes it so you can specify "default" on the command line to just get the defaults

I also fixed the regular expression code so that the no_default wasn't evaluated twice.

Reviewers: RKSimon, atrick, jmolloy, javed.absar

Reviewed By: javed.absar

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44834

llvm-svn: 328608

[coroutines] Fix unused warning on result of co_yield.

This patch follows up on r328602, which fixed the spurious unused
result warning for `co_await`.

llvm-svn: 328607

[coroutines] Fix invalid source range in co_await call expressions.

Summary:
Currently an invalid source range is generated for the member call expressions of `co_await`. The end location of the call expression is the `co_await` token loc, while the start is the location of the operand. This causes crashes when the source range is used to produce diagnostics.

This patch fixes the issues by using the expression location instead of the token location when building the member calls.

Reviewers: GorNishanov, rsmith, vsk, aaron.ballman

Reviewed By: vsk

Subscribers: cfe-commits, modocache

Differential Revision: https://reviews.llvm.org/D44915

llvm-svn: 328606

Remove extraneous local variable. NFC.

llvm-svn: 328605

Update comments.

llvm-svn: 328604

Revert "Revert "[lit] Generalized /dev/null support on Windows.""

Summary:
This reverts commit r328596.

Checking if the arguments are strings before testing if they contain "/dev/null".

Reviewers: rnk

Reviewed By: rnk

Subscribers: delcypher, llvm-commits

Differential Revision: https://reviews.llvm.org/D44914

llvm-svn: 328603

Fix unused expression warning in co_await.

Previously, anytime the result of the resume expression in
operator co_await was unused, a warning was generated. This
patch fixes the issue by only generating the unused result warning
if calling `await_resume()` would also generate a warning.

llvm-svn: 328602

[x86] add RUN for target before roundss; NFC

llvm-svn: 328601

Revert "[asan] Replace vfork with fork."

Replacing vfork with fork results in significant slowdown of certain
apps (in particular, memcached).

This reverts r327752.

llvm-svn: 328600

Remove dead method

llvm-svn: 328599

[lit] Temporarily disable shtest-timeout.py on darwin

Disabled until fixed in order to avoid random failures on green dragon.

rdar://problem/38774530

llvm-svn: 328598

[Edit, Rewrite] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

llvm-svn: 328597

Revert "[lit] Generalized /dev/null support on Windows."

This reverts commit ca7fdbb974384ce5a05528b22a41d46b1cc13e92.

llvm-svn: 328596

Add a build dependency from libMC to libDebugInfoCodeView to match the reality of header dependencies here

llvm-svn: 328595

Fix for header rename in LLVM

llvm-svn: 328594

Move CVDebugRecord from CodeView to Object to fix layering

llvm-svn: 328593

[x86] add tests for ftrunc; NFC

llvm-svn: 328592

Add the same new entitlement from r326399 to
the macos entitlement list.
<rdar://problem/38887712>

llvm-svn: 328591

[DebugInfoPDB] Print the method name along with the variant value

Before this change, using dumpProperties() with PDBSymbolData
would look like this:

  get_locationType: 3
  1

After this change:

  get_locationType: 3
  get_value: 1

llvm-svn: 328590

[lit] Generalized /dev/null support on Windows.

Generalized /dev/null remapping on Windows, and added test.

Reviewers: rnk

Reviewed By: rnk

Subscribers: amccarth, zturner, delcypher, llvm-commits

Differential Revision: https://reviews.llvm.org/D44771

llvm-svn: 328589

[clang-doc] Removing -Wunused-variable warning

Warning was appearing in release with debug info build, this removes it.

Differential Revision: https://reviews.llvm.org/D44912

llvm-svn: 328588

[DebugInfoPDB] Add methods to get the compiland and line numbers with PDBSymbolData

llvm-svn: 328587

[DebugInfoPDB] Add DIA implementation of findLineNumbersByRVA

This method is used to find line numbers for PDBSymbolData
that have an invalid virtual address.

llvm-svn: 328586

[DebugInfoPDB] Add DIA implementation of addressForVA and addressForRVA

These are used in finding line numbers for PDBSymbolData

llvm-svn: 328585

[Frontend] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

llvm-svn: 328584

Fix newlines. NFCI.

llvm-svn: 328583

[X86] Add WriteCRC32 scheduler class

Currently CRC32 instructions use the WriteFAdd class, this patch splits them off into their own, at the moment it is still mostly just a duplicate of WriteFAdd but it can now be tweaked on a target by target basis.

Differential Revision: https://reviews.llvm.org/D44647

llvm-svn: 328582

Use local symbols for creating .stack-size.

llvm-svn: 328581

Fix go bindings test when using goma distributed build tool

Goma[1] is a distributed build system similar to distcc and icecc
primarily used to compile Chromium. The client is open source, and
hopefully soon the server will be as well. The intended usage model is
similar to most distributed build systems: prefix gomacc onto your
compiler command line, and it transparently distributes compilation.

The go lit config wants to determine the host compiler binary, so it
needs some extra logic to avoid looking at these prefixes.

[1] https://chromium.googlesource.com/infra/goma/client/

llvm-svn: 328580

Refactor SharedFile::parseRest. NFC.

SharedFile::parseRest function grew organically and got a bit hard to
understand. This patch refactor it. This patch also adds comments.

Differential Revision: https://reviews.llvm.org/D44860

llvm-svn: 328579

Use correct format specifier.
Review comment on r328235 by James Henderson.

llvm-svn: 328578

[MemorySSA] Fix exponential compile-time updating MemorySSA.

MemorySSAUpdater::getPreviousDefRecursive is a recursive algorithm, for
each block, it computes the previous definition for each predecessor,
then takes those definitions and combines them. But currently it doesn't
remember results which it already computed; this means it can visit the
same block multiple times, which adds up to exponential time overall.

To fix this, this patch adds a cache. If we computed the result for a
block already, we don't need to visit it again because we'll come up
with the same result. Well, unless we RAUW a MemoryPHI; in that case,
the TrackingVH will be updated automatically.

This matches the original source paper for this algorithm.

The testcase isn't really a test for the bug, but it adds coverage for
the case where tryRemoveTrivialPhi erases an existing PHI node. (It's
hard to write a good regression test for a performance issue.)

Differential Revision: https://reviews.llvm.org/D44715

llvm-svn: 328577

[libFuzzer] Do not optimize minimize_two_crashes.test.

Speculative fix for build bot breakage on Mac.

llvm-svn: 328576

Move blocktime_str variable right before its first use

llvm-svn: 328575

[Hexagon] Assertion failure in HexagonSubtarget.cpp

In restoreLatency, replace range-for loop with std::find.

Patch by Jyotsna Verma.

llvm-svn: 328574

[X86][Btver2] Add (U)COMISD/(U)COMISD scheduler costs

Account for the "+i" integer pipe transfer cost (1cy use of JALU0 for GPR PRF write)

llvm-svn: 328573

[SLP] Add more checks to a test case. NFC.

llvm-svn: 328572

Reduce code duplication a bit.

Thanks to George Rimar for pointing it out.

llvm-svn: 328571

[X86] Fix Windows `i1 zeroext` conventions to use i8 instead of i32

Summary:
Re-lands r328386 and r328443, reverting r328482.

Incorporates fixes from @mstorsjo in D44876 (thanks!) so that small
parameters in i8 and i16 do not end up in the SysV register parameters
(EDI, ESI, etc).

I added tests for how we receive small parameters, since that is the
important part. It's always safe to store more bytes than will be read,
but the assumptions you make when loading them are what really matter.

I also tested this by self-hosting clang and it passed tests on win64.

Reviewers: mstorsjo, hans

Subscribers: hiraditya, mstorsjo, llvm-commits

Differential Revision: https://reviews.llvm.org/D44900

llvm-svn: 328570

Reduce code duplication a bit. NFC

llvm-svn: 328569

Add summarizeStats.py to tools directory

The summarizeStats.py script processes raw data provided by the
instrumented (stats-gathering) OpenMP* runtime library. It provides:

1) A radar chart which plots counters as frequency (per GigaTick) of use within
   the program. The frequencies are plotted as log10, however values less than
   one are kept as it is and represented in red color. This was done to help
   visualize the differences better.
2) Pie charts separating total time as compute and non-compute. The compute and
   non-compute times have their own pie charts showing the constructs that
   contributed to them. The percentages listed are with respect to the total
   time.
3) '.csv' file with percentage of time spent within the different constructs.

The script can be used as:
$ python $PATH_TO_SCRIPT/summarizeStats.py instrumented1.csv instrumented2.csv

Patch by Taru Doodi

Differential Revision: https://reviews.llvm.org/D41838

llvm-svn: 328568

[MS] Fix late-parsed template infinite loop in eager instantiation

Summary:
This fixes PR33561 and PR34185.

Don't store pending template instantiations for late-parsed templates in
the normal PendingInstantiations queue. Instead, use a separate list
that will only be parsed and instantiated at end of TU when late
template parsing actually works and doesn't infinite loop.

Reviewers: rsmith, thakis, hans

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D44846

llvm-svn: 328567

[X86] Add WriteBitScan/WriteLZCNT/WriteTZCNT/WritePOPCNT scheduler classes (PR36881)

Give the bit count instructions their own scheduler classes instead of forcing them into existing classes.

These were mostly overridden anyway, but I had to add in costs from Agner for silvermont and znver1 and the Fam16h SoG for btver2 (Jaguar).

Differential Revision: https://reviews.llvm.org/D44879

llvm-svn: 328566

Remove unused file, ExecutionEngine/MCJIT/ObjectBuffer.h

This header also wasn't self contained/modular - but with no users, it
didn't seem worth fixing because it'd break so easily again.

llvm-svn: 328565

[XCore] Change std::sort to llvm::sort in response to r327219

Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: dblaikie, RKSimon, robertlytton

Reviewed By: robertlytton

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44875

llvm-svn: 328564

[lit] Implement 'cat' command for internal shell

Fixes PR36449

Patch by Chamal de Silva

Differential Revision: https://reviews.llvm.org/D43501

llvm-svn: 328563

Delete pdbutil diff mode.

This has been made obsolete by the fact that almost all of the
things it previously checked for are no longer relevant since
we can just compare bytes in a lot of places.

llvm-svn: 328562

[Hexagon] Add more lit tests

llvm-svn: 328561

[InstCombine] improve code comment; NFC

llvm-svn: 328560

[ELF] GotSection increment NumEntries when Target saves GlobalOffsetTable in the .got

When the target saves ElfSym::GlobalOffsetTable in the .got rather than
.got.plt, Target->GotHeaderEntriesNum states the number of extra entries
required in the .got. Rather than having to add Target->GotHeaderEntriesNum to
NumEntries in every function which refers to NumEntries, this patch changes the
initial value of NumEntries in the constructor.

Differential Revision: https://reviews.llvm.org/D44744

llvm-svn: 328559

[Power9]Legalize and emit code for quad-precision convert from double-precision

Legalize and emit code for quad-precision floating point operation xscvdpqp
and add option to guard the quad precision operation support.

Differential Revision: https://reviews.llvm.org/D44746

llvm-svn: 328558

Fix check for verbose logging.

Thanks to Pavel for pointing this out!

llvm-svn: 328557

[PowerPC] Infrastructure work. Implement getting the opcode for a spill in one place.

A new function getOpcodeForSpill should now be the only place to get
the opcode for a given spilled register.

Differential Revision: https://reviews.llvm.org/D43086

llvm-svn: 328556

Disable [MachineLICM] Add functions to MachineLICM to hoist invariant stores

Disable https://reviews.llvm.org/D40196 with setting option
hoist-const-stores to false since failing s390 buildbot.

llvm-svn: 328555

[Pipeliner] Several node-ordering fixes

First, we change the heuristic that is used to ignore the recurrent
node-sets in the node ordering. In certain cases it's not important
to focus on the recurrent node-sets. Instead, the algorithm begins
by considering all the instructions in the node ordering step.

Second, a minor change to the bottom up traversal, which needs to
consider loop carried dependences (modeled as anti dependences).
Previously, these instructions were skipped, which caused problems
because the instruction ends up having both predecessors and
sucessors in the schedule.

Third, consider anti-dependences as a tie breaker when choosing
between instructions in the node ordering. We want to make sure
that the source of the anti-dependence does not end up with both
predecesssors and sucessors in the final node ordering.

Patch by Brendon Cahoon.

llvm-svn: 328554

[AMDGPU] Improve disassembler error handling

Summary:
llvm-objdump now disassembles unrecognised opcodes as data, using
the .long directive. We treat unrecognised opcodes as being 32 bit
values, so move along 4 bytes rather than the single byte which
previously resulted in a cascade of bogus disassembly following an
unrecognised opcode.

While no solution can always disassemble code that contains
embedded data correctly this provides a significant improvement.

The disassembler will now cope with an arbitrary length section
as it no longer truncates it to a multiple of 4 bytes, and will
use the .byte directive for trailing bytes.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D44685

llvm-svn: 328553

[CodeGen] Mark fma as const for Android

Summary:
r318093 sets fma, fmaf, fmal as const for Gnu and MSVC. Android also
does not set errno for these functions. So mark these const for
Android.

Reviewers: spatel, efriedma, srhines, chh, enh

Subscribers: cfe-commits, llvm-commits

Differential Revision: https://reviews.llvm.org/D44852

llvm-svn: 328552

[X86][Btver2] Add CVTSI2SD/CVTSI2SS scheduler costs

We still need to account for how Jaguar passes data from GPR -> XMM, which isn't as clean as XMM -> GPR.....

llvm-svn: 328551

[Pipeliner] Check for affine expression in isLoopCarriedOrder

The pipeliner must add a loop carried dependence between two memory
operations if the base register is not an affine (linear) exression.
The current implementation doesn't check how the base register is
defined, which allows non-affine expressions, and then the pipeliner
does not add a loop carried dependence when one is needed.

This patch adds code to isLoopCarriedOrder that checks if the base
register of the memory operations is defined by a phi, and the loop
definition for the phi is a constant increment value. This is a very
simple check for a linear expression.

Patch by Brendon Cahoon.

llvm-svn: 328550

Remove an unneeded (& mislayered) include from Target/TargetLoweringObjectFile on a CodeGen header

llvm-svn: 328549

Remove unneeded (& mislayered) include from TargetMachine.cpp on a CodeGen header

llvm-svn: 328548

[Pipeliner] Add missing loop carried dependences

The pipeliner is not adding a dependence edge for a loop carried
dependence, and ends up scheduling a load from iteration n prior
to an aliased store in iteration n-1.

The code that adds the loop carried dependences in the pipeliner
doesn't check if the memory objects for loads and stores are
"identified" (i.e., distinct) objects. If they are not, then the
code that adds the dependences needs to be conservative. The
objects can be used to check dependences only when they are
distinct objects.

The code that checks for loop carried dependences has been updated
to classify loads and stores that are not identified as "unknown"
values. A store with an "unknown" value can potentially create
a loop carried dependence with any pending load.

Patch by Brendon Cahoon.

llvm-svn: 328547

[SLP] Add a test case. NFC.

llvm-svn: 328546

[Pipeliner] Fix renaming in pipeliner when eliminating phis

The phi renaming code in the pipeliner uses the wrong value when
rewriting phi uses, which results in an undefined value. In this
case, the original phi is no longer needed due to the order of
instruction in the pipelined loop. The pipeliner was assuming, in
this case, the the phi loop definition should be used to
rewrite the uses. However, the pipeliner needs to check to make
sure that the loop definition has already been scheduled. If not,
then the phi initial value needs to be used instead.

Patch by Brendon Cahoon.

llvm-svn: 328545

[OPENMP] Codegen for declare target with link clause.

If the link clause is used on the declare target directive, the object
should be linked on target or target data directives, not during the
codegen. Patch adds support for this clause.

llvm-svn: 328544

[Pipeliner] Fix number of phis to generate in the epilog

The pipeliner was generating too many phis in the epilog blocks, which
caused incorrect code generation when rewriting an instruction that uses
the phi.

In this case, there 3 prolog and epilog stages. An existing phi was
scheduled at stage 1. When generating the code for the 2nd epilog an
extra new phi was generated.

To fix this, we need to update the code that calculates the maximum
number of phis that can be generated, which is based upon the current
prolog stage and the stage of the original phi. In this case, when the
prolog stage is 1 and the original phi stage is 1, the maximum number
of phis to generate is 2.

Patch by Brendon Cahoon.

llvm-svn: 328543

[Pipeliner] Use latency to compute RecMII

The patch contains severals changes needed to pipeline an example
that was transformed so that a Phi with a subreg is converted to
copies.

The pipeliner wasn't working for a couple of reasons.
- The RecMII was 3 instead of 2 due to the extra copies.
- Copy instructions contained a latency of 1.
- The node order algorithm was not choosing the best "bottom"
node, which caused an instruction to be scheduled that had a
predecessor and successor already scheduled.
- Updated the Hexagon Machine Scheduler to check if the node is
latency bound when adding the cost for a 0-latency dependence.

The RecMII was 3 because the computation looks at the number of
nodes in the recurrence. The extra copy is an extra node but
it shouldn't increase the latency. The new RecMII computation
looks at the latency of the instructions in the recurrence. We
changed the latency of the dependence of a copy to 0. The latency
computation for the copy also checks the use of the copy (similar
to a reg_sequence).

The node order algorithm was not choosing the last instruction
in the recurrence for a bottom up traversal. This was when the
last instruction is a copy. A check was added when choosing the
instruction to check for NodeNum if the maxASAP is the same. This
means that the scheduler will not end up with another node in
the recurrence that has both a predecessor and successor already
scheduled.

The cost computation in Hexagon Machine Scheduler adds cost when
an instruction can be packetized with a zero-latency instruction.
We should only do this if the schedule is latency bound.

Patch by Brendon Cahoon.

llvm-svn: 328542

[X86][Btver2] Add CVTSD2SS/CVTSS2SD scheduler costs

llvm-svn: 328541

[Pipeliner] Fix assert caused by pipeliner serialization

The pipeliner is asserting because the serialization step that
occurs at the end is deleting an instruction. The assert
occurs later on because there is a use without a definition.

The problem occurs when an instruction defines a value used
by a REQ_SEQUENCE and that value is used by a COPY instruction.
The latencies between these instructions are zero, so they are
put in to the same packet. The serialization code is unable to
handle this correctly, and ends up putting the REG_SEQUENCE
before its definition.

There is special code in the serialization step that attempts
to handle zero-cost instructions (phis, copy, reg_sequence)
differently than regular instructions. Unfortunately, this means
the order does not come out correct.

This patch simplifies the code by changing the seperate steps for
handling zero-cost and regular instructions. Only phis are
handled separate now, since they should occurs first. Then, this
patch adds checks to make use the MoveUse is set to the smallest
value if there are multiple uses in a cycle.

Patch by Brendon Cahoon.

llvm-svn: 328540

[InstCombine] reassociate loop invariant GEP chains to enable LICM

This change brings performance of zlib up by 10%. The example below is from a
hot loop in longest_match() from zlib.

do.body:
  %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ]
  %idx.ext = zext i32 %cur_match.addr.0 to i64
  %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext
  %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 %idx.ext1
  %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 -1

In this example %idx.ext1 is a loop invariant. It will be moved above the use of
loop induction variable %idx.ext such that it can be hoisted out of the loop by
LICM. The operands that have dependences carried by the loop will be sinked down
in the GEP chain. This patch will produce the following output:

do.body:
  %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ]
  %idx.ext = zext i32 %cur_match.addr.0 to i64
  %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext1
  %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 -1
  %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 %idx.ext

llvm-svn: 328539

[Pipeliner] Enable more base+offset dependence changes in pipeliner

The pipeliner changes dependences between base+offset instructions
(loads and stores) so that the instructions have more flexibility
to be scheduled with respect to each other. This occurs when the
pipeliner is able to compute that the instructions will not alias
if their order is changed. The prevous code enforced the alias
property by checking if the base register is the same, and that the
offset values are either both positive or negative.

This patch improves the alias check by using the API
areMemAccessesTriviallyDisjoint instead. This enables more cases,
especially if the offset is a negative value. The pipeliner uses
the function by creating a new instruction with the offset used
in the next iteration.

Patch by Brendon Cahoon.

llvm-svn: 328538

[Pipeliner] Fix calculation when reusing phis

A schedule may require that a phi from the original loop is used in
multiple iterations in the scheduled loop. When this occurs, we generate
multiple phis in the pipelined loop to save the value across iterations.

When we generate the new phis and update the register names in the
pipelined loop, the pipeliner attempts to reuse a previously generated
phi, when possible. The calculation for the name of the new phi needs
to account for the version/iteration of the original phi. Also, in the
epilog, the code only needs to check backwards for a previous iteration
until reaching the first prolog block.

Patch by Brendon Cahoon.

llvm-svn: 328537

[X86][Btver2] Account for the "+i" integer pipe transfer costs (1cy use of JALU0 for GPR PRF write)

llvm-svn: 328536

[Pipeliner] Fix check for order dependences when finalizing instructions

The code in orderDepdences that looks at the order dependences between
instructions was processing all the successor and predecessor order
dependences. However, we really only want to check for an order dependence
for instructions scheduled in the same cycle.

Also, fixed how the pipeliner handles output dependences. An output
dependence is also a potential loop carried dependence. The pipeliner
didn't handle this case properly so an invalid schedule could be created
that allowed an output dependence to be scheduled in the next iteration
at the same cycle.

Patch by Brendon Cahoon.

llvm-svn: 328516

[Pipeliner] Fix in the pipeliner phi reuse code

When the definition of a phi is used by a phi in the next iteration,
the pipeliner was assuming that the definition is processed first.
Because of the assumption, an incorrect phi name was used. This patch
has a check to see if the phi definition has been processed already.

Patch by Brendon Cahoon.

llvm-svn: 328510

[Pipeliner] Pipeliner should mark physical registers as used

The software pipeliner attempts to delete dead instructions after
generating the pipelined loop. The code looks for uses of each
instruction. Physical registers should be treated differently because
the use chains do not exist. The code that checks for dead
instructions should assume that definitions of physical registers
are used if the operand doesn't contain the dead flag.

Patch by Brendon Cahoon.

llvm-svn: 328509

[Pipeliner] Correctly update memoperands in the epilog

The pipeliner needs to be conservative when updating the memoperands
of instructions in the epilog. Previously, the pipeliner was changing
the offset of the memoperand based upon the scheduling stage. However,
that is incorrect when control flow branches around the kernel code.
The bug enabled a load and store to the same stack offset to be swapped.

This patch fixes the bug by updating the size of the memoperands to be
UINT_MAX. This conservative value means that dependences will be created
between other loads and stores.

Patch by Brendon Cahoon.

llvm-svn: 328508

[demangler] Fix a bug in r328464 found by oss-fuzz.

llvm-svn: 328507

[Hexagon] Give priority to post-incremementing memory accesses in LSR

llvm-svn: 328506

[X86][Btver2] Add CVTSD2SI/CVTSS2SI scheduler costs

Account for the "+i" integer pipe transfer cost (1cy use of JALU0 for GPR PRF write)

This also adds missing vcvttss2si tests

llvm-svn: 328505

Fix TestDisassembleBreakpoint broken by r328488

The first issue was that the test was capturing the "before" disassembly
before launching, and the "after" after. This is a problem because some
of the disassembly will change after we know the load address (e.g. PCs
in call instructions). I fix this by capturing both disassemblies with
the process running.

The second issue was that the refactor in r328488 accidentaly changed
the meaning of the test, as it was no longer disassembling the function
which contained the breakpoint.

While inside, I also modernize the test to use
lldbutil.run_to_source_breakpoint and prevent debug-info replication.

llvm-svn: 328504

Migrate dockerfiles to use multi-stage builds.

Summary:
We previously emulated multi-staged builds using two dockerfiles,
native support from Docker allows us to merge them into one,
simplifying our scripts.

For more details about multi-stage builds, see:
https://docs.docker.com/develop/develop-images/multistage-build/

Reviewers: mehdi_amini, klimek, sammccall

Reviewed By: sammccall

Subscribers: llvm-commits, ioeric, cfe-commits

Differential Revision: https://reviews.llvm.org/D44787

llvm-svn: 328503

[InstCombine] distribute fmul over fadd/fsub

This replaces a large chunk of code that was looking for compound
patterns that include these sub-patterns. Existing tests ensure that
all of the previous examples are still folded as expected.

We still need to loosen the FMF check.

llvm-svn: 328502

[X86][Btver2] Fix YMM BLENDPD/BLENDPS + UNPCKPD/UNPCKP instructions costs

These should match the YMM MOVDUP/ PERMILPD/PERMILPS + SHUFPD/SHUFPS shuffles instead of using the WriteFShuffle defaults.

llvm-svn: 328501

[clangd] Support incremental document syncing

Summary:
This patch adds support for incremental document syncing, as described
in the LSP spec.  The protocol specifies ranges in terms of Position (a
line and a character), and our drafts are stored as plain strings.  So I
see two things that may not be super efficient for very large files:

- Converting a Position to an offset (the positionToOffset function)
  requires searching for end of lines until we reach the desired line.
- When we update a range, we construct a new string, which implies
  copying the whole document.

However, for the typical size of a C++ document and the frequency of
update (at which a user types), it may not be an issue.  This patch aims
at getting the basic feature in, and we can always improve it later if
we find it's too slow.

Signed-off-by: Simon Marchi <simon.marchi@ericsson.com>
Reviewers: malaperle, ilya-biryukov

Reviewed By: ilya-biryukov

Subscribers: MaskRay, klimek, mgorny, ilya-biryukov, jkorous-apple, ioeric, cfe-commits

Differential Revision: https://reviews.llvm.org/D44272

llvm-svn: 328500

[llvm-mca] Fix how views are added to the InstructionTables.

This should fix the stack-use-after-scope reported by the asan buildbots after
revision 328493.

llvm-svn: 328499

[InstCombine] check uses before creating instructions for fmul distribution

As the tests show, we could create extra instructions without any obvious benefit.

llvm-svn: 328498

[X86][Btver2] Add (V)SQRTPD/(V)SQRTSD costs

The xmm sd/pd versions were using the WriteFSQRT default which is modelled on sqrtss/sqrtps

llvm-svn: 328497