platform/upstream/llvm.git
5 years ago[clangd] Add thread priority lowering for MacOS as well
Kadir Cetinkaya [Mon, 25 Feb 2019 09:19:26 +0000 (09:19 +0000)]
[clangd] Add thread priority lowering for MacOS as well

Reviewers: ilya-biryukov

Subscribers: ioeric, MaskRay, jkorous, arphaman, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D58492

llvm-svn: 354765

5 years ago[XRay][tools] Revert "Use Support/JSON.h in llvm-xray convert"
Roman Lebedev [Mon, 25 Feb 2019 07:39:07 +0000 (07:39 +0000)]
[XRay][tools] Revert "Use Support/JSON.h in llvm-xray convert"

Summary:
This reverts D50129 / rL338834: [XRay][tools] Use Support/JSON.h in llvm-xray convert

Abstractions are great.
Readable code is great.
JSON support library is a *good* idea.

However unfortunately, there is an internal detail that one needs
to be aware of in `llvm::json::Object` - it uses `llvm::DenseMap`.
So for **every** `llvm::json::Object`, even if you only store a single `int`
entry there, you pay the whole price of `llvm::DenseMap`.

Unfortunately, it matters for `llvm-xray`.

I was trying to analyse the `llvm-exegesis` analysis mode performance,
and for that i wanted to view the LLVM X-Ray log visualization in Chrome
trace viewer. And the `llvm-xray convert` is sluggish, and sometimes
even ended up being killed by OOM.

`xray-log.llvm-exegesis.lwZ0sT` was acquired from `llvm-exegesis`
(compiled with ` -fxray-instruction-threshold=128`)
analysis mode over `-benchmarks-file` with 10099 points (one full
latency measurement set), with normal runtime of 0.387s.

Timings:
Old: (copied from D58580)
```
$ perf stat -r 5 ./bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT

 Performance counter stats for './bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT' (5 runs):

          21346.24 msec task-clock                #    1.000 CPUs utilized            ( +-  0.28% )
               314      context-switches          #   14.701 M/sec                    ( +- 59.13% )
                 1      cpu-migrations            #    0.037 M/sec                    ( +-100.00% )
           2181354      page-faults               # 102191.251 M/sec                  ( +-  0.02% )
       85477442102      cycles                    # 4004415.019 GHz                   ( +-  0.28% )  (83.33%)
       14526427066      stalled-cycles-frontend   #   16.99% frontend cycles idle     ( +-  0.70% )  (83.33%)
       32371533721      stalled-cycles-backend    #   37.87% backend cycles idle      ( +-  0.27% )  (33.34%)
       67896890228      instructions              #    0.79  insn per cycle
                                                  #    0.48  stalled cycles per insn  ( +-  0.03% )  (50.00%)
       14592654840      branches                  # 683631198.653 M/sec               ( +-  0.02% )  (66.67%)
         212207534      branch-misses             #    1.45% of all branches          ( +-  0.94% )  (83.34%)

           21.3502 +- 0.0585 seconds time elapsed  ( +-  0.27% )
```
New:
```
$ perf stat -r 9 ./bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT

 Performance counter stats for './bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT' (9 runs):

           7178.38 msec task-clock                #    1.000 CPUs utilized            ( +-  0.26% )
               182      context-switches          #   25.402 M/sec                    ( +- 28.84% )
                 0      cpu-migrations            #    0.046 M/sec                    ( +- 70.71% )
             33701      page-faults               # 4694.994 M/sec                    ( +-  0.88% )
       28761053971      cycles                    # 4006833.933 GHz                   ( +-  0.26% )  (83.32%)
        2028297997      stalled-cycles-frontend   #    7.05% frontend cycles idle     ( +-  1.61% )  (83.32%)
       10773154901      stalled-cycles-backend    #   37.46% backend cycles idle      ( +-  0.38% )  (33.36%)
       36199132874      instructions              #    1.26  insn per cycle
                                                  #    0.30  stalled cycles per insn  ( +-  0.03% )  (50.02%)
        6434504227      branches                  # 896420204.421 M/sec               ( +-  0.03% )  (66.68%)
          73355176      branch-misses             #    1.14% of all branches          ( +-  1.46% )  (83.33%)

            7.1807 +- 0.0190 seconds time elapsed  ( +-  0.26% )
```

So using `llvm::json` nearly triples run-time on that test case.
(+3x is times, not percent.)

Memory:
Old:
```
total runtime: 39.88s.
bytes allocated in total (ignoring deallocations): 79.07GB (1.98GB/s)
calls to allocation functions: 33267816 (834135/s)
temporary memory allocations: 5832298 (146235/s)
peak heap memory consumption: 9.21GB
peak RSS (including heaptrack overhead): 147.98GB
total memory leaked: 1.09MB
```
New:
```
total runtime: 17.42s.
bytes allocated in total (ignoring deallocations): 5.12GB (293.86MB/s)
calls to allocation functions: 21382982 (1227284/s)
temporary memory allocations: 232858 (13364/s)
peak heap memory consumption: 350.69MB
peak RSS (including heaptrack overhead): 2.55GB
total memory leaked: 79.95KB
```
Diff:
```
total runtime: -22.46s.
bytes allocated in total (ignoring deallocations): -73.95GB (3.29GB/s)
calls to allocation functions: -11884834 (529155/s)
temporary memory allocations: -5599440 (249307/s)
peak heap memory consumption: -8.86GB
peak RSS (including heaptrack overhead): 0B
total memory leaked: -1.01MB
```
So using `llvm::json` increases *peak* memory consumption on *this* testcase ~+27x.
And total allocation count +15x. Both of these numbers are times, *not* percent.

And note that memory usage is clearly unbound with `llvm::json`, it directly depends
on the length of the log, so peak memory consumption is always increasing.
This isn't so with the dumb code, there is no accumulating memory consumption,
peak memory consumption is fixed. Naturally, that means it will handle *much*
larger logs without OOM'ing.

Readability is good, but the price is simply unacceptable here.
Too bad none of this analysis was done as part of the development/review D50129 itself.

Reviewers: dberris, kpw, sammccall

Reviewed By: dberris

Subscribers: riccibruno, hans, courbet, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58584

llvm-svn: 354764

5 years ago[SelectionDAG] Add a OPC_CheckChild2CondCode to SelectionDAGISel to remove a MoveChil...
Craig Topper [Mon, 25 Feb 2019 03:11:44 +0000 (03:11 +0000)]
[SelectionDAG] Add a OPC_CheckChild2CondCode to SelectionDAGISel to remove a MoveChild and MoveParent pair.

OPC_CheckCondCode is always used as operand 2 of a setcc. And its always surrounded by a MoveChild2 and a MoveParent. By having a dedicated opcode for this case we can reduce the number of bytes needed for this pattern from 4 bytes to 2.

This saves ~3000 bytes in the X86 table.

llvm-svn: 354763

5 years ago[PowerPC] [PowerPC] Enhance the fast selection of fptoi & fptrunc instruction and...
Kang Zhang [Mon, 25 Feb 2019 02:46:16 +0000 (02:46 +0000)]
[PowerPC] [PowerPC] Enhance the fast selection of fptoi & fptrunc instruction and clean up related asserts

Summary:
Fast selection of llvm fptoi & fptrunc instructions is not handled well about
VSX instruction support.
We'd use VSX float convert integer instruction instead of non-vsx float convert
integer instruction if the operand register class is VSSRC or VSFRC because i32
and i64 are mapped to VSSRC and VSFRC correspondingly if VSX feature is
openeded.
For float trunc instruction, we do this silimar work like float convert integer
instruction to try to use VSX instruction.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D58430

llvm-svn: 354762

5 years ago[clangd] Enhance macro hover to see full definition
Marc-Andre Laperle [Sun, 24 Feb 2019 23:47:03 +0000 (23:47 +0000)]
[clangd] Enhance macro hover to see full definition

Summary: Signed-off-by: Marc-Andre Laperle <malaperle@gmail.com>

Reviewers: simark, ilya-biryukov, sammccall, ioeric, hokein

Reviewed By: ilya-biryukov

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D55250

llvm-svn: 354761

5 years ago[InstCombine] Add tests for PR40846; NFC
Nikita Popov [Sun, 24 Feb 2019 21:55:37 +0000 (21:55 +0000)]
[InstCombine] Add tests for PR40846; NFC

The icmps are the same as the overflow result of the intrinsic.

llvm-svn: 354760

5 years ago[InstCombine] Move with.overflow tests to separate file; NFC
Nikita Popov [Sun, 24 Feb 2019 21:55:31 +0000 (21:55 +0000)]
[InstCombine] Move with.overflow tests to separate file; NFC

And regenerate checks. I had to rename some variables, because
update_test_checks can't deal with the same variable names used
in lower and upper case. I've also dropped the result type aliases,
as just using the type directly gives a cleaner result.

llvm-svn: 354759

5 years ago[X86] Add PR40483 test cases
Simon Pilgrim [Sun, 24 Feb 2019 21:13:29 +0000 (21:13 +0000)]
[X86] Add PR40483 test cases

Demonstrate failure to merge ISD::ADD(x,y)/X86ISD::ADD(x,y) + ISD::SUB(x,y)/X86ISD::SUB(x,y) equivalent ops

llvm-svn: 354758

5 years ago[X86] Combine zext(packus(x),packus(y)) -> concat(x,y) (PR39637)
Simon Pilgrim [Sun, 24 Feb 2019 19:57:52 +0000 (19:57 +0000)]
[X86] Combine zext(packus(x),packus(y)) -> concat(x,y) (PR39637)

Its proving tricky to combine shuffles across multiple vector sizes, so for now I'm adding this more specific combine - the pattern is common enough to be worth it as a first step.

llvm-svn: 354757

5 years ago[X86] Fix tls variable lowering issue with large code model
Craig Topper [Sun, 24 Feb 2019 19:33:37 +0000 (19:33 +0000)]
[X86] Fix tls variable lowering issue with large code model

Summary:
The problem here is the lowering for tls variable. Below is the DAG for the code.
SelectionDAG has 11 nodes:

t0: ch = EntryToken
      t8: i64,ch = load<(load 8 from `i8 addrspace(257)* null`, addrspace 257)> t0, Constant:i64<0>, undef:i64
        t10: i64 = X86ISD::WrapperRIP TargetGlobalTLSAddress:i64<i32* @x> 0 [TF=10]
      t11: i64,ch = load<(load 8 from got)> t0, t10, undef:i64
    t12: i64 = add t8, t11
  t4: i32,ch = load<(dereferenceable load 4 from @x)> t0, t12, undef:i64
t6: ch = CopyToReg t0, Register:i32 %0, t4
And when mcmodel is large, below instruction can NOT be folded.

  t10: i64 = X86ISD::WrapperRIP TargetGlobalTLSAddress:i64<i32* @x> 0 [TF=10]
t11: i64,ch = load<(load 8 from got)> t0, t10, undef:i64
So "t11: i64,ch = load<(load 8 from got)> t0, t10, undef:i64" is lowered to " Morphed node: t11: i64,ch = MOV64rm<Mem:(load 8 from got)> t10, TargetConstant:i8<1>, Register:i64 $noreg, TargetConstant:i32<0>, Register:i32 $noreg, t0"

When llvm start to lower "t10: i64 = X86ISD::WrapperRIP TargetGlobalTLSAddress:i64<i32* @x> 0 [TF=10]", it fails.

The patch is to fold the load and X86ISD::WrapperRIP.

Fixes PR26906

Patch by LuoYuanke

Reviewers: craig.topper, rnk, annita.zhang, wxiao3

Reviewed By: rnk

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58336

llvm-svn: 354756

5 years ago[X86][SSE] Use pblendw for v4i32/v2i64 during isel.
Craig Topper [Sun, 24 Feb 2019 19:23:41 +0000 (19:23 +0000)]
[X86][SSE] Use pblendw for v4i32/v2i64 during isel.

Summary:

Previously we used BLENDPS/BLENDPD but that puts the blend in the FP domain. Under optsize, the two address instruction pass can cause blendps/blendpd to commute to blendps/blendpd. But we probably shouldn't do that if the original type was a integer. So use pblendw instead.

Reviewers: spatel, RKSimon

Reviewed By: RKSimon

Subscribers: jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58574

llvm-svn: 354755

5 years ago[X86] Correct some ADC/SBB with immediate scheduler data for Broadwell and Skylake.
Craig Topper [Sun, 24 Feb 2019 19:23:39 +0000 (19:23 +0000)]
[X86] Correct some ADC/SBB with immediate scheduler data for Broadwell and Skylake.

Summary:
The AX/EAX/RAX with immediate forms are 2 uops just like the AL with immediate.

The modrm form with r8 and immediate is a single uop just like r16/r32/r64 with immediate.

Reviewers: RKSimon, andreadb

Reviewed By: RKSimon

Subscribers: gbedwell, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58581

llvm-svn: 354754

5 years ago[LegalizeTypes][AArch64][X86] Make type legalization of vector (S/U)ADD/SUB/MULO...
Craig Topper [Sun, 24 Feb 2019 19:23:36 +0000 (19:23 +0000)]
[LegalizeTypes][AArch64][X86] Make type legalization of vector (S/U)ADD/SUB/MULO follow getSetCCResultType for the overflow bits. Make UnrollVectorOverflowOp properly convert from scalar boolean contents to vector boolean contents

Summary:
When promoting the over flow vector for these ops we should use the target's desired setcc result type. This way a v8i32 result type will use a v8i32 overflow vector instead of a v8i16 overflow vector. A v8i16 overflow vector will cause LegalizeDAG/LegalizeVectorOps to have to use v8i32 and truncate to v8i16 in its expansion. By doing this in type legalization instead, we get the truncate into the DAG earlier and give DAG combine more of a chance to optimize it.

We also have to fix unrolling to use the scalar setcc result type for the scalarized operation, and convert it to the required vector element type after the scalar operation. We have to observe the vector boolean contents when doing this conversion. The previous code was just taking the scalar result and putting it in the vector. But for X86 and AArch64 that would have only put a the boolean value in bit 0 of the element and left all other bits in the element 0. We need to ensure all bits in the element are the same. I'm using a select with constants here because that's what setcc unrolling in LegalizeVectorOps used.

Reviewers: spatel, RKSimon, nikic

Reviewed By: nikic

Subscribers: javed.absar, kristof.beyls, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58567

llvm-svn: 354753

5 years agoFix accidentally used hard tabs. NFC
Kristina Brooks [Sun, 24 Feb 2019 18:06:10 +0000 (18:06 +0000)]
Fix accidentally used hard tabs. NFC

Big sorry. This undoes the indentation mess I made
in r354751.

llvm-svn: 354752

5 years agoWrap code for builtin_assume_aligned at 80 col.NFC
Kristina Brooks [Sun, 24 Feb 2019 17:57:33 +0000 (17:57 +0000)]
Wrap code for builtin_assume_aligned at 80 col.NFC

Minor style fix to avoid going over 80 cols in handling
of case for Builtin::BI__builtin_assume_aligned. NFC.

llvm-svn: 354751

5 years ago[InstCombine] add test for icmp+add fold; NFC
Sanjay Patel [Sun, 24 Feb 2019 17:31:15 +0000 (17:31 +0000)]
[InstCombine] add test for icmp+add fold; NFC

llvm-svn: 354750

5 years ago[X86][AVX] Rename lowerShuffleByMerging128BitLanes to lowerShuffleAsLanePermuteAndRep...
Simon Pilgrim [Sun, 24 Feb 2019 17:30:06 +0000 (17:30 +0000)]
[X86][AVX] Rename lowerShuffleByMerging128BitLanes to lowerShuffleAsLanePermuteAndRepeatedMask. NFC.

Name better matches the other similar 'lane permute' and 'repeated mask' functions we have.

llvm-svn: 354749

5 years ago[InstCombine] canonicalize add/sub with bool
Sanjay Patel [Sun, 24 Feb 2019 16:57:45 +0000 (16:57 +0000)]
[InstCombine] canonicalize add/sub with bool

add A, sext(B) --> sub A, zext(B)

We have to choose 1 of these forms, so I'm opting for the
zext because that's easier for value tracking.

The backend should be prepared for this change after:
D57401
rL353433

This is also a preliminary step towards reducing the amount
of bit hackery that we do in IR to optimize icmp/select.
That should be waiting to happen at a later optimization stage.

The seeming regression in the fuzzer test was discussed in:
D58359

We were only managing that fold in instcombine by luck, and
other passes should be able to deal with that better anyway.

llvm-svn: 354748

5 years ago[InstCombine] regenerate checks; NFC
Sanjay Patel [Sun, 24 Feb 2019 16:11:58 +0000 (16:11 +0000)]
[InstCombine] regenerate checks; NFC

llvm-svn: 354747

5 years ago[CGP] add special-cases to form unsigned add with overflow (PR40486)
Sanjay Patel [Sun, 24 Feb 2019 15:31:27 +0000 (15:31 +0000)]
[CGP] add special-cases to form unsigned add with overflow (PR40486)

There's likely a missed IR canonicalization for at least 1 of these
patterns. Otherwise, we wouldn't have needed the pattern-matching
enhancement in D57516.

Note that -- unlike usubo added with D57789 -- the TLI hook for
this transform defaults to 'on'. So if there's any perf fallout
from this, targets should look at how they're lowering the uaddo
node in SDAG and/or override that hook.

The x86 diffs suggest that there's some missing pattern-matching
for forming inc/dec.

This should fix the remaining known problems in:
https://bugs.llvm.org/show_bug.cgi?id=40486
https://bugs.llvm.org/show_bug.cgi?id=31754

llvm-svn: 354746

5 years agoFix "enumeral and non-enumeral type in conditional expression" gcc7 warning. NFCI.
Simon Pilgrim [Sun, 24 Feb 2019 13:31:52 +0000 (13:31 +0000)]
Fix "enumeral and non-enumeral type in conditional expression" gcc7 warning. NFCI.

llvm-svn: 354745

5 years ago[WebAssembly] Rename a variable in CFGStackify (NFC)
Heejin Ahn [Sun, 24 Feb 2019 08:30:06 +0000 (08:30 +0000)]
[WebAssembly] Rename a variable in CFGStackify (NFC)

llvm-svn: 354744

5 years ago[WebAssembly] Merge two identical switch case routines into one (NFC)
Heejin Ahn [Sun, 24 Feb 2019 08:19:55 +0000 (08:19 +0000)]
[WebAssembly] Merge two identical switch case routines into one (NFC)

llvm-svn: 354743

5 years agoTypo: s/CHCCK/CHECK
Michael Liao [Sun, 24 Feb 2019 03:10:14 +0000 (03:10 +0000)]
Typo: s/CHCCK/CHECK

llvm-svn: 354742

5 years ago[NFC] Minor coding style (indent) fix.
Michael Liao [Sun, 24 Feb 2019 03:07:32 +0000 (03:07 +0000)]
[NFC] Minor coding style (indent) fix.

llvm-svn: 354741

5 years ago[Hexagon, SystemZ] Be super conservative about atomics
Philip Reames [Sun, 24 Feb 2019 00:45:09 +0000 (00:45 +0000)]
[Hexagon, SystemZ] Be super conservative about atomics

As requested during review of D57601, be equally conservative for atomic MMOs as for volatile MMOs in all in tree backends. At the moment, all atomic MMOs are also volatile, but I'm about to change that.

Reviewed as part of https://reviews.llvm.org/D58490, with other backends still pending review.

llvm-svn: 354740

5 years agoVFS: Avoid some unnecessary std::string copies
Duncan P. N. Exon Smith [Sat, 23 Feb 2019 23:48:47 +0000 (23:48 +0000)]
VFS: Avoid some unnecessary std::string copies

Thread Twine a little deeper through the VFS to avoid unnecessarily
constructing the same std::string twice in a parameter sequence:

    Twine -> std::string -> StringRef -> std::string

Changing a few parameters from StringRef to Twine avoids the early call
to `Twine::str()`.

llvm-svn: 354739

5 years ago[TwoAddressInstructionPass] After commuting an instruction and before trying to look...
Craig Topper [Sat, 23 Feb 2019 21:41:44 +0000 (21:41 +0000)]
[TwoAddressInstructionPass] After commuting an instruction and before trying to look for more commutable operands, resample the number of operands.

The new instruciton might have less operands than the original instruction. If we don't resample, the next loop iteration might read an operand that doesn't exist.

X86 can commute blends to movss/movsd which reduces from 4 operands to 3. This happened in the test case that caused r354363 & company to be reverted. A reduced version of that has been committed here.

Really this whole checking for more commutable operands is a little fragile. It assumes that the new instructions operands are the same order and positions as the original except for the pair that was swapped. I don't know of anything that breaks this assumption today, but I've left a fixme. Fixing this will likely require an interface change.

llvm-svn: 354738

5 years agoRecommit r354363 "[X86][SSE] Generalize X86ISD::BLENDI support to more value types"
Craig Topper [Sat, 23 Feb 2019 21:41:42 +0000 (21:41 +0000)]
Recommit r354363 "[X86][SSE] Generalize X86ISD::BLENDI support to more value types"

And its follow ups r354511, r354640.

A follow patch will fix the issue that caused it to be reverted.

llvm-svn: 354737

5 years agoEnable coroutines under -std=c++2a.
Richard Smith [Sat, 23 Feb 2019 21:06:26 +0000 (21:06 +0000)]
Enable coroutines under -std=c++2a.

llvm-svn: 354736

5 years ago[cxx_status] Update to match Kona motions.
Richard Smith [Sat, 23 Feb 2019 21:06:25 +0000 (21:06 +0000)]
[cxx_status] Update to match Kona motions.

llvm-svn: 354735

5 years agoRecommit r354647 and r354648 "[LegalizeTypes] When promoting the result of EXTRACT_SU...
Craig Topper [Sat, 23 Feb 2019 19:51:32 +0000 (19:51 +0000)]
Recommit r354647 and r354648 "[LegalizeTypes] When promoting the result of EXTRACT_SUBVECTOR, also check if the input needs to be promoted. Use that to determine the element type to extract"

r354648 was a follow up to fix a regression "[X86] Add a DAG combine for (aext_vector_inreg (aext_vector_inreg X)) -> (aext_vector_inreg X) to fix a regression from my previous commit."

These were reverted in r354713 as their context depended on other patches that were reverted for a bug.

llvm-svn: 354734

5 years ago[WebAssembly] Fix select of and (PR40805)
Nikita Popov [Sat, 23 Feb 2019 18:59:01 +0000 (18:59 +0000)]
[WebAssembly] Fix select of and (PR40805)

Fixes https://bugs.llvm.org/show_bug.cgi?id=40805 introduced by
patterns added in D53676.

I'm removing the patterns entirely here, as they are not correct
in the general case. If necessary something more specific can be
added in the future.

Differential Revision: https://reviews.llvm.org/D58575

llvm-svn: 354733

5 years ago[X86][AVX] combineInsertSubvector - remove concat_vectors(load(x),load(x)) --> sub_vb...
Simon Pilgrim [Sat, 23 Feb 2019 18:53:03 +0000 (18:53 +0000)]
[X86][AVX] combineInsertSubvector - remove concat_vectors(load(x),load(x)) --> sub_vbroadcast(x)

D58053/rL354340 added this to EltsFromConsecutiveLoads directly

llvm-svn: 354732

5 years agoFix MSVC constant truncation warnings. NFCI.
Simon Pilgrim [Sat, 23 Feb 2019 18:49:02 +0000 (18:49 +0000)]
Fix MSVC constant truncation warnings. NFCI.

llvm-svn: 354731

5 years ago[X86][AVX] concat_vectors(scalar_to_vector(x),scalar_to_vector(x)) --> broadcast(x)
Simon Pilgrim [Sat, 23 Feb 2019 18:34:05 +0000 (18:34 +0000)]
[X86][AVX] concat_vectors(scalar_to_vector(x),scalar_to_vector(x)) --> broadcast(x)

For AVX1, limit this to i32/f32/i64/f64 loading cases only.

llvm-svn: 354730

5 years ago[X86][AVX] Shuffle->Permute+Blend if we have one v4f64/v4i64 shuffle input in place
Simon Pilgrim [Sat, 23 Feb 2019 17:10:47 +0000 (17:10 +0000)]
[X86][AVX] Shuffle->Permute+Blend if we have one v4f64/v4i64 shuffle input in place

Even on AVX1 we can pretty cheaply (VPERM2F128+VSHUFPD) permute a single v4f64/v4i64 input (on AVX2 its just a single VPERMPD), followed by a BLENDPD.

llvm-svn: 354729

5 years ago[NFC] Fix Wdocumentation warning in OMPToClause
Bruno Ricci [Sat, 23 Feb 2019 16:40:30 +0000 (16:40 +0000)]
[NFC] Fix Wdocumentation warning in OMPToClause

llvm-svn: 354728

5 years ago[Sema][NFC] SequenceChecker: More tests in preparation for D57660
Bruno Ricci [Sat, 23 Feb 2019 16:25:00 +0000 (16:25 +0000)]
[Sema][NFC] SequenceChecker: More tests in preparation for D57660

llvm-svn: 354727

5 years ago[MIPS] Fix a incorrect test. (NFC)
Simon Dardis [Sat, 23 Feb 2019 15:56:32 +0000 (15:56 +0000)]
[MIPS] Fix a incorrect test. (NFC)

This test is incorrect as it should be using the microMIPSR6 instruction to
return, not the microMIPS version.

llvm-svn: 354726

5 years ago[libcxx] Make sure all experimental tests are disabled when enable_experimental=False
Louis Dionne [Sat, 23 Feb 2019 11:24:03 +0000 (11:24 +0000)]
[libcxx] Make sure all experimental tests are disabled when enable_experimental=False

Summary:
Previously, we'd run some experimental tests even when enable_experimental=False
was used with lit.

Reviewers: EricWF

Subscribers: christof, jkorous, dexonsmith, libcxx-commits, mclow.lists

Differential Revision: https://reviews.llvm.org/D55834

llvm-svn: 354725

5 years ago[X86] Sign extend the 8-bit immediate when commuting blend instructions to match...
Craig Topper [Sat, 23 Feb 2019 08:34:10 +0000 (08:34 +0000)]
[X86] Sign extend the 8-bit immediate when commuting blend instructions to match isel.

Conversion from ConstantSDNode to MachineInstr sign extends immediates from their APInt representation to int64_t.

This commit makes sure we do the same for commuting. The tests changes show how this improves CSE. This issue was made worse by the MachineCSE using commuteInstruction to undo a commute. So we virtually guarantee the sign extend from isel would be lost.

The improved CSE also occurred with r354363, but that was reverted. I'm working to undo the revert, but wanted to get this fix in while it was easy to see the results.

llvm-svn: 354724

5 years agoRemove OpenBSD case for old system libstdc++ header path as OpenBSD
Brad Smith [Sat, 23 Feb 2019 07:21:19 +0000 (07:21 +0000)]
Remove OpenBSD case for old system libstdc++ header path as OpenBSD
has switched to libc++.

llvm-svn: 354723

5 years agoobjdump fails to parse Mach-O binaries with n_desc bearing stabs
Michael Trent [Sat, 23 Feb 2019 06:19:56 +0000 (06:19 +0000)]
objdump fails to parse Mach-O binaries with n_desc bearing stabs

Summary:
The objdump Mach-O parser uses MachOObjectFile::checkSymbolTable() to
verify the symbol table is in a legal state before dereferencing the
offsets in the table. This routine missed a test for N_STAB symbols
when validating the two-level name space library ordinal for undefined
symbols. If the binary in question contained a value in the n_desc high
byte that is larger than the list of loaded dylibs, checkSymbolTable()
will flag the library ordinal as being out of range. Most of the time
the n_desc field is set to 0 or to small values, but old final linked
binaries exist with N_STAB symbols bearing non-trivial n_desc fields.

The change here is simply to verify a symbol is not an N_STAB symbol
before consulting the values of n_other or n_desc.

rdar://44977336

Reviewers: lhames, pete, ab

Reviewed By: pete

Subscribers: llvm-commits, rupprecht

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58568

llvm-svn: 354722

5 years agoRemove sanitizer context workaround no longer necessary
Brad Smith [Sat, 23 Feb 2019 06:19:28 +0000 (06:19 +0000)]
Remove sanitizer context workaround no longer necessary

The base linker is now lld.

llvm-svn: 354721

5 years agoRemove overly broad assert from r354717.
Richard Trieu [Sat, 23 Feb 2019 05:48:50 +0000 (05:48 +0000)]
Remove overly broad assert from r354717.

llvm-svn: 354720

5 years agoTry again to fix memory leak in r354692
Daniel Sanders [Sat, 23 Feb 2019 03:25:37 +0000 (03:25 +0000)]
Try again to fix memory leak in r354692

The previous one didn't fix everything.

llvm-svn: 354719

5 years ago[NFC][Sanitizer] Comment out argument checks
Julian Lettner [Sat, 23 Feb 2019 03:24:10 +0000 (03:24 +0000)]
[NFC][Sanitizer] Comment out argument checks

These break clang-ppc64 bots.

llvm-svn: 354718

5 years ago[NFC][Sanitizer] Add argument checks to BufferedStackTrace::Unwind* functions
Julian Lettner [Sat, 23 Feb 2019 02:36:23 +0000 (02:36 +0000)]
[NFC][Sanitizer] Add argument checks to BufferedStackTrace::Unwind* functions

Reviewers: vitalybuka

Differential Revision: https://reviews.llvm.org/D58555

llvm-svn: 354717

5 years ago[LLD][COFF] Add support for /FUNCTIONPADMIN command-line option
Alexandre Ganea [Sat, 23 Feb 2019 01:46:18 +0000 (01:46 +0000)]
[LLD][COFF] Add support for /FUNCTIONPADMIN command-line option

Initial patch by Stefan Reinalter.

Fixes PR36775

Differential Revision: https://reviews.llvm.org/D49366

llvm-svn: 354716

5 years ago[NFC] Fix typos: preceeding -> preceding
Jordan Rupprecht [Sat, 23 Feb 2019 01:28:32 +0000 (01:28 +0000)]
[NFC] Fix typos: preceeding -> preceding

llvm-svn: 354715

5 years agoRevert r354363 & co "[X86][SSE] Generalize X86ISD::BLENDI support to more value types"
Reid Kleckner [Sat, 23 Feb 2019 01:19:42 +0000 (01:19 +0000)]
Revert r354363 & co "[X86][SSE] Generalize X86ISD::BLENDI support to more value types"

r354363 caused https://crbug.com/934963#c1, which has a plain C reduced
test case.

I also had to revert some dependent changes:
- r354648
- r354647
- r354640
- r354511

llvm-svn: 354713

5 years agoFix memory leak in r354692
Daniel Sanders [Sat, 23 Feb 2019 01:13:35 +0000 (01:13 +0000)]
Fix memory leak in r354692

llvm-svn: 354712

5 years agoRevert r354706 - lit touched my thigh
Jim Ingham [Sat, 23 Feb 2019 01:08:17 +0000 (01:08 +0000)]
Revert r354706 - lit touched my thigh

llvm-svn: 354711

5 years ago[LegalizeTypes] Use PromoteTargetBoolean in PromoteIntOp_ADDSUBCARRY instead of reimp...
Craig Topper [Sat, 23 Feb 2019 00:38:19 +0000 (00:38 +0000)]
[LegalizeTypes] Use PromoteTargetBoolean in PromoteIntOp_ADDSUBCARRY instead of reimplementing it. NFCI

llvm-svn: 354710

5 years ago[X86] Enable custom splitting of v8i64/v16i32 sext/zext for avx/avx2 when input type...
Craig Topper [Sat, 23 Feb 2019 00:35:02 +0000 (00:35 +0000)]
[X86] Enable custom splitting of v8i64/v16i32 sext/zext for avx/avx2 when input type will be promoted by the type legalize to 128-bits.

If the the input type will be promoted to 128 bits its better to put a sign_extend_inreg/and in the 128 bit register before the split occurs. Otherwise we end up doing it on each half in the wider register.

Some of the overflow arithmetic tests are regressions, but I think we can make some improvement using getSetccResultType in DAG combine and/or type legalization.

llvm-svn: 354709

5 years ago[X86] Add a few test cases for a v8i64 sext/zext from an illegal type that needs...
Craig Topper [Sat, 23 Feb 2019 00:34:58 +0000 (00:34 +0000)]
[X86] Add a few test cases for a v8i64 sext/zext from an illegal type that needs to be promoted to 128 bits.

If v8i64 isn't a legal type but v4i64 is, these will be split and then each half will get their input promoted and become an any_extend_vector_inreg/punpckhwd + any_extend + and/sign_extend_inreg.

If we instead recognize the input will be promoted we can emit the and/sign_extend_inreg first in a 128 bit register. Then we can sign_extend/zero_extend one half and pshufd+sign_extend/zero_extend the other half.

llvm-svn: 354708

5 years agoSplit a long line to avoid annoying horizontal scrolling on a browser.
Rui Ueyama [Sat, 23 Feb 2019 00:24:18 +0000 (00:24 +0000)]
Split a long line to avoid annoying horizontal scrolling on a browser.

llvm-svn: 354707

5 years agoMake sure that stop-hooks run asynchronously.
Jim Ingham [Sat, 23 Feb 2019 00:13:25 +0000 (00:13 +0000)]
Make sure that stop-hooks run asynchronously.

They aren't designed to nest recursively, so this will prevent that.
Also add a --auto-continue flag, putting "continue" in the stop hook makes
the stop hooks fight one another in multi-threaded programs.
Also allow more than one -o options so you can make more complex stop hooks w/o
having to go into the editor.

<rdar://problem/48115661>

Differential Revision: https://reviews.llvm.org/D58394

llvm-svn: 354706

5 years ago[WebAssembly] Update CodeGen test expectations after rL354697. NFC
Sam Clegg [Sat, 23 Feb 2019 00:07:39 +0000 (00:07 +0000)]
[WebAssembly] Update CodeGen test expectations after rL354697. NFC

llvm-svn: 354705

5 years agos/method/function/g since function is the correct name in C++.
Rui Ueyama [Fri, 22 Feb 2019 23:59:51 +0000 (23:59 +0000)]
s/method/function/g since function is the correct name in C++.

llvm-svn: 354704

5 years agoRemove a function from header and move the implementation to a .cpp file. NFC.
Rui Ueyama [Fri, 22 Feb 2019 23:59:43 +0000 (23:59 +0000)]
Remove a function from header and move the implementation to a .cpp file. NFC.

llvm-svn: 354703

5 years agoWhen deserializing breakpoints some options may not be present.
Jim Ingham [Fri, 22 Feb 2019 23:54:11 +0000 (23:54 +0000)]
When deserializing breakpoints some options may not be present.
The deserializer was not handling this case.  For now we just
accept the absent option, and set it to the breakpoint default.
This will be more important if/when I figure out how to serialize
the options set on breakpont locations.

<rdar://problem/48322664>

llvm-svn: 354702

5 years ago[NFC][Sanitizer] Re-enable test on Darwin
Julian Lettner [Fri, 22 Feb 2019 23:37:46 +0000 (23:37 +0000)]
[NFC][Sanitizer] Re-enable test on Darwin

This unexpectedly passes on our CI, although it still fails on my
machine.

llvm-svn: 354701

5 years agoRevert "AMDGPU/NFC: Cleanup subtarget predicates"
Konstantin Zhuravlyov [Fri, 22 Feb 2019 23:21:06 +0000 (23:21 +0000)]
Revert "AMDGPU/NFC: Cleanup subtarget predicates"

It breaks one of our downstream merges, so revert it
temporarily while investigating failures downstream

llvm-svn: 354700

5 years ago[CGP] add tests for uaddo increment/decrement; NFC
Sanjay Patel [Fri, 22 Feb 2019 23:19:34 +0000 (23:19 +0000)]
[CGP] add tests for uaddo increment/decrement; NFC

llvm-svn: 354699

5 years ago[OpenMP 5.0] Parsing/sema support for to clause with mapper modifier.
Michael Kruse [Fri, 22 Feb 2019 22:29:42 +0000 (22:29 +0000)]
[OpenMP 5.0] Parsing/sema support for to clause with mapper modifier.

This patch implements the parsing and sema support for OpenMP to clause
with potential user-defined mappers attached. User defined mapper is a
new feature in OpenMP 5.0. A to/from clause can have an explicit or
implicit associated mapper, which instructs the compiler to generate and
use customized mapping functions. An example is shown below:

    struct S { int len; int *d; };
    #pragma omp declare mapper(id: struct S s) map(s, s.d[0:s.len])
    struct S ss;
    #pragma omp target update to(mapper(id): ss) // use the mapper with name 'id' to map ss to device

Contributed-by: <lildmh@gmail.com>
Differential Revision: https://reviews.llvm.org/D58523

llvm-svn: 354698

5 years ago[WebAssembly] Remove unneeded MCSymbolRefExpr variants
Sam Clegg [Fri, 22 Feb 2019 22:29:34 +0000 (22:29 +0000)]
[WebAssembly] Remove unneeded MCSymbolRefExpr variants

We record the type of the symbol (event/function/data/global) in the
MCWasmSymbol and so it should always be clear how to handle a relocation
based on the symbol itself.

The exception is a function which still needs the special @TYPEINDEX
then the relocation contains the signature rather than the address
of the functions.

Differential Revision: https://reviews.llvm.org/D58472

llvm-svn: 354697

5 years ago[NFC][Sanitizer] Rename BufferedStackTrace::FastUnwindStack
Julian Lettner [Fri, 22 Feb 2019 22:03:09 +0000 (22:03 +0000)]
[NFC][Sanitizer] Rename BufferedStackTrace::FastUnwindStack

FastUnwindStack -> UnwindFast
SlowUnwindStack -> UnwindSlow
Stack is redundant, verb should come first.

SlowUnwindStackWithContext(uptr pc, void *context, u32 max_depth) ->
SlowUnwindStack
WithContext is redundant, since it is a required parameter.

Reviewers: vitalybuka

Differential Revision: https://reviews.llvm.org/D58551

llvm-svn: 354696

5 years ago[Sanitizer] Fix uses of stack->Unwind(..., fast)
Julian Lettner [Fri, 22 Feb 2019 22:00:13 +0000 (22:00 +0000)]
[Sanitizer] Fix uses of stack->Unwind(..., fast)

Apply StackTrace::WillUseFastUnwind(fast) in a few more places missed by
my previous patch (https://reviews.llvm.org/D58156).

Reviewers: vitalybuka

Differential Revision: https://reviews.llvm.org/D58550

llvm-svn: 354695

5 years ago[WebAssembly] MC: Handle aliases of aliases
Sam Clegg [Fri, 22 Feb 2019 21:41:42 +0000 (21:41 +0000)]
[WebAssembly] MC: Handle aliases of aliases

Differential Revision: https://reviews.llvm.org/D58417

llvm-svn: 354694

5 years ago[CMake] Honor LLVM_EXTERNAL_<proj>_SOURCE_DIR
David Greene [Fri, 22 Feb 2019 21:19:48 +0000 (21:19 +0000)]
[CMake] Honor LLVM_EXTERNAL_<proj>_SOURCE_DIR

When LLVM_ENABLE_PROJECTS is set, CMake assumes the project
directories are all side-by-side. This is not always the case and
there's no reason to expect it if LLVM_EXTERNAL_<proj>_SOURCE_DIR is
set. Honor that setting if it exists and allow the build configuration
to continue.

Differential Revision: https://reviews.llvm.org/D49672

llvm-svn: 354693

5 years agoRestore ability for C++ API users to Enable IPRA.
Daniel Sanders [Fri, 22 Feb 2019 20:59:07 +0000 (20:59 +0000)]
Restore ability for C++ API users to Enable IPRA.

Summary:
Prior to r310876 one of our out-of-tree targets was enabling IPRA by modifying
the TargetOptions::EnableIPRA. This no longer works on current trunk since the
useIPRA() hook overrides any values that are set in advance. This patch adjusts
the behaviour of the hook so that API users and useIPRA() can both enable it
but useIPRA() cannot disable it if the API user already enabled it.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: wdng, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D38043

llvm-svn: 354692

5 years ago[clang] Only provide C11 features in <float.h> starting with C++17
Louis Dionne [Fri, 22 Feb 2019 20:48:54 +0000 (20:48 +0000)]
[clang] Only provide C11 features in <float.h> starting with C++17

Summary:
In r353970, I enabled those features in C++11 and above. To be strictly
conforming, those features should only be enabled in C++17 and above.

Reviewers: jfb, eli.friedman

Subscribers: jkorous, dexonsmith, libcxx-commits

Differential Revision: https://reviews.llvm.org/D58289

llvm-svn: 354691

5 years ago[OPENMP] Delayed diagnostics for VLA support.
Alexey Bataev [Fri, 22 Feb 2019 20:36:10 +0000 (20:36 +0000)]
[OPENMP] Delayed diagnostics for VLA support.

Generalized processing of the deferred diagnostics for OpenMP/CUDA code.

llvm-svn: 354690

5 years ago[CGP] move overflow intrinsic insertion to common location; NFCI
Sanjay Patel [Fri, 22 Feb 2019 20:20:24 +0000 (20:20 +0000)]
[CGP] move overflow intrinsic insertion to common location; NFCI

We need to enhance the uaddo matching to handle special-cases
as seen in PR40486 and PR31754. That means we won't necessarily
have a def-use pattern, so we'll need to check dominance to
determine where to place the intrinsic (as we already do for
usubo). This preliminary patch is just rearranging the code,
so the planned follow-up to improve uaddo will be more clear.

llvm-svn: 354689

5 years agoMIR: Preserve incoming frame index numbers
Matt Arsenault [Fri, 22 Feb 2019 19:30:38 +0000 (19:30 +0000)]
MIR: Preserve incoming frame index numbers

Don't skip incrementing the frame index number
if the object is dead. Instructions can still be
referencing the old frame index number, and this
doesn't attempt to remap those. The resulting
MIR then fails to load because the use instructions
use a higher frame index number than recorded
list of stack objects.

I'm not sure it's possible to craft a testcase
with the existing set of passes. It requires
selectively marking some stack objects
dead in an essentially random order.
StackSlotColoring condenses towards
the low indexes. This avoids a regression in a
future AMDGPU commit when some frame indexes
are lowered separately from PEI.

llvm-svn: 354688

5 years agoCodeGen: Make RegAllocRegistry a template class
Matt Arsenault [Fri, 22 Feb 2019 19:16:52 +0000 (19:16 +0000)]
CodeGen: Make RegAllocRegistry a template class

Will allow re-using the machinery for independent
sets of register allocators.

This will allow AMDGPU to use separate command line
options for the allocator to use for SGPRs separate
from VGPRs.

llvm-svn: 354687

5 years agoAMDGPU: Use removeAllRegUnitsForPhysReg
Matt Arsenault [Fri, 22 Feb 2019 19:03:36 +0000 (19:03 +0000)]
AMDGPU: Use removeAllRegUnitsForPhysReg

llvm-svn: 354686

5 years agoLiveIntervals: Add removeAllRegUnitsForPhysReg
Matt Arsenault [Fri, 22 Feb 2019 19:03:31 +0000 (19:03 +0000)]
LiveIntervals: Add removeAllRegUnitsForPhysReg

Convenience wrapper for removing the reg units of
a physical register.

llvm-svn: 354685

5 years ago[WebAssembly] Remove debug statement submitted in rL354657
Sam Clegg [Fri, 22 Feb 2019 19:00:03 +0000 (19:00 +0000)]
[WebAssembly] Remove debug statement submitted in rL354657

Subscribers: dschuff, jgravelle-google, hiraditya, aheejin, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58549

llvm-svn: 354684

5 years ago[GN] Updated build file to allow GN builds to succeed at ToT.
Mitch Phillips [Fri, 22 Feb 2019 18:45:41 +0000 (18:45 +0000)]
[GN] Updated build file to allow GN builds to succeed at ToT.

llvm-svn: 354683

5 years ago[MBP] Factor out function hasViableTopFallthrough and enhancement
Guozhi Wei [Fri, 22 Feb 2019 18:04:37 +0000 (18:04 +0000)]
[MBP] Factor out function hasViableTopFallthrough and enhancement

This patch factor out the function hasViableTopFallthrough from rotateLoop. It is also enhanced. Original code checks only if there is a block can be placed before current loop top. This patch also checks if the loop top is the most possible successor of its predecessor. The attached test case shows its effect.

Differential Revision: https://reviews.llvm.org/D58393

llvm-svn: 354682

5 years agoFix "not all control paths return" warning. NFCI.
Simon Pilgrim [Fri, 22 Feb 2019 17:37:59 +0000 (17:37 +0000)]
Fix "not all control paths return" warning. NFCI.

llvm-svn: 354681

5 years agoRevert "[OPENMP] Delayed diagnostics for VLA support."
Alexey Bataev [Fri, 22 Feb 2019 17:16:50 +0000 (17:16 +0000)]
Revert "[OPENMP] Delayed diagnostics for VLA support."

This reverts commit r354679 to fix the problem with the Windows
buildbots

llvm-svn: 354680

5 years ago[OPENMP] Delayed diagnostics for VLA support.
Alexey Bataev [Fri, 22 Feb 2019 16:49:13 +0000 (16:49 +0000)]
[OPENMP] Delayed diagnostics for VLA support.

Generalized processing of the deferred diagnostics for OpenMP/CUDA code.

llvm-svn: 354679

5 years agoCodeGen: use COMDAT for block copy/destroy helpers
Saleem Abdulrasool [Fri, 22 Feb 2019 16:29:50 +0000 (16:29 +0000)]
CodeGen: use COMDAT for block copy/destroy helpers

SVN r339438 added support to deduplicate the helpers by using a consistent
naming scheme and using LinkOnceODR semantics.  This works on ELF by means of
weak linking semantics, and entirely does not work on PE/COFF where you end up
with multiply defined strong symbols, which is a strong error on PE/COFF.
Assign the functions a COMDAT group so that they can be uniqued by the linker.
This fixes the use of blocks in CoreFoundation on Windows.

llvm-svn: 354678

5 years agoDisable big-endian constant store merges from rL354676.
Nirav Dave [Fri, 22 Feb 2019 16:20:34 +0000 (16:20 +0000)]
Disable big-endian constant store merges from rL354676.

llvm-svn: 354677

5 years ago[DAGCombine] Fold overlapping constant stores
Nirav Dave [Fri, 22 Feb 2019 16:00:19 +0000 (16:00 +0000)]
[DAGCombine] Fold overlapping constant stores

Fold a smaller constant store into larger constant stores immediately
preceeding it.

Reviewers: rnk, courbet

Subscribers: javed.absar, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58468

llvm-svn: 354676

5 years ago[x86] allow narrowing of vector UINT_TO_FP
Sanjay Patel [Fri, 22 Feb 2019 15:47:45 +0000 (15:47 +0000)]
[x86] allow narrowing of vector UINT_TO_FP

As discussed in:
D56864
D58197

Always use the narrow (128-bit) instruction when possible.
We already had the signed int version of this transform.

llvm-svn: 354675

5 years ago[x86] simplify code in combineExtractSubvector; NFC
Sanjay Patel [Fri, 22 Feb 2019 15:28:22 +0000 (15:28 +0000)]
[x86] simplify code in combineExtractSubvector; NFC

Only the 1st fold is attempted pre-legalization, but it requires
legal (simple) types too, so we don't need an EVT in any of the code.

llvm-svn: 354674

5 years agoBreakCriticalEdges: Update PostDominatorTree
Matt Arsenault [Fri, 22 Feb 2019 15:01:41 +0000 (15:01 +0000)]
BreakCriticalEdges: Update PostDominatorTree

llvm-svn: 354673

5 years ago[mips][micromips] fix filling delay slots for PseudoIndirectBranch_MM
Petar Jovanovic [Fri, 22 Feb 2019 14:53:58 +0000 (14:53 +0000)]
[mips][micromips] fix filling delay slots for PseudoIndirectBranch_MM

Filling a delay slot in 32bit jump instructions with a 16bit instruction
can cause issues. According to the documentation such an operation is
unpredictable.
This patch adds opcode Mips::PseudoIndirectBranch_MM alongside
Mips::PseudoIndirectBranch and other instructions that are expanded to jr
instruction and do not allow a 16bit instruction in their delay slots.

Patch by Mirko Brkusanin.

Differential Revision: https://reviews.llvm.org/D58507

llvm-svn: 354672

5 years ago[CUDA]Delayed diagnostics for the asm instructions.
Alexey Bataev [Fri, 22 Feb 2019 14:42:48 +0000 (14:42 +0000)]
[CUDA]Delayed diagnostics for the asm instructions.

Adapted targetDiag for the CUDA and used for the delayed diagnostics in
asm constructs. Works for both host and device compilation sides.

Differential Revision: https://reviews.llvm.org/D58463

llvm-svn: 354671

5 years ago[LowerSwitch][AMDGPU] Do not handle impossible values
Roman Tereshin [Fri, 22 Feb 2019 14:33:46 +0000 (14:33 +0000)]
[LowerSwitch][AMDGPU] Do not handle impossible values

This patch adds LazyValueInfo to LowerSwitch to compute the range of the
value being switched over and reduce the size of the tree LowerSwitch
builds to lower a switch.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D58096

llvm-svn: 354670

5 years ago[DTU] Refine the interface and logic of applyUpdates
Chijun Sima [Fri, 22 Feb 2019 13:48:38 +0000 (13:48 +0000)]
[DTU] Refine the interface and logic of applyUpdates

Summary:
This patch separates two semantics of `applyUpdates`:
1. User provides an accurate CFG diff and the dominator tree is updated according to the difference of `the number of edge insertions` and `the number of edge deletions` to infer the status of an edge before and after the update.
2. User provides a sequence of hints. Updates mentioned in this sequence might never happened and even duplicated.

Logic changes:

Previously, removing invalid updates is considered a side-effect of deduplication and is not guaranteed to be reliable. To handle the second semantic, `applyUpdates` does validity checking before deduplication, which can cause updates that have already been applied to be submitted again. Then, different calls to `applyUpdates` might cause unintended consequences, for example,
```
DTU(Lazy) and Edge A->B exists.
1. DTU.applyUpdates({{Delete, A, B}, {Insert, A, B}}) // User expects these 2 updates result in a no-op, but {Insert, A, B} is queued
2. Remove A->B
3. DTU.applyUpdates({{Delete, A, B}}) // DTU cancels this update with {Insert, A, B} mentioned above together (Unintended)
```
But by restricting the precondition that updates of an edge need to be strictly ordered as how CFG changes were made, we can infer the initial status of this edge to resolve this issue.

Interface changes:
The second semantic of `applyUpdates`  is separated to `applyUpdatesPermissive`.
These changes enable DTU(Lazy) to use the first semantic if needed, which is quite useful in `transforms/utils`.

Reviewers: kuhar, brzycki, dmgreen, grosser

Reviewed By: brzycki

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58170

llvm-svn: 354669

5 years agoAvoid two-stage initialization of MinidumpParser
Pavel Labath [Fri, 22 Feb 2019 13:36:01 +0000 (13:36 +0000)]
Avoid two-stage initialization of MinidumpParser

remove the Initialize function, move the things that can fail into the
static factory function. The factory function now returns
Expected<Parser> instead of Optional<Parser> so that it can give a
reason why creation failed.

llvm-svn: 354668

5 years ago[ARM] Add some missing thumb1 opcodes to enable peephole optimisation of CMPs
David Green [Fri, 22 Feb 2019 12:23:31 +0000 (12:23 +0000)]
[ARM] Add some missing thumb1 opcodes to enable peephole optimisation of CMPs

This adds a number of missing Thumb1 opcodes so that the peephole optimiser can
remove redundant CMP instructions.

Reapplying this after the first attempt broke non-thumb1 code as the t2ADDri
instruction can be used with frame indices. In thumb1 we use tADDframe.

Differential Revision: https://reviews.llvm.org/D57833

llvm-svn: 354667

5 years ago[ELF][test]Remove unnecessary empty symbol references in yaml/add missing symbols...
James Henderson [Fri, 22 Feb 2019 11:22:39 +0000 (11:22 +0000)]
[ELF][test]Remove unnecessary empty symbol references in yaml/add missing symbols for relocs

yaml2obj used to require the Symbol field in relocations, but it hasn't
done so for a couple of years. Another change to yaml2obj will soon land
that will look up the symbol by name or index, if present, and emit an
error if not found. This will mean that an explicit symbol reference
(even to an empty-named symbol) that does not reference a symbol
declared in the yaml will result in an error.

This patch updates tests that would otherwise start emitting errors.

Reviewed by: ruiu, grimar

Differential Revision: https://reviews.llvm.org/D58508

llvm-svn: 354666

5 years ago[ARM GlobalISel] Support floating point for Thumb2
Diana Picus [Fri, 22 Feb 2019 09:54:54 +0000 (09:54 +0000)]
[ARM GlobalISel] Support floating point for Thumb2

This is exactly the same as arm mode, so for the instruction selector
tests we just extract them to a new file and run with the same checks
for both arm and thumb mode.

For the legalizer we need to update the tests for soft float a bit, but
only because BL and tBL are slightly different. We could be pedantic and
check that we get a well-formed BL for arm mode and a tBL for thumb, but
for the purposes of the legalizer test it's sufficient to just skip over
the predicate operands in the checks. Also note that we have the
pedantic checks in the divmod test, so we're covered.

llvm-svn: 354665