platform/upstream/llvm.git
5 years agoRevert "[DebugInfo] Introduce DW_OP_LLVM_convert"
Markus Lavin [Tue, 19 Mar 2019 09:17:28 +0000 (09:17 +0000)]
Revert "[DebugInfo] Introduce DW_OP_LLVM_convert"

This reverts commit 1cf4b593a7ebd666fc6775f3bd38196e8e65fafe.

Build bots found failing tests not detected locally.

Failing Tests (3):
  LLVM :: DebugInfo/Generic/convert-debugloc.ll
  LLVM :: DebugInfo/Generic/convert-inlined.ll
  LLVM :: DebugInfo/Generic/convert-linked.ll

llvm-svn: 356444

5 years agoUse response file when generating LLVM-C.dll
Serge Guelton [Tue, 19 Mar 2019 09:14:09 +0000 (09:14 +0000)]
Use response file when generating LLVM-C.dll

As discovered in D56774 the command line gets to long, so use a response file
to give the script the libs. This change has been tested and is confirmed
working for me.

Commited on behalf of Jakob Bornecrantz.
Differential Revision: https://reviews.llvm.org/D56781

llvm-svn: 356443

5 years ago[DebugInfo] Introduce DW_OP_LLVM_convert
Markus Lavin [Tue, 19 Mar 2019 08:48:19 +0000 (08:48 +0000)]
[DebugInfo] Introduce DW_OP_LLVM_convert

Introduce a DW_OP_LLVM_convert Dwarf expression pseudo op that allows
for a convenient way to perform type conversions on the Dwarf expression
stack. As an additional bonus it paves the way for using other Dwarf
v5 ops that need to reference a base_type.

The new DW_OP_LLVM_convert is used from lib/Transforms/Utils/Local.cpp
to perform sext/zext on debug values but mainly the patch is about
preparing terrain for adding other Dwarf v5 ops that need to reference a
base_type.

For Dwarf v5 the op maps to DW_OP_convert and for earlier versions a
complex shift & mask pattern is generated to emulate sext/zext.

Differential Revision: https://reviews.llvm.org/D56587

llvm-svn: 356442

5 years agoRefactor cast<>'s in if conditionals, which can only assert on failure.
Don Hinton [Tue, 19 Mar 2019 06:14:14 +0000 (06:14 +0000)]
Refactor cast<>'s in if conditionals, which can only assert on failure.

Summary:
This patch refactors several instances of cast<> used in if
conditionals.  Since cast<> asserts on failure, the else branch can
never be taken.

In some cases, the fix is to replace cast<> with dyn_cast<>.  While
others required the removal of the conditional and some minor
refactoring.

A discussion can be seen here: http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20190318/265044.html

Differential Revision: https://reviews.llvm.org/D59529

llvm-svn: 356441

5 years ago[WebAssembly] Small improvements in FixIrreducibleControlFlow (NFC)
Heejin Ahn [Tue, 19 Mar 2019 05:26:33 +0000 (05:26 +0000)]
[WebAssembly] Small improvements in FixIrreducibleControlFlow (NFC)

Summary:
- Make some class member methods const
- Delete unnecessary includes
- Use a simpler form of `BuildMI`

Reviewers: kripken

Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59454

llvm-svn: 356440

5 years ago[WebAssembly] Improve readability of irreducibility tests
Heejin Ahn [Tue, 19 Mar 2019 05:10:39 +0000 (05:10 +0000)]
[WebAssembly] Improve readability of irreducibility tests

Summary:
This adds `preds` comment lines to BB names for readability, while also
fixes some of existing incorrect comment lines. Also deletes a few
unnecessary attributes. Autogenerated by `opt`.

Reviewers: kripken

Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59456

llvm-svn: 356439

5 years ago[WebAssembly] Rename methods according to instruction name changes (NFC)
Heejin Ahn [Tue, 19 Mar 2019 05:07:33 +0000 (05:07 +0000)]
[WebAssembly] Rename methods according to instruction name changes (NFC)

Reviewers: tlively, sbc100

Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59469

llvm-svn: 356438

5 years ago[WebAssembly] Add immarg attribute to intrinsics
Heejin Ahn [Tue, 19 Mar 2019 05:02:30 +0000 (05:02 +0000)]
[WebAssembly] Add immarg attribute to intrinsics

Summary:
After r355981, intrinsic arguments that are immediate values should be
marked as `ImmArg`.

Reviewers: dschuff, tlively

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59447

llvm-svn: 356437

5 years ago[WebAssembly] Change wasm.throw's first argument to an immediate
Heejin Ahn [Tue, 19 Mar 2019 04:58:59 +0000 (04:58 +0000)]
[WebAssembly] Change wasm.throw's first argument to an immediate

Summary:
`wasm.throw` builtin's first 'tag' argument should be an immediate index
into the event section.

Reviewers: dschuff, craig.topper

Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D59448

llvm-svn: 356436

5 years agoMark 'front()' and 'back()' as noexcept for array/deque/string/string_view. These...
Marshall Clow [Tue, 19 Mar 2019 03:30:07 +0000 (03:30 +0000)]
Mark 'front()' and 'back()' as noexcept for array/deque/string/string_view. These are just rebranded 'operator[]', and should be noexcept like it is.

llvm-svn: 356435

5 years ago[CodeGen] LLVM OpenMP Backend.
Michael Kruse [Tue, 19 Mar 2019 03:18:21 +0000 (03:18 +0000)]
[CodeGen] LLVM OpenMP Backend.

The ParallelLoopGenerator class is changed such that GNU OpenMP specific
code was removed, allowing to use it as super class in a
template-pattern. Therefore, the code has been reorganized and one may
not use the ParallelLoopGenerator directly anymore, instead specific
implementations have to be provided. These implementations contain the
library-specific code. As such, the "GOMP" (code completely taken from
the existing backend) and "KMP" variant were created.

For "check-polly" all tests that involved "GOMP": equivalents were added
that test the new functionalities, like static scheduling and different
chunk sizes. "docs/UsingPollyWithClang.rst" shows how the alternative
backend may be used.

Patch by Michael Halkenhäuser <michaelhalk@web.de>

Differential Revision: https://reviews.llvm.org/D59100

llvm-svn: 356434

5 years agoFactor out repeated code parsing and concatenating header-names from
Richard Smith [Tue, 19 Mar 2019 01:51:19 +0000 (01:51 +0000)]
Factor out repeated code parsing and concatenating header-names from
tokens.

We now actually form an angled_string_literal token for a header name by
concatenation rather than just working out what its contents would be.
This substantially simplifies downstream processing and is necessary for
C++20 header unit imports.

llvm-svn: 356433

5 years agoDon't apply the include depth limit until we actually decide to enter
Richard Smith [Tue, 19 Mar 2019 01:51:17 +0000 (01:51 +0000)]
Don't apply the include depth limit until we actually decide to enter
the file.

NFC unless a skipped #include is found at the final permitted #include
level.

llvm-svn: 356432

5 years ago[WebAssembly] Lower SIMD nnan setcc nodes
Thomas Lively [Tue, 19 Mar 2019 00:55:34 +0000 (00:55 +0000)]
[WebAssembly] Lower SIMD nnan setcc nodes

Summary:
Adds patterns to lower all the remaining setcc modes: lt, gt,
le, and ge. Fixes PR40912.

Reviewers: aheejin, sbc100, dschuff

Reviewed By: dschuff

Subscribers: jgravelle-google, hiraditya, sunfish, jdoerfert, llvm-commits, srj

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59519

llvm-svn: 356431

5 years agoMinor renaming as suggested in review [NFC]
Aaron Puchert [Tue, 19 Mar 2019 00:14:46 +0000 (00:14 +0000)]
Minor renaming as suggested in review [NFC]

See D59455.

llvm-svn: 356430

5 years agoRemove unused try catch blocks from old debug tests
Eric Fiselier [Tue, 19 Mar 2019 00:00:30 +0000 (00:00 +0000)]
Remove unused try catch blocks from old debug tests

llvm-svn: 356429

5 years ago[ELF] Allow sh_entsize to be unrelated to sh_addralign and not a power of 2
Fangrui Song [Mon, 18 Mar 2019 23:49:18 +0000 (23:49 +0000)]
[ELF] Allow sh_entsize to be unrelated to sh_addralign and not a power of 2

Summary:
This implements Rui Ueyama's idea in PR39044.
I've checked that ld.bfd and gold do not have the power-of-2 requirement
and do not require sh_entsize to be a multiple of sh_align.

Now on the updated test merge-entsize.s, all the 3 linkers happily
create .rodata that is not 3-byte aligned.

This has a use case in Linux arch/x86/crypto/sha512-avx2-asm.S
It uses sh_entsize of 640, which is not a power of 2.
See https://github.com/ClangBuiltLinux/linux/issues/417

Reviewers: ruiu, espindola

Reviewed By: ruiu

Subscribers: nickdesaulniers, E5ten, emaste, arichardson, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59478

llvm-svn: 356428

5 years agoThread safety analysis: Add note for unlock kind mismatch
Aaron Puchert [Mon, 18 Mar 2019 23:26:54 +0000 (23:26 +0000)]
Thread safety analysis: Add note for unlock kind mismatch

Summary:
Similar to D56967, we add the existing diag::note_locked_here to tell
the user where we saw the locking that isn't matched correctly.

Reviewers: aaron.ballman, delesley

Reviewed By: aaron.ballman

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D59455

llvm-svn: 356427

5 years ago[asan] Disable -Wfortify-source in intentional OOB tests
Reid Kleckner [Mon, 18 Mar 2019 23:03:46 +0000 (23:03 +0000)]
[asan] Disable -Wfortify-source in intentional OOB tests

Needed after r356397

llvm-svn: 356426

5 years ago[MS] Skip vbase construction in abstract class ctors
Reid Kleckner [Mon, 18 Mar 2019 22:41:50 +0000 (22:41 +0000)]
[MS] Skip vbase construction in abstract class ctors

As background, when constructing a complete object, virtual bases are
constructed first. If an exception is thrown later in the ctor, those
virtual bases are destroyed, so sema marks the relevant constructors and
destructors of virtual bases as referenced. If necessary, they are
emitted.

However, an abstract class can never be used to construct a complete
object. In the Itanium C++ ABI, this works out nicely, because we never
end up emitting the "complete" constructor variant, only the "base"
constructor variant, which can be called by constructors of derived
classes. Clang's Sema::MarkBaseAndMemberDestructorsReferenced is aware
of this optimization, and it does not mark ctors and dtors of virtual
bases referenced when the constructor of an abstract class is emitted.

In the Microsoft ABI, there are no complete/base variants, so before
this change, the constructor of an abstract class could reference ctors
and dtors of a virtual base without marking them referenced. This could
lead to unresolved symbol errors at link time, as reported in PR41065.

The fix is to implement the same optimization as Sema: If the class is
abstract, don't bother initializing its virtual bases. The "is this
class the most derived class" check in the constructor will never pass,
and the virtual base constructor calls are always dead. Skip them.

I think Richard noticed this missed optimization back in 2016 when he
was implementing inheriting constructors. I wasn't able to find any bugs
or email about it, though.

Fixes PR41065

llvm-svn: 356425

5 years agoRevert "[ValueTracking][InstSimplify] Support min/max selects in computeConstantRange()"
Nikita Popov [Mon, 18 Mar 2019 22:26:27 +0000 (22:26 +0000)]
Revert "[ValueTracking][InstSimplify] Support min/max selects in computeConstantRange()"

This reverts commit 106f0cdefb02afc3064268dc7a71419b409ed2f3.

This change impacts the AMDGPU smed3.ll and umed3.ll codegen tests.

llvm-svn: 356424

5 years ago[X86] Add gcc rotate intrinsics to ia32intrin.h
Craig Topper [Mon, 18 Mar 2019 22:25:57 +0000 (22:25 +0000)]
[X86] Add gcc rotate intrinsics to ia32intrin.h

This is another attempt at what Erich Keane tried to do in r355322.

This adds rolb, rolw, rold, rolq and their ror equivalent as always_inline wrappers around __builtin_rotate* which will lower to funnel shift intrinsics in IR.

Additionally, when _MSC_VER is not defined we will define _rotl, _lrotl, _rotr, _lrotr as macros to one of the always_inline intrinsics mentioned above. Making sure that _lrotl/_lrotr use either 32 or 64 bit based on the size of long. These need to be macros because we have builtins with the same name for MS compatibility, but _MSC_VER isn't always defined when those builtins are enabled.

We also define _rotwl and _rotwr as macros aliasing to rolw/rorw just like gcc to complete the set. These don't need to be gated with _MSC_VER because these aren't MS builtins.

I've added tests both for non-MS and -ms-extensions with and without _MSC_VER being defined.

Differential Revision: https://reviews.llvm.org/D59346

llvm-svn: 356423

5 years ago[libFuzzer] document -len_control
Kostya Serebryany [Mon, 18 Mar 2019 22:20:47 +0000 (22:20 +0000)]
[libFuzzer] document -len_control

llvm-svn: 356422

5 years agoFix test failures after debug mode changes
Eric Fiselier [Mon, 18 Mar 2019 22:12:09 +0000 (22:12 +0000)]
Fix test failures after debug mode changes

llvm-svn: 356421

5 years ago[X86] Add coverage for 16-bit and 64-bit versions of bsf/bsr/bt/btc/btr/bts in the...
Craig Topper [Mon, 18 Mar 2019 22:06:19 +0000 (22:06 +0000)]
[X86] Add coverage for 16-bit and 64-bit versions of bsf/bsr/bt/btc/btr/bts in the assembly tests that are supposed to provide full coverage. Add coverage for cwtl/cltq/cwtd/cqto as well.

llvm-svn: 356420

5 years ago[X86] Disable CQTO and CLTQ instructions in the assembly parser outside 64-bit mode.
Craig Topper [Mon, 18 Mar 2019 22:06:14 +0000 (22:06 +0000)]
[X86] Disable CQTO and CLTQ instructions in the assembly parser outside 64-bit mode.

llvm-svn: 356419

5 years ago[NFC][TSan][libdispatch] Fix test for dispatch_apply[_f]
Julian Lettner [Mon, 18 Mar 2019 21:55:41 +0000 (21:55 +0000)]
[NFC][TSan][libdispatch] Fix test for dispatch_apply[_f]

* Array index out of bounds: 100 iterations, but size of array is 2.
* Unmatched barrier_init (2) with barrier_wait (200)
* Number of iterations must be smaller than the available parallelism
  for the queue, otherwise we deadlock (since every barrier_wait call
  blocks the thread).

Scary: All of this worked reliably in gcd-apply.mm (for Darwin)

Rievewed By: kubamracek

Differential Revision: https://reviews.llvm.org/D59510

llvm-svn: 356418

5 years agoRemove exception throwing debug mode handler support.
Eric Fiselier [Mon, 18 Mar 2019 21:50:12 +0000 (21:50 +0000)]
Remove exception throwing debug mode handler support.

Summary:
The reason libc++ implemented a throwing debug mode handler was for ease of testing. Specifically,
I thought that if a debug violation aborted, we could only test one violation per file. This made
it impossible to test debug mode. Which throwing behavior we could test more!

However, the throwing approach didn't work either, since there are debug violations underneath noexcept
functions. This lead to the introduction of `_NOEXCEPT_DEBUG`, which was only noexcept when debug
mode was off.

Having thought more and having grown wiser, `_NOEXCEPT_DEBUG` was a horrible decision. It was
viral, it didn't cover all the cases it needed to, and it was observable to the user -- at worst
changing the behavior of their program.

  This patch removes the throwing debug handler, and rewrites the debug tests using 'fork-ing' style
  death tests.

Reviewers: mclow.lists, ldionne, thomasanderson

Reviewed By: ldionne

Subscribers: christof, arphaman, libcxx-commits, #libc

Differential Revision: https://reviews.llvm.org/D59166

llvm-svn: 356417

5 years agoA target definition file that may work for
Jason Molenda [Mon, 18 Mar 2019 21:39:54 +0000 (21:39 +0000)]
A target definition file that may work for
Aarch32 Cortex-M target processor debugging.

<rdar://problem/48448564>

llvm-svn: 356416

5 years ago[ValueTracking][InstSimplify] Support min/max selects in computeConstantRange()
Nikita Popov [Mon, 18 Mar 2019 21:35:19 +0000 (21:35 +0000)]
[ValueTracking][InstSimplify] Support min/max selects in computeConstantRange()

Add support for min/max flavor selects in computeConstantRange(),
which allows us to fold comparisons of a min/max against a constant
in InstSimplify. This was suggested by spatel as an alternative
approach to D59378. I've also added the infinite looping test from
that revision here.

Differential Revision: https://reviews.llvm.org/D59506

llvm-svn: 356415

5 years ago[InstCombine] Add tests for add nuw + uaddo; NFC
Nikita Popov [Mon, 18 Mar 2019 21:35:09 +0000 (21:35 +0000)]
[InstCombine] Add tests for add nuw + uaddo; NFC

Baseline tests for D59471 (InstCombine of `add nuw` and `uaddo` with
constants).

Patch by Dan Robertson.

Differential Revision: https://reviews.llvm.org/D59472

llvm-svn: 356414

5 years ago[X86] Allow any 8-bit immediate to be used with BT/BTC/BTR/BTS not just sign extended...
Craig Topper [Mon, 18 Mar 2019 21:33:59 +0000 (21:33 +0000)]
[X86] Allow any 8-bit immediate to be used with BT/BTC/BTR/BTS not just sign extended 8-bit immediates.

We need to allow [128,255] in addition to [-128, 127] to match gas.

llvm-svn: 356413

5 years ago[CMake] Set LLVM_DEFAULT_EXTERNAL_LIT in standalone build correctly on windows
Alex Langford [Mon, 18 Mar 2019 21:32:31 +0000 (21:32 +0000)]
[CMake] Set LLVM_DEFAULT_EXTERNAL_LIT in standalone build correctly on windows

LLVM installed llvm-lit with a .py suffix on windows. Let's match that
behavior here.

llvm-svn: 356412

5 years ago[GlobalISel] Include missing change from r356396
Amara Emerson [Mon, 18 Mar 2019 21:29:21 +0000 (21:29 +0000)]
[GlobalISel] Include missing change from r356396

Forgot to add a change to relax some asserts in r356396.

llvm-svn: 356411

5 years ago[WebAssembly] Don't override default implementation of isOffsetFoldingLegal. NFC.
Sam Clegg [Mon, 18 Mar 2019 21:21:12 +0000 (21:21 +0000)]
[WebAssembly] Don't override default implementation of isOffsetFoldingLegal. NFC.

The default implementation does we want and is going to more compatible
with dynamic linking (-fPIC) support that is planned.

This is NFC because currently we only build wasm with
`-relocation-model=static` which in turn means that the default
`isOffsetFoldingLegal` always returns true today.

Differential Revision: https://reviews.llvm.org/D54661

llvm-svn: 356410

5 years ago[ValueTracking][InstSimplify] Move abs handling into computeConstantRange(); NFC
Nikita Popov [Mon, 18 Mar 2019 21:20:03 +0000 (21:20 +0000)]
[ValueTracking][InstSimplify] Move abs handling into computeConstantRange(); NFC

This is preparation for D59506. The InstructionSimplify abs handling
is moved into computeConstantRange(), which is the general place for
such calculations. This is NFC and doesn't affect the existing tests
in test/Transforms/InstSimplify/icmp-abs-nabs.ll.

Differential Revision: https://reviews.llvm.org/D59511

llvm-svn: 356409

5 years ago[InstSimplify] Add additional icmp of min/max tests; NFC
Nikita Popov [Mon, 18 Mar 2019 21:19:56 +0000 (21:19 +0000)]
[InstSimplify] Add additional icmp of min/max tests; NFC

These are baseline tests for D59506.

llvm-svn: 356408

5 years ago[X86] Use relocImm in the ROL8ri/ROL16ri/ROL32ri/ROL64ri patterns to be consistent...
Craig Topper [Mon, 18 Mar 2019 20:43:15 +0000 (20:43 +0000)]
[X86] Use relocImm in the ROL8ri/ROL16ri/ROL32ri/ROL64ri patterns to be consistent with the ROR patterns.

llvm-svn: 356407

5 years ago[X86] Replace uses of i64immSExt32_su with i64relocImmSExt32_su.
Craig Topper [Mon, 18 Mar 2019 20:43:09 +0000 (20:43 +0000)]
[X86] Replace uses of i64immSExt32_su with i64relocImmSExt32_su.

For the i8, i16, and i32 instructions we were using a relocImm. Presumably we should for i64 as well.

llvm-svn: 356406

5 years ago[AMDGPU] Enable code selection using `s_mul_hi_u32`/`s_mul_hi_i32`.
Michael Liao [Mon, 18 Mar 2019 20:40:09 +0000 (20:40 +0000)]
[AMDGPU] Enable code selection using `s_mul_hi_u32`/`s_mul_hi_i32`.

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59501

llvm-svn: 356405

5 years ago[llvm-objcopy] Make .build-id linking atomic
Jake Ehrlich [Mon, 18 Mar 2019 20:35:18 +0000 (20:35 +0000)]
[llvm-objcopy] Make .build-id linking atomic

This change makes linking into .build-id atomic and safe to use.
Some users under particular workflows are reporting that this races
more than half the time under particular conditions.

llvm-svn: 356404

5 years ago[InstCombine] Improve with.overflow intrinsic tests; NFC
Nikita Popov [Mon, 18 Mar 2019 20:08:35 +0000 (20:08 +0000)]
[InstCombine] Improve with.overflow intrinsic tests; NFC

- Do not use unnamed values in saddo tests
- Add tests for canonicalization of a constant arg0

Patch by Dan Robertson.

Differential Revision: https://reviews.llvm.org/D59476

llvm-svn: 356403

5 years agoRestore comment regarding why Reloc::PIC_ can't be PIC
Sam Clegg [Mon, 18 Mar 2019 20:04:34 +0000 (20:04 +0000)]
Restore comment regarding why Reloc::PIC_ can't be PIC

The original change back in rL29307 explained this but it was
lost somewhere along the way.

Differential Revision: https://reviews.llvm.org/D59445

llvm-svn: 356402

5 years ago[API] Remove unneded LLDB_DISABLE_PYTHON markers.
Davide Italiano [Mon, 18 Mar 2019 20:02:27 +0000 (20:02 +0000)]
[API] Remove unneded LLDB_DISABLE_PYTHON markers.

llvm-svn: 356401

5 years agoFix flat-error-unsupported-gpu-hsa test
Alexandre Ganea [Mon, 18 Mar 2019 19:38:04 +0000 (19:38 +0000)]
Fix flat-error-unsupported-gpu-hsa test

Differential Revision: https://reviews.llvm.org/D59505

llvm-svn: 356400

5 years ago[AMDGPU] Asm/disasm clamp modifier on vop3 int arithmetic
Tim Renouf [Mon, 18 Mar 2019 19:35:44 +0000 (19:35 +0000)]
[AMDGPU] Asm/disasm clamp modifier on vop3 int arithmetic

Allow the clamp modifier on vop3 int arithmetic instructions in assembly
and disassembly.

This involved adding a clamp operand to the affected instructions in MIR
and MC, and thus having to fix up several places in codegen and MIR
tests.

Differential Revision: https://reviews.llvm.org/D59267

Change-Id: Ic7775105f02a985b668fa658a0cd7837846a534e
llvm-svn: 356399

5 years ago[AMDGPU] Asm/disasm v_cndmask_b32_e64 with abs/neg source modifiers
Tim Renouf [Mon, 18 Mar 2019 19:25:39 +0000 (19:25 +0000)]
[AMDGPU] Asm/disasm v_cndmask_b32_e64 with abs/neg source modifiers

This commit allows v_cndmask_b32_e64 with abs, neg source
modifiers on src0, src1 to be assembled and disassembled.

This does appear to be allowed, even though they are floating point
modifiers and the operand type is b32.

To do this, I added src0_modifiers and src1_modifiers to the
MachineInstr, which involved fixing up several places in codegen and mir
tests.

Differential Revision: https://reviews.llvm.org/D59191

Change-Id: I69bf4a8c73ebc65744f6110bb8fc4e937d79fbea
llvm-svn: 356398

5 years ago[Sema] Add some compile time _FORTIFY_SOURCE diagnostics
Erik Pilkington [Mon, 18 Mar 2019 19:23:45 +0000 (19:23 +0000)]
[Sema] Add some compile time _FORTIFY_SOURCE diagnostics

These diagnose overflowing calls to subset of fortifiable functions. Some
functions, like sprintf or strcpy aren't supported right not, but we should
probably support these in the future. We previously supported this kind of
functionality with -Wbuiltin-memcpy-chk-size, but that diagnostic doesn't work
with _FORTIFY implementations that use wrapper functions. Also unlike that
diagnostic, we emit these warnings regardless of whether _FORTIFY_SOURCE is
actually enabled, which is nice for programs that don't enable the runtime
checks.

Why not just use diagnose_if, like Bionic does? We can get better diagnostics in
the compiler (i.e. mention the sizes), and we have the potential to diagnose
sprintf and strcpy which is impossible with diagnose_if (at least, in languages
that don't support C++14 constexpr). This approach also saves standard libraries
from having to add diagnose_if.

rdar://48006655

Differential revision: https://reviews.llvm.org/D58797

llvm-svn: 356397

5 years agoRevert r356304: remove subreg parameter from MachineIRBuilder::buildCopy()
Amara Emerson [Mon, 18 Mar 2019 19:20:10 +0000 (19:20 +0000)]
Revert r356304: remove subreg parameter from MachineIRBuilder::buildCopy()

After review comments, it was preferred to not teach MachineIRBuilder about
non-generic instructions beyond using buildInstr().

For AArch64 I've changed the buildCopy() calls to buildInstr() + a
separate addReg() call.

This also relaxes the MachineIRBuilder's COPY checking more because it may
not always have a SrcOp given to it.

llvm-svn: 356396

5 years ago[DebugInfo][PDB] Don't write empty debug streams
Alexandre Ganea [Mon, 18 Mar 2019 19:13:23 +0000 (19:13 +0000)]
[DebugInfo][PDB] Don't write empty debug streams

Before, empty debug streams were written as 8 bytes (4 bytes signature + 4 bytes for the GlobalRefs count).

With this patch, unused empty streams aren't emitted anymore. Modules now encode 65535 as an 'unused stream' value, by convention.
Also fix the * Linker * contrib section which wasn't correctly emitted previously.

Differential Revision: https://reviews.llvm.org/D59502

llvm-svn: 356395

5 years ago[MsgPack][AMDGPU] Fix unflushed raw_string_ostream bugs on windows expensive checks bot
Tim Renouf [Mon, 18 Mar 2019 19:00:46 +0000 (19:00 +0000)]
[MsgPack][AMDGPU] Fix unflushed raw_string_ostream bugs on windows expensive checks bot

This fixes a couple of unflushed raw_string_ostream bugs in recent
commits that only show up on a bot building on windows with expensive
checks.

Differential Revision: https://reviews.llvm.org/D59396

Change-Id: I9c6208325503b3ee0786b4b688e13fc24a15babf
llvm-svn: 356394

5 years ago[X86] Rename imm8_su/imm16_su/imm32_su to relocImm8_su/relocImm16_su/relocImm32_su...
Craig Topper [Mon, 18 Mar 2019 18:54:06 +0000 (18:54 +0000)]
[X86] Rename imm8_su/imm16_su/imm32_su to relocImm8_su/relocImm16_su/relocImm32_su/ to accurately reflect what they are.

llvm-svn: 356393

5 years ago[SCEV] Guard movement of insertion point for loop-invariants
Warren Ristow [Mon, 18 Mar 2019 18:52:35 +0000 (18:52 +0000)]
[SCEV] Guard movement of insertion point for loop-invariants

This reinstates r347934, along with a tweak to address a problem with
PHI node ordering that that commit created (or exposed). (That commit
was reverted at r348426, due to the PHI node issue.)

Original commit message:

r320789 suppressed moving the insertion point of SCEV expressions with
dev/rem operations to the loop header in non-loop-invariant situations.
This, and similar, hoisting is also unsafe in the loop-invariant case,
since there may be a guard against a zero denominator. This is an
adjustment to the fix of r320789 to suppress the movement even in the
loop-invariant case.

This fixes PR30806.

Differential Revision: https://reviews.llvm.org/D57428

llvm-svn: 356392

5 years ago[AArch64] Small fix for getIntImmCost
Adhemerval Zanella [Mon, 18 Mar 2019 18:50:58 +0000 (18:50 +0000)]
[AArch64] Small fix for getIntImmCost

It uses the generic AArch64_IMM::expandMOVImm to get the correct
number of instruction used in immediate materialization.

Reviewers: efriedma

Differential Revision: https://reviews.llvm.org/D58461

llvm-svn: 356391

5 years ago[AArch64] Optimize floating point materialization
Adhemerval Zanella [Mon, 18 Mar 2019 18:45:57 +0000 (18:45 +0000)]
[AArch64] Optimize floating point materialization

This patch follows some ideas from r352866 to optimize the floating
point materialization even further. It changes isFPImmLegal to
considere up to 2 mov instruction or up to 5 in case subtarget has
fused literals.

The rationale is the cost is the same for mov+fmov vs. adrp+ldr; but
the mov+fmov sequence is always better because of the reduced d-cache
pressure. The timings are still the same if you consider movw+movk+fmov
vs. adrp+ldr will be fused (although one instruction longer).

Reviewers: efriedma

Differential Revision: https://reviews.llvm.org/D58460

llvm-svn: 356390

5 years ago[TargetLowering] Add code size information on isFPImmLegal. NFC
Adhemerval Zanella [Mon, 18 Mar 2019 18:40:07 +0000 (18:40 +0000)]
[TargetLowering] Add code size information on isFPImmLegal. NFC

This allows better code size for aarch64 floating point materialization
in a future patch.

Reviewers: evandro

Differential Revision: https://reviews.llvm.org/D58690

llvm-svn: 356389

5 years ago[OPENMP] Set scheduling for doacross loops as schedule, 1.
Alexey Bataev [Mon, 18 Mar 2019 18:40:00 +0000 (18:40 +0000)]
[OPENMP] Set scheduling for doacross loops as schedule, 1.

The default scheduling for doacross loops is changed from static to
static, 1.

llvm-svn: 356388

5 years ago[AArch64] Refactor floating point materialization. NFC
Adhemerval Zanella [Mon, 18 Mar 2019 18:23:23 +0000 (18:23 +0000)]
[AArch64] Refactor floating point materialization. NFC

It splits the login of actual instruction emission away from the logic
that figures out the appropriate sequence on AArch64ExpandPseudo::expandMOVImm.
The new function AArch64_IMM::expandMOVImm, which return the list of the
instructions to materialize the immediate constant, is implemented on a
separated unit because it will be used in a subsequent patch to optimize
floating point materialization.

Reviewers: efriedma

Differential Revision: https://reviews.llvm.org/D58915

llvm-svn: 356387

5 years ago[libc++][NFC] Promote CMake comment to an actual option description
Louis Dionne [Mon, 18 Mar 2019 18:18:01 +0000 (18:18 +0000)]
[libc++][NFC] Promote CMake comment to an actual option description

llvm-svn: 356386

5 years ago[AMDGPU] Add the missing clang change of the experimental buffer fat pointer
Michael Liao [Mon, 18 Mar 2019 18:11:37 +0000 (18:11 +0000)]
[AMDGPU] Add the missing clang change of the experimental buffer fat pointer

llvm-svn: 356385

5 years ago[X86] Remove the _alt forms of (V)CMP instructions. Use a combination of custom print...
Craig Topper [Mon, 18 Mar 2019 17:59:59 +0000 (17:59 +0000)]
[X86] Remove the _alt forms of (V)CMP instructions. Use a combination of custom printing and custom parsing to achieve the same result and more

Similar to previous change done for VPCOM and VPCMP

Differential Revision: https://reviews.llvm.org/D59468

llvm-svn: 356384

5 years ago[InstCombine] add/adjust test for NaN checks; NFC
Sanjay Patel [Mon, 18 Mar 2019 17:37:05 +0000 (17:37 +0000)]
[InstCombine] add/adjust test for NaN checks; NFC

llvm-svn: 356383

5 years ago[DAG] Cleanup unused node in SimplifySelectCC.
Nirav Dave [Mon, 18 Mar 2019 17:02:38 +0000 (17:02 +0000)]
[DAG] Cleanup unused node in SimplifySelectCC.

Delete temporarily constructed node uses for analysis after it's use,
holding onto original input nodes. Ideally this would be rewritten
without making nodes, but this appears relatively complex.

Reviewers: spatel, RKSimon, craig.topper

Subscribers: jdoerfert, hiraditya, deadalnix, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57921

llvm-svn: 356382

5 years ago[MVT] Fix typos in comment. NFC.
Michael Liao [Mon, 18 Mar 2019 16:57:40 +0000 (16:57 +0000)]
[MVT] Fix typos in comment. NFC.

llvm-svn: 356381

5 years agolld-link: Run conflict-mangled.test on all systems
Nico Weber [Mon, 18 Mar 2019 16:51:23 +0000 (16:51 +0000)]
lld-link: Run conflict-mangled.test on all systems

It seems to pass fine on my Mac, and it running it only on Windows made
me miss it in r355959 and required r355959.

When the test was added in r288992 we still used Win-only
UnDecorateSymbolName() for demangling. Now we use LLVM's
microsoftDemangle() which is cross-platform.

Differential Revision: https://reviews.llvm.org/D59497

llvm-svn: 356380

5 years agoSkip TestVSCode_setFunctionBreakpoints on linux
Pavel Labath [Mon, 18 Mar 2019 16:04:53 +0000 (16:04 +0000)]
Skip TestVSCode_setFunctionBreakpoints on linux

Test hangs under heavy load.

llvm-svn: 356379

5 years agoFix some "variable 'foo' set but not used" warnings
Pavel Labath [Mon, 18 Mar 2019 16:04:46 +0000 (16:04 +0000)]
Fix some "variable 'foo' set but not used" warnings

gcc-8 diagnoses these.

llvm-svn: 356378

5 years agoFix libstdc++ data formatters for python3
Pavel Labath [Mon, 18 Mar 2019 15:42:08 +0000 (15:42 +0000)]
Fix libstdc++ data formatters for python3

Use floor-division for consistentcy across python versions. This fixes a
couple of libstdc++ data formatter tests.

llvm-svn: 356377

5 years ago[libc++] Add a test for PR40977
Louis Dionne [Mon, 18 Mar 2019 15:40:49 +0000 (15:40 +0000)]
[libc++] Add a test for PR40977

Even though the header makes the exact same check since https://llvm.org/D59063,
the headers could conceivably change in the future and introduce a bug.

llvm-svn: 356376

5 years ago[ELF] Emit weak-undef symbols in .dynsym of a PIE binary only if linked against share...
Siva Chandra [Mon, 18 Mar 2019 15:32:57 +0000 (15:32 +0000)]
[ELF] Emit weak-undef symbols in .dynsym of a PIE binary only if linked against shared libs.

Reviewers: espindola

Subscribers: emaste, arichardson, MaskRay, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59275

llvm-svn: 356374

5 years ago[AMDGPU] Add an experimental buffer fat pointer address space.
Neil Henning [Mon, 18 Mar 2019 14:44:28 +0000 (14:44 +0000)]
[AMDGPU] Add an experimental buffer fat pointer address space.

Add an experimental buffer fat pointer address space that is currently
unhandled in the backend. This commit reserves address space 7 as a
non-integral pointer repsenting the 160-bit fat pointer (128-bit buffer
descriptor + 32-bit offset) that is heavily used in graphics workloads
using the AMDGPU backend.

Differential Revision: https://reviews.llvm.org/D58957

llvm-svn: 356373

5 years ago[InstCombine] allow general vector constants for funnel shift to shift transforms
Sanjay Patel [Mon, 18 Mar 2019 14:27:51 +0000 (14:27 +0000)]
[InstCombine] allow general vector constants for funnel shift to shift transforms

Follow-up to:
rL356338
rL356369

We can calculate an arbitrary vector constant minus the bitwidth, so there's
no need to limit this transform to scalars and splats.

llvm-svn: 356372

5 years ago[llvm-objcopy] - Calculate the string table section sizes correctly.
George Rimar [Mon, 18 Mar 2019 14:27:41 +0000 (14:27 +0000)]
[llvm-objcopy] - Calculate the string table section sizes correctly.

This fixes the https://bugs.llvm.org/show_bug.cgi?id=40980.

Previously if string optimization occurred as a result of
StringTableBuilder's finalize() method, the size wasn't updated.

This hopefully also makes the interaction between sections during finalization
processes a bit more clear.

Differential revision: https://reviews.llvm.org/D59488

llvm-svn: 356371

5 years agoFix TestCommandScriptImmediateOutput for python3
Pavel Labath [Mon, 18 Mar 2019 14:13:12 +0000 (14:13 +0000)]
Fix TestCommandScriptImmediateOutput for python3

s/iteritems/items

llvm-svn: 356370

5 years ago[InstCombine] extend rotate-left-by-constant canonicalization to funnel shift
Sanjay Patel [Mon, 18 Mar 2019 14:10:11 +0000 (14:10 +0000)]
[InstCombine] extend rotate-left-by-constant canonicalization to funnel shift

Follow-up to:
rL356338

Rotates are a special case of funnel shift where the 2 input operands
are the same value, but that does not need to be a restriction for the
canonicalization when the shift amount is a constant.

llvm-svn: 356369

5 years ago[SystemZ] Remove icmp undef from reduced tests
Simon Pilgrim [Mon, 18 Mar 2019 13:55:28 +0000 (13:55 +0000)]
[SystemZ] Remove icmp undef from reduced tests

Pre-commit for D59363 (Add icmp UNDEF handling to SelectionDAG::FoldSetCC)

Approved by @uweigand (Ulrich Weigand)

llvm-svn: 356368

5 years ago[InstCombine] add funnel shift tests with arbitrary constants; NFC
Sanjay Patel [Mon, 18 Mar 2019 13:35:51 +0000 (13:35 +0000)]
[InstCombine] add funnel shift tests with arbitrary constants; NFC

llvm-svn: 356367

5 years ago[pp-trace] Delete -ignore and add a new option -callbacks
Fangrui Song [Mon, 18 Mar 2019 13:30:17 +0000 (13:30 +0000)]
[pp-trace] Delete -ignore and add a new option -callbacks

Summary:
-ignore specifies a list of PP callbacks to ignore. It cannot express a
whitelist, which may be more useful than a blacklist.
Add a new option -callbacks to replace it.

-ignore= (default) => -callbacks='*' (default)
-ignore=FileChanged,FileSkipped => -callbacks='*,-FileChanged,-FileSkipped'

-callbacks='Macro*' : print only MacroDefined,MacroExpands,MacroUndefined,...

Reviewers: juliehockett, aaron.ballman, alexfh, ioeric

Reviewed By: aaron.ballman

Subscribers: nemanjai, kbarton, jsji, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D59296

llvm-svn: 356366

5 years ago[llvm-exegesis] Separate tool options into three categories.
Roman Lebedev [Mon, 18 Mar 2019 11:32:37 +0000 (11:32 +0000)]
[llvm-exegesis] Separate tool options into three categories.

Results in much nicer -help output:
```
$ ./bin/llvm-exegesis -help
USAGE: llvm-exegesis [options]

OPTIONS:

Color Options:

  -color                                         - Use colors in output (default=autodetect)

General options:

  -enable-cse-in-irtranslator                    - Should enable CSE in irtranslator
  -enable-cse-in-legalizer                       - Should enable CSE in Legalizer

Generic Options:

  -help                                          - Display available options (-help-hidden for more)
  -help-list                                     - Display list of available options (-help-list-hidden for more)
  -version                                       - Display the version of this program

llvm-exegesis analysis options:

  -analysis-clustering-epsilon=<number>          - dbscan epsilon for benchmark point clustering
  -analysis-clusters-output-file=<string>        -
  -analysis-display-unstable-clusters            - if there is more than one benchmark for an opcode, said benchmarks may end up not being clustered into the same cluster if the measured performance characteristics are different. by default all such opcodes are filtered out. this flag will instead show only such unstable opcodes
  -analysis-inconsistencies-output-file=<string> -
  -analysis-inconsistency-epsilon=<number>       - epsilon for detection of when the cluster is different from the LLVM schedule profile values
  -analysis-numpoints=<uint>                     - minimum number of points in an analysis cluster

llvm-exegesis benchmark options:

  -ignore-invalid-sched-class                    - ignore instructions that do not define a sched class
  -mode=<value>                                  - the mode to run
    =latency                                     -   Instruction Latency
    =inverse_throughput                          -   Instruction Inverse Throughput
    =uops                                        -   Uop Decomposition
    =analysis                                    -   Analysis
  -num-repetitions=<uint>                        - number of time to repeat the asm snippet
  -opcode-index=<int>                            - opcode to measure, by index
  -opcode-name=<string>                          - comma-separated list of opcodes to measure, by name
  -snippets-file=<string>                        - code snippets to measure

llvm-exegesis options:

  -benchmarks-file=<string>                      - File to read (analysis mode) or write (latency/uops/inverse_throughput modes) benchmark results. “-” uses stdin/stdout.
  -mcpu=<string>                                 - cpu name to use for pfm counters, leave empty to autodetect
```

llvm-svn: 356364

5 years ago[DebugInfo] Ignore bitcasts when lowering stack arg dbg.values
David Stenberg [Mon, 18 Mar 2019 11:27:32 +0000 (11:27 +0000)]
[DebugInfo] Ignore bitcasts when lowering stack arg dbg.values

Summary:
Look past bitcasts when looking for parameter debug values that are
described by frame-index loads in `EmitFuncArgumentDbgValue()`.

In the attached test case we would be left with an undef `DBG_VALUE`
for the parameter without this patch.

A similar fix was done for parameters passed in registers in D13005.

This fixes PR40777.

Reviewers: aprantl, vsk, jmorse

Reviewed By: aprantl

Subscribers: bjope, javed.absar, jdoerfert, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D58831

llvm-svn: 356363

5 years agoFix "type qualifiers ignored on cast result type" warnings
Pavel Labath [Mon, 18 Mar 2019 10:50:46 +0000 (10:50 +0000)]
Fix "type qualifiers ignored on cast result type" warnings

These warnings start to get emitted with gcc-8.

llvm-svn: 356362

5 years agoReinitialize UnwindTable when the SymbolFile changes
Pavel Labath [Mon, 18 Mar 2019 10:45:02 +0000 (10:45 +0000)]
Reinitialize UnwindTable when the SymbolFile changes

Summary:
This is a preparatory step to enable adding of unwind plans by symbol
file plugins.

Although at the surface it seems that currently symbol files have
nothing to do with unwinding, this isn't entirely correct even now. The
mere act of adding a symbol file can have the effect of making more
sections (typically .debug_frame) available to the unwinding machinery,
so that it can have more unwind strategies to choose from.

Up until now, we've had a bug, which went largely unnoticed, where
unwind info in the manually added symbols files (target symbols add) was
being ignored during unwinding. Reinitializing the UnwindTable fixes
that bug too.

Reviewers: clayborg, jasonmolenda, alexshap

Subscribers: jdoerfert, lldb-commits

Differential Revision: https://reviews.llvm.org/D58347

llvm-svn: 356361

5 years ago[AArch64] Fix bug 35094 atomicrmw on Armv8.1-A+lse
Christof Douma [Mon, 18 Mar 2019 09:21:06 +0000 (09:21 +0000)]
[AArch64] Fix bug 35094 atomicrmw on Armv8.1-A+lse

Fixes https://bugs.llvm.org/show_bug.cgi?id=35094

The Dead register definition pass should leave alone the atomicrmw
instructions on AArch64 (LTE extension). The reason is the following
statement in the Arm ARM:

"The ST<OP> instructions, and LD<OP> instructions where the destination
register is WZR or XZR, are not regarded as doing a read for the purpose
of a DMB LD barrier."

A good example was given in the gcc thread by Will Deacon (linked in the
bugzilla ticket 35094):

    P0 (atomic_int* y,atomic_int* x) {
      atomic_store_explicit(x,1,memory_order_relaxed);
      atomic_thread_fence(memory_order_release);
      atomic_store_explicit(y,1,memory_order_relaxed);
    }

    P1 (atomic_int* y,atomic_int* x) {
      atomic_fetch_add_explicit(y,1,memory_order_relaxed);  // STADD
      atomic_thread_fence(memory_order_acquire);
      int r0 = atomic_load_explicit(x,memory_order_relaxed);
    }

    P2 (atomic_int* y) {
      int r1 = atomic_load_explicit(y,memory_order_relaxed);
    }

    My understanding is that it is forbidden for r0 == 0 and r1 == 2 after
    this test has executed. However, if the relaxed add in P1 compiles to
    STADD and the subsequent acquire fence is compiled as DMB LD, then we
    don't have any ordering guarantees in P1 and the forbidden result could
    be observed.

Change-Id: I419f9f9df947716932038e1100c18d10a96408d0
llvm-svn: 356360

5 years ago[X86] Hopefully fix a tautological compare warning in printVecCompareInstr.
Craig Topper [Mon, 18 Mar 2019 07:05:01 +0000 (07:05 +0000)]
[X86] Hopefully fix a tautological compare warning in printVecCompareInstr.

llvm-svn: 356359

5 years ago[RISCV] Add ImmArg to intrinsics
Alex Bradbury [Mon, 18 Mar 2019 06:01:27 +0000 (06:01 +0000)]
[RISCV] Add ImmArg to intrinsics

llvm-svn: 356358

5 years ago[X86] Add ADD8ri_DB and ADD8rr_DB to the autogenerated load folding table.
Craig Topper [Mon, 18 Mar 2019 05:48:19 +0000 (05:48 +0000)]
[X86] Add ADD8ri_DB and ADD8rr_DB to the autogenerated load folding table.

These were added in r355423.

We only use the autogenerated table to assist with the maintenance of the
manual table. These entries are alreayd in the manual table.

llvm-svn: 356357

5 years ago[X86] Make ADD*_DB post-RA pseudos and expand them in expandPostRAPseudo.
Craig Topper [Mon, 18 Mar 2019 05:48:18 +0000 (05:48 +0000)]
[X86] Make ADD*_DB post-RA pseudos and expand them in expandPostRAPseudo.

These are used to help convert OR->LEA when needed to avoid avoid a copy. They
aren't need after register allocation.

Happens to remove an ugly goto from X86MCCodeEmitter.cpp

llvm-svn: 356356

5 years ago[X86] Add tab character to the custom printing of VPCMP and VPCOM instructions.
Craig Topper [Mon, 18 Mar 2019 02:53:11 +0000 (02:53 +0000)]
[X86] Add tab character to the custom printing of VPCMP and VPCOM instructions.

All the other instructions are printed with a preceeding tab.

llvm-svn: 356355

5 years agoAdd testcase from bug 41079
Matt Arsenault [Sun, 17 Mar 2019 23:16:31 +0000 (23:16 +0000)]
Add testcase from bug 41079

llvm-svn: 356354

5 years agoRemove immarg from llvm.expect
Matt Arsenault [Sun, 17 Mar 2019 23:16:18 +0000 (23:16 +0000)]
Remove immarg from llvm.expect

The LangRef claimed this was required to be a constant, but this
appears to be wrong.

Fixes bug 41079.

llvm-svn: 356353

5 years ago[X86] Merge printf32mem/printi32mem into a single printdwordmem. Do the same for...
Craig Topper [Sun, 17 Mar 2019 22:57:21 +0000 (22:57 +0000)]
[X86] Merge printf32mem/printi32mem into a single printdwordmem. Do the same for all other printing functions.

The only thing the print methods currently need to know is the string to print for the memory size in intel syntax.

This patch merges the functions based on this string. If we ever need something else in the future, its easy to split them back out.

This reduces the number of cases in the assembly printers. It shrinks the intel printer to only use 7 bytes per instruction instead of 8.

llvm-svn: 356352

5 years ago[CodeGen] Defined MVTs v3i32, v3f32, v5i32, v5f32
Tim Renouf [Sun, 17 Mar 2019 22:56:38 +0000 (22:56 +0000)]
[CodeGen] Defined MVTs v3i32, v3f32, v5i32, v5f32

AMDGPU would like to use these MVTs.

Differential Revision: https://reviews.llvm.org/D58901

Change-Id: I6125fea810d7cc62a4b4de3d9904255a1233ae4e
llvm-svn: 356351

5 years ago[CodeGen] Prepare for introduction of v3 and v5 MVTs
Tim Renouf [Sun, 17 Mar 2019 21:43:12 +0000 (21:43 +0000)]
[CodeGen] Prepare for introduction of v3 and v5 MVTs

AMDGPU would like to have MVTs for v3i32, v3f32, v5i32, v5f32. This
commit does not add them, but makes preparatory changes:

* Exclude non-legal non-power-of-2 vector types from ComputeRegisterProp
  mechanism in TargetLoweringBase::getTypeConversion.

* Cope with SETCC and VSELECT for odd-width i1 vector when the other
  vectors are legal type.

Some of this patch is from Matt Arsenault, also of AMD.

Differential Revision: https://reviews.llvm.org/D58899

Change-Id: Ib5f23377dbef511be3a936211a0b9f94e46331f8
llvm-svn: 356350

5 years ago[ARM] Check that CPSR does not have other uses
David Green [Sun, 17 Mar 2019 21:36:15 +0000 (21:36 +0000)]
[ARM] Check that CPSR does not have other uses

Fix up rL356335 by checking that CPSR is not read between
the compare and the branch.

llvm-svn: 356349

5 years agoRegAllocFast: Add hint to debug printing
Matt Arsenault [Sun, 17 Mar 2019 21:31:40 +0000 (21:31 +0000)]
RegAllocFast: Add hint to debug printing

llvm-svn: 356348

5 years agoAMDGPU: Partially fix default device for HSA
Matt Arsenault [Sun, 17 Mar 2019 21:31:35 +0000 (21:31 +0000)]
AMDGPU: Partially fix default device for HSA

There are a few different issues, mostly stemming from using
generation based checks for anything instead of subtarget
features. Stop adding flat-address-space as a feature for HSA, as it
should only be a device property. This was incorrectly allowing flat
instructions to select for SI.

Increase the default generation for HSA to avoid the encoding error
when emitting objects. This has some other side effects from various
checks which probably should be separate subtarget features (in the
cost model and for dealing with the DS offset folding issue).

Partial fix for bug 41070. It should probably be an error to try using
amdhsa without flat support.

llvm-svn: 356347

5 years ago[ConstantRange] Add assertion for KnownBits validity; NFC
Nikita Popov [Sun, 17 Mar 2019 21:25:32 +0000 (21:25 +0000)]
[ConstantRange] Add assertion for KnownBits validity; NFC

Following the suggestion in D59475.

llvm-svn: 356346

5 years ago[ValueTracking] Use ConstantRange overflow check for signed add; NFC
Nikita Popov [Sun, 17 Mar 2019 21:25:26 +0000 (21:25 +0000)]
[ValueTracking] Use ConstantRange overflow check for signed add; NFC

This is the same change as rL356290, but for signed add. It replaces
the existing ripple logic with the overflow logic in ConstantRange.

This is NFC in that it should return NeverOverflow in exactly the
same cases as the previous implementation. However, it does make
computeOverflowForSignedAdd() more powerful by now also determining
AlwaysOverflows conditions. As none of its consumers handle this yet,
this has no impact on optimization. Making use of AlwaysOverflows
in with.overflow folding will be handled as a followup.

Differential Revision: https://reviews.llvm.org/D59450

llvm-svn: 356345

5 years ago[X86] Remove the _alt forms of AVX512 VPCMP instructions. Use a combination of custom...
Craig Topper [Sun, 17 Mar 2019 21:21:40 +0000 (21:21 +0000)]
[X86] Remove the _alt forms of AVX512 VPCMP instructions. Use a combination of custom printing and custom parsing to achieve the same result and more

Similar to the previous patch for VPCOM.

Differential Revision: https://reviews.llvm.org/D59398

llvm-svn: 356344

5 years ago[X86] Remove the _alt forms of XOP VPCOM instructions. Use a combination of custom...
Craig Topper [Sun, 17 Mar 2019 21:21:37 +0000 (21:21 +0000)]
[X86] Remove the _alt forms of XOP VPCOM instructions. Use a combination of custom printing and custom parsing to achieve the same result and more

Previously we had a regular form of the instruction used when the immediate was 0-7. And _alt form that allowed the full 8 bit immediate. Codegen would always use the 0-7 form since the immediate was always checked to be in range. Assembly parsing would use the 0-7 form when a mnemonic like vpcomtrueb was used. If the immediate was specified directly the _alt form was used. The disassembler would prefer to use the 0-7 form instruction when the immediate was in range and the _alt form otherwise. This way disassembly would print the most readable form when possible.

The assembly parsing for things like vpcomtrueb relied on splitting the mnemonic into 3 pieces. A "vpcom" prefix, an immediate representing the "true", and a suffix of "b". The tablegenerated printing code would similarly print a "vpcom" prefix, decode the immediate into a string, and then print "b".

The _alt form on the other hand parsed and printed like any other instruction with no specialness.

With this patch we drop to one form and solve the disassembly printing issue by doing custom printing when the immediate is 0-7. The parsing code has been tweaked to turn "vpcomtrueb" into "vpcomb" and then the immediate for the "true" is inserted either before or after the other operands depending on at&t or intel syntax.

I'd rather not do the custom printing, but I tried using an InstAlias for each possible mnemonic for all 8 immediates for all 16 combinations of element size, signedness, and memory/register. The code emitted into printAliasInstr ended up checking the number of operands, the register class of each operand, and the immediate for all 256 aliases. This was repeated for both the at&t and intel printer. Despite a lot of common checks between all of the aliases, when compiled with clang at least this commonality was not well optimized. Nor do all the checks seem necessary. Since I want to do a similar thing for vcmpps/pd/ss/sd which have 32 immediate values and 3 encoding flavors, 3 register sizes, etc. This didn't seem to scale well for clang binary size. So custom printing seemed a better trade off.

I also considered just using the InstAlias for the matching and not the printing. But that seemed like it would add a lot of extra rows to the matcher table. Especially given that the 32 immediates for vpcmpps have 46 strings associated with them.

Differential Revision: https://reviews.llvm.org/D59398

llvm-svn: 356343