platform/upstream/llvm.git
4 years ago[llvm-dwarfdump][Stats] Fix the License header
Djordje Todorovic [Fri, 7 Feb 2020 17:15:04 +0000 (18:15 +0100)]
[llvm-dwarfdump][Stats] Fix the License header

Fix the added License.

Differential Revision: https://reviews.llvm.org/D74207

4 years ago[GlobalISel][CallLowering] Tighten constantexpr check for callee.
Amara Emerson [Sun, 9 Feb 2020 18:09:51 +0000 (10:09 -0800)]
[GlobalISel][CallLowering] Tighten constantexpr check for callee.

I'm not sure there's a test case for this, but it's better to be safe.

4 years ago[Attributor] Allow PHI nodes in AAValueConstantRangeFloating
Johannes Doerfert [Mon, 10 Feb 2020 02:14:35 +0000 (20:14 -0600)]
[Attributor] Allow PHI nodes in AAValueConstantRangeFloating

Traversing PHI nodes is natural with the genericValueTraversal but also
a bit tricky. The problem is similar to the ones we have seen in AAAlign
and AADereferenceable, namely that we continue to increase the range in
each iteration. We use a pessimistic approach here to stop the
iterations. Nevertheless, optimistic information can now be propagated
through a PHI node.

4 years ago[Attributor][FIX] Remove FIXME that seems outdated
Johannes Doerfert [Mon, 10 Feb 2020 02:21:56 +0000 (20:21 -0600)]
[Attributor][FIX] Remove FIXME that seems outdated

The change is performed as stated by the FIXME and the tests are
adjusted. All changes look fine to me and values can be inferred as
undef without it being an error.

4 years ago[Attributor] Allow SelectInst in AAValueConstantRangeFloating
Johannes Doerfert [Mon, 10 Feb 2020 01:08:04 +0000 (19:08 -0600)]
[Attributor] Allow SelectInst in AAValueConstantRangeFloating

The genericValueTraversal will already handle SelectInst properly and we
just needed to allow them in the initialize method.

4 years ago[Attributor] Look through (some) casts in AAValueConstantRangeFloating
Johannes Doerfert [Mon, 10 Feb 2020 01:07:30 +0000 (19:07 -0600)]
[Attributor] Look through (some) casts in AAValueConstantRangeFloating

Casts can be handled natively by the ConstantRange class. We do limit it
to extends for now as we assume an integer type in different locations.
A TODO and a test case with a FIXME was added to remove that restriction
in the future.

4 years ago[Attributor][FIX] Call right base method in AAValueConstantRangeFloating
Johannes Doerfert [Mon, 10 Feb 2020 01:05:15 +0000 (19:05 -0600)]
[Attributor][FIX] Call right base method in AAValueConstantRangeFloating

We now call the base class method as we should.

4 years ago[X86] Autogenerate complete checks. NFC
Craig Topper [Mon, 10 Feb 2020 06:31:30 +0000 (22:31 -0800)]
[X86] Autogenerate complete checks. NFC

4 years ago[Attributor][Tests][NFC] Add more range tests
Johannes Doerfert [Mon, 10 Feb 2020 01:06:09 +0000 (19:06 -0600)]
[Attributor][Tests][NFC] Add more range tests

Inspired by https://llvm.discourse.group/t/impossible-condition-optimization/461

4 years ago[Attributor][NFC] Use existing constant instead of magic one
Johannes Doerfert [Sun, 26 Jan 2020 02:16:31 +0000 (20:16 -0600)]
[Attributor][NFC] Use existing constant instead of magic one

4 years ago[X86] Make (insert_vector_elt (v8i16 zerovec), i16 %x, 0) generate the same code...
Craig Topper [Mon, 10 Feb 2020 05:48:00 +0000 (21:48 -0800)]
[X86] Make (insert_vector_elt (v8i16 zerovec), i16 %x, 0) generate the same code as (v8i16 (build_vector %x, 0, 0, 0, 0, 0, 0, 0)).

Instead of using a insrw to element 0, use movzx and movd.

Same for v16i8.

4 years agoFix `-Wparentheses` warning. NFC.
Michael Liao [Mon, 10 Feb 2020 05:41:46 +0000 (00:41 -0500)]
Fix `-Wparentheses` warning. NFC.

4 years ago[clang][codegen] Fix another lifetime emission on alloca on non-default address space.
Michael Liao [Sun, 9 Feb 2020 18:09:19 +0000 (13:09 -0500)]
[clang][codegen] Fix another lifetime emission on alloca on non-default address space.

- Lifetime intrinsics expect the pointer directly from alloca. Need
  extra handling for targets with alloca on non-default (or non-zero)
  address space.

4 years ago[X86] Autogenerate complete checks. NFC
Craig Topper [Mon, 10 Feb 2020 04:31:56 +0000 (20:31 -0800)]
[X86] Autogenerate complete checks. NFC

4 years ago[X86] Use MOVZX instead of MOVSX in f16_to_fp isel patterns.
Craig Topper [Mon, 10 Feb 2020 02:35:57 +0000 (18:35 -0800)]
[X86] Use MOVZX instead of MOVSX in f16_to_fp isel patterns.

Using sign extend forces the adjacent element to either all zeros
or all ones. But all ones is a NAN. So that doesn't seem like a
great idea.

Trying to work on supporting this with strict FP where NAN would
definitely be bad.

4 years ago[RISCV] Fix incorrect FP base CFI offset for variable argument functions
Shiva Chen [Mon, 3 Feb 2020 05:52:13 +0000 (13:52 +0800)]
[RISCV] Fix incorrect FP base CFI offset for variable argument functions

When the FP exists, the FP base CFI directive offset should take the size of variable arguments into account.

Differential Revision: https://reviews.llvm.org/D73862

4 years ago[DebugInfo] Add a DWARFDataExtractor constructor that takes ArrayRef<uint8_t>
Fangrui Song [Mon, 10 Feb 2020 01:28:20 +0000 (17:28 -0800)]
[DebugInfo] Add a DWARFDataExtractor constructor that takes ArrayRef<uint8_t>

Similar to D67797 (DataExtractor).

4 years agoGlobalISel: Fix narrowScalar for G_{CTLZ|CTTZ}_ZERO_UNDEF
Matt Arsenault [Fri, 7 Feb 2020 17:24:15 +0000 (12:24 -0500)]
GlobalISel: Fix narrowScalar for G_{CTLZ|CTTZ}_ZERO_UNDEF

Narrow these for 64-bit VALU for AMDGPU.

4 years agoAMDGPU/GlobalISel: Split 64-bit G_CTPOP in RegBankSelect
Matt Arsenault [Sun, 26 Jan 2020 02:10:17 +0000 (21:10 -0500)]
AMDGPU/GlobalISel: Split 64-bit G_CTPOP in RegBankSelect

4 years agoGlobalISel: Fix narrowing of G_CTLZ/G_CTTZ
Matt Arsenault [Fri, 7 Feb 2020 16:55:39 +0000 (11:55 -0500)]
GlobalISel: Fix narrowing of G_CTLZ/G_CTTZ

The result type is separate from the source type.

4 years agoAMDGPU/GlobalISel: Don't mis-select vector index on a constant
Matt Arsenault [Thu, 6 Feb 2020 22:18:17 +0000 (17:18 -0500)]
AMDGPU/GlobalISel: Don't mis-select vector index on a constant

Vector indexing with a constant index should be folded out in the
legalizer, but this was accidentally falling through. This would
produce the indexing operation with $noreg. Handle this case as a
dynamic index just in case a bug like this happens again in the
future.

4 years agoAMDGPU/GlobalISel: Look through casts when legalizing vector indexing
Matt Arsenault [Thu, 6 Feb 2020 21:52:04 +0000 (16:52 -0500)]
AMDGPU/GlobalISel: Look through casts when legalizing vector indexing

We were failing to find constants that were casted. I feel like the
artifact combiner should have folded the constant in the trunc before
the custom lowering, but that doesn't happen.

4 years agoAMDGPU: Remove dead kill handling
Matt Arsenault [Mon, 6 Jan 2020 20:57:51 +0000 (15:57 -0500)]
AMDGPU: Remove dead kill handling

At one point a custom node was used for kill handling, but now the
intrinsic is directly selected. Remove leftover pattern machinery.

4 years agoAMDGPU: Fix SI_IF lowering when the save exec reg has terminator uses
Matt Arsenault [Fri, 27 Dec 2019 18:11:06 +0000 (13:11 -0500)]
AMDGPU: Fix SI_IF lowering when the save exec reg has terminator uses

Reverts part of 6524a7a2b9ca072bd7f7b4355d1230e70c679d2f. Since that
commit, the expansion was ignoring the actual save exec register
produced by the instruction, and looking at other instructions. I do
not understand why it was looking at other instructions, but relying
on this scan was wrong.

Fixes verifier errors after SI_IF is tail duplicated, which should be
correct to do. The results were fed into a phi, which was lowered to
the S_MOV_B64_term instructions.

4 years ago[X86] combineConcatVectorOps - combine VROTLI/VROTRI ops
Simon Pilgrim [Sun, 9 Feb 2020 21:49:37 +0000 (21:49 +0000)]
[X86] combineConcatVectorOps - combine VROTLI/VROTRI ops

Fix issue mentioned on rGe82e17d4d4ca - non-AVX512BW targets failed to concatenate 256-bit rotations back to 512-bits (split during shuffle lowering as they don't have v32i16/v64i8 types).

4 years ago[X86] Use custom isel for (X86sbb_flag 0, 0) so we can use 32-bit SBB for i8/i16.
Craig Topper [Sun, 9 Feb 2020 21:08:25 +0000 (13:08 -0800)]
[X86] Use custom isel for (X86sbb_flag 0, 0) so we can use 32-bit SBB for i8/i16.

We were using MOV32r0 and an extract_subreg as an input. By using
custom isel we can move the extract_subreg to after the SBB instead
of on the input.

4 years ago[X86] Add flag result VT to a MOV32r0 created in X86DAGToDAGISel::Select
Craig Topper [Sun, 9 Feb 2020 20:31:21 +0000 (12:31 -0800)]
[X86] Add flag result VT to a MOV32r0 created in X86DAGToDAGISel::Select

The flag isn't used, but I believe this matches the MOV32r0 that
would be created by the table emitter. This should allow this node
to be CSEed with any others created by the table.

4 years ago[X86] Add lowerShuffleAsBitRotate (PR44379)
Simon Pilgrim [Sun, 9 Feb 2020 21:15:03 +0000 (21:15 +0000)]
[X86] Add lowerShuffleAsBitRotate (PR44379)

As noted on PR44379, we didn't attempt to lower vector shuffles using bit rotations on XOP/AVX512F targets.

This patch lowers to uniform ISD:ROTL nodes - ROTR isn't supported by XOP and they are interchangeable for constant values anyway.

There might be cases where targets without ISD:ROTL support would benefit from this (expanding to SRL+SHL+OR), which I'll investigate in a future patch.

Also, non-AVX512BW targets fail to concatenate 256-bit rotations back to 512-bits (split during shuffle lowering as they don't have v32i16/v64i8 types).

4 years ago[X86] Use MVT::i32 for the type of a MOV32r0 created in X86DAGToDAGISel::Select.
Craig Topper [Sun, 9 Feb 2020 19:57:22 +0000 (11:57 -0800)]
[X86] Use MVT::i32 for the type of a MOV32r0 created in X86DAGToDAGISel::Select.

Not sure if this really matters. The VT isn't really used after
this point. At best it might affect CSE.

4 years ago[X86] Remove isel patterns that include a vselect/X86selects and a strict FP node.
Craig Topper [Sun, 9 Feb 2020 05:35:21 +0000 (21:35 -0800)]
[X86] Remove isel patterns that include a vselect/X86selects and a strict FP node.

A vselect+strictfp node is not equivalent to a masked operation.
The exceptions of the strictfp node are not masked by a vselect
after it so we can't match it to a masked operation.

We already had a hack in IsLegalToFold to prevent these patterns from
matching. This patch removes that hack and removes the patterns.

4 years agolibclc/r600: Use target specific builtins to implement rsqrt and native_rsqrt
Jan Vesely [Wed, 5 Feb 2020 01:14:04 +0000 (20:14 -0500)]
libclc/r600: Use target specific builtins to implement rsqrt and native_rsqrt

Fixes OCL CTS rsqrt and half_rsqrt (1 thread, scalaer) tests on AMD Turks.

Reviewer: awatry
Differential Revision: https://reviews.llvm.org/D74016

4 years agolibclc: Move rsqrt implementation to a .cl file
Jan Vesely [Wed, 5 Feb 2020 01:09:12 +0000 (20:09 -0500)]
libclc: Move rsqrt implementation to a .cl file

Reviewer: awatry
Differential Revision: https://reviews.llvm.org/D74013

4 years ago[X86][XOP] Add XOP target to vXi16/vXi8 shuffle tests
Simon Pilgrim [Sun, 9 Feb 2020 18:35:02 +0000 (18:35 +0000)]
[X86][XOP] Add XOP target to vXi16/vXi8 shuffle tests

Helps with bit rotation test coverage for PR44379

4 years ago[X86][SSE] Add more tests showing failure to lower shuffles as bit rotations
Simon Pilgrim [Sun, 9 Feb 2020 17:51:53 +0000 (17:51 +0000)]
[X86][SSE] Add more tests showing failure to lower shuffles as bit rotations

4 years ago[X86] Rename matchShuffleAsRotate - matchShuffleAsByteRotate. NFCI.
Simon Pilgrim [Sun, 9 Feb 2020 14:23:19 +0000 (14:23 +0000)]
[X86] Rename matchShuffleAsRotate - matchShuffleAsByteRotate. NFCI.

A matchShuffleAsBitRotate variant will be added soon and we need to make the difference more obvious.

4 years ago[lldb] [doc] Status: Linux: Update the paragraph
Jan Kratochvil [Sun, 9 Feb 2020 17:13:04 +0000 (18:13 +0100)]
[lldb] [doc] Status: Linux: Update the paragraph

4 years ago[LLDB] [doc] Document NetBSD status and sort OSs alphabetically
Kamil Rytarowski [Sun, 9 Feb 2020 17:02:07 +0000 (18:02 +0100)]
[LLDB] [doc] Document NetBSD status and sort OSs alphabetically

4 years ago[gn build] Port a17f03bd939
LLVM GN Syncbot [Sun, 9 Feb 2020 15:41:05 +0000 (15:41 +0000)]
[gn build] Port a17f03bd939

4 years ago[VectorCombine] new IR transform pass for partial vector ops
Sanjay Patel [Sun, 9 Feb 2020 15:04:41 +0000 (10:04 -0500)]
[VectorCombine] new IR transform pass for partial vector ops

We have several bug reports that could be characterized as "reducing scalarization",
and this topic was also raised on llvm-dev recently:
http://lists.llvm.org/pipermail/llvm-dev/2020-January/138157.html
...so I'm proposing that we deal with these patterns in a new, lightweight IR vector
pass that runs before/after other vectorization passes.

There are 4 alternate options that I can think of to deal with this kind of problem
(and we've seen various attempts at all of these), but they all have flaws:

    InstCombine - can't happen without TTI, but we don't want target-specific
                  folds there.
    SDAG - too late to assist other vectorization passes; TLI is not equipped
           for these kind of cost queries; limited to a single basic block.
    CGP - too late to assist other vectorization passes; would need to re-implement
          basic cleanups like CSE/instcombine.
    SLP - doesn't fit with existing transforms; limited to a single basic block.

This initial patch/transform is based on existing code in AggressiveInstCombine:
we walk backwards through the function looking for a pattern match. But we diverge
from that cost-independent IR canonicalization pass by using TTI to decide if the
vector alternative is profitable.

We probably have at least 10 similar bug reports/patterns (binops, constants,
inserts, cheap shuffles, etc) that would fit in this pass as follow-up enhancements.
It's possible that we could iterate on a worklist to fix-point like InstCombine does,
but it's safer to start with a most basic case and evolve from there, so I didn't
try to do anything fancy with this initial implementation.

Differential Revision: https://reviews.llvm.org/D73480

4 years ago[lldb] [doc] Status: Debugserver (remote debugging) is OK now
Jan Kratochvil [Sun, 9 Feb 2020 14:22:36 +0000 (15:22 +0100)]
[lldb] [doc] Status: Debugserver (remote debugging) is OK now

4 years ago[lldb] [doc] Testing: Fix typos
Jan Kratochvil [Sun, 9 Feb 2020 14:11:38 +0000 (15:11 +0100)]
[lldb] [doc] Testing: Fix typos

4 years ago[LLDB] [doc] Remove note about libpanel(3) and NetBSD
Kamil Rytarowski [Sun, 9 Feb 2020 13:59:04 +0000 (14:59 +0100)]
[LLDB] [doc] Remove note about libpanel(3) and NetBSD

libpanel(3) is now supported in all supported versions of NetBSD.

4 years ago[LLDB] [doc] Update the current status of pkgsrc (NetBSD) building
Kamil Rytarowski [Sun, 9 Feb 2020 13:57:09 +0000 (14:57 +0100)]
[LLDB] [doc] Update the current status of pkgsrc (NetBSD) building

4 years ago[lldb] [testsuite] TestGdbRemoteLibrariesSvr4Support: Fix symlinked builddir
Jan Kratochvil [Sun, 9 Feb 2020 13:49:38 +0000 (14:49 +0100)]
[lldb] [testsuite] TestGdbRemoteLibrariesSvr4Support: Fix symlinked builddir

When I have symlinked builddir on Fedora 31 x86_64 I get:

FAIL: test_libraries_svr4_libs_present (TestGdbRemoteLibrariesSvr4Support.TestGdbRemoteLibrariesSvr4Support)
----------------------------------------------------------------------
...
  File "lldb/packages/Python/lldbsuite/test/tools/lldb-server/libraries-svr4/TestGdbRemoteLibrariesSvr4Support.py", line 106, in
libraries_svr4_libs_present
    self.assertIn(self.getBuildDir() + "/" + lib, libraries_svr4_names)
AssertionError:
'/home/jkratoch/redhat/llvm-monorepo-clangassertsymlink/lldb-test-build.noindex/tools/lldb-server/libraries-svr4/TestGdbRemoteLibrariesSvr4Support.test_libraries_svr4_libs_present/libsvr4lib_a.so' not found in ['/home/jkratoch/redhat/llvm-monorepo/lldb/packages/Python/lldbsuite/test/tools/lldb-server/libraries-svr4/linux-vdso.so.1', '/quad/home/jkratoch/redhat/llvm-monorepo-clangassertsymlink/lldb-test-build.noindex/tools/lldb-server/libraries-svr4/TestGdbRemoteLibrariesSvr4Support.test_libraries_svr4_libs_present/libsvr4lib_a.so', '/quad/home/jkratoch/redhat/llvm-monorepo-clangassertsymlink/lldb-test-build.noindex/tools/lldb-server/libraries-svr4/TestGdbRemoteLibrariesSvr4Support.test_libraries_svr4_libs_present/libsvr4lib_b".so', '/usr/lib64/libdl-2.30.so', '/usr/lib64/libstdc++.so.6.0.27', '/usr/lib64/libm-2.30.so', '/usr/lib64/libgcc_s-9-20190827.so.1', '/usr/lib64/libc-2.30.so', '/usr/lib64/ld-2.30.so']
Config=x86_64-/quad/home/jkratoch/redhat/llvm-monorepo-clangassertsymlink/bin/clang-11
----------------------------------------------------------------------

Differential Revision: https://reviews.llvm.org/D74295

4 years agoFix signed/unsigned warning.
Simon Pilgrim [Sun, 9 Feb 2020 13:35:03 +0000 (13:35 +0000)]
Fix signed/unsigned warning.

4 years ago[X86] Recognise ROTLI/ROTRI rotations as faux shuffles
Simon Pilgrim [Sun, 9 Feb 2020 12:25:19 +0000 (12:25 +0000)]
[X86] Recognise ROTLI/ROTRI rotations as faux shuffles

Allows us to combine rotations with shuffles.

One of many things necessary to fix PR44379 (lowering shuffles to rotations)

4 years ago[LoopExtractor] Convert LoopExtractor from LoopPass to ModulePass
Ehud Katz [Sun, 9 Feb 2020 10:25:21 +0000 (12:25 +0200)]
[LoopExtractor] Convert LoopExtractor from LoopPass to ModulePass

The LoopExtractor created new functions (by definition), which violates
the restrictions of a LoopPass.
The correct implementation of this pass should be as a ModulePass.
Includes reverting rL82990 implications on the LoopExtractor.

Fixes PR3082 and PR8929.

Differential Revision: https://reviews.llvm.org/D69069

4 years ago[AggressiveInstCombine] Add test with baseline CHECKs for aggressive inst combine...
Ayman Musa [Tue, 28 Jan 2020 14:31:44 +0000 (16:31 +0200)]
[AggressiveInstCombine] Add test with baseline CHECKs for aggressive inst combine for SELECT.

4 years agoSupport -fstack-clash-protection for x86
serge_sans_paille [Mon, 9 Sep 2019 14:59:34 +0000 (16:59 +0200)]
Support -fstack-clash-protection for x86

Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a357a8f685b3540246c5d762734e035f with proper LiveIn
declaration, better option handling and more portable testing.

Differential Revision: https://reviews.llvm.org/D68720

4 years agoRevert "Support -fstack-clash-protection for x86"
serge-sans-paille [Sun, 9 Feb 2020 09:06:31 +0000 (10:06 +0100)]
Revert "Support -fstack-clash-protection for x86"

This reverts commit 0fd51a4554f5f4f90342f40afd35b077f6d88213.

Failures:

http://lab.llvm.org:8011/builders/llvm-clang-win-x-armv7l/builds/4354

4 years agoSupport -fstack-clash-protection for x86
serge_sans_paille [Mon, 9 Sep 2019 14:59:34 +0000 (16:59 +0200)]
Support -fstack-clash-protection for x86

Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a357a8f685b3540246c5d762734e035f with proper LiveIn
declaration, better option handling and more portable testing.

Differential Revision: https://reviews.llvm.org/D68720

4 years ago[ELF][test] Use llvm-readelf -l instead of llvm-readobj -l for some memory region...
Fangrui Song [Sun, 9 Feb 2020 06:39:09 +0000 (22:39 -0800)]
[ELF][test] Use llvm-readelf -l instead of llvm-readobj -l for some memory region tests

4 years ago[mlir][GPUToSPIRV] Modify the lowering of gpu.block_dim to be consistent with Vulkan...
MaheshRavishankar [Sun, 9 Feb 2020 02:23:09 +0000 (18:23 -0800)]
[mlir][GPUToSPIRV] Modify the lowering of gpu.block_dim to be consistent with Vulkan SPEC

The existing lowering of gpu.block_dim added a global variable with
the WorkGroupSize decoration. This raises an error within
Vulkan/SPIR-V validation since Vulkan requires this to have a constant
initializer. This is not yet supported in SPIR-V dialect. Changing the
lowering to return the workgroup size as a constant value instead,
obtained from spv.entry_point_abi attribute gets around the issue for
now. The validation goes through since the workgroup size is specified
using spv.execution_mode operation.

4 years ago[X86] Add more scalar intrinsic instructions to isNonFoldablePartialRegisterLoad.
Craig Topper [Sun, 9 Feb 2020 02:56:17 +0000 (18:56 -0800)]
[X86] Add more scalar intrinsic instructions to isNonFoldablePartialRegisterLoad.

I think this covers most if not all of the scalar intrinsic
instructions.

4 years ago[Attributor] Add an Attributor CGSCC pass and run it
Johannes Doerfert [Wed, 27 Nov 2019 06:30:12 +0000 (00:30 -0600)]
[Attributor] Add an Attributor CGSCC pass and run it

In addition to the module pass, this patch introduces a CGSCC pass that
runs the Attributor on a strongly connected component of the call graph
(both old and new PM). The Attributor was always design to be used on a
subset of functions which makes this patch mostly mechanical.

The one change is that we give up `norecurse` deduction in the module
pass in favor of doing it during the CGSCC pass. This makes the
interfaces simpler but can be revisited if needed.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D70767

4 years agoFix -Wunused-lambda-capture for -DLLVM_ENABLE_ASSERTIONS=off builds after 6556c615f3c...
Fangrui Song [Sun, 9 Feb 2020 03:01:17 +0000 (19:01 -0800)]
Fix -Wunused-lambda-capture for -DLLVM_ENABLE_ASSERTIONS=off builds after 6556c615f3c3aae8af876806777065961ae20024

4 years ago[FIX] Ordering problem accidentally introduced with D72304
Johannes Doerfert [Sun, 9 Feb 2020 02:14:01 +0000 (20:14 -0600)]
[FIX] Ordering problem accidentally introduced with D72304

4 years ago[X86] Add the recently added (V)CVTSS2SI/CVTSD2SI instructions used for LRINT/LLRINT...
Craig Topper [Sun, 9 Feb 2020 01:46:59 +0000 (17:46 -0800)]
[X86] Add the recently added (V)CVTSS2SI/CVTSD2SI instructions used for LRINT/LLRINT to the load folding tables.

4 years ago[FIX] Fix warning in LazyCallGraphTest caused by D70927
Johannes Doerfert [Sun, 9 Feb 2020 00:58:16 +0000 (18:58 -0600)]
[FIX] Fix warning in LazyCallGraphTest caused by D70927

4 years ago[OpenMP][OMPIRBuilder] Add Directives (master and critical) to OMPBuilder.
fady [Sun, 9 Feb 2020 00:54:08 +0000 (18:54 -0600)]
[OpenMP][OMPIRBuilder] Add Directives (master and critical) to OMPBuilder.

Add support for Master and Critical directive in the OMPIRBuilder. Both make use of a new common interface for emitting inlined OMP regions called `emitInlinedRegion` which was added in this patch as well.

Also this patch modifies clang to use the new directives when  `-fopenmp-enable-irbuilder` commandline option is passed.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D72304

4 years ago[OpenMP][Opt] Delete terminating and read-only parallel regions
Johannes Doerfert [Sun, 9 Feb 2020 00:42:24 +0000 (18:42 -0600)]
[OpenMP][Opt] Delete terminating and read-only parallel regions

Parallel regions known to be read-only, e.g., after we removed all dead
write accesses, and terminating (`willreturn`) can be removed.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D69954

4 years ago[OpenMP][Opt] Annotate known runtime functions and deduplicate more
Johannes Doerfert [Sun, 9 Feb 2020 00:03:40 +0000 (18:03 -0600)]
[OpenMP][Opt] Annotate known runtime functions and deduplicate more

This adds ~27 more runtime calls to the OpenMPKinds.def file, all with
attributes. We deduplicate 16 of those automatically in function =
thread scope. And we annotate all of them automatically during the
OpenMPOpt discovery step. A test with all omp_XXXX runtime calls to
track annotation coverage is included.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D69984

4 years ago[gn build] (manually) port 72277ecd62e and the LLVMBuild bit of 9548b74a83
Nico Weber [Sat, 8 Feb 2020 23:50:51 +0000 (18:50 -0500)]
[gn build] (manually) port 72277ecd62e and the LLVMBuild bit of 9548b74a83

4 years ago[X86] Use any_fadd/sub/mul/div/sqrt with the AVX512 scalar_*_patterns.
Craig Topper [Sat, 8 Feb 2020 23:52:57 +0000 (15:52 -0800)]
[X86] Use any_fadd/sub/mul/div/sqrt with the AVX512 scalar_*_patterns.

Making sure not to use them with patterns for masked instructions.

Also fix FMA patterns that were matching strict_fma+x86selects to
masked instructions.

4 years ago[mlir][DeclarativeParser] Move several missed parsers over to the declarative form.
River Riddle [Sat, 8 Feb 2020 23:46:02 +0000 (15:46 -0800)]
[mlir][DeclarativeParser] Move several missed parsers over to the declarative form.

Differential Revision: https://reviews.llvm.org/D74283

4 years ago[mlir][DeclarativeParser] Add support for attributes with buildable types.
River Riddle [Sat, 8 Feb 2020 18:01:17 +0000 (10:01 -0800)]
[mlir][DeclarativeParser] Add support for attributes with buildable types.

This revision adds support in the declarative assembly form for printing attributes with buildable types without the type, and moves several more parsers over to the declarative form.

Differential Revision: https://reviews.llvm.org/D74276

4 years ago[mlir][quantizer] Add gathering of per-axis statistics in quantizer.
Dmitry Murygin [Sat, 8 Feb 2020 23:07:35 +0000 (15:07 -0800)]
[mlir][quantizer] Add gathering of per-axis statistics in quantizer.

Reviewers: stellaraccident, nicolasvasilache

Reviewed By: stellaraccident

Subscribers: Joonsoo, merge_guards_bot, denis13

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73556

4 years ago[mlir] Add support for generating debug locations from intermediate levels of the IR.
River Riddle [Sat, 8 Feb 2020 23:01:34 +0000 (15:01 -0800)]
[mlir] Add support for generating debug locations from intermediate levels of the IR.

Summary:
This revision adds a utility to generate debug locations from the IR during compilation, by snapshotting to a output stream and using the locations that operations were dumped in that stream. The new locations may either;
* Replace the original location of the operation.

old:
   loc("original_source.cpp":1:1)
new:
   loc("snapshot_source.mlir":10:10)

* Fuse with the original locations as NamedLocs with a specific tag.

old:
    loc("original_source.cpp":1:1)
new:
    loc(fused["original_source.cpp":1:1, "snapshot"("snapshot_source.mlir":10:10)])

This feature may be used by a debugger to display the code at various different levels of the IR. It would also be able to show the different levels of IR attached to a specific source line in the original source file.

This feature may also be used to generate locations for operations generated during compilation, that don't necessarily have a user source location to attach to.

This requires changes in the printer to track the locations of operations emitted in the stream. Moving forward we need to properly(and efficiently) track the number of newlines emitted to the stream during printing.

Differential Revision: https://reviews.llvm.org/D74019

4 years ago[gn build] Add OpenMPOpt.cpp to LLVMipo after D69930/9548b74a831e
Fangrui Song [Sat, 8 Feb 2020 22:18:43 +0000 (14:18 -0800)]
[gn build] Add OpenMPOpt.cpp to LLVMipo after D69930/9548b74a831e

4 years ago[ELF] Simplify parsing of version dependency. NFC
Fangrui Song [Sat, 8 Feb 2020 22:10:29 +0000 (14:10 -0800)]
[ELF] Simplify parsing of version dependency. NFC

4 years agoFix test name typo
Simon Pilgrim [Sat, 8 Feb 2020 21:24:01 +0000 (21:24 +0000)]
Fix test name typo

4 years ago[InstCombine] Refactor foldICmpAndShift(); NFCI
Nikita Popov [Sat, 8 Feb 2020 19:41:10 +0000 (20:41 +0100)]
[InstCombine] Refactor foldICmpAndShift(); NFCI

Separate out handling for shl, lshr and ashr. The combined handling
obscured some overly pessimistic requirements for the transform.

4 years ago[FIX] Update PM tests after D69930 landed
Johannes Doerfert [Sat, 8 Feb 2020 21:22:03 +0000 (15:22 -0600)]
[FIX] Update PM tests after D69930 landed

4 years ago[X86][SSE] Add test cases from PR44379
Simon Pilgrim [Sat, 8 Feb 2020 21:02:04 +0000 (21:02 +0000)]
[X86][SSE] Add test cases from PR44379

4 years ago[X86] Test showing inability to combine ROTLI/ROTRI rotations into shuffles
Simon Pilgrim [Sat, 8 Feb 2020 20:44:41 +0000 (20:44 +0000)]
[X86] Test showing inability to combine ROTLI/ROTRI rotations into shuffles

One of many things necessary to fix PR44379 (lowering shuffles to rotations)

4 years ago[OpenMP] Introduce the OpenMPOpt transformation pass
Johannes Doerfert [Thu, 7 Nov 2019 05:20:06 +0000 (23:20 -0600)]
[OpenMP] Introduce the OpenMPOpt transformation pass

The OpenMPOpt pass is a CGSCC pass in which OpenMP specific
optimizations can reside.

The OpenMPOpt pass uses the OpenMPKinds.def file to identify runtime
calls and their uses. This allows targeted transformations and eases
their implementation.

This initial patch deduplicates `__kmpc_global_thread_num` and
`omp_get_thread_num` calls. We can also identify arguments that are
equivalent to such a call result and use it instead. Later we can
determine "gtid" arguments based on the use in kernel functions etc.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D69930

4 years agoIntroduce a CallGraph updater helper class
Johannes Doerfert [Fri, 29 Nov 2019 19:11:24 +0000 (13:11 -0600)]
Introduce a CallGraph updater helper class

The CallGraphUpdater is a helper that simplifies the process of updating
the call graph, both old and new style, while running an CGSCC pass.

The uses are contained in different commits, e.g. D70767.

More functionality is added as we need it.

Reviewed By: modocache, hfinkel

Differential Revision: https://reviews.llvm.org/D70927

4 years ago[SimplifyLibCalls] Add __strlen_chk.
George Burgess IV [Wed, 5 Feb 2020 06:10:39 +0000 (22:10 -0800)]
[SimplifyLibCalls] Add __strlen_chk.

Bionic has had `__strlen_chk` for a while. Optimizing that into a
constant is quite profitable, when possible.

Differential Revision: https://reviews.llvm.org/D74079

4 years ago[InstCombine] Fix infinite min/max canonicalization loop (PR44541)
Nikita Popov [Sun, 2 Feb 2020 16:40:15 +0000 (17:40 +0100)]
[InstCombine] Fix infinite min/max canonicalization loop (PR44541)

While D72944 also fixes https://bugs.llvm.org/show_bug.cgi?id=44541,
it does so in a more roundabout manner and there might be other
loopholes to trigger the same issue. This is a more direct fix,
that prevents the transform if the min/max is based on a
non-canonical sub X, 0 instruction.

Differential Revision: https://reviews.llvm.org/D73849

4 years ago[mlir] Add a utility method on CallOpInterface for resolving the callable.
River Riddle [Sat, 8 Feb 2020 18:44:15 +0000 (10:44 -0800)]
[mlir] Add a utility method on CallOpInterface for resolving the callable.

Summary: This is the most common operation performed on a CallOpInterface. This just moves the existing functionality from the CallGraph so that other users can access it.

Differential Revision: https://reviews.llvm.org/D74250

4 years ago[mlir][EDSC] NFC - Move StructuredIndexed and IteratorType out of Linalg
Nicolas Vasilache [Sat, 8 Feb 2020 16:16:22 +0000 (11:16 -0500)]
[mlir][EDSC] NFC - Move StructuredIndexed and IteratorType out of Linalg

Summary:
This NFC revision will allow those classes to be reused to allow
building structured vector operations.

Reviewers: aartbik, ftynse

Subscribers: arphaman, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74279

4 years ago[mlir] Add a document detailing the design of the SymbolTable.
River Riddle [Sat, 8 Feb 2020 18:40:00 +0000 (10:40 -0800)]
[mlir] Add a document detailing the design of the SymbolTable.

Summary: This document provides insight on the rationale and the design of Symbols in MLIR, and why they are necessary.

Differential Revision: https://reviews.llvm.org/D73590

4 years ago[LegalizeTypes][ARM][AArch64][PowerPC][RISCV][X86] Use BUILD_PAIR to return expanded...
Craig Topper [Sat, 8 Feb 2020 07:23:42 +0000 (23:23 -0800)]
[LegalizeTypes][ARM][AArch64][PowerPC][RISCV][X86] Use BUILD_PAIR to return expanded integer results from ReplaceNodeResults instead of just returning two results.

Remove code from LegalizeTypes that allowed this to work.

We were already using BUILD_PAIR for this in some places so this
standardizes on a single way to do this.

4 years ago[X86] X86InstComments - add FMA4 comments
Simon Pilgrim [Sat, 8 Feb 2020 17:01:04 +0000 (17:01 +0000)]
[X86] X86InstComments - add FMA4 comments

These typically match the FMA3 equivalents, although the multiply operands sometimes get flipped due to the FMA3 permute variants.

4 years ago[X86] Standardize BROADCAST enum names (PR31079)
Simon Pilgrim [Sat, 8 Feb 2020 16:54:46 +0000 (16:54 +0000)]
[X86] Standardize BROADCAST enum names (PR31079)

Tweak EVEX implementation names so it matches the other variants by adding the 'r' prefix. Oddly some of the subvec broadcast ops already matched.

4 years ago[InstCombine] Remove unnecessary worklist push; NFCI
Nikita Popov [Sat, 8 Feb 2020 16:08:42 +0000 (17:08 +0100)]
[InstCombine] Remove unnecessary worklist push; NFCI

This is no longer needed after d4627b90a0462c90a834c2f7b9c9228b3ec7a45b,
should have dropped it there...

4 years ago[InstCombine] Avoid modifying instructions in-place
Nikita Popov [Sat, 8 Feb 2020 16:02:10 +0000 (17:02 +0100)]
[InstCombine] Avoid modifying instructions in-place

As discussed on D73919, this replaces a few cases where we were
modifying multiple operands of instructions in-place with the
creation of a new instruction, which we generally prefer nowadays.

This tends to be more readable and less prone to worklist management
bugs.

Test changes are only superficial (instruction naming and order).

4 years ago[InstCombine] Use swapValues(); NFC
Nikita Popov [Sat, 8 Feb 2020 15:57:28 +0000 (16:57 +0100)]
[InstCombine] Use swapValues(); NFC

Less code, and makes it more obvious that these operands do not
need to be added back to the worklist.

4 years ago[InstCombine] Fix infinite loop in min/max load/store bitcast combine (PR44835)
Nikita Popov [Sat, 8 Feb 2020 11:04:58 +0000 (12:04 +0100)]
[InstCombine] Fix infinite loop in min/max load/store bitcast combine (PR44835)

Fixes https://bugs.llvm.org/show_bug.cgi?id=44835. Skip the transform
if it wouldn't actually do anything (apart from removing and reinserting
the same instructions).

Note that the test case doesn't loop on current master anymore, only
on the LLVM 10 release branch. The issue is already mitigated on master
due to worklist order fixes, but we should fix the root cause there as well.

As a side note, we should probably assert in combineLoadToNewType()
that it does not combine to the same type. Not doing this here, because
this assertion would also be triggered in another place right now.

Differential Revision: https://reviews.llvm.org/D74278

4 years agoRegenerate FMA tests
Simon Pilgrim [Sat, 8 Feb 2020 14:31:06 +0000 (14:31 +0000)]
Regenerate FMA tests

4 years agoAdd missing encoding comments from fma scalar folded intrinsics tests
Simon Pilgrim [Sat, 8 Feb 2020 14:01:36 +0000 (14:01 +0000)]
Add missing encoding comments from fma scalar folded intrinsics tests

4 years agoPut back makeArrayRef to make GCC 5 happy
Benjamin Kramer [Sat, 8 Feb 2020 15:15:09 +0000 (16:15 +0100)]
Put back makeArrayRef to make GCC 5 happy

4 years ago[X86] Standardize VPSLLDQ/VPSRLDQ enum names (PR31079)
Simon Pilgrim [Sat, 8 Feb 2020 14:51:10 +0000 (14:51 +0000)]
[X86] Standardize VPSLLDQ/VPSRLDQ enum names (PR31079)

Tweak EVEX implementation names so it matches the other variants

4 years agoDrop some uses of StringLiteral in favor of StringRef
Benjamin Kramer [Sat, 8 Feb 2020 14:45:09 +0000 (15:45 +0100)]
Drop some uses of StringLiteral in favor of StringRef

StringRef can be used in constexpr contexts, so StringLiteral isn't
necessary anymore.

4 years agoRevert "Support -fstack-clash-protection for x86"
serge-sans-paille [Sat, 8 Feb 2020 13:26:22 +0000 (14:26 +0100)]
Revert "Support -fstack-clash-protection for x86"

This reverts commit e229017732bcf1911210903ee9811033d5588e0d.

Failures:

http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/2604
http://lab.llvm.org:8011/builders/llvm-clang-win-x-aarch64/builds/4308

4 years agoRevert "[ARM] Improve codegen of volatile load/store of i64"
Victor Campos [Sat, 8 Feb 2020 11:59:46 +0000 (11:59 +0000)]
Revert "[ARM] Improve codegen of volatile load/store of i64"

This reverts commit 60e0120c913dd1d4bfe33769e1f000a076249a42.

4 years ago[DebugInfo] Allow reading an address table with a mismatched address.
Igor Kudrin [Thu, 6 Feb 2020 10:08:20 +0000 (17:08 +0700)]
[DebugInfo] Allow reading an address table with a mismatched address.

This case does not look as an unrecoverable error.

Differential Revision: https://reviews.llvm.org/D74194

4 years agoSupport -fstack-clash-protection for x86
serge_sans_paille [Mon, 9 Sep 2019 14:59:34 +0000 (16:59 +0200)]
Support -fstack-clash-protection for x86

Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a357a8f685b3540246c5d762734e035f with better option
handling and more portable testing

Differential Revision: https://reviews.llvm.org/D68720

4 years agoUse heterogenous lookup for std;:map<std::string with a StringRef. NFCI.
Benjamin Kramer [Sat, 8 Feb 2020 12:27:52 +0000 (13:27 +0100)]
Use heterogenous lookup for std;:map<std::string with a StringRef. NFCI.

4 years agoAdd missing encoding comments from fma4 folded intrinsics tests
Simon Pilgrim [Sat, 8 Feb 2020 11:23:05 +0000 (11:23 +0000)]
Add missing encoding comments from fma4 folded intrinsics tests