Johannes Doerfert [Mon, 10 Feb 2020 01:06:09 +0000 (19:06 -0600)]
[Attributor][Tests][NFC] Add more range tests
Inspired by https://llvm.discourse.group/t/impossible-condition-optimization/461
Johannes Doerfert [Sun, 26 Jan 2020 02:16:31 +0000 (20:16 -0600)]
[Attributor][NFC] Use existing constant instead of magic one
Craig Topper [Mon, 10 Feb 2020 05:48:00 +0000 (21:48 -0800)]
[X86] Make (insert_vector_elt (v8i16 zerovec), i16 %x, 0) generate the same code as (v8i16 (build_vector %x, 0, 0, 0, 0, 0, 0, 0)).
Instead of using a insrw to element 0, use movzx and movd.
Same for v16i8.
Michael Liao [Mon, 10 Feb 2020 05:41:46 +0000 (00:41 -0500)]
Fix `-Wparentheses` warning. NFC.
Michael Liao [Sun, 9 Feb 2020 18:09:19 +0000 (13:09 -0500)]
[clang][codegen] Fix another lifetime emission on alloca on non-default address space.
- Lifetime intrinsics expect the pointer directly from alloca. Need
extra handling for targets with alloca on non-default (or non-zero)
address space.
Craig Topper [Mon, 10 Feb 2020 04:31:56 +0000 (20:31 -0800)]
[X86] Autogenerate complete checks. NFC
Craig Topper [Mon, 10 Feb 2020 02:35:57 +0000 (18:35 -0800)]
[X86] Use MOVZX instead of MOVSX in f16_to_fp isel patterns.
Using sign extend forces the adjacent element to either all zeros
or all ones. But all ones is a NAN. So that doesn't seem like a
great idea.
Trying to work on supporting this with strict FP where NAN would
definitely be bad.
Shiva Chen [Mon, 3 Feb 2020 05:52:13 +0000 (13:52 +0800)]
[RISCV] Fix incorrect FP base CFI offset for variable argument functions
When the FP exists, the FP base CFI directive offset should take the size of variable arguments into account.
Differential Revision: https://reviews.llvm.org/D73862
Fangrui Song [Mon, 10 Feb 2020 01:28:20 +0000 (17:28 -0800)]
[DebugInfo] Add a DWARFDataExtractor constructor that takes ArrayRef<uint8_t>
Similar to D67797 (DataExtractor).
Matt Arsenault [Fri, 7 Feb 2020 17:24:15 +0000 (12:24 -0500)]
GlobalISel: Fix narrowScalar for G_{CTLZ|CTTZ}_ZERO_UNDEF
Narrow these for 64-bit VALU for AMDGPU.
Matt Arsenault [Sun, 26 Jan 2020 02:10:17 +0000 (21:10 -0500)]
AMDGPU/GlobalISel: Split 64-bit G_CTPOP in RegBankSelect
Matt Arsenault [Fri, 7 Feb 2020 16:55:39 +0000 (11:55 -0500)]
GlobalISel: Fix narrowing of G_CTLZ/G_CTTZ
The result type is separate from the source type.
Matt Arsenault [Thu, 6 Feb 2020 22:18:17 +0000 (17:18 -0500)]
AMDGPU/GlobalISel: Don't mis-select vector index on a constant
Vector indexing with a constant index should be folded out in the
legalizer, but this was accidentally falling through. This would
produce the indexing operation with $noreg. Handle this case as a
dynamic index just in case a bug like this happens again in the
future.
Matt Arsenault [Thu, 6 Feb 2020 21:52:04 +0000 (16:52 -0500)]
AMDGPU/GlobalISel: Look through casts when legalizing vector indexing
We were failing to find constants that were casted. I feel like the
artifact combiner should have folded the constant in the trunc before
the custom lowering, but that doesn't happen.
Matt Arsenault [Mon, 6 Jan 2020 20:57:51 +0000 (15:57 -0500)]
AMDGPU: Remove dead kill handling
At one point a custom node was used for kill handling, but now the
intrinsic is directly selected. Remove leftover pattern machinery.
Matt Arsenault [Fri, 27 Dec 2019 18:11:06 +0000 (13:11 -0500)]
AMDGPU: Fix SI_IF lowering when the save exec reg has terminator uses
Reverts part of
6524a7a2b9ca072bd7f7b4355d1230e70c679d2f. Since that
commit, the expansion was ignoring the actual save exec register
produced by the instruction, and looking at other instructions. I do
not understand why it was looking at other instructions, but relying
on this scan was wrong.
Fixes verifier errors after SI_IF is tail duplicated, which should be
correct to do. The results were fed into a phi, which was lowered to
the S_MOV_B64_term instructions.
Simon Pilgrim [Sun, 9 Feb 2020 21:49:37 +0000 (21:49 +0000)]
[X86] combineConcatVectorOps - combine VROTLI/VROTRI ops
Fix issue mentioned on rGe82e17d4d4ca - non-AVX512BW targets failed to concatenate 256-bit rotations back to 512-bits (split during shuffle lowering as they don't have v32i16/v64i8 types).
Craig Topper [Sun, 9 Feb 2020 21:08:25 +0000 (13:08 -0800)]
[X86] Use custom isel for (X86sbb_flag 0, 0) so we can use 32-bit SBB for i8/i16.
We were using MOV32r0 and an extract_subreg as an input. By using
custom isel we can move the extract_subreg to after the SBB instead
of on the input.
Craig Topper [Sun, 9 Feb 2020 20:31:21 +0000 (12:31 -0800)]
[X86] Add flag result VT to a MOV32r0 created in X86DAGToDAGISel::Select
The flag isn't used, but I believe this matches the MOV32r0 that
would be created by the table emitter. This should allow this node
to be CSEed with any others created by the table.
Simon Pilgrim [Sun, 9 Feb 2020 21:15:03 +0000 (21:15 +0000)]
[X86] Add lowerShuffleAsBitRotate (PR44379)
As noted on PR44379, we didn't attempt to lower vector shuffles using bit rotations on XOP/AVX512F targets.
This patch lowers to uniform ISD:ROTL nodes - ROTR isn't supported by XOP and they are interchangeable for constant values anyway.
There might be cases where targets without ISD:ROTL support would benefit from this (expanding to SRL+SHL+OR), which I'll investigate in a future patch.
Also, non-AVX512BW targets fail to concatenate 256-bit rotations back to 512-bits (split during shuffle lowering as they don't have v32i16/v64i8 types).
Craig Topper [Sun, 9 Feb 2020 19:57:22 +0000 (11:57 -0800)]
[X86] Use MVT::i32 for the type of a MOV32r0 created in X86DAGToDAGISel::Select.
Not sure if this really matters. The VT isn't really used after
this point. At best it might affect CSE.
Craig Topper [Sun, 9 Feb 2020 05:35:21 +0000 (21:35 -0800)]
[X86] Remove isel patterns that include a vselect/X86selects and a strict FP node.
A vselect+strictfp node is not equivalent to a masked operation.
The exceptions of the strictfp node are not masked by a vselect
after it so we can't match it to a masked operation.
We already had a hack in IsLegalToFold to prevent these patterns from
matching. This patch removes that hack and removes the patterns.
Jan Vesely [Wed, 5 Feb 2020 01:14:04 +0000 (20:14 -0500)]
libclc/r600: Use target specific builtins to implement rsqrt and native_rsqrt
Fixes OCL CTS rsqrt and half_rsqrt (1 thread, scalaer) tests on AMD Turks.
Reviewer: awatry
Differential Revision: https://reviews.llvm.org/D74016
Jan Vesely [Wed, 5 Feb 2020 01:09:12 +0000 (20:09 -0500)]
libclc: Move rsqrt implementation to a .cl file
Reviewer: awatry
Differential Revision: https://reviews.llvm.org/D74013
Simon Pilgrim [Sun, 9 Feb 2020 18:35:02 +0000 (18:35 +0000)]
[X86][XOP] Add XOP target to vXi16/vXi8 shuffle tests
Helps with bit rotation test coverage for PR44379
Simon Pilgrim [Sun, 9 Feb 2020 17:51:53 +0000 (17:51 +0000)]
[X86][SSE] Add more tests showing failure to lower shuffles as bit rotations
Simon Pilgrim [Sun, 9 Feb 2020 14:23:19 +0000 (14:23 +0000)]
[X86] Rename matchShuffleAsRotate - matchShuffleAsByteRotate. NFCI.
A matchShuffleAsBitRotate variant will be added soon and we need to make the difference more obvious.
Jan Kratochvil [Sun, 9 Feb 2020 17:13:04 +0000 (18:13 +0100)]
[lldb] [doc] Status: Linux: Update the paragraph
Kamil Rytarowski [Sun, 9 Feb 2020 17:02:07 +0000 (18:02 +0100)]
[LLDB] [doc] Document NetBSD status and sort OSs alphabetically
LLVM GN Syncbot [Sun, 9 Feb 2020 15:41:05 +0000 (15:41 +0000)]
[gn build] Port
a17f03bd939
Sanjay Patel [Sun, 9 Feb 2020 15:04:41 +0000 (10:04 -0500)]
[VectorCombine] new IR transform pass for partial vector ops
We have several bug reports that could be characterized as "reducing scalarization",
and this topic was also raised on llvm-dev recently:
http://lists.llvm.org/pipermail/llvm-dev/2020-January/138157.html
...so I'm proposing that we deal with these patterns in a new, lightweight IR vector
pass that runs before/after other vectorization passes.
There are 4 alternate options that I can think of to deal with this kind of problem
(and we've seen various attempts at all of these), but they all have flaws:
InstCombine - can't happen without TTI, but we don't want target-specific
folds there.
SDAG - too late to assist other vectorization passes; TLI is not equipped
for these kind of cost queries; limited to a single basic block.
CGP - too late to assist other vectorization passes; would need to re-implement
basic cleanups like CSE/instcombine.
SLP - doesn't fit with existing transforms; limited to a single basic block.
This initial patch/transform is based on existing code in AggressiveInstCombine:
we walk backwards through the function looking for a pattern match. But we diverge
from that cost-independent IR canonicalization pass by using TTI to decide if the
vector alternative is profitable.
We probably have at least 10 similar bug reports/patterns (binops, constants,
inserts, cheap shuffles, etc) that would fit in this pass as follow-up enhancements.
It's possible that we could iterate on a worklist to fix-point like InstCombine does,
but it's safer to start with a most basic case and evolve from there, so I didn't
try to do anything fancy with this initial implementation.
Differential Revision: https://reviews.llvm.org/D73480
Jan Kratochvil [Sun, 9 Feb 2020 14:22:36 +0000 (15:22 +0100)]
[lldb] [doc] Status: Debugserver (remote debugging) is OK now
Jan Kratochvil [Sun, 9 Feb 2020 14:11:38 +0000 (15:11 +0100)]
[lldb] [doc] Testing: Fix typos
Kamil Rytarowski [Sun, 9 Feb 2020 13:59:04 +0000 (14:59 +0100)]
[LLDB] [doc] Remove note about libpanel(3) and NetBSD
libpanel(3) is now supported in all supported versions of NetBSD.
Kamil Rytarowski [Sun, 9 Feb 2020 13:57:09 +0000 (14:57 +0100)]
[LLDB] [doc] Update the current status of pkgsrc (NetBSD) building
Jan Kratochvil [Sun, 9 Feb 2020 13:49:38 +0000 (14:49 +0100)]
[lldb] [testsuite] TestGdbRemoteLibrariesSvr4Support: Fix symlinked builddir
When I have symlinked builddir on Fedora 31 x86_64 I get:
FAIL: test_libraries_svr4_libs_present (TestGdbRemoteLibrariesSvr4Support.TestGdbRemoteLibrariesSvr4Support)
----------------------------------------------------------------------
...
File "lldb/packages/Python/lldbsuite/test/tools/lldb-server/libraries-svr4/TestGdbRemoteLibrariesSvr4Support.py", line 106, in
libraries_svr4_libs_present
self.assertIn(self.getBuildDir() + "/" + lib, libraries_svr4_names)
AssertionError:
'/home/jkratoch/redhat/llvm-monorepo-clangassertsymlink/lldb-test-build.noindex/tools/lldb-server/libraries-svr4/TestGdbRemoteLibrariesSvr4Support.test_libraries_svr4_libs_present/libsvr4lib_a.so' not found in ['/home/jkratoch/redhat/llvm-monorepo/lldb/packages/Python/lldbsuite/test/tools/lldb-server/libraries-svr4/linux-vdso.so.1', '/quad/home/jkratoch/redhat/llvm-monorepo-clangassertsymlink/lldb-test-build.noindex/tools/lldb-server/libraries-svr4/TestGdbRemoteLibrariesSvr4Support.test_libraries_svr4_libs_present/libsvr4lib_a.so', '/quad/home/jkratoch/redhat/llvm-monorepo-clangassertsymlink/lldb-test-build.noindex/tools/lldb-server/libraries-svr4/TestGdbRemoteLibrariesSvr4Support.test_libraries_svr4_libs_present/libsvr4lib_b".so', '/usr/lib64/libdl-2.30.so', '/usr/lib64/libstdc++.so.6.0.27', '/usr/lib64/libm-2.30.so', '/usr/lib64/libgcc_s-9-
20190827.so.1', '/usr/lib64/libc-2.30.so', '/usr/lib64/ld-2.30.so']
Config=x86_64-/quad/home/jkratoch/redhat/llvm-monorepo-clangassertsymlink/bin/clang-11
----------------------------------------------------------------------
Differential Revision: https://reviews.llvm.org/D74295
Simon Pilgrim [Sun, 9 Feb 2020 13:35:03 +0000 (13:35 +0000)]
Fix signed/unsigned warning.
Simon Pilgrim [Sun, 9 Feb 2020 12:25:19 +0000 (12:25 +0000)]
[X86] Recognise ROTLI/ROTRI rotations as faux shuffles
Allows us to combine rotations with shuffles.
One of many things necessary to fix PR44379 (lowering shuffles to rotations)
Ehud Katz [Sun, 9 Feb 2020 10:25:21 +0000 (12:25 +0200)]
[LoopExtractor] Convert LoopExtractor from LoopPass to ModulePass
The LoopExtractor created new functions (by definition), which violates
the restrictions of a LoopPass.
The correct implementation of this pass should be as a ModulePass.
Includes reverting rL82990 implications on the LoopExtractor.
Fixes PR3082 and PR8929.
Differential Revision: https://reviews.llvm.org/D69069
Ayman Musa [Tue, 28 Jan 2020 14:31:44 +0000 (16:31 +0200)]
[AggressiveInstCombine] Add test with baseline CHECKs for aggressive inst combine for SELECT.
serge_sans_paille [Mon, 9 Sep 2019 14:59:34 +0000 (16:59 +0200)]
Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.
Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].
This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.
Only implemented for x86.
[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html
This a recommit of
39f50da2a357a8f685b3540246c5d762734e035f with proper LiveIn
declaration, better option handling and more portable testing.
Differential Revision: https://reviews.llvm.org/D68720
serge-sans-paille [Sun, 9 Feb 2020 09:06:31 +0000 (10:06 +0100)]
Revert "Support -fstack-clash-protection for x86"
This reverts commit
0fd51a4554f5f4f90342f40afd35b077f6d88213.
Failures:
http://lab.llvm.org:8011/builders/llvm-clang-win-x-armv7l/builds/4354
serge_sans_paille [Mon, 9 Sep 2019 14:59:34 +0000 (16:59 +0200)]
Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.
Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].
This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.
Only implemented for x86.
[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html
This a recommit of
39f50da2a357a8f685b3540246c5d762734e035f with proper LiveIn
declaration, better option handling and more portable testing.
Differential Revision: https://reviews.llvm.org/D68720
Fangrui Song [Sun, 9 Feb 2020 06:39:09 +0000 (22:39 -0800)]
[ELF][test] Use llvm-readelf -l instead of llvm-readobj -l for some memory region tests
MaheshRavishankar [Sun, 9 Feb 2020 02:23:09 +0000 (18:23 -0800)]
[mlir][GPUToSPIRV] Modify the lowering of gpu.block_dim to be consistent with Vulkan SPEC
The existing lowering of gpu.block_dim added a global variable with
the WorkGroupSize decoration. This raises an error within
Vulkan/SPIR-V validation since Vulkan requires this to have a constant
initializer. This is not yet supported in SPIR-V dialect. Changing the
lowering to return the workgroup size as a constant value instead,
obtained from spv.entry_point_abi attribute gets around the issue for
now. The validation goes through since the workgroup size is specified
using spv.execution_mode operation.
Craig Topper [Sun, 9 Feb 2020 02:56:17 +0000 (18:56 -0800)]
[X86] Add more scalar intrinsic instructions to isNonFoldablePartialRegisterLoad.
I think this covers most if not all of the scalar intrinsic
instructions.
Johannes Doerfert [Wed, 27 Nov 2019 06:30:12 +0000 (00:30 -0600)]
[Attributor] Add an Attributor CGSCC pass and run it
In addition to the module pass, this patch introduces a CGSCC pass that
runs the Attributor on a strongly connected component of the call graph
(both old and new PM). The Attributor was always design to be used on a
subset of functions which makes this patch mostly mechanical.
The one change is that we give up `norecurse` deduction in the module
pass in favor of doing it during the CGSCC pass. This makes the
interfaces simpler but can be revisited if needed.
Reviewed By: hfinkel
Differential Revision: https://reviews.llvm.org/D70767
Fangrui Song [Sun, 9 Feb 2020 03:01:17 +0000 (19:01 -0800)]
Fix -Wunused-lambda-capture for -DLLVM_ENABLE_ASSERTIONS=off builds after
6556c615f3c3aae8af876806777065961ae20024
Johannes Doerfert [Sun, 9 Feb 2020 02:14:01 +0000 (20:14 -0600)]
[FIX] Ordering problem accidentally introduced with D72304
Craig Topper [Sun, 9 Feb 2020 01:46:59 +0000 (17:46 -0800)]
[X86] Add the recently added (V)CVTSS2SI/CVTSD2SI instructions used for LRINT/LLRINT to the load folding tables.
Johannes Doerfert [Sun, 9 Feb 2020 00:58:16 +0000 (18:58 -0600)]
[FIX] Fix warning in LazyCallGraphTest caused by D70927
fady [Sun, 9 Feb 2020 00:54:08 +0000 (18:54 -0600)]
[OpenMP][OMPIRBuilder] Add Directives (master and critical) to OMPBuilder.
Add support for Master and Critical directive in the OMPIRBuilder. Both make use of a new common interface for emitting inlined OMP regions called `emitInlinedRegion` which was added in this patch as well.
Also this patch modifies clang to use the new directives when `-fopenmp-enable-irbuilder` commandline option is passed.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D72304
Johannes Doerfert [Sun, 9 Feb 2020 00:42:24 +0000 (18:42 -0600)]
[OpenMP][Opt] Delete terminating and read-only parallel regions
Parallel regions known to be read-only, e.g., after we removed all dead
write accesses, and terminating (`willreturn`) can be removed.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D69954
Johannes Doerfert [Sun, 9 Feb 2020 00:03:40 +0000 (18:03 -0600)]
[OpenMP][Opt] Annotate known runtime functions and deduplicate more
This adds ~27 more runtime calls to the OpenMPKinds.def file, all with
attributes. We deduplicate 16 of those automatically in function =
thread scope. And we annotate all of them automatically during the
OpenMPOpt discovery step. A test with all omp_XXXX runtime calls to
track annotation coverage is included.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D69984
Nico Weber [Sat, 8 Feb 2020 23:50:51 +0000 (18:50 -0500)]
[gn build] (manually) port
72277ecd62e and the LLVMBuild bit of
9548b74a83
Craig Topper [Sat, 8 Feb 2020 23:52:57 +0000 (15:52 -0800)]
[X86] Use any_fadd/sub/mul/div/sqrt with the AVX512 scalar_*_patterns.
Making sure not to use them with patterns for masked instructions.
Also fix FMA patterns that were matching strict_fma+x86selects to
masked instructions.
River Riddle [Sat, 8 Feb 2020 23:46:02 +0000 (15:46 -0800)]
[mlir][DeclarativeParser] Move several missed parsers over to the declarative form.
Differential Revision: https://reviews.llvm.org/D74283
River Riddle [Sat, 8 Feb 2020 18:01:17 +0000 (10:01 -0800)]
[mlir][DeclarativeParser] Add support for attributes with buildable types.
This revision adds support in the declarative assembly form for printing attributes with buildable types without the type, and moves several more parsers over to the declarative form.
Differential Revision: https://reviews.llvm.org/D74276
Dmitry Murygin [Sat, 8 Feb 2020 23:07:35 +0000 (15:07 -0800)]
[mlir][quantizer] Add gathering of per-axis statistics in quantizer.
Reviewers: stellaraccident, nicolasvasilache
Reviewed By: stellaraccident
Subscribers: Joonsoo, merge_guards_bot, denis13
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73556
River Riddle [Sat, 8 Feb 2020 23:01:34 +0000 (15:01 -0800)]
[mlir] Add support for generating debug locations from intermediate levels of the IR.
Summary:
This revision adds a utility to generate debug locations from the IR during compilation, by snapshotting to a output stream and using the locations that operations were dumped in that stream. The new locations may either;
* Replace the original location of the operation.
old:
loc("original_source.cpp":1:1)
new:
loc("snapshot_source.mlir":10:10)
* Fuse with the original locations as NamedLocs with a specific tag.
old:
loc("original_source.cpp":1:1)
new:
loc(fused["original_source.cpp":1:1, "snapshot"("snapshot_source.mlir":10:10)])
This feature may be used by a debugger to display the code at various different levels of the IR. It would also be able to show the different levels of IR attached to a specific source line in the original source file.
This feature may also be used to generate locations for operations generated during compilation, that don't necessarily have a user source location to attach to.
This requires changes in the printer to track the locations of operations emitted in the stream. Moving forward we need to properly(and efficiently) track the number of newlines emitted to the stream during printing.
Differential Revision: https://reviews.llvm.org/D74019
Fangrui Song [Sat, 8 Feb 2020 22:18:43 +0000 (14:18 -0800)]
[gn build] Add OpenMPOpt.cpp to LLVMipo after D69930/
9548b74a831e
Fangrui Song [Sat, 8 Feb 2020 22:10:29 +0000 (14:10 -0800)]
[ELF] Simplify parsing of version dependency. NFC
Simon Pilgrim [Sat, 8 Feb 2020 21:24:01 +0000 (21:24 +0000)]
Fix test name typo
Nikita Popov [Sat, 8 Feb 2020 19:41:10 +0000 (20:41 +0100)]
[InstCombine] Refactor foldICmpAndShift(); NFCI
Separate out handling for shl, lshr and ashr. The combined handling
obscured some overly pessimistic requirements for the transform.
Johannes Doerfert [Sat, 8 Feb 2020 21:22:03 +0000 (15:22 -0600)]
[FIX] Update PM tests after D69930 landed
Simon Pilgrim [Sat, 8 Feb 2020 21:02:04 +0000 (21:02 +0000)]
[X86][SSE] Add test cases from PR44379
Simon Pilgrim [Sat, 8 Feb 2020 20:44:41 +0000 (20:44 +0000)]
[X86] Test showing inability to combine ROTLI/ROTRI rotations into shuffles
One of many things necessary to fix PR44379 (lowering shuffles to rotations)
Johannes Doerfert [Thu, 7 Nov 2019 05:20:06 +0000 (23:20 -0600)]
[OpenMP] Introduce the OpenMPOpt transformation pass
The OpenMPOpt pass is a CGSCC pass in which OpenMP specific
optimizations can reside.
The OpenMPOpt pass uses the OpenMPKinds.def file to identify runtime
calls and their uses. This allows targeted transformations and eases
their implementation.
This initial patch deduplicates `__kmpc_global_thread_num` and
`omp_get_thread_num` calls. We can also identify arguments that are
equivalent to such a call result and use it instead. Later we can
determine "gtid" arguments based on the use in kernel functions etc.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D69930
Johannes Doerfert [Fri, 29 Nov 2019 19:11:24 +0000 (13:11 -0600)]
Introduce a CallGraph updater helper class
The CallGraphUpdater is a helper that simplifies the process of updating
the call graph, both old and new style, while running an CGSCC pass.
The uses are contained in different commits, e.g. D70767.
More functionality is added as we need it.
Reviewed By: modocache, hfinkel
Differential Revision: https://reviews.llvm.org/D70927
George Burgess IV [Wed, 5 Feb 2020 06:10:39 +0000 (22:10 -0800)]
[SimplifyLibCalls] Add __strlen_chk.
Bionic has had `__strlen_chk` for a while. Optimizing that into a
constant is quite profitable, when possible.
Differential Revision: https://reviews.llvm.org/D74079
Nikita Popov [Sun, 2 Feb 2020 16:40:15 +0000 (17:40 +0100)]
[InstCombine] Fix infinite min/max canonicalization loop (PR44541)
While D72944 also fixes https://bugs.llvm.org/show_bug.cgi?id=44541,
it does so in a more roundabout manner and there might be other
loopholes to trigger the same issue. This is a more direct fix,
that prevents the transform if the min/max is based on a
non-canonical sub X, 0 instruction.
Differential Revision: https://reviews.llvm.org/D73849
River Riddle [Sat, 8 Feb 2020 18:44:15 +0000 (10:44 -0800)]
[mlir] Add a utility method on CallOpInterface for resolving the callable.
Summary: This is the most common operation performed on a CallOpInterface. This just moves the existing functionality from the CallGraph so that other users can access it.
Differential Revision: https://reviews.llvm.org/D74250
Nicolas Vasilache [Sat, 8 Feb 2020 16:16:22 +0000 (11:16 -0500)]
[mlir][EDSC] NFC - Move StructuredIndexed and IteratorType out of Linalg
Summary:
This NFC revision will allow those classes to be reused to allow
building structured vector operations.
Reviewers: aartbik, ftynse
Subscribers: arphaman, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, arpith-jacob, mgester, lucyrfox, liufengdb, Joonsoo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74279
River Riddle [Sat, 8 Feb 2020 18:40:00 +0000 (10:40 -0800)]
[mlir] Add a document detailing the design of the SymbolTable.
Summary: This document provides insight on the rationale and the design of Symbols in MLIR, and why they are necessary.
Differential Revision: https://reviews.llvm.org/D73590
Craig Topper [Sat, 8 Feb 2020 07:23:42 +0000 (23:23 -0800)]
[LegalizeTypes][ARM][AArch64][PowerPC][RISCV][X86] Use BUILD_PAIR to return expanded integer results from ReplaceNodeResults instead of just returning two results.
Remove code from LegalizeTypes that allowed this to work.
We were already using BUILD_PAIR for this in some places so this
standardizes on a single way to do this.
Simon Pilgrim [Sat, 8 Feb 2020 17:01:04 +0000 (17:01 +0000)]
[X86] X86InstComments - add FMA4 comments
These typically match the FMA3 equivalents, although the multiply operands sometimes get flipped due to the FMA3 permute variants.
Simon Pilgrim [Sat, 8 Feb 2020 16:54:46 +0000 (16:54 +0000)]
[X86] Standardize BROADCAST enum names (PR31079)
Tweak EVEX implementation names so it matches the other variants by adding the 'r' prefix. Oddly some of the subvec broadcast ops already matched.
Nikita Popov [Sat, 8 Feb 2020 16:08:42 +0000 (17:08 +0100)]
[InstCombine] Remove unnecessary worklist push; NFCI
This is no longer needed after
d4627b90a0462c90a834c2f7b9c9228b3ec7a45b,
should have dropped it there...
Nikita Popov [Sat, 8 Feb 2020 16:02:10 +0000 (17:02 +0100)]
[InstCombine] Avoid modifying instructions in-place
As discussed on D73919, this replaces a few cases where we were
modifying multiple operands of instructions in-place with the
creation of a new instruction, which we generally prefer nowadays.
This tends to be more readable and less prone to worklist management
bugs.
Test changes are only superficial (instruction naming and order).
Nikita Popov [Sat, 8 Feb 2020 15:57:28 +0000 (16:57 +0100)]
[InstCombine] Use swapValues(); NFC
Less code, and makes it more obvious that these operands do not
need to be added back to the worklist.
Nikita Popov [Sat, 8 Feb 2020 11:04:58 +0000 (12:04 +0100)]
[InstCombine] Fix infinite loop in min/max load/store bitcast combine (PR44835)
Fixes https://bugs.llvm.org/show_bug.cgi?id=44835. Skip the transform
if it wouldn't actually do anything (apart from removing and reinserting
the same instructions).
Note that the test case doesn't loop on current master anymore, only
on the LLVM 10 release branch. The issue is already mitigated on master
due to worklist order fixes, but we should fix the root cause there as well.
As a side note, we should probably assert in combineLoadToNewType()
that it does not combine to the same type. Not doing this here, because
this assertion would also be triggered in another place right now.
Differential Revision: https://reviews.llvm.org/D74278
Simon Pilgrim [Sat, 8 Feb 2020 14:31:06 +0000 (14:31 +0000)]
Regenerate FMA tests
Simon Pilgrim [Sat, 8 Feb 2020 14:01:36 +0000 (14:01 +0000)]
Add missing encoding comments from fma scalar folded intrinsics tests
Benjamin Kramer [Sat, 8 Feb 2020 15:15:09 +0000 (16:15 +0100)]
Put back makeArrayRef to make GCC 5 happy
Simon Pilgrim [Sat, 8 Feb 2020 14:51:10 +0000 (14:51 +0000)]
[X86] Standardize VPSLLDQ/VPSRLDQ enum names (PR31079)
Tweak EVEX implementation names so it matches the other variants
Benjamin Kramer [Sat, 8 Feb 2020 14:45:09 +0000 (15:45 +0100)]
Drop some uses of StringLiteral in favor of StringRef
StringRef can be used in constexpr contexts, so StringLiteral isn't
necessary anymore.
serge-sans-paille [Sat, 8 Feb 2020 13:26:22 +0000 (14:26 +0100)]
Revert "Support -fstack-clash-protection for x86"
This reverts commit
e229017732bcf1911210903ee9811033d5588e0d.
Failures:
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/2604
http://lab.llvm.org:8011/builders/llvm-clang-win-x-aarch64/builds/4308
Victor Campos [Sat, 8 Feb 2020 11:59:46 +0000 (11:59 +0000)]
Revert "[ARM] Improve codegen of volatile load/store of i64"
This reverts commit
60e0120c913dd1d4bfe33769e1f000a076249a42.
Igor Kudrin [Thu, 6 Feb 2020 10:08:20 +0000 (17:08 +0700)]
[DebugInfo] Allow reading an address table with a mismatched address.
This case does not look as an unrecoverable error.
Differential Revision: https://reviews.llvm.org/D74194
serge_sans_paille [Mon, 9 Sep 2019 14:59:34 +0000 (16:59 +0200)]
Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.
Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].
This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.
Only implemented for x86.
[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html
This a recommit of
39f50da2a357a8f685b3540246c5d762734e035f with better option
handling and more portable testing
Differential Revision: https://reviews.llvm.org/D68720
Benjamin Kramer [Sat, 8 Feb 2020 12:27:52 +0000 (13:27 +0100)]
Use heterogenous lookup for std;:map<std::string with a StringRef. NFCI.
Simon Pilgrim [Sat, 8 Feb 2020 11:23:05 +0000 (11:23 +0000)]
Add missing encoding comments from fma4 folded intrinsics tests
Benjamin Kramer [Sat, 8 Feb 2020 11:14:37 +0000 (12:14 +0100)]
ArrayRef'ize spillCalleeSavedRegisters. NFCI.
Simon Pilgrim [Sat, 8 Feb 2020 10:56:27 +0000 (10:56 +0000)]
[X86][SSE] Add X86ISD::FRCP handling to isNegatibleForFree
Peek through X86ISD::FRCP nodes to see if there is a negatible input.
Simon Pilgrim [Sat, 8 Feb 2020 10:40:49 +0000 (10:40 +0000)]
[X86][SSE] Show isNegatibleForFree inability to peek through X86ISD::FRCP
We can safely negate the input of RCP but we can't peek through it.
Simon Pilgrim [Sat, 8 Feb 2020 08:55:51 +0000 (08:55 +0000)]
[TargetLowering] Remove isDesirableToCombineBuildVectorToShuffleTruncate target hook. NFC.
This hasn't been used for years, its original implementation, D35700, had bugs that caused the reversion of most of the code, and since then x86 shuffle lowering/combining has handled most cases and can deal with the rest as well.
Fangrui Song [Sat, 8 Feb 2020 07:30:26 +0000 (23:30 -0800)]
[Driver][test] Create empty file Inputs/basic_cross_linux_tree/usr/x86_64-unknown-linux-gnu/bin/ld.lld
To make lto.c and lto.cu pass for systems that do not have ld.lld
Fangrui Song [Fri, 7 Feb 2020 22:17:36 +0000 (14:17 -0800)]
[Driver] Don't pass -plugin LLVMgold.so when the linker is ld.lld
This is does not cover the case when ld is lld (e.g. /usr/bin/ld on
modern FreeBSD systems).
Fangrui Song [Sat, 8 Feb 2020 05:42:14 +0000 (21:42 -0800)]
[Driver][test] Refactor LLVMgold tests
LLVMgold.so tests are duplicated in several places. Deduplicate them.
Move the tests to lto.c and lto.cu
Specify -fuse-ld=bfd or -fuse-ld=gold.
In a future change, if -fuse-ld=lld or CLANG_DEFAULT_LINKER=lld without -fuse-ld=, we will remove -plugin /path/to/LLVMgold.so
Fangrui Song [Sat, 8 Feb 2020 06:27:22 +0000 (22:27 -0800)]
[Driver][test] Fix Driver/hexagon-toolchain-elf.c for -DCLANG_DEFAULT_LINKER=lld builds after
305bf5b21dbdb2345ef86b5700285e42d992c954