Zequan Wu [Wed, 14 Oct 2020 01:40:45 +0000 (18:40 -0700)]
[llvm-cov] don't include all source files when provided source files are filtered out
When all provided source files are filtered out either due to `--ignore-filename-regex` or not part of binary, don't generate coverage reults for all source files. Because if users want to generate coverage results for all source files, they don't even need to provid selected source files or `--ignore-filename-regex`.
Differential Revision: https://reviews.llvm.org/D89359
Rob Suderman [Fri, 16 Oct 2020 00:01:06 +0000 (17:01 -0700)]
[MLIR] Fix gcc5 in D89161
Missing .str() makes gcc5 unable to infer the template to use.
Differential Revision: https://reviews.llvm.org/D89516
Richard Smith [Thu, 15 Oct 2020 23:50:13 +0000 (16:50 -0700)]
Switch the default of VerifyIntegerConstantExpression from constant
folding to not constant folding.
Constant folding of ICEs is done as a GCC compatibility measure, but new
code was picking it up, presumably by accident, due to the bad default.
While here, also switch the flag from a bool to an enum to make it more
obvious what it means at call sites. This highlighted a couple of places
where our behavior is different between C++11 and C++14 due to switching
from checking for an ICE to checking for a converted constant
expression (where there is no 'fold' codepath).
Rob Suderman [Fri, 9 Oct 2020 20:32:01 +0000 (13:32 -0700)]
[mlir] RewriterGen NativeCodeCall matcher with ConstantOp matcher
Added an underlying matcher for generic constant ops. This
included a rewriter of RewriterGen to make variable use more
clear.
Differential Revision: https://reviews.llvm.org/D89161
Vedant Kumar [Tue, 13 Oct 2020 21:35:29 +0000 (21:35 +0000)]
[PM/CC1] Add -f[no-]split-cold-code CC1 option to toggle splitting
This patch adds -f[no-]split-cold-code CC1 options to clang. This allows
the splitting pass to be toggled on/off. The current method of passing
`-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose
correctly (say, with `-O0` or `-Oz`).
To implement the -fsplit-cold-code option, an attribute is applied to
functions to indicate that they may be considered for splitting. This
removes some complexity from the old/new PM pipeline builders, and
behaves as expected when LTO is enabled.
Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org>
Differential Revision: https://reviews.llvm.org/D57265
Reviewed By: Aditya Kumar, Vedant Kumar
Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar
Jessica Paquette [Wed, 14 Oct 2020 22:19:52 +0000 (15:19 -0700)]
[AArch64][GlobalISel] NFC: Refactor emitIntegerCompare
Simplify emitIntegerCompare and improve comments + asserts.
Mostly making the code a little easier to follow.
Also, this code is only used for G_ICMP. The legalizer ensures that the LHS/RHS
for every G_ICMP is either a s32 or s64. So, there's no need to handle anything
else. This lets us remove a bunch of checks for whether or not we successfully
emitted the compare.
Differential Revision: https://reviews.llvm.org/D89433
Amara Emerson [Fri, 9 Oct 2020 17:41:35 +0000 (10:41 -0700)]
[GlobalISel] Remove scalar src from non-sequential fadd/fmul reductions.
It's probably better to split these into separate G_FADD/G_FMUL + G_VECREDUCE
operations in the translator rather than carrying the scalar around. The
majority of the time it'll get simplified away as the scalars are probably
identity values.
Differential Revision: https://reviews.llvm.org/D89150
Thomas Raoux [Thu, 15 Oct 2020 16:47:58 +0000 (09:47 -0700)]
[mlir][vector] Add unrolling patterns for Transfer read/write
Adding unroll support for transfer read and transfer write operation. This
allows to pick the ideal size for the memory access for a given target.
Differential Revision: https://reviews.llvm.org/D89289
David Blaikie [Thu, 15 Oct 2020 22:15:53 +0000 (15:15 -0700)]
Add missing 'override'
Fangrui Song [Thu, 15 Oct 2020 22:11:45 +0000 (15:11 -0700)]
[CGBuiltin] Respect asm labels and redefine_extname for builtins with specialized emitting
rL131311 added `asm()` support for builtin functions, but `asm()` for builtins with
specialized emitting (e.g. memcpy, various math functions) still do not work.
This patch makes these functions work for `asm()` and `#pragma redefine_extname`.
glibc uses `asm()` to redirect internal libc function calls to hidden aliases.
Limitation: such a function is a builtin in clang, but will not be recognized as
a libcall in optimization passes because Clang does not annotate the renamed
function as a libcall. In GCC -O1 or above, `abs` can be optimized out but we can't.
Additionally, we cannot redirect `__builtin_sin` to `real_sin` in the following example:
double sin(double x) asm("real_sin");
double f(double d) { return __builtin_sin(d); }
---
According to @rsmith, the following three statements cannot be simultaneously true:
(1) The frontend function foo has known, builtin semantics X.
(2) The symbol foo has known, builtin semantics X.
(3) It's not correct to lower a call to the frontend function foo to the symbol foo.
People do want (1) (if it is profitable to expand a memcpy, do it).
This also means that people do not want to add -fno-builtin-memcpy.
People do want (3): that is why they use asm("__GI_memcpy") in the first place.
So unfortunately we make a compromise by not refuting (2) (see the limitation above).
For most libcalls, there is a small loss because compilers don't synthesize them.
For the few glibc cares about, it uses `asm("memcpy = __GI_memcpy");` to make
the assembly level redirection.
(Changing function names (e.g. `__memcpy`) is a hit to ergonomics which is not acceptable).
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D88712
Reid Kleckner [Wed, 14 Oct 2020 02:58:39 +0000 (19:58 -0700)]
[MS] Apply `inreg` to AArch64 sret parms on instance methods
The documentation rules indicate that instance methods should return
large, trivially copyable aggregates via X1/X0 and not X8 as is normally
done when returning such structs from free functions:
https://docs.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions?view=vs-2019#return-values
Fixes PR47836, a bug in the initial implementation of these rules.
I tried to simplify the logic a bit as well while I'm here.
Differential Revision: https://reviews.llvm.org/D89362
Jim Ingham [Thu, 15 Oct 2020 21:28:06 +0000 (14:28 -0700)]
Add an SB API to get the SBTarget from an SBBreakpoint
Differential Revision: https://reviews.llvm.org/D89358
Kazushi (Jam) Marukawa [Thu, 15 Oct 2020 14:35:34 +0000 (23:35 +0900)]
[VE] Add VGT/VSC/PFCHV instructions
Add VGT/VSC/PFCHV vector instructions and regression tests.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D89471
Kazushi (Jam) Marukawa [Sun, 11 Oct 2020 08:33:47 +0000 (17:33 +0900)]
[VE] Support fabs/fcos/fsin/fsqrt math functions
VE doesn't have instruction for fabs/fcos/fsin/fsqrt, so expand them.
Add regression tests also. Update fcopysign regression test, also.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D89457
Yaxun (Sam) Liu [Thu, 15 Oct 2020 20:07:38 +0000 (16:07 -0400)]
Revert "[HIP] Change default --gpu-max-threads-per-block value to 1024"
This reverts commit
187658b8a6112446d9e7797d495bc7542ac83905 due to
AMDGPU backend issues.
Leonard Chan [Thu, 15 Oct 2020 21:21:56 +0000 (14:21 -0700)]
Revert "[clang] Add -fc++-abi= flag for specifying which C++ ABI to use"
This reverts commits
683b308c07bf827255fe1403056413f790e03729 and
8487bfd4e9ae186f9f588ef989d27a96cc2438c9.
We will go for a more restricted approach that does not give freedom to
everyone to change ABIs on whichever platform.
See the discussion on https://reviews.llvm.org/D85802.
Thomas Lively [Thu, 15 Oct 2020 21:18:22 +0000 (21:18 +0000)]
[WebAssembly] Prototype i8x16.popcnt
As proposed at https://github.com/WebAssembly/simd/pull/379. Use a target
builtin and intrinsic rather than normal codegen patterns to make the
instruction opt-in until it is merged to the proposal and stabilized in engines.
Differential Revision: https://reviews.llvm.org/D89446
Jameson Nash [Thu, 15 Oct 2020 20:57:21 +0000 (16:57 -0400)]
fix symbol printing on windows
Similar to MCSymbol::print in
3d6c8ebb584375d01b1acead4c2056b3f0c501fc
(llvm-svn: 81682, PR4966), these symbols may need to be quoted to be handled by
the linker correctly.
Reviewed By: compnerd
Differential Revision: https://reviews.llvm.org/D87099
Florian Hahn [Thu, 15 Oct 2020 20:50:56 +0000 (21:50 +0100)]
[LoopVersion] Unify SCEVChecks and alias check handling (NFC).
This is an initial cleanup of the way LoopVersioning interacts with LAA.
Currently LoopVersioning has 2 ways of initializing things:
1. Passing LAI and passing UseLAIChecks = true
2. Passing UseLAIChecks = false, followed by calling setSCEVChecks and
setAliasChecks.
Both ways of initializing lead to the same result and the duplication
seems more complicated than necessary.
This patch removes the UseLAIChecks flag from the constructor and the
setSCEVChecks & setAliasChecks helpers and move initialization
exclusively to the constructor.
This simplifies things, by providing a single way to initialize
LoopVersioning and reducing duplication.
Reviewed By: Meinersbur, lebedev.ri
Differential Revision: https://reviews.llvm.org/D84406
Yitzhak Mandelbaum [Thu, 15 Oct 2020 14:32:13 +0000 (14:32 +0000)]
[libTooling] Change `after` range-selector to operate only on source ranges
Currently, `after` fails when applied to locations in macro arguments. This
change projects the subrange into a file source range and then applies `after`.
Differential Revision: https://reviews.llvm.org/D89468
Richard Smith [Thu, 15 Oct 2020 20:32:00 +0000 (13:32 -0700)]
PR47864: Fix assertion in pointer-to-member emission if there are
multiple declarations of the same base class.
Michael Jones [Mon, 12 Oct 2020 17:03:19 +0000 (17:03 +0000)]
[libc] Use entrypoints.txt as the single source of list of functions for a platform.
The function listings in api.td are removed. The same lists are now deduced using the information
in entrypoints.txt.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D89267
alex-t [Wed, 14 Oct 2020 15:34:07 +0000 (18:34 +0300)]
[AMDGPU] SILowerControlFlow::removeMBBifRedundant should not try to change MBB layout if it can fallthrough
removeMBBifRedundant normally tries to keep predecessors fallthrough when removing redundant MBB.
It has to change MBBs layout to keep the new successor to immediately follow the predecessor of removed MBB.
It only may be allowed in case the new successor itself has no successors to which it fall through.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D89397
Roman Lebedev [Thu, 15 Oct 2020 18:41:01 +0000 (21:41 +0300)]
[NFC][IndVars] Autogenerate check lines in tests being affected by upcoming patch
Roman Lebedev [Thu, 15 Oct 2020 10:11:14 +0000 (13:11 +0300)]
[NFC][LSR] Autogenerate check lines in tests being affected by upcoming patch
Roman Lebedev [Thu, 15 Oct 2020 10:11:02 +0000 (13:11 +0300)]
[NFC][SCEV] Autogenerate check lines in tests being affected by upcoming patch
Nico Weber [Thu, 15 Oct 2020 20:14:09 +0000 (16:14 -0400)]
[gn bulid] Remove phantom WebAssembly tablegen() calls
Apparenlty I added these in https://reviews.llvm.org/rL350628 but
I'm not sure why. They never existed in the CMake build, and now
they're causing trouble.
Evgenii Stepanov [Fri, 2 Oct 2020 20:09:13 +0000 (13:09 -0700)]
[AArch64] Stack frame reordering.
Implement stack frame reordering in the AArch64 backend.
Unlike the X86 implementation, AArch64 does not seem to benefit from
"access density" based frame reordering, mainly because it has a much
smaller variety of addressing modes, and the fact that all instructions
are 4 bytes so each frame object is either in range of an instruction
(and then the access is "free") or not (and that has a code size cost
of 4 bytes).
This change improves Memory Tagging codegen by
* Placing an object that has been chosen as the base tagged pointer of
the function at SP + 0. This saves one instruction to setup the pointer
(IRG does not have an offset immediate), and more because that object
can now be referenced without materializing its tagged address in a
scratch register.
* Placing objects that go out of scope simultaneously together. This
exposes opportunities for instruction merging in tryMergeAdjacentSTG.
Differential Revision: https://reviews.llvm.org/D72366
Evgenii Stepanov [Fri, 10 Apr 2020 22:34:11 +0000 (15:34 -0700)]
[MTE] Pin the tagged base pointer to one of the stack slots.
Summary:
Pin the tagged base pointer to one of the stack slots, and (if
necessary) rewrite tag offsets so that an object that occupies that
slot has both address and tag offsets of 0. This allows ADDG
instructions for that object to be eliminated and their uses replaced
with the tagged base pointer itself.
This optimization must be done in machine instructions and not in the IR
instrumentation pass, because referring to a stack slot through an IRG
pointer would confuse the stack coloring pass.
The optimization makes a (pretty naive) attempt to find the slot that
would benefit the most by counting the uses of stack slots in the
function.
Reviewers: ostannard, pcc
Subscribers: merge_guards_bot, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72365
Stanislav Mekhanoshin [Thu, 15 Oct 2020 17:48:46 +0000 (10:48 -0700)]
[AMDGPU] gfx1032 target
Differential Revision: https://reviews.llvm.org/D89487
Thomas Lively [Thu, 15 Oct 2020 19:32:34 +0000 (19:32 +0000)]
Reland "[WebAssembly] v128.load{8,16,32,64}_lane instructions"
This reverts commit
7c8385a352ba21cb388046290d93b53dc273cd9f with a typing fix
to an instruction selection pattern.
Erik Pilkington [Thu, 15 Oct 2020 18:05:01 +0000 (14:05 -0400)]
[SemaObjC] Fix composite pointer type calculation for `void*` and pointer to lifetime qualified ObjC pointer type
Fixes a regression introduced in
9a6f4d451ca7. rdar://
70101809
Differential revision: https://reviews.llvm.org/D89475
Sean Silva [Wed, 14 Oct 2020 18:26:22 +0000 (11:26 -0700)]
[mlir] Add std.tensor_to_memref op and teach the infra about it
The opposite of tensor_to_memref is tensor_load.
- Add some basic tensor_load/tensor_to_memref folding.
- Add source/target materializations to BufferizeTypeConverter.
- Add an example std bufferization pattern/pass that shows how the
materialiations work together (more std bufferization patterns to come
in subsequent commits).
- In coming commits, I'll document how to write composable
bufferization passes/patterns and update the other in-tree
bufferization passes to match this convention. The populate* functions
will of course continue to be exposed for power users.
The naming on tensor_load/tensor_to_memref and their pretty forms are
not very intuitive. I'm open to any suggestions here. One key
observation is that the memref type must always be the one specified in
the pretty form, since the tensor type can be inferred from the memref
type but not vice-versa.
With this, I've been able to replace all my custom bufferization type
converters in npcomp with BufferizeTypeConverter!
Part of the plan discussed in:
https://llvm.discourse.group/t/what-is-the-strategy-for-tensor-memref-conversion-bufferization/1938/17
Differential Revision: https://reviews.llvm.org/D89437
Sean Silva [Thu, 15 Oct 2020 00:07:58 +0000 (17:07 -0700)]
[mlir] Fix typo in LangRef
Martin Storsjö [Tue, 6 Oct 2020 10:54:49 +0000 (13:54 +0300)]
Reapply [LLD] [COFF] Implement a GNU/ELF like -wrap option
Add a simple forwarding option in the MinGW frontend, and implement
the private -wrap option in the COFF linker.
The feature in lld-link isn't gated by the -lldmingw option, but
the option is left as a private, undocumented option primarily
used by the MinGW driver.
The implementation is significantly based on the support for --wrap
in the ELF linker, but many small nuance details are different
between the ELF and COFF linkers, ending up with more than a few
implementation differences.
This fixes https://bugs.llvm.org/show_bug.cgi?id=47384.
Differential Revision: https://reviews.llvm.org/D89004
Reapplied with the bitfield member canInline fixed so it doesn't break
builds targeting windows.
Sanjay Patel [Thu, 15 Oct 2020 18:35:39 +0000 (14:35 -0400)]
[InstCombine] update tests for logic folds to exercise commuted patterns; NFC
This was the intent for D88551.
I also varied the types a bit for extra coverage
and tried to make better test/value names.
Anh Tuyen Tran [Thu, 15 Oct 2020 18:37:29 +0000 (18:37 +0000)]
[NFC][CaptureTracking] Move static function isNonEscapingLocalObject to llvm namespace
Function isNonEscapingLocalObject is a static one within BasicAliasAnalysis.cpp.
It wraps around PointerMayBeCaptured of CaptureTracking, checking whether a pointer
is to a function-local object, which never escapes from the function.
Although at the moment, isNonEscapingLocalObject is used only by BasicAliasAnalysis,
its functionality can be used by other pass(es), one of which I will put up for review
very soon. Instead of copying the contents of this static function, I move it to llvm
scope, and place it amongst other functions with similar functionality in CaptureTracking.
The rationale for the location are:
- Pointer escape and pointer being captured are actually two sides of the same coin
- isNonEscapingLocalObject is wrapping around another function in CaptureTracking
Reviewed By: jdoerfert (Johannes Doerfert)
Differential Revision: https://reviews.llvm.org/D89465
Konstantin Zhuravlyov [Thu, 15 Oct 2020 18:01:20 +0000 (14:01 -0400)]
Make sure both cc1 and cc1as process -m[no-]code-object-v3
Differential Revision: https://reviews.llvm.org/D89478
Louis Dionne [Thu, 15 Oct 2020 17:27:27 +0000 (13:27 -0400)]
[libc++] Reduce dependencies on <iostream> from <random>
We included <istream> and <ostream> from <random>, but really it is
sufficient to include <iosfwd> if we make sure we access ios_base
members through a dependent type. This allows us to break a hard
dependency of <random> on locales.
peter klausler [Wed, 14 Oct 2020 22:57:49 +0000 (15:57 -0700)]
[flang][msvc] Avoid a reinterpret_cast
The call to the binary->decimal formatter in real.cpp was cheating
by using a reinterpret_cast<> to extract its binary value.
Use a more principled and portable approach by extending the
API of evaluate::Integer<> to include ToUInt<>()/ToSInt<>()
member function templates that do the "right" thing. Retain
ToUInt64()/ToSInt64() for compatibility.
Differential revision: https://reviews.llvm.org/D89435
Nicolas Vasilache [Thu, 15 Oct 2020 17:29:50 +0000 (17:29 +0000)]
[mlir][Linalg] NFC - Rename test files s/_/-/g
Arthur Eubanks [Thu, 15 Oct 2020 17:25:56 +0000 (10:25 -0700)]
Revert "[LLD] [COFF] Implement a GNU/ELF like -wrap option"
This reverts commit
a012c704b5e5b60f9d2a7304d27cbc84a3619571.
Breaks Windows builds.
C:\src\llvm-mint\lld\COFF\Symbols.cpp(26,1): error: static_assert failed due to requirement 'sizeof(lld::coff::SymbolUnion) <= 48' "symbols should be optimized for memory usage"
static_assert(sizeof(SymbolUnion) <= 48,
David Green [Thu, 15 Oct 2020 17:21:41 +0000 (18:21 +0100)]
[LV] Add a getRecurrenceBinOp and make use of it. NFC
Louis Dionne [Thu, 15 Oct 2020 17:14:22 +0000 (13:14 -0400)]
[libc++][filesystem] Only include <fstream> when we actually need it in copy_file_impl
This allows building <filesystem> on systems that don't support <fstream>,
such as systems that don't support localization.
Sanjay Patel [Thu, 15 Oct 2020 17:12:38 +0000 (13:12 -0400)]
[CostModel] remove cost-kind predicate for ctlz/cttz intrinsics in basic TTI implementation
The cost modeling for intrinsics is a patchwork based on different
expectations from the callers, so it's a mess. I'm hoping to untangle
this to allow canonicalization to the new min/max intrinsics in IR.
The general goal is to remove the cost-kind restriction here in the
basic implementation class. Ie, if some intrinsic has throughput cost
of 104, assume that it has the same size, latency, and blended costs.
Effectively, an intrinsic with cost N is composed of N simple
instructions. If that's not correct, the target should provide a more
accurate override.
The x86-64 SSE2 subtarget cost diffs require explanation:
1. The scalar ctlz/cttz are assuming "BSR+XOR+CMOV" or
"TEST+BSF+CMOV/BRANCH", so not cheap.
2. The 128-bit SSE vector width versions assume cost of 18 or 26
(no explanation provided in the tables, but this corresponds to a
bunch of shift/logic/compare).
3. The 512-bit vectors in the test file are scaled up by a factor of
4 from the legal vector width costs.
4. The plain latency cost-kind is not affected in this patch because
that calc is diverted before we get to getIntrinsicInstrCost().
Differential Revision: https://reviews.llvm.org/D89461
Hiroshi Yamauchi [Fri, 2 Oct 2020 20:00:40 +0000 (13:00 -0700)]
[PGO] Remove the old memop value profiling buckets.
Following up D81682 and D83903, remove the code for the old value profiling
buckets, which have been replaced with the new, extended buckets and disabled by
default.
Also syncing InstrProfData.inc between compiler-rt and llvm.
Differential Revision: https://reviews.llvm.org/D88838
Louis Dionne [Thu, 15 Oct 2020 16:54:50 +0000 (12:54 -0400)]
[libc++] NFC: Remove unused include
Louis Dionne [Thu, 15 Oct 2020 14:32:09 +0000 (10:32 -0400)]
[libc++] Allow building libc++ on platforms without a random device
Some platforms, like several embedded platforms, do not provide a source
of randomness through a random device. This commit makes it possible to
build and test libc++ for such platforms, i.e. without std::random_device.
Surprisingly, the only functionality that doesn't work on such platforms
is std::random_device itself -- everything else in <random> still works,
one just has to find alternative ways to seed the PRNGs.
Sanjay Patel [Thu, 15 Oct 2020 15:49:58 +0000 (11:49 -0400)]
[x86] add no 'unwind' to reduce test noise; NFC
I suggested this in D89412, but the comment was missed.
Thomas Lively [Thu, 15 Oct 2020 15:49:36 +0000 (15:49 +0000)]
Revert "[WebAssembly] v128.load{8,16,32,64}_lane instructions"
This reverts commit
7c6bfd90ab2ddaa60de62878c8512db0645e8452.
Michał Górny [Wed, 14 Oct 2020 17:17:42 +0000 (19:17 +0200)]
[lldb] [Process/FreeBSDRemote] Initial multithreading support
Implement initial support for watching thread creation and termination.
Update ptrace() calls to correctly indicate requested thread.
Watchpoints are not supported yet.
This patch fixes at least multithreaded register tests.
Differential Revision: https://reviews.llvm.org/D89413
Martin Storsjö [Tue, 6 Oct 2020 10:54:49 +0000 (13:54 +0300)]
[LLD] [COFF] Implement a GNU/ELF like -wrap option
Add a simple forwarding option in the MinGW frontend, and implement
the private -wrap option in the COFF linker.
The feature in lld-link isn't gated by the -lldmingw option, but
the option is left as a private, undocumented option primarily
used by the MinGW driver.
The implementation is significantly based on the support for --wrap
in the ELF linker, but many small nuance details are different
between the ELF and COFF linkers, ending up with more than a few
implementation differences.
This fixes https://bugs.llvm.org/show_bug.cgi?id=47384.
Differential Revision: https://reviews.llvm.org/D89004
Martin Storsjö [Wed, 7 Oct 2020 07:46:29 +0000 (10:46 +0300)]
[LLD] [COFF] Fix a condition that was missed in
7f0e6c31c255. NFC.
This should fix cases when e.g. auto import is enabled without
mingw mode in total being enabled.
Differential Revision: https://reviews.llvm.org/D89006
Thomas Lively [Thu, 15 Oct 2020 15:33:10 +0000 (15:33 +0000)]
[WebAssembly] v128.load{8,16,32,64}_lane instructions
Prototype the newly proposed load_lane instructions, as specified in
https://github.com/WebAssembly/simd/pull/350. Since these instructions are not
available to origin trial users on Chrome stable, make them opt-in by only
selecting them from intrinsics rather than normal ISel patterns. Since we only
need rough prototypes to measure performance right now, this commit does not
implement all the load and store patterns that would be necessary to make full
use of the offset immediate. However, the full suite of offset tests is included
to make it easy to track improvements in the future.
Since these are the first instructions to have a memarg immediate as well as an
additional immediate, the disassembler needed some additional hacks to be able
to parse them correctly. Making that code more principled is left as future
work.
Differential Revision: https://reviews.llvm.org/D89366
sunshaoce [Thu, 15 Oct 2020 15:16:53 +0000 (23:16 +0800)]
[RISCV] fix a mistake in RISCVInstrInfoV.td
A commit of VALUVVNoVm was wrong, fixed it.
Reviewed By: HsiangKai
Differential Revision: https://reviews.llvm.org/D88142
Simon Pilgrim [Thu, 15 Oct 2020 15:03:34 +0000 (16:03 +0100)]
[InstCombine] Use m_SpecificInt instead of m_APInt + comparison. NFCI.
Simon Pilgrim [Thu, 15 Oct 2020 14:23:34 +0000 (15:23 +0100)]
[InstCombine] SimplifyDemandedUseBits - xor - refactor cast<ConstantInt> usage to PatternMatch. NFCI.
First step towards replacing these to add full vector support.
Simon Pilgrim [Thu, 15 Oct 2020 14:09:57 +0000 (15:09 +0100)]
[InstCombine] InstCombineAndOrXor - refactor cast<ConstantInt> usages to PatternMatch. NFCI.
First step towards replacing these to add full vector support.
JonChesterfield [Thu, 15 Oct 2020 14:43:39 +0000 (15:43 +0100)]
[openmp][libomptarget] Include header from LLVM source tree
[openmp][libomptarget] Include header from LLVM source tree
The change is to the amdgpu plugin so is unlikely to break anything.
The point of contention is whether libomptarget can depend on LLVM.
A community discussion was cautiously not opposed yesterday.
This introduces a compile time dependency on the LLVM source tree, in this case
expressed as skipping the building of the plugin if LLVM_MAIN_INCLUDE_DIR is not
set. One the source files will #include llvm/Frontend/OpenMP/OMPGridValues.h,
instead of copy&pasting the numbers across.
For users that download the monorepo, the llvm tree is already on disk. This will
inconvenience users who download only the openmp source as a tar, as they would
now also have to download (at least a file or two) from the llvm source, if they want
to build the parts of the openmp project that (post this patch) depend on llvm.
There was interest expressed in going further - using llvm tools as part of
building libomp, or linking against llvm libraries. That seems less clear cut
an improvement and worthy of further discussion. This patch seeks only to change
policy to support openmp depending on the llvm source tree. Including in the
other direction, or using libraries / tools etc, are purposefully out of scope.
Reviewers are a best guess at interested parties, please feel free to add others
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D87841
Stephan Herhut [Thu, 15 Oct 2020 14:27:31 +0000 (16:27 +0200)]
[mlir][standard] Fix parsing of scalar subview and canonicalize
Parsing of a scalar subview did not create the required static_offsets attribute.
This also adds support for folding scalar subviews away.
Differential Revision: https://reviews.llvm.org/D89467
JonChesterfield [Thu, 15 Oct 2020 14:41:11 +0000 (15:41 +0100)]
[NFC] Fix license header from D87841
Paul C. Anagnostopoulos [Mon, 12 Oct 2020 17:35:23 +0000 (13:35 -0400)]
[TableGen] Add the !not and !xor operators.
Update the TableGen Programmer's Reference.
Paul C. Anagnostopoulos [Tue, 13 Oct 2020 16:40:45 +0000 (12:40 -0400)]
[RISCV] [TableGen] Modify RISCVCompressInstEmitter.cpp to use getAllDerivedDefinitions().
Jeremy Morse [Thu, 15 Oct 2020 14:05:59 +0000 (15:05 +0100)]
Add "not" to an llvm-symbolizer test that expects to fail
In
a7b209a6d40d77b, llvm-symbolizer was adjusted to return a failure status
code when it produced an error, to flag up DWARF parsing problems. The
test for missing PDB file is analogous, and returns a failure status now
too.
This should fix the llvm-clang-win-x-armv7l buildbot croaking:
http://lab.llvm.org:8011/#/builders/60/builds/77
Matt Arsenault [Wed, 14 Oct 2020 22:10:54 +0000 (18:10 -0400)]
AMDGPU: Fix verifier error on killed spill of partially undef register
This does unfortunately end up with extra waitcnts getting inserted
that were avoided before. Ideally we would avoid the spills of these
undef components in the first place.
Simon Pilgrim [Thu, 15 Oct 2020 13:16:35 +0000 (14:16 +0100)]
[InstCombine] visitXor - refactor ((X^C1)>>C2)^C3 -> (X>>C2)^((C1>>C2)^C3) fold. NFCI.
This is still ConstantInt-only (scalar) but is refactored to use PatternMatch to make adding vector support in the future relatively trivial.
Carl Ritson [Thu, 15 Oct 2020 09:40:46 +0000 (18:40 +0900)]
[AMDGPU] Minimize number of s_mov generated by copyPhysReg
Generate the minimal set of s_mov instructions required when
expanding a SGPR copy operation in copyPhysReg.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D89187
Caroline Concatto [Tue, 13 Oct 2020 12:41:54 +0000 (13:41 +0100)]
[SVE]Fix implicit TypeSize casts in EmitCheckValue
Using TypeSize::getFixedSize() instead of relying upon the implicit
TypeSize->uint64_cast as the type is always fixed width.
Differential Revision: https://reviews.llvm.org/D89313
Denis Antrushin [Tue, 13 Oct 2020 18:27:14 +0000 (01:27 +0700)]
[Statepoints] Remove MI limit on number of tied operands.
After D87915 statepoint can have more than 15 tied operands.
Remove this restriction from statepoint lowering code.
Andrew Ng [Wed, 23 Sep 2020 14:07:35 +0000 (15:07 +0100)]
[LLD][ELF] Improve ICF for relocations to ineligible sections via "aliases"
ICF was not able to merge equivalent sections because of relocations to
sections ineligible for ICF that use alternative symbols, e.g. symbol
aliases or section relative relocations.
Merging in this scenario has been enabled by giving the sections that
are ineligible for ICF a unique ID, i.e. an equivalence class of their
own. This approach also provides another benefit as it improves the
hashing that is used to perform the initial equivalance grouping for
ICF. This is because the ICF ineligible sections can now contribute a
unique value towards the hashes instead of the same value of zero. This
has been seen to reduce link time with ICF by ~68% for objects compiled
with -fprofile-instr-generate.
In order to facilitate this use of a unique ID, the existing
inconsistent approach to the setting of the InputSection eqClass in ICF
has been changed so that there is a clear distinction between the
eqClass values of ICF eligible sections and those of the ineligible
sections that have a unique ID. This inconsistency could have caused
incorrect equivalence class equality in the past, although it appears
that no issues were encountered in actual use.
Differential Revision: https://reviews.llvm.org/D88830
Serge Guelton [Thu, 15 Oct 2020 08:46:02 +0000 (04:46 -0400)]
[flang] Fix build with BUILD_SHARED_LIBS=ON and FLANG_BUILD_NEW_DRIVER=ON
As usual, it's difficult to handle all different configuration in the first row,
but this one has been extensively tested
Differential Revision: https://reviews.llvm.org/D89452
Adrian Kuegel [Thu, 15 Oct 2020 10:48:35 +0000 (12:48 +0200)]
Fix unused variable warning when compiling with asserts disabled.
Differential Revision: https://reviews.llvm.org/D89454
Jeremy Morse [Thu, 15 Oct 2020 10:20:29 +0000 (11:20 +0100)]
[DebugInstrRef] Support recording of instruction reference substitutions
Add a table recording "substitutions" between pairs of <instruction,
operand> numbers, from old pairs to new pairs. Post-isel optimizations are
able to record the outcome of an optimization in this way. For example, if
there were a divide instruction that generated the quotient and remainder,
and it were replaced by one that only generated the quotient:
$rax, $rcx = DIV-AND-REMAINDER $rdx, $rsi, debug-instr-num 1
DBG_INSTR_REF 1, 0
DBG_INSTR_REF 1, 1
Became:
$rax = DIV $rdx, $rsi, debug-instr-num 2
DBG_INSTR_REF 1, 0
DBG_INSTR_REF 1, 1
We could enter a substitution from <1, 0> to <2, 0>, and no substitution
for <1, 1> as it's no longer generated.
This approach means that if an instruction or value is deleted once we've
left SSA form, all variables that used the value implicitly become
"optimized out", something that isn't true of the current DBG_VALUE
approach.
Differential Revision: https://reviews.llvm.org/D85749
Simon Pilgrim [Thu, 15 Oct 2020 10:02:35 +0000 (11:02 +0100)]
[AggressiveInstCombine] foldAnyOrAllBitsSet - add uniform vector support
Replace m_ConstantInt with m_APInt to support uniform vectors (with no undef elements)
Adding non-undef support would involve some refactoring of the MaskOps struct but this might still be worth it.
Simon Pilgrim [Thu, 15 Oct 2020 09:48:24 +0000 (10:48 +0100)]
[AggressiveInstCombine] foldAnyOrAllBitsSet - add uniform vector tests
Simon Pilgrim [Thu, 15 Oct 2020 09:22:23 +0000 (10:22 +0100)]
[CodeGen][X86] Emit fshl/fshr ir intrinsics for shiftleft128/shiftright128 ms intrinsics
Now that funnel shift handling is pretty good, we can use the intrinsics directly and avoid a lot of zext/trunc issues.
https://godbolt.org/z/YqhnnM
Differential Revision: https://reviews.llvm.org/D89405
Raphael Isemann [Thu, 15 Oct 2020 07:56:53 +0000 (09:56 +0200)]
[lldb] Explicitly test the template argument SB API
Sebastian Neubauer [Wed, 14 Oct 2020 09:14:20 +0000 (11:14 +0200)]
[AMDGPU] Add objdump invalid metadata testcase
Checks that metadata and invalid message are printed.
Differential Revision: https://reviews.llvm.org/D89375
Denis Antrushin [Fri, 18 Sep 2020 16:09:52 +0000 (23:09 +0700)]
[Statepoints] Unlimited tied operands.
Current limit on amount of tied operands (15) sometimes is too low
for statepoint. We may get couple dozens of gc pointer operands on
statepoint.
Review D87154 changed format of statepoint to list every gc pointer
only once, which makes it trivial to find tiedness relation between
statepoint operands: defs are mapped 1-1 to gc pointer operands passed
on registers.
Reviewed By: skatkov
Differential Revision: https://reviews.llvm.org/D87915
Tyker [Thu, 15 Oct 2020 07:56:53 +0000 (09:56 +0200)]
[NFC] Correct name of profile function to Profile in APValue
Capitalize the profile function of APValue such that it can be used by FoldingSetNodeID
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D88643
Georgii Rymar [Wed, 14 Oct 2020 11:46:21 +0000 (14:46 +0300)]
[yaml2obj] - Allow specifying no tags to create empty sections in few cases.
Currently we have a few sections that
does not support specifying no keys for them. E.g. it is required that one
of "Content", "Size" or "Entries" key is present. There is no reason to
have this restriction. We can allow this and emit an empty section instead.
This opens road for a simplification and generalization of the code in `validate()`
that is discussed in the D89039 thread.
Depends on D89039.
Differential revision: https://reviews.llvm.org/D89391
Guillaume Chatelet [Thu, 15 Oct 2020 08:01:26 +0000 (08:01 +0000)]
[libc][NFC] Add probability distributions for memory function sizes
This patch adds memory function size distributions sampled from different applications running in production.
This will be used to benchmark and compare memory functions implementations.
Differential Revision: https://reviews.llvm.org/D89401
Georgii Rymar [Tue, 6 Oct 2020 12:48:15 +0000 (15:48 +0300)]
[yaml2obj/obj2yaml] - Add support of 'Size' and 'Content' keys for all sections.
Many sections either do not have a support of `Size`/`Content` or support just a
one of them, e.g only `Content`.
`Section` is the base class for sections. This patch adds `Content` and `Size` members
to it and removes similar members from derived classes. This allows to cleanup and
generalize the code and adds a support of these keys for all sections (`SHT_MIPS_ABIFLAGS`
is a only exception, it requires unrelated specific changes to be done).
I had to update/add many tests to test the new functionality properly.
Differential revision: https://reviews.llvm.org/D89039
Craig Topper [Wed, 14 Oct 2020 17:52:21 +0000 (10:52 -0700)]
[TargetLowering] Replace Log2_32_Ceil with Log2_32 in SimplifySetCC ctpop combine.
This combine can look through (trunc (ctpop X)). When doing this
it tries to make sure the trunc doesn't lose any information
from the ctpop. It does this by checking that the truncated type
has more bits that Log2_32_Ceil of the ctpop type. The Ceil is
unnecessary and pessimizes non-power of 2 types.
For example, ctpop of i256 requires 9 bits to represent the max
value of 256. But ctpop of i255 only requires 8 bits to represent
the max result of 255. Log2_32_Ceil of 256 and 255 both return 8
while Log2_32 returns 8 for 256 and 7 for 255
The code with popcnt enabled is a regression for this test case,
but it does match what already happens with i256 truncated to i9.
Since power of 2 is more likely, I don't think it should block
this change.
Differential Revision: https://reviews.llvm.org/D89412
David Sherwood [Fri, 9 Oct 2020 08:02:47 +0000 (09:02 +0100)]
[SVE][NFC] Replace some TypeSize comparisons in non-AArch64 Targets
In most of lib/Target we know that we are not dealing with scalable
types so it's perfectly fine to replace TypeSize comparison operators
with their fixed width equivalents, making use of getFixedSize()
and so on.
Differential Revision: https://reviews.llvm.org/D89101
Jason Molenda [Thu, 15 Oct 2020 07:57:23 +0000 (00:57 -0700)]
Increase timeout to find a dSYM in macos DownloadObjectAndSymbolFile
With a large dSYM over a slow home connection, the two minute timeout
would sometimes be exceeded, and we haven't seen instances of a
long timeout causing people any problems, so we're bumping it up.
640 seconds ought to be enough for anyone.
<rdar://problem/
67759526>
Luqman Aden [Thu, 15 Oct 2020 07:06:46 +0000 (00:06 -0700)]
[LLD] Set alignment as part of Characteristics in TLS table.
Fixes https://bugs.llvm.org/show_bug.cgi?id=46473
LLD wasn't previously specifying any specific alignment in the TLS table's Characteristics field so the loader would just assume the default value (16 bytes). This works most of the time except if you have thread locals that want specific higher alignments (e.g. 32 as in the bug) *even* if they specify an alignment on the thread local. This change updates LLD to take the max alignment from tls section.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D88637
Luqman Aden [Thu, 15 Oct 2020 06:57:57 +0000 (23:57 -0700)]
Revert "[LLD] Set alignment as part of Characteristics in TLS table."
Revert individual wip commits and will instead follow up with a
single commit with all the changes. Makes cherry-picking easier
and will contain all the right tags.
This reverts commit
32a4ad3b6ce6028a371b028cf06fa5feff9534bf.
This reverts commit
7fe13af676678815989a6d0ece684687953245e7.
This reverts commit
51fbc1bef657bb0f5808986555ec3517a84768c4.
This reverts commit
f80950a8bb985c082b26534b0e157447bf803935.
This reverts commit
0778cad9f325df4d7b32b22f3dba201a16a0b8fe.
This reverts commit
8b70d527d7ec1c8b9e921177119a0d906ffad4f0.
David Blaikie [Thu, 15 Oct 2020 07:01:43 +0000 (00:01 -0700)]
Fix llvm-symbolizer assembly-based test to require x86 and specify x86 when assembling
David Blaikie [Thu, 15 Oct 2020 06:23:33 +0000 (23:23 -0700)]
llvm-symbolizer: Exit non-zero when DWARF parsing errors have been rendered
Jason Molenda [Thu, 15 Oct 2020 06:30:42 +0000 (23:30 -0700)]
Fix typeo in attach failed error message.
<rdar://problem/
70296751>
Carl Ritson [Thu, 15 Oct 2020 04:27:50 +0000 (13:27 +0900)]
[AMDGPU] Pre-commit test for D89187
David Blaikie [Thu, 15 Oct 2020 05:47:36 +0000 (22:47 -0700)]
llvm-symbolizer: Ensure non-zero exit when an error is printed
(this doesn't cover all cases - libDebugInfoDWARF has a default error
handler that prints errors without any exit code handling - I'll be
following up with a patch for that after this)
MaheshRavishankar [Thu, 15 Oct 2020 05:32:52 +0000 (22:32 -0700)]
[mlir][SPIRV] Adding an attribute to capture configuration for cooperative matrix operations.
Each hardware that supports SPV_C_CooperativeMatrixNV has a list of
configurations that are supported natively. Add an attribute to
specify the configurations supported to the `spv.target_env`.
Reviewed By: antiagainst, ThomasRaoux
Differential Revision: https://reviews.llvm.org/D89364
David Blaikie [Thu, 15 Oct 2020 05:10:18 +0000 (22:10 -0700)]
llvm-dwarfdump: Exit non-zero on an error path
Richard Smith [Thu, 15 Oct 2020 05:05:30 +0000 (22:05 -0700)]
Perform lvalue conversions on the left of a pseudo-destructor call 'p->~T()'.
Previously we failed to convert 'p' from array/function to pointer type,
and to represent the load of 'p' in the AST. The latter causes problems
for constant evaluation.
Duncan P. N. Exon Smith [Wed, 14 Oct 2020 18:36:00 +0000 (14:36 -0400)]
clang-{tools,unittests}: Stop using SourceManager::getBuffer, NFC
Update clang-tools-extra, clang/tools, clang/unittests to migrate from
`SourceManager::getBuffer`, which returns an always dereferenceable
`MemoryBuffer*`, to `getBufferOrNone` or `getBufferOrFake`, both of
which return a `MemoryBufferRef`, depending on whether the call site was
checking for validity of the buffer. No functionality change intended.
Differential Revision: https://reviews.llvm.org/D89416
Duncan P. N. Exon Smith [Wed, 14 Oct 2020 18:06:37 +0000 (14:06 -0400)]
clang/StaticAnalyzer: Stop using SourceManager::getBuffer
Update clang/lib/StaticAnalyzer to stop relying on a `MemoryBuffer*`,
using the `MemoryBufferRef` from `getBufferOrNone` or the
`Optional<MemoryBufferRef>` from `getBufferOrFake`, depending on whether
there's logic for checking validity of the buffer. The change to
clang/lib/StaticAnalyzer/Core/IssueHash.cpp is potentially a
functionality change, since the logic was wrong (it checked for
`nullptr`, which was never returned by the old API), but if that was
reachable the new behaviour should be better.
Differential Revision: https://reviews.llvm.org/D89414
Vinay Madhusudan [Thu, 15 Oct 2020 03:25:51 +0000 (08:55 +0530)]
[AArch64] Combine UADDVs to generate vector add
ADD(UADDV a, UADDV b) --> UADDV(ADD a, b)
This partially solves the bug: https://bugs.llvm.org/show_bug.cgi?id=46888
Meta ticket: https://bugs.llvm.org/show_bug.cgi?id=46929
Differential Revision: https://reviews.llvm.org/D88731
Duncan P. N. Exon Smith [Wed, 14 Oct 2020 17:48:52 +0000 (13:48 -0400)]
clang/CodeGen: Stop using SourceManager::getBuffer, NFC
Update `clang/lib/CodeGen` to use a `MemoryBufferRef` from
`getBufferOrNone` instead of `MemoryBuffer*` from `getBuffer`. No
functionality change here.
Differential Revision: https://reviews.llvm.org/D89411