Roman Lebedev [Tue, 11 May 2021 16:35:41 +0000 (19:35 +0300)]
[X86][Codegen] Shift amount mod: sh? i64 x, (32-y) --> sh? i64 x, -(y+32)
I've seen this in the RawSpeed's BitPumpMSB*::push() hotpath,
after fixing the buffer abstraction to a more sane one,
when looking into a +5% runtime regression.
I was hoping that this would fix it, but it does not look it does.
This seems to be at least not worse than the original pattern.
But i'm actually mainly interested in the case where we already
compute `(y+32)` (see last test),
https://alive2.llvm.org/ce/z/ZCzJio
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D101944
Craig Topper [Mon, 10 May 2021 19:12:16 +0000 (12:12 -0700)]
[RISCV] Match trunc_vector_vl+sra_vl/srl_vl with splat shift amount to vnsra/vnsrl.
Limited to splats because we would need to truncate the shift
amount vector otherwise.
I tried to do this with new ISD nodes and a DAG combine to
avoid such a large pattern, but we don't form the splat until
LegalizeDAG and need DAG combine to remove a scalable->fixed->scalable
cast before it becomes visible to the shift node. By the time that
happens we've already visited the truncate node and won't revisit it.
I think I have an idea how to improve i64 on RV32 I'll save for a
follow up.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D102019
Alan Phipps [Tue, 11 May 2021 16:26:19 +0000 (11:26 -0500)]
Revert "Fix branch coverage merging in FunctionCoverageSummary::get() for instantiation"
This reverts commit
6400905a615282c83a2fc6e49e57ff716aa8b4de.
Arthur O'Dwyer [Mon, 10 May 2021 17:13:04 +0000 (13:13 -0400)]
[libc++] Remove more unnecessary _VSTD:: from type names. NFCI.
Differential Revision: https://reviews.llvm.org/D102181
Arthur O'Dwyer [Mon, 10 May 2021 19:32:38 +0000 (15:32 -0400)]
[libc++] s/_VSTD::is_unsigned/is_unsigned/ in <random>. NFCI.
Arthur O'Dwyer [Mon, 10 May 2021 17:19:07 +0000 (13:19 -0400)]
[libc++] s/_VSTD::chrono/chrono/g. NFCI.
Arthur O'Dwyer [Mon, 10 May 2021 17:07:00 +0000 (13:07 -0400)]
[libc++] s/std::size_t/size_t/g. NFCI.
Arthur O'Dwyer [Mon, 10 May 2021 17:04:16 +0000 (13:04 -0400)]
[libc++] s/_VSTD::declval/declval/g. NFCI.
Jon Chesterfield [Tue, 11 May 2021 16:23:08 +0000 (17:23 +0100)]
[libomptarget][nfc] Add hook to easily disable building amdgcn bclib
[libomptarget][nfc] Add hook to easily disable building amdgcn bclib
This is useful when building LLVM with a toolchain that can't emit code
for amdgcn, e.g. because it overrides the include search path with headers
from another architecture, or the clang compiler is missing builtins.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D102229
Benjamin Kramer [Tue, 11 May 2021 12:14:54 +0000 (14:14 +0200)]
[mlir] Use static shape knowledge when lowering memref.reshape
This is actually necessary for correctness, as memref.reinterpret_cast
doesn't verify if the output shape doesn't match the static sizes.
Differential Revision: https://reviews.llvm.org/D102232
Augusto Noronha [Tue, 11 May 2021 16:15:03 +0000 (13:15 -0300)]
Add null-pointer checks when accessing a TypeSystem's SymbolFile
A type system is not guaranteed to have a symbol file. This patch adds null-pointer checks so we don't crash when trying to access a type system's symbol file.
Reviewed By: aprantl, teemperor
Differential Revision: https://reviews.llvm.org/D101539
Augusto Noronha [Tue, 11 May 2021 16:07:02 +0000 (13:07 -0300)]
Change Target::ReadMemory to ensure the amount of memory read from the file-cache is the amount requested.
This change ensures that if for whatever reason we read less bytes than expected (for example, when trying to read memory that spans multiple sections), we try reading from the live process as well.
Reviewed By: jasonmolenda
Differential Revision: https://reviews.llvm.org/D101390
Alan Phipps [Mon, 10 May 2021 20:15:26 +0000 (15:15 -0500)]
Fix branch coverage merging in FunctionCoverageSummary::get() for instantiation
groups.
This change corrects the implementation for the branch coverage
summary to do the same thing for branches that is done for lines and regions.
That is, across function instantiations in an instantiation group, the maximum
branch coverage found in any of those instantiations is returned, with the
total number of branches being the same across instantiations.
Differential Revision: https://reviews.llvm.org/D102193
Roman Lebedev [Tue, 11 May 2021 15:34:14 +0000 (18:34 +0300)]
[NFC][X86] Precommit another testcase for D101944
Jamie Schmeiser [Tue, 11 May 2021 15:29:50 +0000 (11:29 -0400)]
Produce warning for performing pointer arithmetic on a null pointer.
Summary:
Test and produce warning for subtracting a pointer from null or subtracting
null from a pointer. Reuse existing warning that this is undefined
behaviour. Also add unit test for both warnings.
Reformat to satisfy clang-format.
Respond to review comments: add additional test.
Respond to review comments: Do not issue warning for nullptr - nullptr
in C++.
Fix indenting to satisfy clang-format.
Respond to review comments: Add C++ tests.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: efriedma (Eli Friedman), nickdesaulniers (Nick Desaulniers)
Differential Revision: https://reviews.llvm.org/D98798
Steven Wu [Tue, 11 May 2021 15:23:08 +0000 (08:23 -0700)]
[IR][AutoUpgrade] Drop align attribute from void return types
Since D87304, `align` become an invalid attribute on none pointer types and
verifier will reject bitcode that has invalid `align` attribute.
The problem is before the change, DeadArgumentElimination can easily
turn a pointer return type into a void return type without removing
`align` attribute. Teach Autograde to remove invalid `align` attribute
from return types to maintain bitcode compatibility.
rdar://
77022993
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D102201
Tony Tye [Tue, 11 May 2021 02:24:30 +0000 (02:24 +0000)]
[NFC][AMDGPU] Correct product name for gfx908
The product name for gfx908 is "AMD Instinct MI100 Accelerator".
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D102209
Pushpinder Singh [Tue, 11 May 2021 11:45:09 +0000 (06:45 -0500)]
Revert "[AMDGPU][OpenMP] Emit textual IR for -emit-llvm -S"
This reverts commit
7f78e409d0280c62209e1a7dc8c6d1409acc9184.
Congzhe Cao [Tue, 11 May 2021 14:59:50 +0000 (10:59 -0400)]
[LoopInterchange] Fix legality for triangular loops
This is a bug fix in legality check.
When we encounter triangular loops such as the following form:
for (int i = 0; i < m; i++)
for (int j = 0; j < i; j++), or
for (int i = 0; i < m; i++)
for (int j = 0; j*i < n; j++),
we should not perform interchange since the number of executions of the loop body
will be different before and after interchange, resulting in incorrect results.
Reviewed By: bmahjour
Differential Revision: https://reviews.llvm.org/D101305
Aakanksha Patil [Tue, 4 May 2021 22:36:32 +0000 (18:36 -0400)]
Fix typo "Execpt" in comments
Differential Revision: https://reviews.llvm.org/D101858
Paul C. Anagnostopoulos [Tue, 11 May 2021 14:41:17 +0000 (10:41 -0400)]
Revert "[TableGen] Make the NUL character invalid in .td files"
At least one build uses a 'sed' that does not understand \x00.
This reverts commit
cf9647011c4f05e1eb4423c6637d84e2f26b2042.
Peyton, Jonathan L [Mon, 10 May 2021 15:03:23 +0000 (10:03 -0500)]
[OpenMP] Fix hidden helper + affinity
When KMP_AFFINITY is set, each worker thread's gtid value is used as an
index into the place list to determine the thread's placement. With hidden
helpers enabled, this gtid value is shifted down leading to unexpected
shifted thread placement. This patch restores the previous behavior by
adjusting the mask index to take the number of hidden helper threads
into account.
Hidden helper threads are given the full initial mask and do not
participate in any of the other affinity mechanisms (place partitioning,
balanced affinity). Their affinity is only printed for debug builds.
Differential Revision: https://reviews.llvm.org/D101882
Florian Hahn [Tue, 11 May 2021 13:29:21 +0000 (14:29 +0100)]
[VPlan] Register recipe for instr if the simplified value is recipe.
If the simplified VPValue is a recipe, we need to register it for Instr,
in case it needs to be recorded. The way this is handled in general may
change soon, following some post-commit comments.
This fixes PR50298.
Roman Lebedev [Tue, 11 May 2021 13:09:10 +0000 (16:09 +0300)]
[X86] X86TTIImpl::getInterleavedMemoryOpCostAVX2(): use getMemoryOpCost()
Now that getMemoryOpCost() correctly handles all the vector variants,
we should no longer hand-roll our own version of it, but use it directly.
The AVX512 variant probably needs a similar change,
but there it is less obvious.
Paul C. Anagnostopoulos [Wed, 5 May 2021 16:03:50 +0000 (12:03 -0400)]
[TableGen] Make the NUL character invalid in .td files
Differential Revision: https://reviews.llvm.org/D101923
Simon Pilgrim [Tue, 11 May 2021 13:18:29 +0000 (14:18 +0100)]
[X86] Replace repeated isa/cast<ConstantSDNode> calls with single single dyn_cast<>. NFCI.
Noticed while looking at D101944
Simon Pilgrim [Tue, 11 May 2021 11:26:14 +0000 (12:26 +0100)]
[X86][SSE] Replace foldShuffleOfHorizOp with generalized version in canonicalizeShuffleMaskWithHorizOp
foldShuffleOfHorizOp only handled basic shufps(hop(x,y),hop(z,w)) folds - by moving this to canonicalizeShuffleMaskWithHorizOp we can work with more general/combined v4x32 shuffles masks, float/integer domains and support shuffle-of-packs as well.
The next step will be to support 256/512-bit vector cases.
Matt Arsenault [Tue, 11 May 2021 00:56:24 +0000 (20:56 -0400)]
CodeGen: Fix null dereference before null check
Roman Lebedev [Tue, 11 May 2021 13:02:11 +0000 (16:02 +0300)]
[X86][CostModel] X86TTIImpl::getMemoryOpCost(): rewrite vector handling again
Instead of handling power-of-two sized vector chunks,
try handling the large vector in a stream mode,
decreasing the operational vector size
once it no longer works for the elements left to process.
Notably, this improves costs for overaligned loads - loading padding is fine.
This more directly tracks when we need to insert/extract the YMM/XMM subvector,
some costs fluctuate because of that.
Reviewed By: RKSimon, ABataev
Differential Revision: https://reviews.llvm.org/D100684
Sanjay Patel [Tue, 11 May 2021 12:45:20 +0000 (08:45 -0400)]
[SLP] restrict matching of load combine candidates
The test example from https://llvm.org/PR50256 (and reduced here)
shows that we can match a load combine candidate even when there
are no "or" instructions. We can avoid that by confirming that we
do see an "or". This doesn't apply when matching an or-reduction
because that match begins from the operands of the reduction.
Differential Revision: https://reviews.llvm.org/D102074
Piotr Sobczak [Tue, 11 May 2021 10:45:04 +0000 (12:45 +0200)]
[AMDGPU] Move code sinking before structurizer
Moving code sinking pass before structurizer creates more sinking
opportunities.
The extra flow edges introduced by the structurizer can have adverse
effects on sinking, because the sinking pass prefers moving instructions
to blocks with unique predecessors and the structurizer destroys that
property in some cases.
A notable example is moving high-latency image instructions across kills.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D101115
Anastasia Stulova [Tue, 11 May 2021 09:36:19 +0000 (10:36 +0100)]
[OpenCL] Allow use of double type without extension pragma.
Simply use of extensions by allowing the use of supported
double types without the pragma. Since earlier standards
instructed that the pragma is used explicitly a new warning
is introduced in pedantic mode to indicate that use of
type without extension pragma enable can be non-portable.
This patch does not break backward compatibility since the
extension pragma is still supported and it makes the behavior
of the compiler less strict by accepting code without extra
pragma statements.
Differential Revision: https://reviews.llvm.org/D100980
Jon Chesterfield [Tue, 11 May 2021 11:19:55 +0000 (12:19 +0100)]
[libomptarget][nfc] Drop stringify in macro
[libomptarget][nfc] Drop stringify in macro
A step towards deleting the macros entirely.
Differential Revision: https://reviews.llvm.org/D102228
Martin Storsjö [Wed, 5 May 2021 10:26:56 +0000 (13:26 +0300)]
[LLDB] Don't use the local python to set a default for LLDB_PYTHON_RELATIVE_PATH when cross compiling.
Differential Revision: https://reviews.llvm.org/D101903
Stefan Pintilie [Tue, 11 May 2021 10:32:32 +0000 (05:32 -0500)]
[PowerPC][Bug] Fix Bug in Stack Frame Update Code
The stack frame update code does not take into consideration spilling
to registers for callee saved registers. The option -ppc-enable-pe-vector-spills
turns on spilling to registers for callee saved registers and may expose a bug
in the code that moves a stack frame pointer update instruction.
Reviewed By: nemanjai, #powerpc
Differential Revision: https://reviews.llvm.org/D101366
Denis Antrushin [Wed, 24 Mar 2021 15:48:11 +0000 (22:48 +0700)]
[RegAllocFast] properly handle STATEPOINT instruction.
STATEPOINT is a fancy and complex pseudo instruction which
has both tied defs and regmask operand.
Basic FastRA algorithm is as follows:
1. Mark registers used by defs as free
2. If instruction has regmask operand displace clobbered registers
according to regmask.
3. Assign registers for use operands.
In case of tied defs step 1 is replaced with allocation of registers
for them. But regmask is still processed, which may displace already
allocated registers. As a result, tied use and def will get assigned
to different registers.
This patch makes FastRA to process instruction's RegMask (if any) when
checking for physical registers interference.
That way tied operands won't get registers clobbered by regmask.
Reviewed By: arsenm, skatkov
Differential Revision: https://reviews.llvm.org/D99284
Jay Foad [Tue, 11 May 2021 10:21:03 +0000 (11:21 +0100)]
[AMDGPU] Add some GFX10.3 testing. NFC.
Uday Bondhugula [Tue, 11 May 2021 09:48:42 +0000 (15:18 +0530)]
[MLIR] Switch llvm.noalias to a unit attribute
Switch llvm.noalias attribute from a boolean attribute to a unit
attribute.
Differential Revision: https://reviews.llvm.org/D102225
Martin Storsjö [Fri, 7 May 2021 21:21:49 +0000 (00:21 +0300)]
[LLD] [COFF] Add an assert regarding the RVA of exported symbols. NFC.
As this isn't handled as a regular relocation, the normal handling of
maybeReportRelocationToDiscarded in Chunks.cpp doesn't apply here.
This would have caught the issue fixed by
82de4e075339f5ad8d68cfe31eb45b771d4750ae.
Differential Revision: https://reviews.llvm.org/D102115
Andy Wingo [Wed, 5 May 2021 12:59:30 +0000 (14:59 +0200)]
[CodeGen][WebAssembly] Better lowering for WASM_SYMBOL_TYPE_GLOBAL symbols
As we have been missing support for WebAssembly globals on the IR level,
the lowering of WASM_SYMBOL_TYPE_GLOBAL to IR was incomplete. This
commit fleshes out the lowering support, lowering references to and
definitions of addrspace(1) values to correctly typed
WASM_SYMBOL_TYPE_GLOBAL symbols.
Depends on D101608.
Differential Revision: https://reviews.llvm.org/D101913
Simon Moll [Tue, 11 May 2021 07:09:48 +0000 (09:09 +0200)]
[VP] Improve the VP intrinsic unittests
Test that all VP intrinsics are tested.
Test intrinsic id -> opcode -> intrinsic id round tripping.
Test property scopes in the include/llvm/IR/VPIntrinsics.def file.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D93534
Paulo Matos [Tue, 4 May 2021 12:13:08 +0000 (14:13 +0200)]
[WebAssembly] Support for WebAssembly globals in LLVM IR
This patch adds support for WebAssembly globals in LLVM IR, representing
them as pointers to global values, in a non-default, non-integral
address space. Instruction selection legalizes loads and stores to
these pointers to new WebAssemblyISD nodes GLOBAL_GET and GLOBAL_SET.
Once the lowering creates the new nodes, tablegen pattern matches those
and converts them to Wasm global.get/set of the appropriate type.
Based on work by Paulo Matos in https://reviews.llvm.org/D95425.
Reviewed By: pmatos
Differential Revision: https://reviews.llvm.org/D101608
Andrzej Warzynski [Tue, 4 May 2021 15:52:15 +0000 (15:52 +0000)]
[flang][cmake] Enable the new driver by default
With this patch, `FLANG_BUILD_NEW_DRIVER` is set to `On` by default
(i.e. the new driver is enabled). Note that the new driver depends on
Clang and hence with this change you will need to add `clang` to
`LLVM_ENABLE_PROJECTS`.
If you don't want to build the new driver, set `FLANG_BUILD_NEW_DRIVER`
to `Off`. This way you won't be required to include `clang` in
`LLVM_ENABLE_PROJECTS`.
Differential Revision: https://reviews.llvm.org/D101842
Alex Orlov [Tue, 11 May 2021 09:10:54 +0000 (13:10 +0400)]
* Add support for JSON output style to llvm-symbolizer
This patch adds JSON output style to llvm-symbolizer to better support CLI automation by providing a machine readable output.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D96883
Simon Pilgrim [Tue, 11 May 2021 08:20:55 +0000 (09:20 +0100)]
Fix -Wdocumentation warnings. NFCI.
Ole Strohm [Tue, 11 May 2021 08:45:28 +0000 (09:45 +0100)]
[OpenCL] [NFC] Fixed underline being too short in rst
Tres Popp [Fri, 7 May 2021 14:19:22 +0000 (16:19 +0200)]
Support VectorTransfer splitting on writes also.
VectorTransfer split previously only split read xfer ops. This adds
the same logic to write ops. The resulting code involves 2
conditionals for write ops while read ops only needed 1, but the created
ops are built upon the same patterns, so pattern matching/expectations
are all consistent other than in regards to the if/else ops.
Differential Revision: https://reviews.llvm.org/D102157
Kristina Bessonova [Mon, 10 May 2021 18:25:46 +0000 (20:25 +0200)]
[libcxx][test] Make string.modifiers/clear_and_shrink_db1.pass.cpp a regular mode test
Turn this test into a normal mode as it contains well-formed code and
checks for defined behavior. It still can be run in debug mode as of D100866.
Differential Revision: https://reviews.llvm.org/D102192
Djordje Todorovic [Mon, 10 May 2021 13:06:58 +0000 (06:06 -0700)]
[llvm-dwarfdump] Fix abstract origin vars location stats calculation
There are cases where a concrete DIE with DW_TAG_subprogram can have
abstract_origin attribute, so we handle that situation as well.
Differential Revision: https://reviews.llvm.org/D101025
Tobias Gysi [Tue, 11 May 2021 06:52:44 +0000 (06:52 +0000)]
[mlir][linalg] Remove IndexedGenericOp support from LinalgToLoops...
after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612).
Differential Revision: https://reviews.llvm.org/D102187
Tobias Gysi [Tue, 11 May 2021 06:27:38 +0000 (06:27 +0000)]
[mlir][linalg] Remove IndexedGenericOp support from Fusion...
after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612).
Differential Revision: https://reviews.llvm.org/D102174
Christopher Di Bella [Sun, 2 May 2021 20:39:35 +0000 (20:39 +0000)]
[libcxx] deprecates/removes `std::raw_storage_iterator`
C++17 deprecates `std::raw_storage_iterator` and C++20 removes it.
Implements part of:
* P0174R2 'Deprecating Vestigial Library Parts in C++17'
* P0619R4 'Reviewing Deprecated Facilities of C++17 for C++20'
Differential Revision: https://reviews.llvm.org/D101730
Christopher Di Bella [Sat, 1 May 2021 19:00:36 +0000 (19:00 +0000)]
[libcxx] makes comparison operators for `std::*_ordering` types hidden friends
The standard leaves it up to the implementation to decide whether or not
these operators are hidden friends. There are several (well-documented)
reasons to prefer hidden friends, as well as an argument for improved
readability.
Depends on D100342.
Differential Revision: https://reviews.llvm.org/D101707
Christopher Di Bella [Mon, 12 Apr 2021 20:55:05 +0000 (20:55 +0000)]
[libcxx] removes operator!= and globally guards against no spaceship operator
* `operator!=` isn't in the spec
* `<compare>` is designed to work with `operator<=>` so it doesn't
really make sense to have `operator<=>`-less friendly sections.
Depends on D100283.
Differential Revision: https://reviews.llvm.org/D100342
Kadir Cetinkaya [Wed, 5 May 2021 15:46:46 +0000 (17:46 +0200)]
[clangd][remote-client] Set HasMore to true for failure
Currently client was setting the HasMore to true iff stream said so.
Hence if we had a broken stream for whatever reason (e.g. hitting deadline for a
huge response), HasMore would be false, which is semantically incorrect (e.g.
will throw rename off).
Differential Revision: https://reviews.llvm.org/D101915
Kadir Cetinkaya [Wed, 5 May 2021 15:36:20 +0000 (17:36 +0200)]
[clangd][index-sever] Limit results in repsonse
This is to prevent server from being DOS'd by possible malicious
parties issuing requests that can yield huge responses.
One possible drawback is on rename workflow. As it really requests all
occurences, but it has an internal limit on 50 files currently.
We are putting the limit on 10000 elements per response So for rename to regress
one should have 10k refs to a symbol in less than 50 files. This seems unlikely
and we fix it if there are complaints by giving up on the response based on the
number of files covered instead.
Differential Revision: https://reviews.llvm.org/D101914
Tobias Gysi [Tue, 11 May 2021 05:51:45 +0000 (05:51 +0000)]
[mlir][linalg] Remove IndexedGenericOp support from Tiling...
after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612).
Differential Revision: https://reviews.llvm.org/D102176
Igor Kudrin [Thu, 6 May 2021 13:45:29 +0000 (20:45 +0700)]
[LLD] Improve reporting unresolved symbols in shared libraries
Currently, when reporting unresolved symbols in shared libraries, if an
undefined symbol is firstly seen in a regular object file that shadows
the reference for the same symbol in a shared object. As a result, the
error for the unresolved symbol in the shared library is not reported.
If referencing sections in regular object files are discarded because of
'--gc-sections', no reports about such symbols are generated, and the
linker finishes successfully, generating an output image that fails on
the run.
The patch fixes the issue by keeping symbols, which should be checked,
for each shared library separately.
Differential Revision: https://reviews.llvm.org/D101996
Chris Lattner [Sun, 9 May 2021 01:46:30 +0000 (18:46 -0700)]
[OpAsmParser] Refactor parseOptionalInteger to support wide integers, NFC.
OpAsmParser (and DialectAsmParser) supports a pair of
parseInteger/parseOptionalInteger methods, which allow parsing a bare
integer into a C type of your choice (e.g. int8_t) using templates. It
was implemented in terms of a virtual method call that is hard coded to
int64_t because "that should be big enough".
Change the virtual method hook to return an APInt instead. This allows
asmparsers for custom ops to parse large integers if they want to, without
changing any of the clients of the fixed size C API.
Differential Revision: https://reviews.llvm.org/D102120
Carl Ritson [Tue, 11 May 2021 03:14:01 +0000 (12:14 +0900)]
[AMDGPU] Pre-commit tests for D102211
Hsiangkai Wang [Mon, 10 May 2021 14:29:00 +0000 (22:29 +0800)]
[RISCV] Fix the calculation of the offset of Zvlsseg spilling.
For Zvlsseg spilling, we need to convert the pseudo instructions
into multiple vector load/store instructions with appropriate offsets.
For example, for PseudoVSPILL3_M2, we need to convert it to
VS2R %v2, %base
ADDI %base, %base, (vlenb x 2)
VS2R %v4, %base
ADDI %base, %base, (vlenb x 2)
VS2R %v6, %base
We need to keep the size of the offset in the pseudo spilling instructions.
In this case, it is (vlenb x 2).
In the original implementation, we use the size of frame objects divide the
number of vectors in zvlsseg types. The size of frame objects is not
necessary exactly the same as the spilling data. It may be larger than
it. So, we change it to (VLENB x LMUL) in this patch. The calculation is
more direct and easy to understand.
Differential Revision: https://reviews.llvm.org/D101869
Renaud-K [Mon, 10 May 2021 23:41:29 +0000 (16:41 -0700)]
Enable export of FIR includes into the install tree
https://reviews.llvm.org/D102040
Vitaly Buka [Tue, 11 May 2021 00:32:36 +0000 (17:32 -0700)]
[NFC][LSAN] Fix flaky multithreaded test
LLVM GN Syncbot [Tue, 11 May 2021 00:19:33 +0000 (00:19 +0000)]
[gn build] Port
e5d483f28a3a
Lang Hames [Fri, 7 May 2021 04:50:53 +0000 (21:50 -0700)]
[ORC-RT] Add unit test infrastructure, extensible_rtti implementation, unit test
Add unit test infrastructure for the ORC runtime, plus a cut-down
extensible_rtti system and extensible_rtti unit test.
Removes the placeholder.cpp source file.
Differential Revision: https://reviews.llvm.org/D102080
zoecarver [Fri, 23 Apr 2021 19:34:22 +0000 (12:34 -0700)]
[libcxx][ranges] Add ranges::empty CPO.
Depends on D101079. Refs D101189.
Differential Revision: https://reviews.llvm.org/D101193
Aart Bik [Mon, 10 May 2021 19:56:15 +0000 (12:56 -0700)]
[mlir][linalg] remove the -now- obsolete sparse support in linalg
All glue and clutter in the linalg ops has been replaced by proper
sparse tensor type encoding. This code is no longer needed. Thanks
to ntv@ for giving us a temporary home in linalg.
So long, and thanks for all the fish.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D102098
Stanislav Mekhanoshin [Mon, 10 May 2021 22:42:47 +0000 (15:42 -0700)]
[AMDGPU] Constant fold Intrinsic::amdgcn_perm
Differential Revision: https://reviews.llvm.org/D102203
LLVM GN Syncbot [Mon, 10 May 2021 23:06:37 +0000 (23:06 +0000)]
[gn build] Port
3b8d2be52725
Sam Clegg [Sat, 27 Feb 2021 00:09:32 +0000 (16:09 -0800)]
Reland: "[lld][WebAssembly] Initial support merging string data"
This change was originally landed in:
5000a1b4b9edeb9e994f2a5b36da8d48599bea49
It was reverted in:
061e071d8c9b98526f35cad55a918a4f1615afd4
This change adds support for a new WASM_SEG_FLAG_STRINGS flag in
the object format which works in a similar fashion to SHF_STRINGS
in the ELF world.
Unlike the ELF linker this support is currently limited:
- No support for SHF_MERGE (non-string merging)
- Always do full tail merging ("lo" can be merged with "hello")
- Only support single byte strings (p2align 0)
Like the ELF linker merging is only performed at `-O1` and above.
This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828,
although crucially it doesn't not currently support debug sections
because they are not represented by data segments (they are custom
sections)
Differential Revision: https://reviews.llvm.org/D97657
Benjamin Kramer [Mon, 10 May 2021 21:19:59 +0000 (23:19 +0200)]
[mlir][Tensor] Add folding for tensor.from_elements
This trivially folds into a constant when all operands are constant.
Differential Revision: https://reviews.llvm.org/D102199
Jessica Paquette [Mon, 3 May 2021 19:21:11 +0000 (12:21 -0700)]
[AArch64][GlobalISel] Add post-legalizer lowering for NEON vector fcmps
This is roughly equivalent to the floating point portion of
`AArch64TargetLowering::LowerVSETCC`. Main part that's missing is the v4s16 bit.
This also adds helpers equivalent to `EmitVectorComparison`, and
`changeVectorFPCCToAArch64CC`. This moves `changeFCMPPredToAArch64CC` out of
the selector into AArch64GlobalISelUtils for the sake of code reuse.
This is done in post-legalizer lowering with pseudos to simplify selection.
The imported patterns end up handling selection for us this way.
Differential Revision: https://reviews.llvm.org/D101782
Nico Weber [Mon, 10 May 2021 22:27:45 +0000 (18:27 -0400)]
Revert "[lld][WebAssembly] Initial support merging string data"
This reverts commit
5000a1b4b9edeb9e994f2a5b36da8d48599bea49.
Breaks tests, see https://reviews.llvm.org/D97657#2749151
Easily repros locally with `ninja check-llvm-mc-webassembly`.
Jessica Paquette [Mon, 10 May 2021 21:06:42 +0000 (14:06 -0700)]
[AArch64][GlobalISel] Enable memcpy family combines on minsize functions
The combines in `tryCombineMemCpyFamily` have heuristics (e.g.
`TLI.getMaxStoresPerMemset`) which consider size. So, theoretically, enabling
these combines on minsize functions shouldn't be harmful.
With this enabled we save 0.9% geomean on CTMark at -Oz, and 5.1% on Bullet.
There are no code size regressions.
Differential Revision: https://reviews.llvm.org/D102198
Guozhi Wei [Mon, 10 May 2021 21:47:54 +0000 (14:47 -0700)]
Pre-commit test case for D101970
This is a test case for D101970, which shows the optimization opportunity for
lea (reg1, reg2), reg3
sub reg3, reg4
to
sub reg1, reg4
sub reg2, reg4
Differential Revision: https://reviews.llvm.org/D102010
Krzysztof Parzyszek [Mon, 10 May 2021 20:26:57 +0000 (15:26 -0500)]
[Hexagon] Handle loads and stores of scalar predicate vectors
Handle v2i1, v4i1, and v8i1.
David Blaikie [Mon, 10 May 2021 21:30:22 +0000 (14:30 -0700)]
Clangd Matchers.h: Fix -Wdeprecated-copy by making the defaulted copy ctor and deleted copy assignment operators explicit
David Blaikie [Mon, 10 May 2021 21:28:09 +0000 (14:28 -0700)]
Remove some unnecessary explicit defaulted copy ctors to cleanup -Wdeprecated-copy
These types also wanted to be/were copy assignable, and using the
implicit copy ctor is deprecated in the presence of an explicit copy
ctor.
Removing the explicit copy ctor provides the desired behavior - both
ctor and assignment operator are available implicitly.
Also while I was nearby there were some missing std::moves on shared
pointer parameters.
Sanjay Patel [Mon, 10 May 2021 21:20:10 +0000 (17:20 -0400)]
[InstCombine] fold extract subvector of bitcast insertelt
This is visible in the original example from:
https://llvm.org/PR50055
(but this change doesn't solve the bug)
https://alive2.llvm.org/ce/z/vM_Yq-
Sanjay Patel [Mon, 10 May 2021 20:32:52 +0000 (16:32 -0400)]
[InstCombine] add tests for extract-subvector of insert; NFC
Artem Dergachev [Mon, 3 May 2021 21:32:37 +0000 (14:32 -0700)]
[clang-tidy] Aliasing: Add support for aggregates with references.
When a variable is used in an initializer of an aggregate
for its reference-type field this counts as aliasing.
Differential Revision: https://reviews.llvm.org/D101791
Artem Dergachev [Mon, 26 Apr 2021 20:52:01 +0000 (13:52 -0700)]
[clang-tidy] Aliasing: Add more support for captures.
D96215 takes care of the situation where the variable is captured into
a nearby lambda. This patch takes care of the situation where
the current function is the lambda and the variable is one of its captures
from an enclosing scope.
The analogous problem for ^{blocks} is already handled automagically
by D96215.
Differential Revision: https://reviews.llvm.org/D101787
Artem Dergachev [Mon, 26 Apr 2021 20:47:36 +0000 (13:47 -0700)]
[clang-tidy] Aliasing: Add support for captures.
The utility function clang::tidy::utils::hasPtrOrReferenceInFunc() scans the
function for pointer/reference aliases to a given variable. It currently scans
for operator & over that variable and for declarations of references to that
variable.
This patch makes it also scan for C++ lambda captures by reference
and for Objective-C block captures.
Differential Revision: https://reviews.llvm.org/D96215
Roman Lebedev [Mon, 10 May 2021 20:44:53 +0000 (23:44 +0300)]
[X86] AMD Zen 3: same-reg AVX YMM VPCMP is dep breaking one-idiom
As measured by exegesis, and confirmed by ref docs.
Still not zero-cycle :)
Roman Lebedev [Mon, 10 May 2021 20:43:59 +0000 (23:43 +0300)]
[NFC][X86][MCA] AMD Zen 3: add tests for same-re AVX YMM VPCMP
Roman Lebedev [Mon, 10 May 2021 20:40:34 +0000 (23:40 +0300)]
[X86] AMD Zen 3: same-reg AVX XMM VPCMP is dep breaking one-idiom
As measured by exegesis, and confirmed by ref docs.
Again, it's not zero-cycle.
Roman Lebedev [Mon, 10 May 2021 20:36:28 +0000 (23:36 +0300)]
[NFC][X86][MCA] AMD Zen 3: add tests for same-re AVX XMM VPCMP
Roman Lebedev [Mon, 10 May 2021 20:36:08 +0000 (23:36 +0300)]
[X86] AMD Zen 3: same-reg SSE XMM PCMP is dep breaking one-idiom
As measured by exegesis, and confirmed by ref docs.
Much like with MMX PCMP, it does actually have to execute, though.
Roman Lebedev [Mon, 10 May 2021 20:28:27 +0000 (23:28 +0300)]
[NFC][X86][MCA] AMD Zen 3: add tests for same-reg XMM SSE PCMP
Roman Lebedev [Mon, 10 May 2021 19:52:15 +0000 (22:52 +0300)]
[X86] AMD Zen 3: same-reg PCMPEQ is an MMX all-ones dep breaking idiom
They are, however, not zero-cycle, and do actually execute.
As measured by exegesis, and confirmed by ref docs.
Roman Lebedev [Mon, 10 May 2021 19:46:20 +0000 (22:46 +0300)]
[NFC][X86][MCA] AMD Zen 3: add tests for same-reg MMX PCMPEQ
Christopher Di Bella [Mon, 12 Apr 2021 05:23:09 +0000 (05:23 +0000)]
[libcxx] removes `weak_equality` and `strong_equality` from <compare>
`weak_equality` and `strong_equality` were removed before being
standardised, and need to be removed.
Also adjusts `common_comparison_category` since its test needed
adjusting due to the equality deletions.
Differential Revision: https://reviews.llvm.org/D100283
Arthur Eubanks [Mon, 10 May 2021 20:18:00 +0000 (13:18 -0700)]
[test] Put aix-xcoff-huge-relocs.ll under expensive checks
It is an order of magnitude slower than the second slowest test
according to obj/llvm/test/.lit_test_times.txt.
The two slowest are:
2.870437e+02 CodeGen/PowerPC/aix-xcoff-huge-relocs.ll
2.850697e+01 tools/llvm-readobj/ELF/file-header-machine-types.test
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D102190
Stella Laurenzo [Mon, 10 May 2021 20:03:30 +0000 (20:03 +0000)]
[mlir] Fix windows build bot break due to use of `alloca` in a test.
Differential Revision: https://reviews.llvm.org/D102189
Stella Laurenzo [Mon, 10 May 2021 18:03:40 +0000 (18:03 +0000)]
[mlir][Python] Finish adding RankedTensorType support for encoding.
Differential Revision: https://reviews.llvm.org/D102184
Nikita Popov [Sat, 24 Apr 2021 14:18:56 +0000 (16:18 +0200)]
[InstCombine] Fold comparison of integers by parts
Let's say you represent (i32, i32) as an i64 from which the parts
are extracted with lshr/trunc. Then, if you compare two tuples by
parts you get something like A[0] == B[0] && A[1] == B[1], just
that the part extraction happens by lshr/trunc and not a narrow
load or similar.
The fold implemented here reduces such equality comparisons by
converting them into a comparison on a larger part of the integer
(which might be the whole integer). It handles both the "and of eq"
and the conjugated "or of ne" case.
I'm being conservative with one-use for now, though this could be
relaxed if profitable (the base pattern converts 11 instructions
into 5 instructions, but there's quite a few variations on how it
can play out).
Differential Revision: https://reviews.llvm.org/D101232
Florian Hahn [Mon, 10 May 2021 19:49:19 +0000 (20:49 +0100)]
[VecLib] Add support for vector fns from Darwin's libsystem.
This patch adds support for Darwin's libsystem math vector functions to
TLI. Darwin's libsystem provides a range of vector functions for libm
functions.
This initial patch only adds the 2 x double and 4 x float versions,
which are available on both X86 and ARM64. On X86, wider vector versions
are supported as well.
Reviewed By: jroelofs
Differential Revision: https://reviews.llvm.org/D101856
Sam Clegg [Sat, 27 Feb 2021 00:09:32 +0000 (16:09 -0800)]
[lld][WebAssembly] Initial support merging string data
This change adds support for a new WASM_SEG_FLAG_STRINGS flag in
the object format which works in a similar fashion to SHF_STRINGS
in the ELF world.
Unlike the ELF linker this support is currently limited:
- No support for SHF_MERGE (non-string merging)
- Always do full tail merging ("lo" can be merged with "hello")
- Only support single byte strings (p2align 0)
Like the ELF linker merging is only performed at `-O1` and above.
This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828,
although crucially it doesn't not currently support debug sections
because they are not represented by data segments (they are custom
sections)
Differential Revision: https://reviews.llvm.org/D97657
Arthur Eubanks [Sun, 2 May 2021 04:27:47 +0000 (21:27 -0700)]
[NFC] Use ArgListEntry indirect types more in ISel lowering
For opaque pointers, we're trying to avoid uses of
PointerType::getElementType().
A couple of ISel places use PointerType::getElementType(). Some of these
are easy to fix by using ArgListEntry's indirect types.
The inalloca type wasn't stored there, as opposed to preallocated and
byval which have their indirect types available, so add it and use it.
Differential Revision: https://reviews.llvm.org/D101713
Lang Hames [Mon, 10 May 2021 19:34:52 +0000 (12:34 -0700)]
[ORC] Use a unique_function rather than std::function for dispatchTask.