Jan Svoboda [Fri, 26 Nov 2021 12:39:25 +0000 (13:39 +0100)]
[clang][deps] NFC: Extract function
This commits extracts a couple of nested conditions into a separate function with early returns, making the control flow easier to understand.
Stanislav Funiak [Fri, 26 Nov 2021 12:39:04 +0000 (18:09 +0530)]
Added line numbers to the debug output of PDL bytecode.
This is a small diff that splits out the debug output for PDL bytecode. When running bytecode with debug output on, it is useful to know the line numbers where the PDLIntepr operations are performed. Usually, these are in a single MLIR file, so it's sufficient to print out the line number rather than the entire location (which tends to be quite verbose). This debug output is gated by `LLVM_DEBUG` rather than `#ifndef NDEBUG` to make it easier to test.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D114061
Stanislav Funiak [Fri, 26 Nov 2021 12:38:50 +0000 (18:08 +0530)]
Multi-root PDL matching using upward traversals.
This is commit 4 of 4 for the multi-root matching in PDL, discussed in https://llvm.discourse.group/t/rfc-multi-root-pdl-patterns-for-kernel-matching/4148 (topic flagged for review).
This PR integrates the various components (root ordering algorithm, nondeterministic execution of PDL bytecode) to implement multi-root PDL matching. The main idea is for the pattern to specify mulitple candidate roots. The PDL-to-PDLInterp lowering selects one of these roots and "hangs" the pattern from this root, traversing the edges downwards (from operation to its operands) when possible and upwards (from values to its uses) when needed. The root is selected by invoking the optimal matching multiple times, once for each candidate root, and the connectors are determined form the optimal matching. The costs in the directed graph are equal to the number of upward edges that need to be traversed when connecting the given two candidate roots. It can be shown that, for this choice of the cost function, "hanging" the pattern an inner node is no better than from the optimal root.
The following three main additions were implemented as a part of this PR:
1. OperationPos predicate has been extended to allow tracing the operation accepting a value (the opposite of operation defining a value).
2. Predicate checking if two values are not equal - this is useful to ensure that we do not traverse the edge back downwards after we traversed it upwards.
3. Function for for building the cost graph among the candidate roots.
4. Updated buildPredicateList, building the predicates optimal branching has been determined.
Testing: unit tests (an integration test to follow once the stack of commits has landed)
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D108550
Stanislav Funiak [Fri, 26 Nov 2021 12:38:42 +0000 (18:08 +0530)]
Implementation of the root ordering algorithm
This is commit 3 of 4 for the multi-root matching in PDL, discussed in https://llvm.discourse.group/t/rfc-multi-root-pdl-patterns-for-kernel-matching/4148 (topic flagged for review).
We form a graph over the specified roots, provided in `pdl.rewrite`, where two roots are connected by a directed edge if the target root can be connected (via a chain of operations) in the underlying pattern to the source root. We place a restriction that the path connecting the two candidate roots must only contain the nodes in the subgraphs underneath these two roots. The cost of an edge is the smallest number of upward traversals (edges) required to go from the source to the target root, and the connector is a `Value` in the intersection of the two subtrees rooted at the source and target root that results in that smallest number of such upward traversals. Optimal root ordering is then formulated as the problem of finding a spanning arborescence (i.e., a directed spanning tree) of minimal weight.
In order to determine the spanning arborescence (directed spanning tree) of minimum weight, we use the [Edmonds' algorithm](https://en.wikipedia.org/wiki/Edmonds%27_algorithm). The worst-case computational complexity of this algorithm is O(_N_^3) for a single root, where _N_ is the number of specified roots. The `pdl`-to-`pdl_interp` lowering calls this algorithm as a subroutine _N_ times (once for each candidate root), so the overall complexity of root ordering is O(_N_^4). If needed, this complexity could be reduced to O(_N_^3) with a more efficient algorithm. However, note that the underlying implementation is very efficient, and _N_ in our instances tends to be very small (<10). Therefore, we believe that the proposed (asymptotically suboptimal) implementation will suffice for now.
Testing: a unit test of the algorithm
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D108549
Stanislav Funiak [Fri, 26 Nov 2021 12:38:34 +0000 (18:08 +0530)]
Introduced iterative bytecode execution.
This is commit 2 of 4 for the multi-root matching in PDL, discussed in https://llvm.discourse.group/t/rfc-multi-root-pdl-patterns-for-kernel-matching/4148 (topic flagged for review).
This commit implements the features needed for the execution of the new operations pdl_interp.get_accepting_ops, pdl_interp.choose_op:
1. The implementation of the generation and execution of the two ops.
2. The addition of Stack of bytecode positions within the ByteCodeExecutor. This is needed because in pdl_interp.choose_op, we iterate over the values returned by pdl_interp.get_accepting_ops until we reach finalize. When we reach finalize, we need to return back to the position marked in the stack.
3. The functionality to extend the lifetime of values that cross the nondeterministic choice. The existing bytecode generator allocates the values to memory positions by representing the liveness of values as a collection of disjoint intervals over the matcher positions. This is akin to register allocation, and substantially reduces the footprint of the bytecode executor. However, because with iterative operation pdl_interp.choose_op, execution "returns" back, so any values whose original liveness cross the nondeterminstic choice must have their lifetime executed until finalize.
Testing: pdl-bytecode.mlir test
Reviewed By: rriddle, Mogball
Differential Revision: https://reviews.llvm.org/D108547
Igor Kirillov [Wed, 24 Nov 2021 17:23:24 +0000 (17:23 +0000)]
[AArch64][SVEIntrinsicOpts] Fix: predicated SVE mul/fmul are not commutative
We can not swap multiplicand and multiplier because the sve intrinsics
are predicated. Imagine lanes in vectors having the following values:
pg = 0
multiplicand = 1 (from dup)
multiplier = 2
The resulting value should be 1, but if we swap multiplicand and multiplier it will become 2,
which is incorrect.
Differential Revision: https://reviews.llvm.org/D114577
Bhumitram Kumar [Fri, 26 Nov 2021 12:35:27 +0000 (18:05 +0530)]
[Docs] Removed /Zd flag still mentioned in documentation
https://reviews.llvm.org/D93458 removed the /Zd flag as MSVC doesn't support that syntax. Instead users should be using -gline-tables-only.
The /Zd flag is still mentioned at https://clang.llvm.org/docs/UsersManual.html#clang-cl : /Zd Emit debug line number tables only.
Fix PR52571
Reviewed By: xgupta
Differential Revision: https://reviews.llvm.org/D114632
Stanislav Funiak [Fri, 26 Nov 2021 12:27:30 +0000 (17:57 +0530)]
Defines new PDLInterp operations needed for multi-root matching in PDL.
This is commit 1 of 4 for the multi-root matching in PDL, discussed in https://llvm.discourse.group/t/rfc-multi-root-pdl-patterns-for-kernel-matching/4148 (topic flagged for review).
These operations are:
* pdl.get_accepting_ops: Returns a list of operations accepting the given value or a range of values at the specified position. Thus if there are two operations `%op1 = "foo"(%val)` and `%op2 = "bar"(%val)` accepting a value at position 0, `%ops = pdl_interp.get_accepting_ops of %val : !pdl.value at 0` will return both of them. This allows us to traverse upwards from a value to operations accepting the value.
* pdl.choose_op: Iteratively chooses one operation from a range of operations. Therefore, writing `%op = pdl_interp.choose_op from %ops` in the example above will select either `%op1`or `%op2`.
Testing: Added the corresponding test cases to mlir/test/Dialect/PDLInterp/ops.mlir.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D108543
Daniel Kiss [Fri, 26 Nov 2021 12:26:19 +0000 (13:26 +0100)]
[libunwind][ARM] Handle end of stack during unwind
When unwind step reaches the end of the stack that means the force unwind should notify the stop function.
This is not an error, it could mean just the thread is cleaned up completely.
Reviewed By: #libunwind, mstorsjo
Differential Revision: https://reviews.llvm.org/D109856
Carl Ritson [Fri, 26 Nov 2021 10:45:23 +0000 (19:45 +0900)]
[AMDGPU] Add SIMemoryLegalizer comments to clarify bit usage
Attempt to further document the intended cache policies requested
by different combinations of GLC, SLC and DLC bits.
GFX10 non-temporal stores are updated to set GLC.
Reviewed By: t-tye
Differential Revision: https://reviews.llvm.org/D114351
Abinav Puthan Purayil [Tue, 23 Nov 2021 17:13:18 +0000 (22:43 +0530)]
[GlobalISel] Fold or of shifts to funnel shift.
This change folds a basic funnel shift idiom:
- (or (shl x, amt), (lshr y, sub(bw, amt))) -> fshl(x, y, amt)
- (or (shl x, sub(bw, amt)), (lshr y, amt)) -> fshr(x, y, amt)
This also helps in folding to rotate shift if x and y are equal since we
already have a funnel shift to rotate combine.
Differential Revision: https://reviews.llvm.org/D114499
David Sherwood [Mon, 25 Oct 2021 14:26:54 +0000 (15:26 +0100)]
[LoopVectorize] When tail-folding, don't always predicate uniform loads
In VPRecipeBuilder::handleReplication if we believe the instruction
is predicated we then proceed to create new VP region blocks even
when the load is uniform and only predicated due to tail-folding.
I have updated isPredicatedInst to avoid treating a uniform load as
predicated when tail-folding, which means we can do a single scalar
load and a vector splat of the value.
Tests added here:
Transforms/LoopVectorize/AArch64/tail-fold-uniform-memops.ll
Differential Revision: https://reviews.llvm.org/D112552
Jan Svoboda [Thu, 25 Nov 2021 17:54:18 +0000 (18:54 +0100)]
[clang][deps] NFC: Clean up wording (ignored vs minimized)
The filesystem used during dependency scanning does two things: it caches file entries and minimizes source file contents. We use the term "ignored file" in a couple of places, but it's not clear what exactly that means. This commit clears up the semantics, explicitly spelling out this relates to minimization.
Jan Svoboda [Fri, 26 Nov 2021 11:03:19 +0000 (12:03 +0100)]
[clang][deps] NFC: Remove else after early return
Bradley Smith [Thu, 18 Nov 2021 15:00:33 +0000 (15:00 +0000)]
[AArch64][SVE] Generate ASRD instructions for power of 2 signed divides
Differential Revision: https://reviews.llvm.org/D113281
David Green [Fri, 26 Nov 2021 10:57:14 +0000 (10:57 +0000)]
[ARM] Generate VCTP from SETCC
This converts a vector SETCC([0,1,2,..], splat(n), ult) to vctp n, which
can be fewer instructions and prevent the need for constant pool loads.
Differential Revision: https://reviews.llvm.org/D114177
Simon Pilgrim [Fri, 26 Nov 2021 10:32:01 +0000 (10:32 +0000)]
[DAG] SimplifyDemandedVectorElts - attempt to handle ADD(x,x) as single use
If the ADD node is the only user of the repeated operand, then treat this as single use - allows us to peek through shl(x,1) patterns.
Venkata Ramanaiah Nalamothu [Fri, 26 Nov 2021 10:04:57 +0000 (15:34 +0530)]
[lldb] Fix 'memory write' to not allow specifying values when writing file contents
Currently the 'memory write' command allows specifying the values when
writing the file contents to memory but the values are actually ignored. This
patch fixes that by erroring out when values are specified in such cases.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D114544
Mikhail Maltsev [Fri, 26 Nov 2021 10:12:19 +0000 (10:12 +0000)]
[libcxx] Implement three-way comparison for std::reverse_iterator
This patch implements operator<=> for std::reverse_iterator and
also adds a test that checks that three-way comparison of different
instantiations of std::reverse_iterator works as expected (related to
D113417).
Reviewed By: ldionne, Quuxplusone, #libc
Differential Revision: https://reviews.llvm.org/D113695
Kadir Cetinkaya [Thu, 25 Nov 2021 19:11:46 +0000 (20:11 +0100)]
[clang] Fix crash on broken parameter declarators
Differential Revision: https://reviews.llvm.org/D114609
David Green [Fri, 26 Nov 2021 09:41:09 +0000 (09:41 +0000)]
[ARM] Add some vctp from setcc tests. NFC
Kirill Bobyrev [Fri, 26 Nov 2021 09:10:49 +0000 (10:10 +0100)]
[clang] Change ordering of PreableCallbacks to make sure PP can be referenced in them
Currently, BeforeExecute is called before BeginSourceFile which does not allow
using PP in the callbacks. Change the ordering to ensure it is possible.
This is a prerequisite for D114370.
Originated from a discussion with @kadircet.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D114525
David Sherwood [Mon, 22 Nov 2021 16:41:18 +0000 (16:41 +0000)]
[CodeGen] Add scalable vector support for lowering of llvm.get.active.lane.mask
Currently the generic lowering of llvm.get.active.lane.mask is done
in SelectionDAGBuilder::visitIntrinsicCall and currently assumes
only fixed-width vectors are used. This patch changes the code to be
more generic and support scalable vectors too. I have added tests
for SVE here:
CodeGen/AArch64/active_lane_mask.ll
although the code quality leaves a lot to be desired. The code will
be improved significantly in a later patch that makes use of the
SVE whilelo instruction.
Differential Revision: https://reviews.llvm.org/D114541
Tobias Gysi [Fri, 26 Nov 2021 07:32:34 +0000 (07:32 +0000)]
[mlir][linalg] Simplify the hoist padding tests.
Use primarily matvec instead of matmul to test hoist padding. Test the hoisting only starting from already padded IR. Use one-dimensional tiling only except for the tile_and_fuse test that exercises hoisting on a larger loop nest with fill and pad tensor operations in the backward slice.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114608
Balázs Kéri [Thu, 25 Nov 2021 14:55:35 +0000 (15:55 +0100)]
[clang][AST] Check context of record in structural equivalence.
The AST structural equivalence check did not differentiate between
a struct and a struct with same name in different namespace. When
type of a member is checked it is possible to encounter such a case
and wrongly decide that the types are similar. This problem is fixed
by check for the namespaces of a record declaration.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D113118
Michal Terepeta [Fri, 26 Nov 2021 07:14:07 +0000 (07:14 +0000)]
[mlir][Vector] Minor formatting fixes in Vector.md
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D113854
Dmitry Vyukov [Thu, 25 Nov 2021 18:34:35 +0000 (19:34 +0100)]
tsan: remember and print function that installed at_exit callbacks
Sometimes stacks for at_exit callbacks don't include any of the user functions/files.
For example, a race with a global std container destructor will only contain
the container type name and our at_exit_wrapper function. No signs what global variable
this is.
Remember and include in reports the function that installed the at_exit callback.
This should give glues as to what variable is being destroyed.
Depends on D114606.
Reviewed By: vitalybuka, melver
Differential Revision: https://reviews.llvm.org/D114607
Dmitry Vyukov [Thu, 25 Nov 2021 17:56:42 +0000 (18:56 +0100)]
tsan: add a test for on_exit
Depends on D114605.
Reviewed By: vitalybuka, melver
Differential Revision: https://reviews.llvm.org/D114606
Dmitry Vyukov [Thu, 25 Nov 2021 17:55:42 +0000 (18:55 +0100)]
tsan: add test for __cxa_atexit
Add a test for a common C++ bug when a global object is destroyed
while background threads still use it.
Depends on D114604.
Reviewed By: vitalybuka, melver
Differential Revision: https://reviews.llvm.org/D114605
Dmitry Vyukov [Thu, 25 Nov 2021 17:54:05 +0000 (18:54 +0100)]
tsan: check stack in atexit4.cpp test
Reviewed By: vitalybuka, melver
Differential Revision: https://reviews.llvm.org/D114604
Kazu Hirata [Fri, 26 Nov 2021 06:17:10 +0000 (22:17 -0800)]
[llvm] Use range-based for loops (NFC)
Christudasan Devadasan [Mon, 6 Sep 2021 03:40:10 +0000 (23:40 -0400)]
[AMDGPU] Make vector superclasses allocatable
The combined vector register classes with both
VGPRs and AGPRs are currently unallocatable.
This patch turns them into allocatable as a
prerequisite to enable copy between VGPR and
AGPR registers during regalloc.
Also, added the missing AV register classes from
192b to 1024b.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D109300
Fangrui Song [Fri, 26 Nov 2021 04:24:23 +0000 (20:24 -0800)]
[ELF] Rename BaseCommand to SectionCommand. NFC
BaseCommand was picked when PHDRS/INSERT/etc were not implemented. Rename it to
SectionCommand to match `sectionCommands` and make it clear that the commands
are used in SECTIONS (except a special case for SymbolAssignment).
Also, improve naming of some BaseCommand variables (base -> cmd).
Jessica Clarke [Fri, 26 Nov 2021 03:58:36 +0000 (03:58 +0000)]
[NFC] Fix typo in
95875d246acb
Matthias Springer [Fri, 26 Nov 2021 02:41:07 +0000 (11:41 +0900)]
[mlir][linalg][bufferize][NFC] Pass BufferizationState to PostAnalysisStep
Pass BufferizationStep instead of BufferizationAliasInfo. Note: BufferizationState contains BufferizationAliasInfo.
Differential Revision: https://reviews.llvm.org/D114512
Matthias Springer [Fri, 26 Nov 2021 02:35:10 +0000 (11:35 +0900)]
[mlir][linalg][bufferize] Compose dialect-specific bufferization state
Use composition instead of inheritance for storing dialect-specific bufferization state. This is in preparation of adding "tensor dialect"-specific bufferization state.
Differential Revision: https://reviews.llvm.org/D114508
Matthias Springer [Fri, 26 Nov 2021 02:25:46 +0000 (11:25 +0900)]
[mlir][linalg][bufferize][NFC] Allow returning arbitrary memrefs
If `allowReturnMemref` is set to true, arbitrary memrefs may be returned from FuncOps. Also remove allocation hoisting code, which is only partly implemented at the moment.
The purpose of this commit is to untangle `bufferize` from `aliasInfo`. (Even with this change, they are not fully untangled yet.)
Differential Revision: https://reviews.llvm.org/D114507
Matthias Springer [Fri, 26 Nov 2021 01:10:08 +0000 (10:10 +0900)]
[mlir][linalg][bufferize][NFC] Extract func boundary bufferization
Bufferization of function boundaries is extracted from ComprehensiveBufferize into a separate file. This will become its own build target in the future.
Differential Revision: https://reviews.llvm.org/D114226
Fangrui Song [Fri, 26 Nov 2021 00:55:06 +0000 (16:55 -0800)]
[ELF] Make ExprValue smaller. NFC'
Fangrui Song [Fri, 26 Nov 2021 00:47:07 +0000 (16:47 -0800)]
[ELF] Rename OutputSection::sectionCommands to commands. NFC
This partially reverts r315409: the description applies to LinkerScript, but not
to OutputSection.
The name "sectionCommands" is used in both LinkerScript::sectionCommands and
OutputSection::sectionCommands, which may lead to confusion.
"commands" in OutputSection has no ambiguity because there are no other types
of commands.
Matthias Springer [Fri, 26 Nov 2021 00:21:04 +0000 (09:21 +0900)]
[mlir][linalg][bufferize][NFC] Move Affine interface impl to new build target
This makes ComprehensiveBufferize entirely independent of the Affine dialect.
Differential Revision: https://reviews.llvm.org/D114222
Mehdi Amini [Fri, 26 Nov 2021 00:13:32 +0000 (00:13 +0000)]
Fix link to the other docs from the Bufferization dialect
Fangrui Song [Thu, 25 Nov 2021 22:42:22 +0000 (14:42 -0800)]
[ELF] Remove redundant part.dynSymTab creation. NFC
Fangrui Song [Thu, 25 Nov 2021 22:23:25 +0000 (14:23 -0800)]
[ELF] Simplify GnuHashSection::write. NFC
Fangrui Song [Thu, 25 Nov 2021 22:12:34 +0000 (14:12 -0800)]
[ELF] Simplify DynamicSection content computation. NFC
The new code computes the content twice, but avoides the tricky
std::function<uint64_t()>. Removed 13KiB code in a Release build.
Jeremy Morse [Thu, 25 Nov 2021 21:41:55 +0000 (21:41 +0000)]
[DebugInfo][InstrRef] Add extra indirection for NRVO tests
In some scenarios, usually involving NRVO, we can issue indirect DBG_VALUEs
after SelectionDAG, even in instruction referencing mode (if the variable
is an argument). If the corresponding argument value is spilt to the stack,
then we have:
* Indirection from it being on the stack,
* Indirection from it being a dbg.declare or a dbg.addr.
However InstrRefBasedLDV only emits one level of indirection. This patch
adds the second, by adding an extra DW_OP_deref if necessary. The two
tests modified fail otherwise -- they feature some NRVO, and require two
levels of indirection to be correct.
Differential Revision: https://reviews.llvm.org/D114364
Zarko Todorovski [Thu, 25 Nov 2021 20:52:28 +0000 (15:52 -0500)]
[clang][NFC] Inclusive terms: rename AccessDeclContextSanity to AccessDeclContextCheck
Rename function to more inclusive name.
Reviewed By: quinnp
Differential Revision: https://reviews.llvm.org/D114029
Quinn Pham [Wed, 17 Nov 2021 15:39:43 +0000 (09:39 -0600)]
[NFC] Inclusive language: rename master flag to main flag
[NFC] As part of using inclusive language within the llvm project, this patch
renames master flag to main flag in these comments.
Reviewed By: ZarkoCA
Differential Revision: https://reviews.llvm.org/D114090
Quinn Pham [Mon, 15 Nov 2021 15:52:20 +0000 (09:52 -0600)]
[NFC][flang] Inclusive language: remove instances of master
[NFC] As part of using inclusive language within the llvm project, this patch:
- replaces master with main in C++style.md to match the renaming of the master
branch,
- removes master from `FortranIR.md` where it is superfluous,
- renames a logical variable in `pre-fir-tree04.f90` containing master.
Reviewed By: ZarkoCA
Differential Revision: https://reviews.llvm.org/D113923
Jeremy Morse [Thu, 25 Nov 2021 19:27:08 +0000 (19:27 +0000)]
[DebugInfo][InstrRef] Avoid some quadratic behaviour in LiveDebugVariables
This is a performance patch -- LiveDebugVariables can behave quadratically
if a lot of debug instructions are inserted back into the same place, and
we have to repeatedly step-over hte ones we've already inserted.
To get around it, whenever we insert a debug instruction at a slot index,
check whether there are more debug instructions to insert at this point,
and insert them too. That avoids the repeated lookup and stepping through.
It relies on the container for unlinked debug instructions being recorded
in-order, which is how LiveDebugVariables currently does it.
Differential Revision: https://reviews.llvm.org/D114587
Louis Dionne [Mon, 22 Nov 2021 19:51:09 +0000 (14:51 -0500)]
[libunwind] Fix testing with sanitizers enabled
When testing with sanitizers enabled, we need to link against a plethora
of system libraries. Using `-nodefaultlibs` like we used to breaks this,
and we would have to add all these system libraries manually, which is
not portable and error prone. Instead, stop using `-nodefaultlibs` so
that we get the libraries added by default by the compiler.
The only caveat with this approach is that we are now relying on the
fact that `-L <path-to-local-libunwind>` will cause the just built
libunwind to be selected before the system implementation (either of
libunwind or libgcc_s.so), which is somewhat fragile.
This patch also turns the 32 bit multilib build into a soft failure
since we are in the process of removing it anyway, see D114473 for
details. This patch is incompatible with the 32 bit multilib build
because Ubuntu does not provide a proper libstdc++ for 32 bits, and
that is required when running with sanitizers enabled.
Differential Revision: https://reviews.llvm.org/D114385
Florian Hahn [Thu, 25 Nov 2021 20:18:33 +0000 (20:18 +0000)]
[ThreadPool] Use auto again for future with ENABLE_THREADS=Off.
This fixes a build failure with LLVM_ENABLE_THREADS=Off.
Michal Terepeta [Thu, 25 Nov 2021 20:10:02 +0000 (20:10 +0000)]
[mlir][Vector] Support 0-D vectors in `VectorPrintOpConversion`
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114549
Jake Egan [Thu, 25 Nov 2021 20:10:42 +0000 (15:10 -0500)]
[AIX] Disable unsupported offloading gpu tests
GPUs are not supported on AIX, so this patch sets these tests as unsupported.
Reviewed By: stevewan
Differential Revision: https://reviews.llvm.org/D114381
Florian Hahn [Thu, 25 Nov 2021 20:07:53 +0000 (20:07 +0000)]
Recommit [ThreadPool] Support returning futures with results.
This reverts commit
71a7c55f0f021b04b9a7303d0cd391b9161cf303.
The revert broken building llvm-reduce and it is not clear it fixes an
issue with LLVM_ENABLE_THREADS=OFF.
See discussion in https://reviews.llvm.org/D114183 for more details.
Jesses Gott [Thu, 25 Nov 2021 19:42:40 +0000 (19:42 +0000)]
[clang-format] Extend AllowShortBlocksOnASingleLine for else blocks
Extend AllowShortBlocksOnASingleLine for else blocks. See https://bugs.llvm.org/show_bug.cgi?id=49722
Reviewed By: HazardyKnusperkeks, owenpan, MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D114320
Quinn Pham [Thu, 18 Nov 2021 20:13:03 +0000 (14:13 -0600)]
[NFC][llvm] Inclusive language: replace master in llvm docs
[NFC] As part of using inclusive language within the llvm project, this patch
removes instances of master in these files.
Reviewed By: ZarkoCA
Differential Revision: https://reviews.llvm.org/D114187
mydeveloperday [Thu, 25 Nov 2021 19:34:37 +0000 (19:34 +0000)]
[clang-format] NFC update LLVM overall clang-formatted status
Whilst the % clang-formatted remains the same, the number
of files added to the LLVM project has risen by almost by 259.
- 190 of them have been added clang-format clean.
- 69 files have been added unformatted. (lit tests should be excluded from this number)
- 291 files have been added to the list of files that are clang-format clean
- 101 files have either become unclean or have been removed
As this updates the clang-formatted-files there are now
8139 files that are clean which we can be used as a regression test when making changes to clang-format.
```
clang-format -verbose -n -files ./clang/docs/tools/clang-formatted-files.txt
```
Quinn Pham [Fri, 19 Nov 2021 17:00:14 +0000 (11:00 -0600)]
[NFC][compiler-rt] Inclusive language: replace master/slave with primary/secondary
[NFC] As part of using inclusive language within the llvm project, this patch
replaces master and slave with primary and secondary respectively in
`sanitizer_mac.cpp`.
Reviewed By: ZarkoCA
Differential Revision: https://reviews.llvm.org/D114255
Uday Bondhugula [Thu, 25 Nov 2021 15:08:32 +0000 (20:38 +0530)]
[MLIR] NFC. Rename MLIR CAPI ExecutionEngine target for consistency
Rename MLIR CAPI ExecutionEngine target for consistency:
MLIRCEXECUTIONENGINE -> MLIRCAPIExecutionEngine in line with other
targets.
Differential Revision: https://reviews.llvm.org/D114596
Amy Kwan [Thu, 25 Nov 2021 17:17:37 +0000 (11:17 -0600)]
[PowerPC] Prevent the optimizer from producing wide vector types in IR.
This patch prevents the optimizer from producing wide vectors in the IR,
specifically the MMA types (v256i1, v512i1). The idea is that on Power, we only
want to be producing these types only if the vector_pair and vector_quad types
are used specifically.
To prevent the optimizer from producing these types in the IR,
vectorCostAdjustmentFactor() is updated to return an instruction cost factor or
an invalid instruction cost if the current type is that of an MMA type. An
invalid instruction cost returned by this function signifies to other cost
computing functions to return the maximum instruction cost to inform the
optimizer that producing these types within the IR is expensive, and should not
be produced in the first place.
This issue was first seen in the test case included within this patch.
Differential Revision: https://reviews.llvm.org/D113900
Quinn Pham [Wed, 17 Nov 2021 16:21:05 +0000 (10:21 -0600)]
[NFC][llvm] Inclusive language: replace master with main in dbg-call-site-spilled-arg.mir
[NFC] As part of using inclusive language within the llvm project, this patch
replaces master with main in `dbg-call-site-spilled-arg.mir`.
Reviewed By: ZarkoCA
Differential Revision: https://reviews.llvm.org/D114097
Pavel Kosov [Thu, 25 Nov 2021 18:27:02 +0000 (21:27 +0300)]
[LLDB] Provide target specific directories to libclang
On Linux some C++ and C include files reside in target specific directories, like /usr/include/x86_64-linux-gnu.
Patch adds them to libclang, so LLDB jitter has more chances to compile expression.
OS Laboratory. Huawei Russian Research Institute. Saint-Petersburg
Reviewed By: teemperor
Differential Revision: https://reviews.llvm.org/D110827
Louis Dionne [Wed, 24 Nov 2021 21:41:03 +0000 (16:41 -0500)]
[libc++] Fix ssize test that made an assumption about ptrdiff_t being 'long'
On some platforms like armv7m, the size() method of containers returns
unsigned long, while ptrdiff_t is just int. Hence, std::ssize_t ends up
being long, which is not the same as ptrdiff_t. This is usually not an
issue because std::ptrdiff_t is long, so everything works out, but it
breaks on some more exotic architectures.
Differential Revision: https://reviews.llvm.org/D114563
Lei Huang [Wed, 17 Nov 2021 21:43:35 +0000 (15:43 -0600)]
[CMake] Add new cmake option to control adding comments in GenDAGISel
Add new cmake option `LLVM_OMIT_DAGISEL_COMMENTS` to control adding
of comments in GenDAGISel for none debug builds
Ref: https://reviews.llvm.org/D78884
Reviewed By: nemanjai, MaskRay, #powerpc
Differential Revision: https://reviews.llvm.org/D114122
Dmitry Vyukov [Tue, 27 Apr 2021 11:55:41 +0000 (13:55 +0200)]
tsan: new runtime (v3)
This change switches tsan to the new runtime which features:
- 2x smaller shadow memory (2x of app memory)
- faster fully vectorized race detection
- small fixed-size vector clocks (512b)
- fast vectorized vector clock operations
- unlimited number of alive threads/goroutimes
Depends on D112602.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D112603
Daniel McIntosh [Thu, 25 Nov 2021 17:18:38 +0000 (12:18 -0500)]
Revert "[ThreadPool] Support returning futures with results."
This reverts commit
6149e57dc1313d32c85524f8009a1249e0b8f4d1.
The offending commit broke building with LLVM_ENABLE_THREADS=OFF.
Quinn Pham [Wed, 17 Nov 2021 16:48:25 +0000 (10:48 -0600)]
[NFC][clang-tools-extra] Inclusive language: replace master with main
[NFC] As part of using inclusive language within the llvm project, this patch
replaces master with main in `SubModule2.h`.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D114100
Kazu Hirata [Thu, 25 Nov 2021 16:55:16 +0000 (08:55 -0800)]
[llvm] Use range-based for loops (NFC)
Joe Loser [Thu, 25 Nov 2021 03:48:40 +0000 (22:48 -0500)]
[libc++] Avoid overload resolution in path comparison operators
Rework `std::filesystem::path::operator==` and friends to avoid overload
resolution and atomic constraint caching issues shown from
https://reviews.llvm.org/D113161.
Always call `__compare(string_view)` from the comparison operators which avoids
overload resolution.
Differential Revision: https://reviews.llvm.org/D114570
Joe Loser [Wed, 24 Nov 2021 20:35:15 +0000 (15:35 -0500)]
[libc++] Fix constraints for string_view's iterator/sentinel constructor
The `string_view` constructor taking an iterator/sentinel uses concepts
instead of type traits like the Standard states. Using `same_as` instead
of `is_same_v` should be harmless. Prefer `std::is_same_v` instead which is
cheaper to compile. Replace `convertible_to` with `is_convertible_v` as
well.
This observation came up while working on
https://reviews.llvm.org/D113161
Differential Revision: https://reviews.llvm.org/D114561
Dmitry Vyukov [Thu, 25 Nov 2021 15:26:40 +0000 (16:26 +0100)]
tsan: fix another potential deadlock in fork
Linux/fork_deadlock.cpp currently hangs in debug mode in the following stack.
Disable memory access handling in OnUserAlloc/Free around fork.
1 0x000000000042c54b in __sanitizer::internal_sched_yield () at sanitizer_linux.cpp:452
2 0x000000000042da15 in __sanitizer::StaticSpinMutex::LockSlow (this=0x57ef02 <__sanitizer::internal_allocator_cache_mu>) at sanitizer_mutex.cpp:24
3 0x0000000000423927 in __sanitizer::StaticSpinMutex::Lock (this=0x57ef02 <__sanitizer::internal_allocator_cache_mu>) at sanitizer_mutex.h:32
4 0x000000000042354c in __sanitizer::GenericScopedLock<__sanitizer::StaticSpinMutex>::GenericScopedLock (this=this@entry=0x7ffcabfca0b8, mu=0x1) at sanitizer_mutex.h:367
5 0x0000000000423653 in __sanitizer::RawInternalAlloc (size=size@entry=72, cache=cache@entry=0x0, alignment=8, alignment@entry=0) at sanitizer_allocator.cpp:52
6 0x00000000004235e9 in __sanitizer::InternalAlloc (size=size@entry=72, cache=0x1, cache@entry=0x0, alignment=4, alignment@entry=0) at sanitizer_allocator.cpp:86
7 0x000000000043aa15 in __sanitizer::SymbolizedStack::New (addr=4802655) at sanitizer_symbolizer.cpp:45
8 0x000000000043b353 in __sanitizer::Symbolizer::SymbolizePC (this=0x7f578b77a028, addr=4802655) at sanitizer_symbolizer_libcdep.cpp:90
9 0x0000000000439dbe in __sanitizer::(anonymous namespace)::StackTraceTextPrinter::ProcessAddressFrames (this=this@entry=0x7ffcabfca208, pc=4802655) at sanitizer_stacktrace_libcdep.cpp:36
10 0x0000000000439c89 in __sanitizer::StackTrace::PrintTo (this=this@entry=0x7ffcabfca2a0, output=output@entry=0x7ffcabfca260) at sanitizer_stacktrace_libcdep.cpp:109
11 0x0000000000439fe0 in __sanitizer::StackTrace::Print (this=0x18) at sanitizer_stacktrace_libcdep.cpp:132
12 0x0000000000495359 in __sanitizer::PrintMutexPC (pc=4802656) at tsan_rtl.cpp:774
13 0x000000000042e0e4 in __sanitizer::InternalDeadlockDetector::Lock (this=0x7f578b1ca740, type=type@entry=2, pc=pc@entry=4371612) at sanitizer_mutex.cpp:177
14 0x000000000042df65 in __sanitizer::CheckedMutex::LockImpl (this=<optimized out>, pc=4) at sanitizer_mutex.cpp:218
15 0x000000000042bc95 in __sanitizer::CheckedMutex::Lock (this=0x600001000000) at sanitizer_mutex.h:127
16 __sanitizer::Mutex::Lock (this=0x600001000000) at sanitizer_mutex.h:165
17 0x000000000042b49c in __sanitizer::GenericScopedLock<__sanitizer::Mutex>::GenericScopedLock (this=this@entry=0x7ffcabfca370, mu=0x1) at sanitizer_mutex.h:367
18 0x000000000049504f in __tsan::TraceSwitch (thr=0x7f578b1ca980) at tsan_rtl.cpp:656
19 0x000000000049523e in __tsan_trace_switch () at tsan_rtl.cpp:683
20 0x0000000000499862 in __tsan::TraceAddEvent (thr=0x7f578b1ca980, fs=..., typ=__tsan::EventTypeMop, addr=4499472) at tsan_rtl.h:624
21 __tsan::MemoryAccessRange (thr=0x7f578b1ca980, pc=4499472, addr=
135257110102784, size=size@entry=16, is_write=true) at tsan_rtl_access.cpp:563
22 0x000000000049853a in __tsan::MemoryRangeFreed (thr=thr@entry=0x7f578b1ca980, pc=pc@entry=4499472, addr=addr@entry=
135257110102784, size=16) at tsan_rtl_access.cpp:487
23 0x000000000048f6bf in __tsan::OnUserFree (thr=thr@entry=0x7f578b1ca980, pc=pc@entry=4499472, p=p@entry=
135257110102784, write=true) at tsan_mman.cpp:260
24 0x000000000048f61f in __tsan::user_free (thr=thr@entry=0x7f578b1ca980, pc=4499472, p=p@entry=0x7b0400004300, signal=true) at tsan_mman.cpp:213
25 0x000000000044a820 in __interceptor_free (p=0x7b0400004300) at tsan_interceptors_posix.cpp:708
26 0x00000000004ad599 in alloc_free_blocks () at fork_deadlock.cpp:25
27 __tsan_test_only_on_fork () at fork_deadlock.cpp:32
28 0x0000000000494870 in __tsan::ForkBefore (thr=0x7f578b1ca980, pc=pc@entry=4904437) at tsan_rtl.cpp:510
29 0x000000000046fcb4 in syscall_pre_fork (pc=1) at tsan_interceptors_posix.cpp:2577
30 0x000000000046fc9b in __sanitizer_syscall_pre_impl_fork () at sanitizer_common_syscalls.inc:3094
31 0x00000000004ad5f5 in myfork () at syscall.h:9
32 main () at fork_deadlock.cpp:46
Depends on D114595.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D114597
Dmitry Vyukov [Thu, 25 Nov 2021 15:06:23 +0000 (16:06 +0100)]
tsan: fix Java heap block begin in reports
We currently use a wrong value for heap block
(only works for C++, but not for Java).
Use the correct value (we already computed it before, just forgot to use).
Depends on D114593.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D114595
Dmitry Vyukov [Thu, 25 Nov 2021 14:42:56 +0000 (15:42 +0100)]
tsan: add a benchmark for vector memory accesses
Depends on D114592.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D114593
Dmitry Vyukov [Thu, 25 Nov 2021 14:41:15 +0000 (15:41 +0100)]
tsan: add a test for vector memory accesses
Add a basic test that checks races between vector/non-vector
read/write accesses of different sizes/offsets in different orders.
This gives coverage of __tsan_read/write16 callbacks.
Depends on D114591.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D114592
Dmitry Vyukov [Thu, 25 Nov 2021 14:38:58 +0000 (15:38 +0100)]
tsan: enable -msse4 when compiling tests
Vector SSE accesses make compiler emit __tsan_[unaligned_]read/write16 callbacks.
Make it possible to test these.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D114591
David Green [Thu, 25 Nov 2021 15:43:45 +0000 (15:43 +0000)]
[ARM] Convert fptoi.sat to fixed point multiply
This is a very small addition to the existing MVE fixed point vcvt code
to also create them from FP_TO_SINT_SAT and FP_TO_UINT_SAT nodes, which
should be equally valid for native saturating converts under MVE.
Differential Revision: https://reviews.llvm.org/D114360
Zarko Todorovski [Thu, 25 Nov 2021 13:18:08 +0000 (08:18 -0500)]
[llvm][ubsan] Inclusive language: replace use of blacklist HandleLLVMOptions.cmake but use old option name
Retry at https://reviews.llvm.org/D113689, this time with using the old option name
to support older versions of clang.
Reviewed By: bjope
Differential Revision: https://reviews.llvm.org/D114033
Jeremy Morse [Thu, 25 Nov 2021 14:46:34 +0000 (14:46 +0000)]
[DebugInfo][InstrRef] Track variable assignments in out-of-scope blocks
DBG_INSTR_REF's and DBG_VALUE's can end up in blocks that aren't in the
lexical scope of their variable. It's arguable as to what we should do
about this, however VarLocBasedLDV permits such variable locations to be
propagated, so let's allow it in InstrRefBasedLDV.
It's necessary for the modified test to work.
Differential Revision: https://reviews.llvm.org/D114578
David Green [Thu, 25 Nov 2021 14:41:20 +0000 (14:41 +0000)]
[ARM] Add fptosi.sat variants of the fixed point vcvt tests. NFC
Alok Kumar Sharma [Wed, 24 Nov 2021 05:08:19 +0000 (10:38 +0530)]
[clang][OpenMP][DebugInfo] Debug support for private variables inside an OpenMP task construct
Currently variables appearing inside private/firstprivate/lastprivate
clause of openmp task construct are not visible inside lldb debugger.
This is because compiler does not generate debug info for it.
Please consider the testcase debug_private.c attached with patch.
```
28 #pragma omp task shared(res) private(priv1, priv2) firstprivate(fpriv)
29 {
30 priv1 = n;
31 priv2 = n + 2;
32 printf("Task n=%d,priv1=%d,priv2=%d,fpriv=%d\n",n,priv1,priv2,fpriv);
33
-> 34 res = priv1 + priv2 + fpriv + foo(n - 1);
35 }
36 #pragma omp taskwait
37 return res;
(lldb) p priv1
error: <user expression 0>:1:1: use of undeclared identifier 'priv1'
priv1
^
(lldb) p priv2
error: <user expression 1>:1:1: use of undeclared identifier 'priv2'
priv2
^
(lldb) p fpriv
error: <user expression 2>:1:1: use of undeclared identifier 'fpriv'
fpriv
^
```
After the current patch, lldb is able to show the variables
```
(lldb) p priv1
(int) $0 = 10
(lldb) p priv2
(int) $1 = 12
(lldb) p fpriv
(int) $2 = 14
```
Reviewed By: djtodoro
Differential Revision: https://reviews.llvm.org/D114504
Tres Popp [Mon, 22 Nov 2021 11:49:00 +0000 (12:49 +0100)]
Don't store nullptrs in mlir::FuncOp::getAll*Attrs' result
These methods for results and arguments would create an ArrayRef full
of nullptrs when there were no argument attributes. This is problematic
because this result could not be passed to the FuncOp::build creator
without causing a segfault. Now the list will have empty attributes.
Differential Revision: https://reviews.llvm.org/D114358
Simon Pilgrim [Thu, 25 Nov 2021 13:39:48 +0000 (13:39 +0000)]
[PowerPC/ Regenerate fp128-bitcast-after-operation test checks
Alexey Bataev [Thu, 25 Nov 2021 13:17:30 +0000 (05:17 -0800)]
Revert "[SLP]Improve analysis/emission of vector operands for alternate nodes."
This reverts commit
496254cf802a21e1967b61dec48017b8ec831574 to fix
compiler crashes reported in D114101#3152982.
seongwon bang [Thu, 25 Nov 2021 12:23:48 +0000 (21:23 +0900)]
[MLIR] [docs] Fix misguided examples in memref.subview operation.
The examples in `memref.subview` operation are misguided in that subview's strides operands mean "memref-rank number of strides that compose multiplicatively with the base memref strides in each dimension.".
So the below examples should be changed from `Strides: [64, 4, 1]` to `Strides: [1, 1, 1]`
Before changes
```
// Subview with constant offsets, sizes and strides.
%1 = memref.subview %0[0, 2, 0][4, 4, 4][64, 4, 1]
: memref<8x16x4xf32, (d0, d1, d2) -> (d0 * 64 + d1 * 4 + d2)> to
memref<4x4x4xf32, (d0, d1, d2) -> (d0 * 64 + d1 * 4 + d2 + 8)>
```
After changes
```
// Subview with constant offsets, sizes and strides.
%1 = memref.subview %0[0, 2, 0][4, 4, 4][1, 1, 1]
: memref<8x16x4xf32, (d0, d1, d2) -> (d0 * 64 + d1 * 4 + d2)> to
memref<4x4x4xf32, (d0, d1, d2) -> (d0 * 64 + d1 * 4 + d2 + 8)>
```
Also I fixed some syntax issues in docs related with memref layout map and added detailed explanation in subview rank reducing case.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D114500
Zarko Todorovski [Thu, 25 Nov 2021 02:56:49 +0000 (21:56 -0500)]
[NFC][llvm] Inclusive language: reword uses of sanity test and check
Part of continuing work to use more inclusive language. Reworded uses
of sanity check and sanity test in llvm/test/
Kirill Bobyrev [Thu, 25 Nov 2021 12:19:01 +0000 (13:19 +0100)]
[clangd] Move IncludeCleaner tracer to the actual computation
This way we won't get results with 0 ms for all the users with disabled
IncludeCleaner.
mydeveloperday [Thu, 25 Nov 2021 11:50:34 +0000 (11:50 +0000)]
[clang-format] [C++20] [Module] clang-format couldn't recognize partitions
https://bugs.llvm.org/show_bug.cgi?id=52517
clang-format is butchering modules, this could easily become a barrier to entry for modules given clang-formats wide spread use.
Prevent the following from adding spaces around the `:` (cf was considering the ':' as an InheritanceColon)
Reviewed By: HazardyKnusperkeks, owenpan, ChuanqiXu
Differential Revision: https://reviews.llvm.org/D114151
Pavel Labath [Wed, 24 Nov 2021 10:20:44 +0000 (11:20 +0100)]
[lldb/gdb-remote] Ignore spurious ACK packets
Although I cannot find any mention of this in the specification, both
gdb and lldb agree on sending an initial + packet after establishing the
connection.
OTOH, gdbserver and lldb-server behavior is subtly different. While
lldb-server *expects* the initial ack, and drops the connection if it is
not received, gdbserver will just ignore a spurious ack at _any_ point
in the connection.
This patch changes lldb's behavior to match that of gdb. An ACK packet
is ignored at any point in the connection (except when expecting an ACK
packet, of course). This is inline with the "be strict in what you
generate, and lenient in what you accept" philosophy, and also enables
us to remove some special cases from the server code. I've extended the
same handling to NAK (-) packets, mainly because I don't see a reason to
treat them differently here.
(The background here is that we had a stub which was sending spurious
+ packets. This bug has since been fixed, but I think this change makes
sense nonetheless.)
Differential Revision: https://reviews.llvm.org/D114520
Pavel Labath [Wed, 24 Nov 2021 13:11:22 +0000 (14:11 +0100)]
[lldb/gdb-remote] Remove initial pipe-draining workaround
This code, added in rL197579 (Dec 2013) is supposed to work around what
was presumably a qemu bug, where it would send unsolicited stop-reply
packets after the initial connection.
At present, qemu does not exhibit such behavior. Also, the 10ms delay
introduced by this code is sufficient to mask bugs in other stubs, but
it is not sufficient to *reliably* mask those bugs. This resulted in
flakyness in one of our stubs, which was (incorrectly) sending a +
packet at the start of the connection, resulting in a small-but-annoying
number of dropped connections.
Differential Revision: https://reviews.llvm.org/D114529
Simon Pilgrim [Thu, 25 Nov 2021 11:14:06 +0000 (11:14 +0000)]
[DAG] SimplifyDemandedBits - simplify rotl/rotr to shl/srl (REAPPLIED)
If we only demand bits from one half of a rotation pattern, see if we can simplify to a logical shift.
For the ARM/AArch64 rev16/32 patterns, I had to drop a fold to prevent srl(bswap()) -> rotr(bswap) -> srl(bswap) infinite loops. I've replaced this with an isel PatFrag which should do the same task.
Reapplied with fix for AArch64 rev patterns to matching the ARM fix.
https://alive2.llvm.org/ce/z/iroxki (rol -> shl by amt iff demanded bits has at least as many trailing zeros as the shift amount)
https://alive2.llvm.org/ce/z/4ez_U- (ror -> shl by revamt iff demanded bits has at least as many trailing zeros as the reverse shift amount)
https://alive2.llvm.org/ce/z/cD7dR- (ror -> lshr by amt iff demanded bits has at least as many leading zeros as the shift amount)
https://alive2.llvm.org/ce/z/_XGHtQ (rol -> lshr by revamt iff demanded bits has at least as many leading zeros as the reverse shift amount)
Differential Revision: https://reviews.llvm.org/D114354
mydeveloperday [Thu, 25 Nov 2021 11:11:30 +0000 (11:11 +0000)]
[clang-format]NFC improve the comment to match the code
Missing from {D114519}
mydeveloperday [Thu, 25 Nov 2021 11:04:17 +0000 (11:04 +0000)]
[clang-format] [PR52595] clang-format does not recognize rvalue references to array
https://bugs.llvm.org/show_bug.cgi?id=52595
missing space between `T(&&)` but not between `T (&` due to && being incorrectly thought of as `UnaryOperator` rather than `PointerOrReference`
```
int operator()(T (&)[N]) { return 0; }
int operator()(T(&&)[N]) { return 1; }
```
Existing Unit tests are changed because actually I think they are originally incorrect, and are inconsistent with the (&) cases that are 4 or 5 lines above them.
Reviewed By: curdeius
Differential Revision: https://reviews.llvm.org/D114519
Alexander Belyaev [Thu, 25 Nov 2021 10:42:16 +0000 (11:42 +0100)]
[mlir] Move memref.[tensor_load|buffer_cast|clone] to "bufferization" dialect.
https://llvm.discourse.group/t/rfc-dialect-for-bufferization-related-ops/4712
Differential Revision: https://reviews.llvm.org/D114552
Tobias Gysi [Thu, 25 Nov 2021 10:42:09 +0000 (10:42 +0000)]
[mlir][linalg] Cleanup hoisting test (NFC).
Rename the check prefixes to HOIST21 and HOIST32 to clarify the different flag configurations.
Depends On D114438
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114442
Tobias Gysi [Thu, 25 Nov 2021 10:37:00 +0000 (10:37 +0000)]
[mlir][linalg] Perform checks early in hoist padding.
Instead of checking for unexpected operations (any operation with a region except for scf::For and `padTensorOp` or operations with a memory effect) while cloning the packing loop nest perform the checks early. Update `dropNonIndexDependencies` to check for unexpected operations. Additionally, check all of these operations have index type operands only.
Depends On D114428
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114438
Tobias Gysi [Thu, 25 Nov 2021 10:31:19 +0000 (10:31 +0000)]
[mlir][linalg] Limit hoist padding to constant paddings.
Limit hoist padding to pad tensor ops that depend only on a constant value. Supporting arbitrary padding values that depend on computations part of the backward slice to hoist require complex analysis to ensure the computation can be hoisted.
Depends On D114420
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114428
Tobias Gysi [Thu, 25 Nov 2021 10:23:28 +0000 (10:23 +0000)]
[mlir][linalg] Add backward slice filtering in hoist padding.
Adapt hoist padding to filter the backward slice before cloning the packing loop nest. The filtering removes all operations that are not used to index the hoisted pad tensor op and its extract slice op. The filtering is needed to support the more complex loop nests created after fusion. For example, fusing the producer of an output operand can added linalg ops and pad tensor ops to the backward slice. These operations have regions and currently prevent hoisting.
The following example demonstrates the effect of the newly introduced `dropNonIndexDependencies` method that filters the backward slice:
```
%source = linalg.fill(%cst, %arg0)
scf.for %i
%unrelated = linalg.fill(%cst, %arg1) // not used to index %source!
scf.for %j (%arg2 = %unrelated)
scf.for %k // not used to index %source!
%ubi = affine.min #map(%i)
%ubj = affine.min #map(%j)
%slice = tensor.extract_slice %source [%i, %j] [%ubi, %ubj]
%padded_slice = linalg.pad_tensor %slice
```
dropNonIndexDependencies(%padded_slice, %slice)
removes [scf.for %k, linalg.fill(%cst, %arg1)] from backwardSlice.
Depends On D114175
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114420
Sheldon Neuberger [Thu, 25 Nov 2021 10:22:56 +0000 (11:22 +0100)]
[clangd] Add ObjC method support to prepareCallHierarchy
This fixes "textDocument/prepareCallHierarchy" in clangd for ObjC methods. Details at https://github.com/clangd/vscode-clangd/issues/247.
clangd uses Decl::isFunctionOrFunctionTemplate to check if the decl given in a prepareCallHierarchy request is eligible for prepareCallHierarchy. We change to use isFunctionOrMethod which includes functions and ObjC methods.
Reviewed By: kadircet
Differential Revision: https://reviews.llvm.org/D114058
David Green [Thu, 25 Nov 2021 10:19:29 +0000 (10:19 +0000)]
[SDAG] Allow Unknown sizes when refining MMO alignments. NFC
The changes in D113888 /
32b6c17b29079e7d altered the memory size of a
masked store, as it will store an unknown number of bytes not the full
vector size. We can have situations where the masked stores is legalized
and then turned to a normal store, as the mask is known to be all ones.
This creates a store with an unknown size MMO that was hitting this
assert.
The store created can be given a better size in a followup patch. This
currently adjusts the assert to handle unknown sizes.