Nick Lewycky [Wed, 28 Apr 2021 21:33:00 +0000 (14:33 -0700)]
NFC: Run clang-format over llvm-link.
Bardia Mahjour [Wed, 28 Apr 2021 21:24:26 +0000 (17:24 -0400)]
[LV] Consider Loop Unroll Hints When Making Interleave Decisions
This patch causes the loop vectorizer to not interleave loops that have
nounroll loop hints (llvm.loop.unroll.disable and llvm.loop.unroll_count(1)).
Note that if a particular interleave count is being requested
(through llvm.loop.interleave_count), it will still be honoured, regardless
of the presence of nounroll hints.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D101374
Petr Hosek [Wed, 15 Jul 2020 21:10:56 +0000 (14:10 -0700)]
[libc++] Support per-target __config_site in per-target runtime build
When using the per-target runtime build, it may be desirable to have
different __config_site headers for each target where all targets cannot
share a single configuration.
The layout used for libc++ headers after this change is:
```
include/
c++/
v1/
<libc++ headers except for __config_site>
<target1>/
c++/
v1/
__config_site
<target2>/
c++/
v1/
__config_site
<other targets>
```
This is the most optimal layout since it avoids duplication, the only
headers that's per-target is __config_site, all other headers are
shared across targets. This also means that we no need two
-isystem flags: one for the target-agnostic headers and one for
the target specific headers.
Differential Revision: https://reviews.llvm.org/D89013
Sanjay Patel [Wed, 28 Apr 2021 20:13:32 +0000 (16:13 -0400)]
[InstCombine] relax masking requirement for truncated funnel/rotate match
I was investigating a seemingly unrelated improvement in demanded
bits for shift-left, but that caused regressions on these tests
because we were able to look through/eliminate the mask.
https://alive2.llvm.org/ce/z/Ztdr22
define i8 @src(i32 %x, i32 %y, i32 %shift) {
%and = and i32 %shift, 3
%conv = and i32 %x, 255
%shr = lshr i32 %conv, %and
%sub = sub i32 8, %and
%shl = shl i32 %y, %sub
%or = or i32 %shr, %shl
%conv2 = trunc i32 %or to i8
ret i8 %conv2
}
define i8 @tgt(i32 %x, i32 %y, i32 %shift) {
%x8 = trunc i32 %x to i8
%y8 = trunc i32 %y to i8
%shift8 = trunc i32 %shift to i8
%and = and i8 %shift8, 3
%conv2 = call i8 @llvm.fshr.i8(i8 %y8, i8 %x8, i8 %and)
ret i8 %conv2
}
declare i8 @llvm.fshr.i8(i8,i8,i8)
Sanjay Patel [Wed, 28 Apr 2021 19:49:19 +0000 (15:49 -0400)]
[InstCombine] add tests for rotate/funnel; NFC
Jessica Paquette [Wed, 28 Apr 2021 18:13:19 +0000 (11:13 -0700)]
[AArch64][GlobalISel] Don't match thread-local globals in matchFoldGlobalOffset
SelectionDAG has separate ISD opcodes for regular global values and thread-local
global values, while GlobalISel does not.
This combine was ported from SDAG directly without knowing that. As a result,
it was running on TLS globals.
This makes it so that `matchFoldGlobalOffset` doesn't match on TLS globals, and
adds an assert to `selectTLSGlobalValue` to make sure that TLS globals never
have offsets.
Differential Revision: https://reviews.llvm.org/D101478
Philip Reames [Wed, 28 Apr 2021 20:46:26 +0000 (13:46 -0700)]
[tests] Precommit some extra tests for D100884
Mike Urbach [Sat, 24 Apr 2021 02:32:54 +0000 (20:32 -0600)]
[mlir][python] Update `PyOpResult.owner` to get the parent object.
Previously, this API would return the PyObjectRef, rather than the
underlying PyOperation.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D101416
Duncan P. N. Exon Smith [Wed, 28 Apr 2021 01:02:22 +0000 (18:02 -0700)]
Linker: Avoid scheduling the link of a global value twice due to an alias
3d4f3a0da90bd1a3 (https://reviews.llvm.org/D20586) avoided rescheduling
a global value that was materialized first through a regular value, and
then again through an alias. This commit catches the dual, avoiding
rescheduling when the global value is first materialized through an
alias.
Differential Revision: https://reviews.llvm.org/D101419
Radar-Id: rdar://
75752728
Philip Reames [Wed, 28 Apr 2021 19:47:32 +0000 (12:47 -0700)]
[SCEV] Avoid range intersection idiom in getRangeForUnkownRecurrence [NFC]
Addresses a review comment from D101181
Arthur Eubanks [Wed, 28 Apr 2021 19:42:54 +0000 (12:42 -0700)]
Revert "[Clang] -Wunused-but-set-parameter and -Wunused-but-set-variable"
This reverts commit
9b0501abc7b515b740fb5ee929817442dd3029a5.
False positives reported in D100581.
Anirudh Prasad [Wed, 28 Apr 2021 19:42:23 +0000 (15:42 -0400)]
[AsmParser][SystemZ][z/OS] Use updated framework in AsmLexer to accept special tokens as Identifiers
- Previously, https://reviews.llvm.org/D99889 changed the framework in the AsmLexer to treat special tokens, if they occur at the start of the string, as Identifiers.
- These are used by the MASM Parser implementation in LLVM, and we can extend some of the changes made in the previous patch to SystemZ.
- In SystemZ, the special "tokens" referred to here are "_", "$", "@", "#". [_|$|@|#] are already supported as "part" of an Identifier.
- The changes in this patch ensure that these special tokens, when they occur at the start of the Identifier, are treated as Identifiers.
Reviewed By: abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D100959
Philip Reames [Wed, 28 Apr 2021 19:35:22 +0000 (12:35 -0700)]
[SCEV] Compute ranges for ashr recurrences
Straight forward extension to the recently added infrastructure which was pioneered with shl. This was originally posted as part of D99687, but split off for ease of review.
(I also decided to exclude the unknown start sign case explicitly for simplicity of understanding.)
Differential Revision: https://reviews.llvm.org/D101181
Louis Dionne [Wed, 28 Apr 2021 19:28:37 +0000 (15:28 -0400)]
[libc++][NFC] Remove stray whitespace
This might have helped align static_asserts originally, but it doesn't
anymore since we use LIBCPP_STATIC_ASSERT.
Florian Hahn [Wed, 28 Apr 2021 19:02:47 +0000 (20:02 +0100)]
[LAA] Support pointer phis in loop by analyzing each incoming pointer.
SCEV does not look through non-header PHIs inside the loop. Such phis
can be analyzed by adding separate accesses for each incoming pointer
value.
This results in 2 more loops vectorized in SPEC2000/186.crafty and
avoids regressions when sinking instructions before vectorizing.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D101286
Craig Topper [Wed, 28 Apr 2021 18:13:10 +0000 (11:13 -0700)]
[TableGen] Store predicates in PatternToMatch as ListInit *. Add string for HwModeFeatures
This uses to be how predicates were handled prior to HwMode being
added. When the Predicates were converted to a std::vector it
significantly increased the cost of a compare in GenerateVariants.
Since ListInit's are uniquified by tablegen, we can use a simple
pointer comparison to check for identical lists.
In order to store the HwMode, we now add a separate string to
PatternToMatch. This will be appended separately to the predicate
string in getPredicateCheck. A new getPredicateRecords is added
to allow GlobalISel and getPredicateCheck to both get the sorted
list of Records. GlobalISel was ignoring any HwMode predicates
before and still is.
There is one slight change here, ListInits with different predicate
orders aren't sorted so the filtering in GenerateVariants might
fail to detect two isomorphic patterns with different predicate
orders. This doesn't seem to be happening in tree today.
My hope is this will allow us to remove all the BitVector tracking
in GenerateVariants that was making up for predicates beeing
expensive to compare. There's a decent amount of heap allocations
there on large targets like X86, AMDGPU, and RISCV.
Differential Revision: https://reviews.llvm.org/D100691
Sanjay Patel [Wed, 28 Apr 2021 18:11:46 +0000 (14:11 -0400)]
[InstCombine] add tests for demand of shl op; NFC
Martin Storsjö [Tue, 27 Apr 2021 21:11:46 +0000 (00:11 +0300)]
[libcxx] Stop hardcoding the bash path in the Windows CI
The buildbots now have bash available in the path from the start.
Differential Revision: https://reviews.llvm.org/D101436
Ryan Santhirarajan [Wed, 28 Apr 2021 18:59:40 +0000 (11:59 -0700)]
[ARM] Neon Polynomial vadd Intrinsic fix
The Neon vadd intrinsics were added to the ARMSIMD intrinsic map,
however due to being defined under an AArch64 guard in arm_neon.td,
were not previously useable on ARM. This change rectifies that.
It is important to note that poly128 is not valid on ARM, thus it was
extracted out of the original arm_neon.td definition and separated
for the sake of AArch64.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D100772
MaheshRavishankar [Wed, 28 Apr 2021 18:01:22 +0000 (11:01 -0700)]
[mlir][Linalg] Avoid changing the rank of the result in canonicalizations of subtensor.
Canonicalizations for subtensor operations defaulted to use the
rank-reduced version of the operation, but the cast inserted to get
back the original type would be illegal if the rank was actually
reduced. Instead make the canonicalization not reduce the rank of the
operation.
Differential Revision: https://reviews.llvm.org/D101258
Jonas Devlieghere [Tue, 27 Apr 2021 01:20:49 +0000 (18:20 -0700)]
[dsymutil] Add flag to force a static variable to keep its enclosing function
Add a flag to change dsymutil's behavior and force a static variable to
keep its enclosing function. The test shows a situation where that could
be useful. I'm not convinced this behavior makes sense as a default,
which is why it's behind a flag.
rdar://
74918374
Differential revision: https://reviews.llvm.org/D101337
Sam Clegg [Wed, 28 Apr 2021 17:42:08 +0000 (10:42 -0700)]
Fix typo from https://reviews.llvm.org/D101399
Joe Nash [Mon, 29 Mar 2021 18:58:29 +0000 (14:58 -0400)]
[AMDGPU] Make some VOP3 insts commutable
Note, only src0 and src1 will be commuted if the isCommutable flag
is set. This patch does not change that, it just makes it possible
to commute src0 and src1 of some U/I/B vop3 instructions.
This patch revises
d35d8da7d6ac6c08578ec0569b072292631691e0.
It contains the commute opportunities excluding float insts
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D101474
Change-Id: I62938173d750453839f2457a3851661a29135faf
Alexander Belyaev [Wed, 28 Apr 2021 17:48:27 +0000 (19:48 +0200)]
[mlir] Fix canonicalization of tiled_loop if not all opresults fold.
The current canonicalization did not remap operation results correctly
and attempted to erase tiledLoop, which is incorrect if not all tensor
results are folded.
Sam Clegg [Tue, 27 Apr 2021 20:45:10 +0000 (13:45 -0700)]
[lld][WebAssembly] Allow relocations against non-live global symbols
Just like the in case for function and data symbols this is needed to
support relocations in debug info sections which are allowed contains
relocations against non-live symbols.
The motivating use case is an object file that contains debug info that
references `__stack_pointer` (a local symbol) but does not actually
contain any uses of `__stack_pointer`.
Fixes: https://github.com/emscripten-core/emscripten/issues/14025
Differential Revision: https://reviews.llvm.org/D101399
Mark de Wever [Wed, 28 Apr 2021 17:13:52 +0000 (19:13 +0200)]
[libc++][CI] Fix check-generated-output.
Before the script detected non-ASCII characters but let them pass. This
fixes the issue. I had a way to solve the issue, during review @Quuxplusone
suggested a better alternative. The patch has been changed to use this alternative.
Intended failed builds:
- Not updated generated files https://buildkite.com/llvm-project/libcxx-ci/builds/2822
- Not updated generated files and non-ASCII usage https://buildkite.com/llvm-project/libcxx-ci/builds/2835
- Non-ASCII usage https://buildkite.com/llvm-project/libcxx-ci/builds/2836
Reviewed By: #libc, Quuxplusone, curdeius
Differential Revision: https://reviews.llvm.org/D101303
Craig Topper [Wed, 28 Apr 2021 16:55:36 +0000 (09:55 -0700)]
[RISCV] Add explanatory comment to RISCVOp::OPERAND_AVL.
Nico Weber [Tue, 20 Apr 2021 14:58:19 +0000 (10:58 -0400)]
[clang] Make libBasic not depend on MC
Reduces numbers of files built for clang-format from 575 to 449.
Requires two small changes:
1. Don't use llvm::ExceptionHandling in LangOptions. This isn't
even quite the right type since we don't use all of its values.
Tweaks the changes made in:
- https://reviews.llvm.org/D93215
- https://reviews.llvm.org/D93216
2. Move section name validation code added (long ago) in commit
30ba67439 out
of libBasic into Sema and base the check on the triple. This is a bit less
OOP-y, but completely in line with what we do in many other places in Sema.
No behavior change.
Differential Revision: https://reviews.llvm.org/D101463
Roman Lebedev [Wed, 28 Apr 2021 15:28:39 +0000 (18:28 +0300)]
[SimplifyCFG] Try 2: sink all-indirect indirect calls
Note that we don't want to turn a partially-direct call
into an indirect one, that will break ICP amongst other things.
Roman Lebedev [Wed, 28 Apr 2021 14:58:08 +0000 (17:58 +0300)]
[NFC][SimplifyCFG] Add common code sinking test with direct and indirect callees
This is the pattern ICP produces.
We shouldn't fold this back into an indirect call.
Florian Hahn [Wed, 28 Apr 2021 15:36:30 +0000 (16:36 +0100)]
[PhaseOrdering] Add test for vectorization requiring hoisting/sinking.
Valeriy Savchenko [Wed, 28 Apr 2021 15:55:20 +0000 (18:55 +0300)]
[analyzer][NFC] Fix tests failing after a rebase
Valeriy Savchenko [Wed, 21 Apr 2021 13:01:30 +0000 (16:01 +0300)]
[analyzer] Find better description for tracked symbolic values
When searching for stores and creating corresponding notes, the
analyzer is more specific about the target region of the store
as opposed to the stored value. While this description was tweaked
for constant and undefined values, it lacked in the most general
case of symbolic values.
This patch tries to find a memory region, where this value is stored,
to use it as a better alias for the value.
rdar://
76645710
Differential Revision: https://reviews.llvm.org/D101041
Valeriy Savchenko [Tue, 20 Apr 2021 14:08:55 +0000 (17:08 +0300)]
[analyzer] Track leaking object through stores
Since we can report memory leaks on one variable, while the originally
allocated object was stored into another one, we should explain
how did it get there.
rdar://
76645710
Differential Revision: https://reviews.llvm.org/D100852
Valeriy Savchenko [Fri, 16 Apr 2021 18:10:05 +0000 (21:10 +0300)]
[analyzer] Adjust the reported variable name in retain count checker
When reporting leaks, we try to attach the leaking object to some
variable, so it's easier to understand. Before the patch, we always
tried to use the first variable that stored the object in question.
This can get very confusing for the user, if that variable doesn't
contain that object at the moment of the actual leak. In many cases,
the warning is dismissed as false positive and it is effectively a
false positive when we fail to properly explain the warning to the
user.
This patch addresses the bigest issue in cases like this. Now we
check if the variable still contains the leaking symbolic object.
If not, we look for the last variable to actually hold it and use
that variable instead.
rdar://
76645710
Differential Revision: https://reviews.llvm.org/D100839
Valeriy Savchenko [Fri, 16 Apr 2021 08:22:25 +0000 (11:22 +0300)]
[analyzer][NFC] Remove duplicated work from retain count leak report
Allocation site is the key location for the leak checker. It is a
uniqueing location for the report and a source of information for
the warning's message.
Before this patch, we calculated and used it twice in bug report and
in bug report visitor. Such duplication is not only harmful
performance-wise (not much, but still), but also design-wise. Because
changing something about the end piece of the report should've been
repeated for description as well.
Differential Revision: https://reviews.llvm.org/D100626
David Candler [Wed, 28 Apr 2021 14:16:01 +0000 (15:16 +0100)]
[ARM][AArch64] Require appropriate features for crypto algorithms
This patch changes the AArch32 crypto instructions (sha2 and aes) to
require the specific sha2 or aes features. These features have
already been implemented and can be controlled through the command
line, but do not have the expected result (i.e. `+noaes` will not
disable aes instructions). The crypto feature retains its existing
meaning of both sha2 and aes.
Several small changes are included due to the knock-on effect this has:
- The AArch32 driver has been modified to ensure sha2/aes is correctly
set based on arch/cpu/fpu selection and feature ordering.
- Crypto extensions are permitted for AArch32 v8-R profile, but not
enabled by default.
- ACLE feature macros have been updated with the fine grained crypto
algorithms. These are also used by AArch64.
- Various tests updated due to the change in feature lists and macros.
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D99079
Frederik Gossen [Wed, 28 Apr 2021 15:16:02 +0000 (17:16 +0200)]
Revert "[MLIR][Shape] Concretize broadcast result type if possible"
This reverts commit
dca536103592cf1e92aa8316ed23f33d75da25bc.
Nicolas Vasilache [Wed, 28 Apr 2021 13:22:34 +0000 (13:22 +0000)]
[mlir][python] Add basic python support for GPU dialect and passes
Differential Revision: https://reviews.llvm.org/D101449
Nicolas Vasilache [Tue, 27 Apr 2021 19:57:56 +0000 (19:57 +0000)]
[mlir][python] Add python support for async dialect and passes.
since the `async` keyword is reserved in python, the dialect is called async_dialect.
Differential Revision: https://reviews.llvm.org/D101447
Roman Lebedev [Wed, 28 Apr 2021 14:46:59 +0000 (17:46 +0300)]
Revert "[SimplifyCFG] Sinking indirect calls - they're already indirect anyways"
Seems to break indirect call promotion, LTO/Resolution/X86/load-sample-prof-icp.ll fails.
This reverts commit
e57cf128b30a88c6dd42e8ef40deeedd0d7f116d.
Roman Lebedev [Wed, 28 Apr 2021 14:33:52 +0000 (17:33 +0300)]
[SimplifyCFG] Sinking indirect calls - they're already indirect anyways
Roman Lebedev [Wed, 28 Apr 2021 14:32:17 +0000 (17:32 +0300)]
[NFC][SimplifyCFG] Add test for sinking indirect calls
Dawid Jurczak [Wed, 28 Apr 2021 14:21:21 +0000 (10:21 -0400)]
[SimplifyLibCalls] Transform printf("%s", str) --> puts(str)/noop
Before this change LLVM cannot simplify printf in following cases:
printf("%s", "") --> noop
printf("%s", str"\n") --> puts(str)
From the other hand GCC can perform such transformations for many years:
https://godbolt.org/z/7nnqbedfe
Differential Revision: https://reviews.llvm.org/D100724
Nico Weber [Wed, 28 Apr 2021 14:21:31 +0000 (10:21 -0400)]
[clang] remove dead code after
2a1332245fc
Commit
2a1332245fc extracted this code to a new function checkSectionName() and
added a call to it, but didn't remove the original code. The original code
is dead since the checkSectionName() early return would fire when it would
trigger. (If it weren't dead, it'd make clang crash since
err_attribute_section_invalid_for_target now takes two args instead of just the
one that's passed.)
No behavior change.
Differential Revision: https://reviews.llvm.org/D101457
Krzysztof Parzyszek [Wed, 28 Apr 2021 13:57:20 +0000 (08:57 -0500)]
[Hexagon] Skip function in Hexagon vector combine if requested
Add a call to skipFunction().
David Sherwood [Fri, 5 Mar 2021 17:10:09 +0000 (17:10 +0000)]
[LoopVectorize][SVE] Fix crash when vectorising FP negation
This patch fixes a crash encountered when vectorising the following loop:
void foo(float *dst, float *src, long long n) {
for (long long i = 0; i < n; i++)
dst[i] = -src[i];
}
using scalable vectors. I've added a test to
Transforms/LoopVectorize/AArch64/sve-basic-vec.ll
as well as cleaned up the other tests in the same file.
Differential Revision: https://reviews.llvm.org/D98054
Arthur O'Dwyer [Wed, 28 Apr 2021 01:30:38 +0000 (21:30 -0400)]
[libc++] [test] Don't assume iterators are class types.
In particular, `span<int>::iterator` may be a raw pointer type
and thus have no nested typedef `iterator::value_type`. However,
we already know that the value_type we expect for `span<int>` is just `int`.
Fix up all other iterator_concept_conformance tests in the same way.
Differential Revision: https://reviews.llvm.org/D101420
David Goldman [Tue, 6 Apr 2021 17:31:09 +0000 (13:31 -0400)]
[clangd][ObjC] Improve support for class properties
Class properties are always implicit short-hands for the getter/setter
class methods.
We need to explicitly visit the interface decl `UIColor` in `UIColor.blueColor`,
otherwise we instead show the method decl even while hovering over
`UIColor` in the expression.
Differential Revision: https://reviews.llvm.org/D99975
Nico Weber [Wed, 28 Apr 2021 13:58:55 +0000 (09:58 -0400)]
[gn build] (port)
64bc44f5dd and
f8de9aaef2f some more
Paul C. Anagnostopoulos [Mon, 26 Apr 2021 13:53:35 +0000 (09:53 -0400)]
[TableGen] Add the !find bang operator
!find searches a source string for a target string and returns the position.
Differential Revision: https://reviews.llvm.org/D101318
Tres Popp [Wed, 28 Apr 2021 13:46:07 +0000 (15:46 +0200)]
Silence unused variable warning
Alexey Bataev [Mon, 26 Apr 2021 15:02:06 +0000 (08:02 -0700)]
[SLP]Try to vectorize tiny trees with shuffled gathers.
If the first tree element is vectorize and the second is gather, it
still might be profitable to vectorize it if the gather node contains
less scalars to vectorize than the original tree node. It might be
profitable to use shuffles.
Differential Revision: https://reviews.llvm.org/D101397
Roman Lebedev [Wed, 28 Apr 2021 13:19:30 +0000 (16:19 +0300)]
[NFC][InlineCost] Add tests for D101228
Utkarsh Saxena [Tue, 27 Apr 2021 18:36:05 +0000 (20:36 +0200)]
[clangd] Add SymbolID to LocatedSymbol.
This is useful for running in batch mode.
Getting the SymbolID from via getSymbolInfo may give SymbolID
of a symbol different from that located by LocateSymbolAt (they
have different semantics of choosing the symbol.)
Differential Revision: https://reviews.llvm.org/D101388
Anton Zabaznov [Thu, 22 Apr 2021 16:53:59 +0000 (19:53 +0300)]
[OpenCL] Introduce new method for validating OpenCL target
Language options are not available when a target is being created,
thus, a new method is introduced. Also, some refactoring is done,
such as removing OpenCL feature macros setting from TargetInfo.
Reviewed By: Anastasia
Differential Revision: https://reviews.llvm.org/D101087
Alexander Belyaev [Wed, 28 Apr 2021 12:57:50 +0000 (14:57 +0200)]
[mlir] Fix the postsubmit comments in https://reviews.llvm.org/D101445
Matt Arsenault [Sat, 17 Apr 2021 14:54:56 +0000 (10:54 -0400)]
GlobalISel: Relax verification of physical register copy types
This was picking a concrete size for a physical register, and
enforcing exact match on the virtual register's type size. Some
targets add multiple types to a register class, and some are smaller
than the full bit width. For example x86 adds f32 to 128-bit xmm
registers, and AMDGPU adds i16/f16 to 32-bit registers.
It might be better to represent these cases as a copy of the full
register and an extraction of the subpart, but a lot of code assumes
you can directly copy. This will help fix the current usage of the DAG
calling convention infrastructure which is incompatible with how
GlobalISel is now using it.
The API is somewhat cumbersome here, but I just mirrored the existing
functions, except now with LLTs (and allow returning null on failure,
unlike the MVT version). I think the concept of selecting register
classes based on type is flawed to begin with, but I'm trying to keep
this compatible with the existing handling.
David Sherwood [Wed, 10 Mar 2021 08:34:19 +0000 (08:34 +0000)]
[LoopVectorize] Simplify scalar cost calculation in getInstructionCost
This patch simplifies the calculation of certain costs in
getInstructionCost when isScalarAfterVectorization() returns a true value.
There are a few places where we multiply a cost by a number N, i.e.
unsigned N = isScalarAfterVectorization(I, VF) ? VF.getKnownMinValue() : 1;
return N * TTI.getArithmeticInstrCost(...
After some investigation it seems that there are only these cases that occur
in practice:
1. VF is a scalar, in which case N = 1.
2. VF is a vector. We can only get here if: a) the instruction is a
GEP/bitcast/PHI with scalar uses, or b) this is an update to an induction
variable that remains scalar.
I have changed the code so that N is assumed to always be 1. For GEPs
the cost is always 0, since this is calculated later on as part of the
load/store cost. PHI nodes are costed separately and were never previously
multiplied by VF. For all other cases I have added an assert that none of
the users needs scalarising, which didn't fire in any unit tests.
Only one test required fixing and I believe the original cost for the scalar
add instruction to have been wrong, since only one copy remains after
vectorisation.
I have also added a new test for the case when a pointer PHI feeds directly
into a store that will be scalarised as we were previously never testing it.
Differential Revision: https://reviews.llvm.org/D99718
Alexey Bataev [Mon, 29 Mar 2021 18:39:27 +0000 (11:39 -0700)]
[OPENMP]Fix PR49098: respect firstprivate of declare target variable.
Need to respect mapping/privatization of declare target variables in the
target regions if explicitly specified by the user.
Differential Revision: https://reviews.llvm.org/D99530
Alexander Belyaev [Wed, 28 Apr 2021 12:16:35 +0000 (14:16 +0200)]
[mlir] Add folding for tensor inputs and memref.cast in linalg.tiled_loop.
Tensor inputs, if not used in the body of TiledLoopOp, can be removed.
memref::CastOp can be folded into TiledLoopOp as well.
Differential Revision: https://reviews.llvm.org/D101445
Adrian Kuegel [Wed, 28 Apr 2021 10:49:57 +0000 (12:49 +0200)]
[MLIR] Add ComplexToStandard conversion pass.
So far, only a conversion for complex::AbsOp is done, but more will be added.
Differential Revision: https://reviews.llvm.org/D101442
Tres Popp [Wed, 28 Apr 2021 12:06:57 +0000 (14:06 +0200)]
Revert "tsan: refactor fork handling"
This reverts commit
e1021dd1fdfebff77cfb205892ada6b6a900865f.
Sander de Smalen [Tue, 27 Apr 2021 12:18:01 +0000 (13:18 +0100)]
[LV] Calculate max feasible scalable VF.
This patch also refactors the way the feasible max VF is calculated,
although this is NFC for fixed-width vectors.
After this change scalable VF hints are no longer truncated/clamped
to a shorter scalable VF, nor does it drop the 'scalable flag' from
the suggested VF to vectorize with a similar VF that is fixed.
Instead, the hint is ignored which means the vectorizer is free
to find a more suitable VF, using the CostModel to determine the
best possible VF.
Reviewed By: c-rhodes, fhahn
Differential Revision: https://reviews.llvm.org/D98509
Alex Richardson [Wed, 28 Apr 2021 09:13:27 +0000 (10:13 +0100)]
[llvm-objdump] Fix dumping dynamic relative relocations for SHT_REL
Previously printing R_386_RELATIVE relocations would trigger
`error: can't read an entry at 0x40: it goes past the end of the section (0x40)`
I found this while writing a test case for LLD (D100490).
This also includes some minor cleanup in the elf-dynamic-relcos.test
llvm-objdump test based on the newly added test.
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D100489
Alex Richardson [Wed, 28 Apr 2021 09:13:17 +0000 (10:13 +0100)]
[ELF] Update URL for MIPS TLS wiki page
The original page no longer works, so use a web.archive.org link instead.
Reviewed By: atanasyan
Differential Revision: https://reviews.llvm.org/D100949
Alex Richardson [Wed, 21 Apr 2021 11:20:06 +0000 (12:20 +0100)]
[builtins] Fix ABI-incompatibility with GCC for floating-point compare
While implementing support for the float128 routines on x86_64, I noticed
that __builtin_isinf() was returning true for 128-bit floating point
values that are not infinite when compiling with GCC and using the
compiler-rt implementation of the soft-float comparison functions.
After stepping through the assembly, I discovered that this was caused by
GCC assuming a sign-extended 64-bit -1 result, but our implementation
returns an enum (which then has zeroes in the upper bits) and therefore
causes the comparison with -1 to fail.
Fix this by using a CMP_RESULT typedef and add a static_assert that it
matches the GCC soft-float comparison return type when compiling with GCC
(GCC has a __libgcc_cmp_return__ mode that can be used for this purpose).
Also move the 3 copies of the same code to a shared .inc file.
Reviewed By: compnerd
Differential Revision: https://reviews.llvm.org/D98205
Alex Richardson [Wed, 21 Apr 2021 11:19:08 +0000 (12:19 +0100)]
[update_(llc_)test_checks.py] Support pre-processing commands
This has been rather useful in our downstream CHERI target where we want
to run tests both with addrspace(0) and addrspace(200) pointers.
With this patch we can prefix the opt command with
`sed -e 's/addrspace(200)/addrspace(0)/g' -e 's/-A200-P200-G200//g'` to
test both cases using the same IR input.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D95137
David Spickett [Wed, 28 Apr 2021 11:09:01 +0000 (12:09 +0100)]
[lldb] Correct format enum comment (NFC)
'.' is used for unprintable chars (see NON_PRINTABLE_CHAR).
Tres Popp [Wed, 28 Apr 2021 11:15:46 +0000 (13:15 +0200)]
Revert "[loop-idiom] Hoist loop memcpys to loop preheader"
This reverts commit
75d6b8bb4056d518d06b72e6411ce3749455e2e3.
The reasoning is mentioned in https://reviews.llvm.org/D97667
Roman Lebedev [Wed, 28 Apr 2021 11:02:28 +0000 (14:02 +0300)]
[NFC][SimplifyCFG] Move sink-common-code.ll into X86
There are post-commit notest for e4c61d5 that suggest
the test is failing on certain bots. It looks like
the code there isn't being moved, which suggests
cost-model involvement, which suggests that we need to
hardcode the target triple.
Hopefully this helps?
Roman Lebedev [Wed, 28 Apr 2021 10:58:22 +0000 (13:58 +0300)]
[NFC][Verifier] Split token1.ll into two, assert/non-assert versions
Lorenzo Chelini [Wed, 28 Apr 2021 10:50:21 +0000 (12:50 +0200)]
[mlir] Fix typos (NFC)
Kerry McLaughlin [Tue, 27 Apr 2021 13:50:27 +0000 (14:50 +0100)]
[LoopVectorize] Prevent multiple Phis being generated with in-order reductions
When using the -enable-strict-reductions flag where UF>1 we generate multiple
Phi nodes, though only one of these is used as an input to the vector.reduce.fadd
intrinsics. The unused Phi nodes are removed later by instcombine.
This patch changes widenPHIInstruction/fixReduction to only generate
one Phi, and adds an additional test for unrolling to strict-fadd.ll
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D100570
Nathan James [Wed, 28 Apr 2021 10:21:34 +0000 (11:21 +0100)]
[clang-query] Add check to prevent setting srcloc when no introspection is available.
Checks if introspection support is available set output kind parser.
If it isn't present the auto complete will not suggest `srcloc` and an error query will be reported if a user tries to access it.
Reviewed By: steveire
Differential Revision: https://reviews.llvm.org/D101365
Jingu Kang [Tue, 27 Apr 2021 17:24:11 +0000 (18:24 +0100)]
[IRCE] Add tests for conservative bound check
Prevent cases in which the start value of IV is bigger than bound for
increasing.
Prevent cases in which the start value of IV is smaller than bound for
decreasing.
Differential Revision: https://reviews.llvm.org/D101174
Benjamin Kramer [Wed, 28 Apr 2021 09:54:00 +0000 (11:54 +0200)]
[ADT] Make TrackingStatistic's ctor constexpr
This lets clang diagnose unused statistics, so remove them.
Frederik Gossen [Wed, 28 Apr 2021 09:56:48 +0000 (11:56 +0200)]
[MLIR][Shape] Concretize broadcast result type if possible
As a canonicalization, infer the resulting shape rank if possible.
Differential Revision: https://reviews.llvm.org/D101377
Hans Wennborg [Wed, 28 Apr 2021 09:56:58 +0000 (11:56 +0200)]
Try to fix clang/test/Driver/cl-options.c on non-x86 hosts
The /QIntel-jcc-erratum flag only works when targeting x86,
so pass --target to the driver to do that also on non-x86 hosts.
Frederik Gossen [Wed, 28 Apr 2021 09:50:28 +0000 (11:50 +0200)]
[MLIR][Shape] Canonicalize casted extent tensor operands
Both, `shape.broadcast` and `shape.cstr_broadcastable` accept dynamic and static
extent tensors. If their operands are casted, we can use the original value
instead.
Differential Revision: https://reviews.llvm.org/D101376
Qiu Chaofan [Wed, 28 Apr 2021 09:39:20 +0000 (17:39 +0800)]
[PowerPC] Fix SELECT_CC with i64 operand on PPC32
This patch fixes the infinite loop in legalization of PPC32 SELECT_CC
with 64-bit operand.
Stephen Tozer [Wed, 28 Apr 2021 09:32:08 +0000 (10:32 +0100)]
[DebugInfo] Drop DBG_VALUE_LISTs with an excessive number of debug operands
This patch fixes a crash in LiveDebugVariables for inputs where a
DBG_VALUE_LIST had 64 or more debug operands. This was triggering an
assert, which was added under the assumption that only bad CodeGen would
result in such a limit being hit, but relatively simple source files
that result in these incredibly long debug values have been found, so
this assert has been changed to a condition that drops the debug value
if it is not met.
Differential Revision: https://reviews.llvm.org/D101373
Hans Wennborg [Wed, 28 Apr 2021 09:09:50 +0000 (11:09 +0200)]
[clang-cl] Map /QIntel-jcc-erratum to -mbranches-within-32B-boundaries
Frederik Gossen [Wed, 28 Apr 2021 08:47:32 +0000 (10:47 +0200)]
[MLIR][Shape] Derive more concrete type for `shape.shape_of`
Also create all extent tensor constants with const_shape op.
Differential Revision: https://reviews.llvm.org/D99197
Joe Ellis [Mon, 26 Apr 2021 09:42:18 +0000 (09:42 +0000)]
[AArch64] Add missing UINT_TO_FP promotions for v16i8
Differential Revision: https://reviews.llvm.org/D101042
Wang, Pengfei [Wed, 28 Apr 2021 07:56:17 +0000 (15:56 +0800)]
[X86][AMX][NFC] Add more comments and remove unnecessary check found by Clocwork
Hans Wennborg [Wed, 28 Apr 2021 07:56:04 +0000 (09:56 +0200)]
Require asserts for llvm/test/Verifier/token1.ll
The test expects and assert, and that only works in asserts-enabled builds.
Diana Picus [Thu, 22 Apr 2021 09:02:40 +0000 (09:02 +0000)]
[flang] Remove interfaces for Character[Min|Max][Val|Loc]. NFC
MAXVAL, MINVAL, MAXLOC and MINLOC are already implemented in extrema.cpp
as MaxvalCharacter, MinvalDim etc. Therefore, the interfaces in
character.h are redundant and should be removed to avoid confusion.
Differential Revision: https://reviews.llvm.org/D101354
Hsiangkai Wang [Wed, 28 Apr 2021 07:53:17 +0000 (15:53 +0800)]
[RISCV] Remove riscv32 test cases for vector intrinsics.
Tobias Gysi [Wed, 28 Apr 2021 07:38:36 +0000 (07:38 +0000)]
[mlir][Python][Linalg] Fixing typos (NFC).
Petr Hosek [Wed, 28 Apr 2021 06:30:53 +0000 (23:30 -0700)]
[libcxx] Fix the libc++abi header path
This addresses an issue introduced in
775e55462a64.
Petr Hosek [Wed, 28 Apr 2021 05:31:36 +0000 (22:31 -0700)]
[Driver] Use normalized triples for per-target runtimes
This is a partial revert of
b4537c3f51bc6c011ddd9c10b80043ac4ce16a01
based on the discussion in https://reviews.llvm.org/D101194. Rather
than using the getMultiarchTriple, we use the getTripleString.
Ranjith Kumar H [Wed, 28 Apr 2021 03:59:11 +0000 (03:59 +0000)]
[MLIR] Add and propagate section attribute for LLVM_GlobalOp
Add a section attribute to LLVM_GlobalOp, during module translation attribute value is propagated to llvm
Reviewed By: sgrechanik, ftynse, mehdi_amini
Differential Revision: https://reviews.llvm.org/D100947
RamNalamothu [Wed, 28 Apr 2021 00:27:29 +0000 (05:57 +0530)]
[NFC] Refactor how CFI section types are represented in AsmPrinter
In terms of readability, the `enum CFIMoveType` didn't better document what it
intends to convey i.e. the type of CFI section that gets emitted.
Reviewed By: dblaikie, MaskRay
Differential Revision: https://reviews.llvm.org/D76519
Jennifer Chukwu [Wed, 28 Apr 2021 03:24:03 +0000 (08:54 +0530)]
Fixed Typos
Fixed typo errors in release notes of Polly 13
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D100588
Jordan Rupprecht [Wed, 28 Apr 2021 03:06:56 +0000 (20:06 -0700)]
[lldb] Fix DataLayout reference after
0f1137ba79c0
Nico Weber [Mon, 19 Apr 2021 17:39:20 +0000 (13:39 -0400)]
[clang/Basic] Make TargetInfo.h not use DataLayout again
Reverts parts of https://reviews.llvm.org/D17183, but keeps the
resetDataLayout() API and adds an assert that checks that datalayout string and
user label prefix are in sync.
Approach 1 in https://reviews.llvm.org/D17183#2653279
Reduces number of TUs build for 'clang-format' from 689 to 575.
I also implemented approach 2 in D100764. If someone feels motivated
to make us use DataLayout more, it's easy to revert this change here
and go with D100764 instead. I don't plan on doing more work in this
area though, so I prefer going with the smaller, more self-consistent change.
Differential Revision: https://reviews.llvm.org/D100776
Nico Weber [Wed, 28 Apr 2021 02:25:55 +0000 (22:25 -0400)]
[gn build] (manually) port
82d3c0759fa0
Mike Urbach [Sat, 24 Apr 2021 02:27:43 +0000 (20:27 -0600)]
[mlir] Support setting operand values in C and Python APIs.
This adds `mlirOperationSetOperand` to the IR C API, similar to the
function to get an operand.
In the Python API, this adds `operands[index] = value` syntax, similar
to the syntax to get an operand with `operands[index]`.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D101398
Mike Urbach [Thu, 22 Apr 2021 06:07:30 +0000 (00:07 -0600)]
[MLIR][Python] Add capsule methods for pybind11 to PyValue.
Add the `getCapsule()` and `createFromCapsule()` methods to the
PyValue class, as well as the necessary interoperability.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D101090