Florian Hahn [Mon, 22 Feb 2021 19:44:47 +0000 (19:44 +0000)]
[LV] Allow tryToCreateWidenRecipe to return a VPValue, use for blends.
Generalize the return value of tryToCreateWidenRecipe to return either a
newly create recipe or an existing VPValue. Use this to avoid creating
unnecessary VPBlendRecipes.
Fixes PR44800.
Nicolai Hähnle [Mon, 3 Aug 2020 13:03:18 +0000 (15:03 +0200)]
[AMDGPU][SelectionDAG] Don't combine uniform multiplies to MUL_[UI]24
Prefer to keep uniform (non-divergent) multiplies on the scalar ALU when
possible. This significantly improves some game cases by eliminating
v_readfirstlane instructions when the result feeds into a scalar
operation, like the address calculation for a scalar load or store.
Since isDivergent is only an approximation of whether a value is in
SGPRs, it can potentially regress some situations where a uniform value
ends up in a VGPR. These should be rare in real code, although the test
changes do contain a number of examples.
Most of the test changes are just using s_mul instead of v_mul/mad which
is generally better for both register pressure and latency (at least on
GFX10 where sgpr pressure doesn't affect occupancy and vector ALU
instructions have significantly longer latency than scalar ALU). Some
R600 tests now use MULLO_INT instead of MUL_UINT24.
GlobalISel appears to handle more scenarios in the desirable way,
although it can also be thrown off and fails to select the 24-bit
multiplies in some cases.
Alternative solution considered and rejected was to allow selecting
MUL_[UI]24 to S_MUL_I32. I've rejected this because the definition of
those SD operations works is don't-care on the most significant 8 bits,
and this fact is used in some combines via SimplifyDemandedBits.
Based on a patch by Nicolai Hähnle.
Differential Revision: https://reviews.llvm.org/D97063
Juneyoung Lee [Tue, 23 Feb 2021 14:35:12 +0000 (23:35 +0900)]
[JumpThreading] Update computeValueKnownInPredecessors to recognize logical and/or patterns
This allows JumpThreading's computeValueKnownInPredecessors to
recognize select form of and/or patterns as well.
Jay Foad [Tue, 23 Feb 2021 14:42:50 +0000 (14:42 +0000)]
[AMDGPU] Rename a prefix for sanity. NFC.
Nate Chandler [Mon, 22 Feb 2021 23:04:51 +0000 (15:04 -0800)]
Add @llvm.coro.async.size.replace intrinsic.
The new intrinsic replaces the size in one specified AsyncFunctionPointer with
the size in another. This ability is necessary for functions which merely
forward to async functions such as those defined for partial applications.
Reviewed By: aschwaighofer
Differential Revision: https://reviews.llvm.org/D97229
Jessica Clarke [Tue, 23 Feb 2021 14:17:15 +0000 (14:17 +0000)]
[Driver][NFC] Add explicit break to final case
Martin Storsjö [Mon, 2 Nov 2020 06:13:26 +0000 (08:13 +0200)]
[libcxx] [test] Define _CRT_STDIO_ISO_WIDE_SPECIFIERS while building tests
This matches how libc++ itself is built. This avoids errors due to
mismatch if linking libc++ statically.
Differential Revision: https://reviews.llvm.org/D97169
Nathan James [Tue, 23 Feb 2021 13:48:06 +0000 (13:48 +0000)]
[clang-tidy] Remove IncludeInserter from MoveConstructorInit check.
This check registers an IncludeInserter, however the check itself doesn't actually emit any fixes or includes, so the inserter is redundant.
From what I can tell the fixes were removed in D26453(rL290051) but the inserter was left in, probably an oversight.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D97243
Joe Ellis [Fri, 19 Feb 2021 17:09:50 +0000 (17:09 +0000)]
[clang][SVE] Don't warn on vector to sizeless builtin implicit conversion
This commit prevents warnings from -Wconversion when a clang vector type
is implicitly converted to a sizeless builtin type -- for example, when
implicitly converting a fixed-predicate to a scalable predicate.
The code below:
1 #include <arm_sve.h>
2
3 #define N __ARM_FEATURE_SVE_BITS
4 #define FIXED_ATTR __attribute__((arm_sve_vector_bits (N)))
5 typedef svbool_t fixed_svbool_t FIXED_ATTR;
6
7 inline fixed_svbool_t foo(fixed_svbool_t p) {
8 return svnot_z(svptrue_b64(), p);
9 }
would previously raise this warning:
warning: implicit conversion turns vector to scalar: \
'fixed_svbool_t' (vector of 8 'unsigned char' values) to 'svbool_t' \
(aka '__SVBool_t') [-Wconversion]
Note that many cases of these implicit conversions were already
permitted because many functions inside arm_sve.h are spawned via
preprocessor macros, and the call to isInSystemMacro would cover us in
this case. This commit fixes the remaining cases.
Differential Revision: https://reviews.llvm.org/D97053
Balázs Kéri [Mon, 22 Feb 2021 16:16:51 +0000 (17:16 +0100)]
[clang-tidy] Extending bugprone-signal-handler with POSIX functions.
An option is added to the check to select wich set of functions is
defined as asynchronous-safe functions.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D90851
Michał Górny [Tue, 23 Feb 2021 12:43:40 +0000 (13:43 +0100)]
[lldb] [test] Un-XFAIL TestBuiltinTrap on FreeBSD/aarch64
Michał Górny [Tue, 23 Feb 2021 12:34:04 +0000 (13:34 +0100)]
[lldb] [test] Un-XFAIL a test that no longer fail on FreeBSD
Simon Pilgrim [Tue, 23 Feb 2021 13:31:26 +0000 (13:31 +0000)]
[X86] Cleanup overflow test check prefixes. NFCI.
Tidy up the check prefixes to improve reuse.
Jay Foad [Fri, 19 Feb 2021 15:04:03 +0000 (15:04 +0000)]
[AMDGPU] Use divergent addresses for vector loads
Change some test cases to use divergent addresses for vector loads,
which should be the common case in real world code. Using uniform
addresses causes poor instruction selection for the surrounding
code which has to be fixed up post-register-allocation, and this causes
a lot of testsuite churn for a forthcoming patch to stop selecting
24-bit vector multiply instructions for uniform multiplies.
This shows up some problems in the idot tests where we fail to select
v_dot instructions because the patterns only match MUL_[UI]24 ISD nodes,
but the DAG contains i16 mul nodes instead.
Differential Revision: https://reviews.llvm.org/D97062
Sjoerd Meijer [Tue, 23 Feb 2021 12:58:03 +0000 (12:58 +0000)]
[ARM] do not consider sp as deprecated for ldm/stm
Early versions of the ARMv7 reference manuals considered the sp register
as a deprecated register for ldm/stm familiy of instructions. However,
later versions such as ARM DDI 0406C.d added a note to the Appendix:
D9.3 Use of the SP as a general-purpose register
Most ARM instructions, unlike Thumb instructions, provide exactly the
same access to the SP as to R0-R12. This means that it is possible to
use the SP as a general-purpose register. Earlier issues of this manual
deprecated the use of SP in an ARM instruction, in any way that is
deprecated, not permitted, or not possible in the corresponding
Thumb instruction. However, user feedback indicates a number of cases
where these instructions are useful. Therefore, ARM no longer deprecates
these instruction uses.
Also Armv8 manuals no longer consider SP as deprecated register for ldm/
stm A32 instructions.
Furthermore, GNU as also does not print a deprecated warning when using
SP with those instructions.
Drop deprecation warning for pop/ldm/push/stm instructions.
Patch by: Stefan Agner.
Differential Revision: https://reviews.llvm.org/D82692
David Green [Tue, 23 Feb 2021 13:04:59 +0000 (13:04 +0000)]
[TTI] Change getOperandsScalarizationOverhead to take Type args
As a followup to D95291, getOperandsScalarizationOverhead was still
using a VF as a vector factor if the arguments were scalar, and would
assert on certain matrix intrinsics with differently sized vector
arguments. This patch removes the VF arg, instead passing the Types
through directly. This should allow it to more accurately compute the
cost without having to guess at which operands will be vectorized,
something difficult with more complex intrinsics.
This adjusts one SVE test as it is now calling the wrong intrinsic vs
veccall. Without invalid InstructCosts the cost of the scalarized
intrinsic is too low. This should get fixed when the cost of
scalarization is accounted for with scalable types.
Differential Revision: https://reviews.llvm.org/D96287
David Green [Tue, 23 Feb 2021 13:03:26 +0000 (13:03 +0000)]
[CostModel] Remove VF from IntrinsicCostAttributes
getIntrinsicInstrCost takes a IntrinsicCostAttributes holding various
parameters of the intrinsic being costed. It can either be called with a
scalar intrinsic (RetTy==Scalar, VF==1), with a vector instruction
(RetTy==Vector, VF==1) or from the vectorizer with a scalar type and
vector width (RetTy==Scalar, VF>1). A RetTy==Vector, VF>1 is considered
an error. Both of the vector modes are expected to be treated the same,
but because this is confusing many backends end up getting it wrong.
Instead of trying work with those two values separately this removes the
VF parameter, widening the RetTy/ArgTys by VF used called from the
vectorizer. This keeps things simpler, but does require some other
modifications to keep things consistent.
Most backends look like this will be an improvement (or were not using
getIntrinsicInstrCost). AMDGPU needed the most changes to keep the code
from
c230965ccf36af5c88c working. ARM removed the fix in
dfac521da1b90db683, webassembly happens to get a fixup for an SLP cost
issue and both X86 and AArch64 seem to now be using better costs from
the vectorizer.
Differential Revision: https://reviews.llvm.org/D95291
Nathan James [Tue, 23 Feb 2021 13:01:16 +0000 (13:01 +0000)]
[clang-tidy] Update checks list.
Timm Bäder [Tue, 23 Feb 2021 12:20:28 +0000 (13:20 +0100)]
[clang][parse][NFC] Remove dead ProhibitAttributes() call
GNU-style attribute in enum bodies are allowed (and used by several
tests), and this call to ProhibitAttributes() was dead code.
Differential Revision: https://reviews.llvm.org/D97271
Florian Schmaus [Tue, 23 Feb 2021 12:38:11 +0000 (12:38 +0000)]
[clang-tidy] Install run-clang-tidy.py in bin/ as run-clang-tidy
The run-clang-tidy.py helper script is supposed to be used by the
user, hence it should be placed in the user's PATH. Some
distributions, like Gentoo [1], won't have it in PATH unless it is
installed in bin/.
Furthermore, installed scripts in PATH usually do not carry a filename
extension, since there is no need to know that this is a Python
script. For example Debian and Ubuntu already install this script as
'run-clang-tidy' [2] and hence build systems like Meson also look for
this name first [3]. Hence we install run-clang-tidy.py as
run-clang-tidy, as suggested by Sylvestre Ledru [4].
1: https://bugs.gentoo.org/753380
2: https://salsa.debian.org/pkg-llvm-team/llvm-toolchain/-/blob/
60aefb14171ab5c3867a0081844b507fc9f6e015/debian/clang-tidy-X.Y.links.in#L2
3: https://github.com/mesonbuild/meson/blob/
b6dc4d5e5c6e838de0b52e62d982ba2547eb366d/mesonbuild/scripts/clangtidy.py#L44
4: https://reviews.llvm.org/D90972#2380640
Reviewed By: sylvestre.ledru, JonasToth
Differential Revision: https://reviews.llvm.org/D90972
Matteo Favaro [Tue, 23 Feb 2021 10:22:53 +0000 (10:22 +0000)]
[DSE] Allow ptrs defined in the entry block in IsGuaranteedLoopInvariant.
The **IsGuaranteedLoopInvariant** function is making sure to check if the
incoming pointer is guaranteed to be loop invariant, therefore I think
the case where the pointer is defined in the entry block of a function
automatically guarantees the pointer to be loop invariant, as the entry
block of a function cannot have predecessors or be part of a loop.
I implemented this small patch and tested it using
**ninja check-llvm-unit** and **ninja check-llvm**. I added a contained test
file that shows the problem and used **opt -O3 -debug** on it to make sure
the case is not currently handled (in fact the debug log is showing that
the DSE pass is bailing out when testing if the killer store is able to
clobber the dead store).
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D96979
Hsiangkai Wang [Tue, 23 Feb 2021 05:49:18 +0000 (13:49 +0800)]
[RISCV] vle1.v/vse1.v should be unmasked instructions.
vle1.v/vse1.v should be unmasked instructions. The vm encoding is 1 for
unmasked instructions.
Differential Revision: https://reviews.llvm.org/D97237
Anastasia Stulova [Tue, 23 Feb 2021 11:44:13 +0000 (11:44 +0000)]
[OpenCL][Docs] Change description for the OpenCL standard headers.
After updating the user interface in D96515, update the docs
reflecting the new approach.
Tags: #clang
Differential Revision: https://reviews.llvm.org/D96616
Nicolas Vasilache [Tue, 23 Feb 2021 08:52:55 +0000 (08:52 +0000)]
[mlir][Linalg] Retire hoistViewAllocOps.
This transformation was only used for quick experimentation and is not general enough.
Retire it.
Differential Revision: https://reviews.llvm.org/D97266
Simon Pilgrim [Tue, 23 Feb 2021 11:41:51 +0000 (11:41 +0000)]
Fix Wdocumentation parameter warning. NFCI.
Nicolas Vasilache [Tue, 23 Feb 2021 11:01:05 +0000 (11:01 +0000)]
[mlir] NFC - Use declarative assembly for scf::YieldOp
Raphael Isemann [Tue, 23 Feb 2021 11:10:39 +0000 (12:10 +0100)]
[lldb][NFC] Remove unused ValueObject::LogValueObject functions
Those functions aren't called anywhere. For debugging purposes we usually
have Dump() methods (which already exist in some semi-functional form in
ValueObject).
Alexey Lapshin [Mon, 8 Feb 2021 15:11:39 +0000 (18:11 +0300)]
[Support] Add reserve() method to the raw_ostream.
If resulting size of the output stream is already known,
then the space for stream data could be preliminary
allocated in some cases. f.e. raw_string_ostream could
preallocate the space for the target string(it allows
to avoid reallocations during writing into the stream).
Differential Revision: https://reviews.llvm.org/D91693
Raphael Isemann [Tue, 23 Feb 2021 11:01:29 +0000 (12:01 +0100)]
[lldb][NFC] Clean up ValueObject comments
* Remove commented out code.
* Doxygenify comments that serve as documentation.
* Use the LLVM comment style where possible.
David Green [Tue, 23 Feb 2021 10:53:22 +0000 (10:53 +0000)]
[ARM] Add pre/post inc tests of various sizes. NFC
Andy Wingo [Tue, 23 Feb 2021 10:23:31 +0000 (11:23 +0100)]
Revert "[WebAssembly] call_indirect issues table number relocs"
This reverts commit
861dbe1a021e6439af837b72b219fb9c449a57ae. It broke
emscripten -- see https://reviews.llvm.org/D90948#2578843.
Fraser Cormack [Thu, 18 Feb 2021 16:48:49 +0000 (16:48 +0000)]
[RISCV] Support insertion of misaligned subvectors
This patch extends the support for RVV INSERT_SUBVECTOR to cover those
which don't align to a vector register boundary. Like the support for
EXTRACT_SUBVECTOR in D96959, it accomplishes this by extracting the
nearest register-sized subvector (a subregister operation), then sliding
the vector down with VSLIDEDOWN, inserting the subvector to the first
position, and sliding the vector back up again afterwards.
Unlike subvector extraction, for vectors that occupy less than a full
vector register we must preserve the untouched elements. We do this by
lowering to an LMUL=1 INSERT_SUBVECTOR using the above method and
lowering that to a VSLIDEUP with a zero offset. This uses a
tail-undisturbed policy and so has the effect of "sliding in" the
subvector elements while preserving the surrounding ones.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D96972
Frederik Gossen [Tue, 23 Feb 2021 10:17:40 +0000 (11:17 +0100)]
Fix unused variable
Sven van Haastregt [Tue, 23 Feb 2021 10:18:14 +0000 (10:18 +0000)]
[OpenCL] Move remaining defines to opencl-c-base.h
Move any remaining preprocessor defines from `opencl-c.h` to
`opencl-c-base.h`, such that they are shared with
`-fdeclare-opencl-builtins` too.
In particular, move:
- the `as_type` and `as_typen` definitions, and
- the `kernel_exec` and `__kernel_exec` definitions.
Also clang-format the changes.
Differential Revision: https://reviews.llvm.org/D96948
Raphael Isemann [Tue, 23 Feb 2021 09:38:48 +0000 (10:38 +0100)]
[lldb][NFC] Give CompilerType's IsArrayType/IsVectorType/IsBlockPointerType out-parameters default values
We already do this for most functions that have out-parameters, so let's do
the same here and avoid all the `nullptr, nullptr, nullptr` in every call.
Martin Liska [Tue, 23 Feb 2021 09:11:07 +0000 (10:11 +0100)]
Fix UBSAN in __ubsan::Value::getSIntValue
/home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.cpp:77:25: runtime error: left shift of 0x0000000000000000fffffffffffffffb by 96 places cannot be represented in type '__int128'
#0 0x7ffff754edfe in __ubsan::Value::getSIntValue() const /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.cpp:77
#1 0x7ffff7548719 in __ubsan::Value::isNegative() const /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_value.h:190
#2 0x7ffff7542a34 in handleShiftOutOfBoundsImpl /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_handlers.cpp:338
#3 0x7ffff75431b7 in __ubsan_handle_shift_out_of_bounds /home/marxin/Programming/gcc2/libsanitizer/ubsan/ubsan_handlers.cpp:370
#4 0x40067f in main (/home/marxin/Programming/testcases/a.out+0x40067f)
#5 0x7ffff72c8b24 in __libc_start_main (/lib64/libc.so.6+0x27b24)
#6 0x4005bd in _start (/home/marxin/Programming/testcases/a.out+0x4005bd)
Differential Revision: https://reviews.llvm.org/D97263
Luís Marques [Tue, 23 Feb 2021 09:23:37 +0000 (09:23 +0000)]
[Sanitizer][NFC] Fix typo
Raphael Isemann [Tue, 23 Feb 2021 09:14:43 +0000 (10:14 +0100)]
[lldb][NFC] Don't inherit from UserID in ValueObject
ValueObject inherits from UserID which is just a bad idea:
* The inheritance gives ValueObject some member functions that are at best
misleading (such as `Clear()` which doesn't clear any value beside `id`).
* It allows passing ValueObject to the overloaded operators for UserID (such as
`==` or `<<` which won't actually compare or print anything in the ValueObject).
* It exposes the `SetID` and `Clear` which both allow users to change the
internal id value.
Similar to D91699 which did the same for Process
Reviewed By: #lldb, JDevlieghere
Differential Revision: https://reviews.llvm.org/D97205
Liu, Chen3 [Tue, 23 Feb 2021 05:53:47 +0000 (13:53 +0800)]
[X86] Support amx-int8 intrinsic.
Adding support for intrinsics of TDPBSUD/TDPBUSD/TDPBUUD.
Differential Revision: https://reviews.llvm.org/D97259
River Riddle [Tue, 23 Feb 2021 08:51:57 +0000 (00:51 -0800)]
[mlir] Add support for DebugCounters using the new DebugAction infrastructure
DebugCounters allow for selectively enabling the execution of a debug action based upon a "counter". This counter is comprised of two components that are used in the control of execution of an action, a "skip" value and a "count" value. The "skip" value is used to skip a certain number of initial executions of a debug action. The "count" value is used to prevent a debug action from executing after it has executed for a set number of times (not including any executions that have been skipped). For example, a counter for a debug action with `skip=47` and `count=2`, would skip the first 47 executions, then execute twice, and finally prevent any further executions.
This is effectively the same as the DebugCounter infrastructure in LLVM, but using the DebugAction infrastructure in MLIR. We can't simply reuse the DebugCounter support already present in LLVM due to its heavy reliance on global constructors (which are not allowed in MLIR). The DebugAction infrastructure already nicely supports the debug counter use case, and promotes the separation of policy and mechanism design philosophy.
Differential Revision: https://reviews.llvm.org/D96395
River Riddle [Tue, 23 Feb 2021 08:51:49 +0000 (00:51 -0800)]
[mlir] Add a new debug action framework.
This revision adds the infrastructure for `Debug Actions`. This is a DEBUG only
API that allows for external entities to control various aspects of compiler
execution. This is conceptually similar to something like DebugCounters in LLVM, but at a lower level. This framework doesn't make any assumptions about how the higher level driver is controlling the execution, it merely provides a framework for connecting the two together. This means that on top of DebugCounter functionality, we could also provide more interesting drivers such as interactive execution. A high level overview of the workflow surrounding debug actions is
shown below:
* Compiler developer defines an `action` that is taken by the a pass,
transformation, utility that they are developing.
* Depending on the needs, the developer dispatches various queries, pertaining
to this action, to an `action manager` that will provide an answer as to
what behavior the action should do.
* An external entity registers an `action handler` with the action manager,
and provides the logic to resolve queries on actions.
The exact definition of an `external entity` is left opaque, to allow for more
interesting handlers.
This framework was proposed here: https://llvm.discourse.group/t/rfc-debug-actions-in-mlir-debug-counters-for-the-modern-world
Differential Revision: https://reviews.llvm.org/D84986
Kadir Cetinkaya [Fri, 19 Feb 2021 12:14:55 +0000 (13:14 +0100)]
[clang][DeclPrinter] Pass Context into StmtPrinter whenever possible
ASTContext were only passed to the StmtPrinter in some places, while it
is always available in DeclPrinter. The context is used by StmtPrinter to better
print statements in some cases, like printing constants as written.
Differential Revision: https://reviews.llvm.org/D97043
Raphael Isemann [Tue, 23 Feb 2021 08:38:34 +0000 (09:38 +0100)]
[lldb][NFC] Cleanup ValueObject construction code
Just code cleanup for ValueObject constructors:
* Use default member initializers where possible.
* Doxygenify the comments for membersa nd constructors where needed.
* Delete the default constructor which isn't defined.
* Initialize the bitfields via a utility struct instead of doing this in the
different constructors.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D97199
Craig Topper [Tue, 23 Feb 2021 08:26:56 +0000 (00:26 -0800)]
[RISCV] Add test case for missed opportunity use bgez for the canonical form X > -1. NFC
Juneyoung Lee [Tue, 23 Feb 2021 08:32:28 +0000 (17:32 +0900)]
[SimplifyCFG] Minor tweaks to the added tests (NFC)
Juneyoung Lee [Tue, 23 Feb 2021 08:15:17 +0000 (17:15 +0900)]
[SimplifyCFG] Add tests for D97244 (NFC)
Jean Perier [Tue, 23 Feb 2021 08:00:48 +0000 (09:00 +0100)]
[flang][NFC] Add source line to lowering TODO messages
- Add a fatal error handler that can print a message with source location
before aborting.
- Update TODO macro to take an mlir location argument and to use the
newly introduced fatal error handler.
- Introduce TODO_NOLOC for the few places where no source location is
easily accessible.
Reviewed By: schweitz
Differential Revision: https://reviews.llvm.org/D97190
Petr Hosek [Tue, 23 Feb 2021 06:19:55 +0000 (22:19 -0800)]
[CMake][profile] Don't use `TARGET lld` to avoid ordering issues
Depending on the order in which lld and compiler-rt projects are
processed by CMake, `TARGET lld` might evaluate to `TRUE` or `FALSE`
even though `lld-available` lit stanza is always set because lld is
being built. We check whether lld project is enabled instead which
is used by other compiler-rt tests.
The ideal solution here would be to use CMake generator expressions,
but those cannot be used for dependencies yet, see:
https://gitlab.kitware.com/cmake/cmake/-/issues/19467
Differential Revision: https://reviews.llvm.org/D97256
KareemErgawy-TomTom [Tue, 16 Feb 2021 06:42:41 +0000 (07:42 +0100)]
[MLIR][LinAlg] Start detensoring implementation.
This commit is the first baby step towards detensoring in
linalg-on-tensors.
Detensoring is the process through which a tensor value is convereted to one
or potentially more primitive value(s). During this process, operations with
such detensored operands are also converted to an equivalen form that works
on primitives.
The detensoring process is driven by linalg-on-tensor ops. In particular, a
linalg-on-tensor op is checked to see whether *all* its operands can be
detensored. If so, those operands are converted to thier primitive
counterparts and the linalg op is replaced by an equivalent op that takes
those new primitive values as operands.
This works towards handling github/google/iree#1159.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D96271
Mark de Wever [Mon, 22 Feb 2021 19:31:47 +0000 (20:31 +0100)]
[NFC][libc++] Fix _LIBCPP_HAS_BITSCAN64 usage.
Seems line was accidentally left in
llvm-svn: 290924
86eebc5b658b5c2ccf2f4fbc16e8aee9880919a5
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D97211
Kamlesh Kumar [Tue, 23 Feb 2021 06:57:23 +0000 (22:57 -0800)]
[builtins] Replace __SOFT_FP__ with __SOFTFP__
Fix PR46294
Differential Revision: https://reviews.llvm.org/D82014
Lang Hames [Tue, 23 Feb 2021 06:37:36 +0000 (17:37 +1100)]
[docs][ORC] Fix section title and reference.
Anton Afanasyev [Tue, 23 Feb 2021 04:55:55 +0000 (07:55 +0300)]
[SLP][Test] Add test for PR49081.ll
Mehdi Amini [Tue, 23 Feb 2021 00:56:01 +0000 (00:56 +0000)]
Move the MLIR integration tests as a subdirectory of test (NFC)
This does not change the behavior directly: the tests only run when
`-DMLIR_INCLUDE_INTEGRATION_TESTS=ON` is configured. However running
`ninja check-mlir` will not run all the tests within a single
lit invocation. The previous behavior would wait for all the integration
tests to complete before starting to run the first regular test. The
test results were also reported separately. This change is unifying all
of this and allow concurrent execution of the integration tests with
regular non-regression and unit-tests.
Differential Revision: https://reviews.llvm.org/D97241
Siva Chandra Reddy [Tue, 23 Feb 2021 05:39:12 +0000 (21:39 -0800)]
[libc][NFC] Eliminate couple of dependencies on llvm/ADT/StringExtras.h.
Juneyoung Lee [Fri, 19 Feb 2021 07:41:19 +0000 (16:41 +0900)]
[BuildLibCalls] Add noundef to allocator fns' size
This is a patch to explicitly mark the size parameter of allocator functions like malloc/realloc/... as noundef.
For C/C++: undef can be created from reading an uninitialized variable or padding.
Calling a function with uninitialized variable is already UB.
Calling malloc with padding value is.. something that's not expected. Padding bits may appear in a coerced aggregate, which doesn't apply to malloc's size.
Therefore, malloc's size can be marked as noundef.
For transformations that introduce malloc/realloc/..: I ran LLVM unit tests with an updated Alive2 semantics, and found no regression, so it seems okay.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D97045
Arthur Eubanks [Mon, 22 Feb 2021 21:36:29 +0000 (13:36 -0800)]
Only verify LazyCallGraph under expensive checks
These verify calls are causing a lot of slowdown on some files, up to 8x.
The LazyCallGraph infra has been tested a lot over the years, so I'm fairly confident that we don't always need to run the verifys.
These verifies took >90% of total time in one of the compilations I looked at.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D97225
Kazu Hirata [Tue, 23 Feb 2021 04:17:18 +0000 (20:17 -0800)]
[Analysis] Use range-based for loops (NFC)
Kazu Hirata [Tue, 23 Feb 2021 04:17:16 +0000 (20:17 -0800)]
[llvm] Use llvm::drop_begin (NFC)
Kazu Hirata [Tue, 23 Feb 2021 04:17:15 +0000 (20:17 -0800)]
[Analysis] Use ListSeparator (NFC)
Raman Tenneti [Tue, 23 Feb 2021 03:14:21 +0000 (19:14 -0800)]
[libc] [Obvious] Fix.
River Riddle [Tue, 23 Feb 2021 03:01:01 +0000 (19:01 -0800)]
[mlir][pdl][NFC] Extract the execution of each bytecode operation into its own function
This makes the implementation of each bytecode operation much easier to reason about, and lets the compiler decide which implementations are beneficial to inline into the main switch.
Differential Revision: https://reviews.llvm.org/D95716
River Riddle [Tue, 23 Feb 2021 03:00:54 +0000 (19:00 -0800)]
[mlir][pdl] Fix bug when ordering predicates
We should be ordering predicates with higher primary/secondary sums first, but we are currently ordering them last. This allows for predicates more frequently encountered to be checked first.
Differential Revision: https://reviews.llvm.org/D95715
ksyx [Wed, 17 Feb 2021 08:43:34 +0000 (16:43 +0800)]
[GVN] Fix a typo in comment
NFC.
Differential Revision: https://reviews.llvm.org/D97200
Reviewed By: fhahn
Raman Tenneti [Tue, 23 Feb 2021 02:03:11 +0000 (18:03 -0800)]
Changes to mktime to handle invalid dates, overflow and underflow andcalculating the correct date and thenumber of seconds even if invalid datesare passed as arguments.
Added tests for invalid dates like the following
Date 1970-01-01 00:00:-1 is treated as 1969-12-31 23:59:59 and seconds
are returned for the modified date.
Tested the code by doing ninja check-libc (and cmake).
Reviewed By: sivachandra, rtenneti
Differential Revision: https://reviews.llvm.org/D96684
Jianzhou Zhao [Sun, 21 Feb 2021 19:38:56 +0000 (19:38 +0000)]
[dfsan] Propagate origins at non-memory/phi/call instructions
This is a part of https://reviews.llvm.org/D95835.
Reviewed-by: morehouse
Differential Revision: https://reviews.llvm.org/D97200
Rahman Lavaee [Tue, 23 Feb 2021 01:50:14 +0000 (17:50 -0800)]
[obj2yaml,yaml2obj] Add NumBlocks to the BBAddrMapEntry yaml field.
As discussed in D95511, this allows us to encode invalid BBAddrMap
sections to be used in more rigorous testing.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D96831
Richard Howell [Tue, 23 Feb 2021 01:53:18 +0000 (17:53 -0800)]
[lldb] add check for libcxx runtime
When enabling LLDB tests with `LLVM_ENABLE_RUNTIMES=libcxx` CMake will
fail with:
```
LLDB test suite requires libc++, but it is currently disabled.
```
The issue is that the targets in LLVM_ENABLE_RUNTIMES are configured
after the targets in LLVM_ENABLE_PROJECTS, so at this point the check
for the `cxx` target will fail. CMake will add a dependency for a target
that does not exist yet however, so by first checking for `libcxx` in
LLVM_ENABLE_RUNTIMES we ensure that the `cxx` target will be present at
build time.
Tested with:
```
% cmake -G Ninja \
-C ~/local/llvm-project/lldb/cmake/caches/Apple-lldb-macOS.cmake \
-DLLVM_ENABLE_PROJECTS="clang;lldb" -DLLVM_ENABLE_RUNTIMES="libcxx" \
-DLIBCXX_INCLUDE_TESTS=NO ~/local/llvm-project/llvm
% ninja check-lldb
```
Reviewed By: smeenai, JDevlieghere
Differential Revision: https://reviews.llvm.org/D97227
River Riddle [Tue, 23 Feb 2021 01:30:19 +0000 (17:30 -0800)]
[mlir][IR] Refactor the `getChecked` and `verifyConstructionInvariants` methods on Attributes/Types
`verifyConstructionInvariants` is intended to allow for verifying the invariants of an attribute/type on construction, and `getChecked` is intended to enable more graceful error handling aside from an assert. There are a few problems with the current implementation of these methods:
* `verifyConstructionInvariants` requires an mlir::Location for emitting errors, which is prohibitively costly in the situations that would most likely use them, e.g. the parser.
This creates an unfortunate code duplication between the verifier code and the parser code, given that the parser operates on llvm::SMLoc and it is an undesirable overhead to pre-emptively convert from that to an mlir::Location.
* `getChecked` effectively requires duplicating the definition of the `get` method, creating a quite clunky workflow due to the subtle different in its signature.
This revision aims to talk the above problems by refactoring the implementation to use a callback for error emission. Using a callback allows for deferring the costly part of error emission until it is actually necessary.
Due to the necessary signature change in each instance of these methods, this revision also takes this opportunity to cleanup the definition of these methods by:
* restructuring the signature of `getChecked` such that it can be generated from the same code block as the `get` method.
* renaming `verifyConstructionInvariants` to `verify` to match the naming scheme of the rest of the compiler.
Differential Revision: https://reviews.llvm.org/D97100
Jessica Paquette [Tue, 23 Feb 2021 01:36:17 +0000 (17:36 -0800)]
Revert "[AArch64][GlobalISel] Match G_SHUFFLE_VECTOR -> insert elt + extract elt"
This reverts commit
867e379c0e14527eb7aa68485a10324693e35f5d.
For some reason this is upsetting Linux/Windows bots. Reverting while I try to
reproduce.
Cassie Jones [Mon, 22 Feb 2021 22:11:58 +0000 (17:11 -0500)]
[Test][AArch64] Test SADDE/SSUBE/UADDE/USUBE narrowing legalization
Reviewed By: paquette
Differential Revision: https://reviews.llvm.org/D96676
Cassie Jones [Mon, 22 Feb 2021 22:11:46 +0000 (17:11 -0500)]
[AArch64][GlobalISel] Make overflow legalization use clampScalar
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D96674
Cassie Jones [Mon, 22 Feb 2021 22:11:35 +0000 (17:11 -0500)]
[GlobalISel] Implement narrowScalar for SADDE/SSUBE/UADDE/USUBE
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D96673
Cassie Jones [Mon, 22 Feb 2021 22:11:23 +0000 (17:11 -0500)]
[GlobalISel] Implement narrowScalar for SADDO/SSUBO
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D96672
Cassie Jones [Mon, 22 Feb 2021 22:10:58 +0000 (17:10 -0500)]
[GlobalISel] Implement narrowScalar for UADDO/USUBO
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D96671
Jonas Devlieghere [Tue, 23 Feb 2021 00:52:30 +0000 (16:52 -0800)]
[lldb] Reinstate support for LLDB_VERSION_STRING
Reinstate support for specifying -DLLDB_VERSION_STRING="best-lldb"
which seems to have gotten accidentally removed in the past.
rdar://
38983903
Differential revision: https://reviews.llvm.org/D97235
Kazu Hirata [Tue, 23 Feb 2021 00:54:57 +0000 (16:54 -0800)]
[MacroExpansionContext] Fix a warning.
This patch fixes:
error: private field 'PP' is not used [-Werror,-Wunused-private-field]
Ryan Prichard [Tue, 23 Feb 2021 00:35:38 +0000 (16:35 -0800)]
[libunwind] unw_* alias fixes for ELF and Mach-O
Rename the CMake option, LIBUNWIND_HERMETIC_STATIC_LIBRARY, to
LIBUNWIND_HIDE_SYMBOLS. Rename the C macro define,
_LIBUNWIND_DISABLE_VISIBILITY_ANNOTATIONS, to _LIBUNWIND_HIDE_SYMBOLS,
because now the macro adds a .hidden directive rather than merely
suppress visibility annotations.
For ELF, when LIBUNWIND_HIDE_SYMBOLS is enabled, mark unw_getcontext as
hidden. This symbol is the only one defined using src/assembly.h's
WEAK_ALIAS macro. Other unw_* weak aliases are defined in C++ and are
already hidden.
Mach-O doesn't support weak aliases, so remove .weak_reference and
weak_import. When LIBUNWIND_HIDE_SYMBOLS is enabled, output
.private_extern for the unw_* aliases.
In assembly.h, add missing SYMBOL_NAME macro invocations, which are
used to prefix symbol names with '_' on some targets.
Fixes PR46709.
Reviewed By: #libunwind, phosek, compnerd, steven_wu
Differential Revision: https://reviews.llvm.org/D93003
Aart Bik [Sun, 21 Feb 2021 02:34:07 +0000 (18:34 -0800)]
[sparse][mlir] simplify lattice optimization logic
Simplifies the way lattices are optimized with less, but more
powerful rules. This also fixes an inaccuracy where too many
lattices resulted (expecting a non-existing universal index).
Also puts no-side-effects on all proper getters and unifies
bufferization flags order in integration tests (for future,
more complex use cases).
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D97134
Lang Hames [Mon, 22 Feb 2021 23:35:54 +0000 (10:35 +1100)]
[JITLink] Add a getFixupAddress convenience method to Block.
Lang Hames [Mon, 22 Feb 2021 06:16:19 +0000 (17:16 +1100)]
[JITLink] Don't allow creation of sections with duplicate names.
LLVM GN Syncbot [Mon, 22 Feb 2021 23:50:19 +0000 (23:50 +0000)]
[gn build] Port
8f48ddd19358
Luo, Yuanke [Sat, 20 Feb 2021 07:05:07 +0000 (15:05 +0800)]
[X86][AMX] Lower tile copy instruction.
Since there is no tile copy instruction, we need to store tile
register to stack and load from stack to another tile register.
We need extra GR to hold the stride, and we need stack slot to
hold the tile data register. We would run this pass after copy
propagation, so that we don't miss copy optimization. And we
would run this pass before prolog/epilog insertion, so that we
can allocate stack slot.
Differential Revision: https://reviews.llvm.org/D97112
James Y Knight [Mon, 22 Feb 2021 23:35:53 +0000 (18:35 -0500)]
DebugInfo: Emit "LocalToUnit" flag on local member function decls.
Follow-up to
fe2dcd89acfd9301a230e38e9030734553baa8dc.
Update test per review comments, restoring the "D" type to its
original state, and adding new "L" type. (Sorry, this was intended to
be included in the prior commit)
Differential Revision: https://reviews.llvm.org/D96044
Andy Kaylor [Thu, 4 Feb 2021 02:16:04 +0000 (18:16 -0800)]
Add auto-upgrade support for annotation intrinsics
The llvm.ptr.annotation and llvm.var.annotation intrinsics were changed
since the 11.0 release to add an additional parameter. This patch
auto-upgrades IR containing the four-parameter versions of these
intrinsics, adding a null pointer as the fifth argument.
Differential Revision: https://reviews.llvm.org/D95993
Stanislav Mekhanoshin [Mon, 22 Feb 2021 23:02:37 +0000 (15:02 -0800)]
[AMDGPU] Move RPT::getLiveRegs() check under EXPENSIVE_CHECKS
This is too expensive even for debug builds. It doubles
scheduling time if enabled.
Differential Revision: https://reviews.llvm.org/D97232
Arthur Eubanks [Fri, 12 Feb 2021 09:55:26 +0000 (01:55 -0800)]
[CMake] Don't optimize tests so much under ThinLTO
This drops check-llvm under -DLLVM_ENABLE_LTO=Thin from 13m to 10m20s on Windows and 20m to 15m35s on Linux.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D96618
Craig Topper [Mon, 22 Feb 2021 22:36:44 +0000 (14:36 -0800)]
[RISCV] Have sexti32 also recognize AssertZExt from types smaller than i32.
An i64 AssertZExt from a type smaller than i32 has at least 33
leading zeros which mean it has at least 33 sign bits.
Since we have a couple patterns that use two sexti32, I've
switched to a ComplexPattern so tablegen didn't have to generate
9 different permutations.
As noted in the FIXME, maybe we should just call computeNumSignBits,
but we don't have tests that benefit from that yet.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D97130
James Y Knight [Wed, 3 Feb 2021 20:56:52 +0000 (15:56 -0500)]
DebugInfo: Emit "LocalToUnit" flag on local member function decls.
Previously, the definition was so-marked, but the declaration was
not. This resulted in LLVM's dwarf emission treating the function as
being external, and incorrectly emitting DW_AT_external.
Differential Revision: https://reviews.llvm.org/D96044
Jessica Paquette [Fri, 19 Feb 2021 17:25:31 +0000 (09:25 -0800)]
[AArch64][GlobalISel] Match G_SHUFFLE_VECTOR -> insert elt + extract elt
Match a G_SHUFFLE_VECTOR with a mask that allows it to be represented as a
G_INSERT_VECTOR_ELT and a G_EXTRACT_VECTOR_ELT.
This ports `isINSMask` from AArch64ISelLowering and the portion of
`AArch64TargetLowering::LowerVECTOR_SHUFFLE` which handles the equivalent
transformation.
This provides more opportunities for matching DUP. We don't have all of the
necessary combines to actually make DUP out of these yet, but this is better for
size than the full TBL expansion for G_SHUFFLE_VECTOR.
This is a -0.1% code size improvement on CTMark/Bullet at -Os.
IR example: https://godbolt.org/z/sdcevT
Differential Revision: https://reviews.llvm.org/D97214
Craig Topper [Mon, 22 Feb 2021 22:34:06 +0000 (14:34 -0800)]
[ValueTracking] Improve ComputeNumSignBits for SRem.
The result will have the same sign as the dividend unless the
result is 0. The magnitude of the result will always be less
than or equal to the dividend. So the result will have at least
as many sign bits as the dividend.
Previously we would do this if the divisor was a positive constant,
but that isn't required.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D97170
Peter Collingbourne [Tue, 22 Dec 2020 02:39:03 +0000 (18:39 -0800)]
scudo: Support memory tagging in the secondary allocator.
This patch enhances the secondary allocator to be able to detect buffer
overflow, and (on hardware supporting memory tagging) use-after-free
and buffer underflow.
Use-after-free detection is implemented by setting memory page
protection to PROT_NONE on free. Because this must be done immediately
rather than after the memory has been quarantined, we no longer use the
combined allocator quarantine for secondary allocations. Instead, a
quarantine has been added to the secondary allocator cache.
Buffer overflow detection is implemented by aligning the allocation
to the right of the writable pages, so that any overflows will
spill into the guard page to the right of the allocation, which
will have PROT_NONE page protection. Because this would require the
secondary allocator to produce a header at the correct position,
the responsibility for ensuring chunk alignment has been moved to
the secondary allocator.
Buffer underflow detection has been implemented on hardware supporting
memory tagging by tagging the memory region between the start of the
mapping and the start of the allocation with a non-zero tag. Due to
the cost of pre-tagging secondary allocations and the memory bandwidth
cost of tagged accesses, the allocation itself uses a tag of 0 and
only the first four pages have memory tagging enabled.
Differential Revision: https://reviews.llvm.org/D93731
Shafik Yaghmour [Mon, 22 Feb 2021 17:58:26 +0000 (09:58 -0800)]
Modify TypePrinter to differentiate between anonymous struct and unnamed struct
Currently TypePrinter lumps anonymous classes and unnamed classes in one group "anonymous" this is not correct and can be confusing in some contexts.
Differential Revision: https://reviews.llvm.org/D96807
Sam McCall [Tue, 16 Feb 2021 08:16:10 +0000 (09:16 +0100)]
[clangd] Shutdown sequence for modules, and doc threading requirements
This allows modules to do work on non-TUScheduler background threads.
Differential Revision: https://reviews.llvm.org/D96755
Sam McCall [Wed, 17 Feb 2021 11:59:25 +0000 (12:59 +0100)]
[clangd] Narrow and document a loophole in blockUntilIdle
blockUntilIdle of a parent can't always be correctly implemented as
return ChildA.blockUntilIdle() && ChildB.blockUntilIdle()
The problem is that B can schedule work on A while we're waiting on it.
I believe this is theoretically possible today between CDB and background index.
Modules open more possibilities and it's hard to reason about all of them.
I don't have a perfect fix, and the abstraction is too good to lose. this patch:
- calls out why we block on workscheduler first, and asserts correctness
- documents the issue
- reduces the practical possibility of spuriously returning true significantly
This function is ultimately only for testing, so we're driving down flake rate.
Differential Revision: https://reviews.llvm.org/D96856
Petr Hosek [Sat, 13 Jul 2019 21:02:07 +0000 (14:02 -0700)]
[InstrProfiling] Use ELF section groups for counters, data and values
__start_/__stop_ references retain C identifier name sections such as
__llvm_prf_*. Putting these into a section group disables this logic.
The ELF section group semantics ensures that group members are retained
or discarded as a unit. When a function symbol is discarded, this allows
allows linker to discard counters, data and values associated with that
function symbol as well.
Note that `noduplicates` COMDAT is lowered to zero-flag section group in
ELF. We only set this for functions that aren't already in a COMDAT and
for those that don't have available_externally linkage since we already
use regular COMDAT groups for those.
Differential Revision: https://reviews.llvm.org/D96757
Amara Emerson [Fri, 19 Feb 2021 20:50:59 +0000 (12:50 -0800)]
[GloblalISel] Support lowering <3 x i8> arguments in multiple parts.
Differential Revision: https://reviews.llvm.org/D97086
Amara Emerson [Fri, 19 Feb 2021 06:54:59 +0000 (22:54 -0800)]
[AArch64][GlobalISel] Support lowering <1 x i8> arguments.
We don't yet have working codegen for the resulting unmerges, and if
we did it would probably be horrible.
Differential Revision: https://reviews.llvm.org/D97035
Nico Weber [Mon, 22 Feb 2021 19:29:55 +0000 (14:29 -0500)]
[lld-link] Add /reproduce: support for several flags
/reproduce: now works correctly with:
- /call-graph-ordering-file:
- /def:
- /natvis:
- /order:
- /pdbstream:
I went through all instances of MemoryBuffer::getFile() and made sure
everything that didn't already do so called takeBuffer().
For natvis, that wasn't possible since DebugInfo/PDB wants to take
owernship of the natvis buffer. For that case, I'm manually adding the
tar file entry.
/natvis: and /pdbstream: is slightly awkward, since createResponseFile()
always adds these flags to the response file but createPDB() (which
ultimately adds the files referenced by the flags) is only called if
/debug is also passed. So when using /natvis: without /debug with
/reproduce:, lld won't warn, but when linking using the response
file from the archive, it won't find the natvis file since it's not
in the tar. This isn't a new issue though, and after this patch things
at least work with using /natvis: _with_ debug with /reproduce:.
(Same for /pdbstream:)
Differential Revison: https://reviews.llvm.org/D97212
Heejin Ahn [Mon, 22 Feb 2021 05:12:37 +0000 (21:12 -0800)]
[WebAssembly] Remap branch dests after fixCatchUnwindMismatches
Fixing catch unwind mismatches can sometimes invalidate existing branch
destinations. This CL remaps those destinations after placing
try-delegates.
Fixes https://github.com/emscripten-core/emscripten/issues/13515.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D97178