Arthur Eubanks [Thu, 24 Jun 2021 15:21:24 +0000 (08:21 -0700)]
[OpaquePtr] Introduce option to force all pointers to be opaque pointers
We don't want to start updating tests to use opaque pointers until we're
close to the opaque pointer transition. However, before the transition
we want to run tests as if pointers are opaque pointers to see if there
are any crashes.
At some point when we have a flag to only create opaque pointers in the
bitcode and textual IR readers, and when we have fixed all places that
try to read a pointee type, this flag will be useless. However, until
then, this can help us find issues more easily.
Since the cl::opt is read into LLVMContext, we need to make sure
LLVMContext is created after cl::ParseCommandLineOptions().
Previously ValueEnumerator would visit the value types of global values
via the pointer type, but with opaque pointers we have to manually visit
the value type.
Reviewed By: nikic, dexonsmith
Differential Revision: https://reviews.llvm.org/D103503
Arthur Eubanks [Wed, 23 Jun 2021 01:53:12 +0000 (18:53 -0700)]
[WPD] Don't optimize calls more than once
WPD currently assumes that there is a one to one correspondence between
type test assume sequences and virtual calls. However, with
-fstrict-vtable-pointers this may not be true. This ends up causing
crashes when we try to optimize a virtual call more than once (
applyUniformRetValOpt()/applyUniqueRetValOpt()/applyVirtualConstProp()/applySingleImplDevirt()).
applySingleImplDevirt() actually didn't previous crash because it would
replace the devirtualized call with the same direct call. Adding an
assert that the call is indirect causes the corresponding test to crash
with the rest of the patch.
This makes Chrome successfully build with -fstrict-vtable-pointers + WPD.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D104798
Seraphime Kirkovski [Thu, 24 Jun 2021 19:24:59 +0000 (21:24 +0200)]
[Clang-Format] Add ReferenceAlignment directive
This introduces ReferenceAlignment style option modeled around
PointerAlignment.
Style implementors can specify Left, Right, Middle or Pointer to
follow whatever the PointerAlignment option specifies.
Differential Revision: https://reviews.llvm.org/D104096
Nemanja Ivanovic [Thu, 24 Jun 2021 19:44:17 +0000 (14:44 -0500)]
[PowerPC] Combine 64-bit bswap(load) without LDBRX
When targeting CPUs that don't have LDBRX, we end up producing code that is
very inefficient and large for this common idiom. This patch just
optimizes it two 32-bit LWBRX instructions along with a merge.
This fixes https://bugs.llvm.org/show_bug.cgi?id=49610
Differential revision: https://reviews.llvm.org/D104836
Nikita Popov [Thu, 24 Jun 2021 19:46:55 +0000 (21:46 +0200)]
[InstCombine] Make varargs cast transform compatible with opaque ptrs
The whole transform can be dropped once we have fully transitioned
to opaque pointers (as it's purpose is to remove no-op pointer
casts). For now, make sure that it handles opaque pointers correctly.
Jonas Paulsson [Wed, 9 Jun 2021 21:35:22 +0000 (23:35 +0200)]
[BuildLibCalls/SimplifyLibCalls] Fix attributes on created CallInst instructions.
- When emitting libcalls, do not only pass the calling convention from the
function prototype but also the attributes.
- Do not pass attributes from e.g. libc memcpy to llvm.memcpy.
Review: Reid Kleckner, Eli Friedman, Arthur Eubanks
Differential Revision: https://reviews.llvm.org/D103992
Björn Schäpers [Thu, 24 Jun 2021 19:19:14 +0000 (21:19 +0200)]
[clang-format][NFC] Fix documentation
This amends
64cf5eba06bd4f81954253b1e7a10be6fe92403e.
Akira Hatanaka [Thu, 24 Jun 2021 18:45:52 +0000 (11:45 -0700)]
[CodeGen] Don't create fake FunctionDecls when generating block/byref
copy/dispose helper functions
We found out that these fake functions would cause clang to crash if the
changes proposed in https://reviews.llvm.org/D98799 were made.
The original patch was reverted in
f681fd927e883301658dcac9a78109ee0aba12a8
because debug locations were missing in the body of the block byref
helper functions. This patch fixes the bug by calling CreateArtificial
after the calls to StartFunction.
Differential Revision: https://reviews.llvm.org/D104082
Roman Lebedev [Thu, 24 Jun 2021 18:33:43 +0000 (21:33 +0300)]
[NFC][Codegen] Autogenerate Thumb2/setjmp_longjmp.ll test
Aakanksha Patil [Thu, 24 Jun 2021 18:32:41 +0000 (14:32 -0400)]
[AMDGPU] Add gfx1035 target
Differential Revision: https://reviews.llvm.org/D104804
Roman Lebedev [Thu, 24 Jun 2021 18:25:06 +0000 (21:25 +0300)]
[SimplifyCFG] Tail-merging all blocks with `resume` terminator
Similar to what we already do for `ret` terminators.
As noted by @rnk, clang seems to already generate a single `ret`/`resume`,
so this isn't likely to cause widespread changes.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D104849
zoecarver [Thu, 24 Jun 2021 18:05:50 +0000 (11:05 -0700)]
[libcxx][nfc] Update the synopsis comment in <ranges> to include drop_view.
zoecarver [Thu, 24 Jun 2021 18:05:14 +0000 (11:05 -0700)]
[libcxx][ranges] Enable borrowed range for drop view when T has borrowing enabled.
LLVM GN Syncbot [Thu, 24 Jun 2021 18:02:44 +0000 (18:02 +0000)]
[gn build] Port
6adbc83ee9e4
Christopher Di Bella [Sat, 5 Jun 2021 02:47:47 +0000 (02:47 +0000)]
[libcxx][modularisation] moves <utility> content out of <type_traits>
Moves:
* `std::move`, `std::forward`, `std::declval`, and `std::swap` into
`__utility/${FUNCTION_NAME}`.
* `std::swap_ranges` and `std::iter_swap` into
`__algorithm/${FUNCTION_NAME}`
Differential Revision: https://reviews.llvm.org/D103734
Christopher Di Bella [Wed, 23 Jun 2021 01:25:31 +0000 (01:25 +0000)]
[libcxx][NFC] removes `swap`'s dependency on `swap_ranges`
Under the as-if rule, we can directly implement the array overload for
`std::swap`. By removing this circular dependency where `swap` is
implemented in terms of `swap_ranges` and `swap_ranges` is defined in
terms of `swap`, we can split them into their own headers. This will:
* limit the surface area in which Hyrum's law can bite us;
* force users to include the correct headers;
* make finding the definitions trivial (`swap` is a utility;
`swap_ranges` is an algorithm).
Differential Revision: https://reviews.llvm.org/D104760
Mehdi Amini [Thu, 24 Jun 2021 17:47:50 +0000 (17:47 +0000)]
Attempt to disable MLIR JIT tests on PowerPC to unbreak the bot
This is until we figure how to turn on the large code size model.
zoecarver [Thu, 24 Jun 2021 17:45:25 +0000 (10:45 -0700)]
[libcxx][nfc] Add one more test case for contiguous_range.
If the `data` member function is different enough, `ranges::data` won't pick it, so the range remains a contiguous_range.
zoecarver [Mon, 14 Jun 2021 20:08:15 +0000 (16:08 -0400)]
[libcxx][ranges] Add contiguous_range.
Differential Revision: https://reviews.llvm.org/D104262
Roman Lebedev [Thu, 24 Jun 2021 17:31:28 +0000 (20:31 +0300)]
[NFC][SimplifyCFG] Revisit tail-merge-resume.ll test
Add an already somewhat-common resume block
Pablo Barrio [Tue, 15 Jun 2021 14:04:15 +0000 (15:04 +0100)]
[AArch64][v8.3A] Avoid inserting implicit landing pads (PACI*SP)
PACI*SP have the advantage that they are in HINT space, meaning
they can be run successfully in hardware without PAuth support -
they will just behave as a NOP. However, PACI*SP are also implicit
landing pads (think of an extra BTI jc). Therefore, they allow
indirect jumps of all kinds into them, potentially inserting new
gadgets. This patch replaces PACI*SP by PACI* LR, SP when
compiling explicitly for hardware with full PAuth support. PACI*
is not in the HINT space, therefore it will fault when run in
hardware without PAuth support, but it is also not a landing pad,
making programs safer in newer HW.
Differential Revision: https://reviews.llvm.org/D101920
Sanjay Patel [Thu, 24 Jun 2021 17:19:07 +0000 (13:19 -0400)]
[InstSimplify] move extract with undef index fold; NFC
This puts it closer to the other undef query check and
will avoid a potential ordering problem if we allow
folding non-constant-int indexes.
Anna Thomas [Thu, 24 Jun 2021 17:09:40 +0000 (13:09 -0400)]
Precommit tests for context senstive attribute dropping
Precommit tests from D104641.
The patch will fix the callsites by dropping the context-sensitive
attributes.
Reviewed-By: Self
William S. Moses [Thu, 24 Jun 2021 16:33:54 +0000 (12:33 -0400)]
[MLIR][SCF] Inline single block ExecuteRegionOp
This commit adds a canonicalization pass which inlines any single block execute region
Differential Revision: https://reviews.llvm.org/D104865
Sanjay Patel [Thu, 24 Jun 2021 16:51:45 +0000 (12:51 -0400)]
[InstSimplify][test] add test for extract of splat; NFC
This is shown in:
https://llvm.org/PR50817
Sanjay Patel [Thu, 24 Jun 2021 16:44:41 +0000 (12:44 -0400)]
[InstSimplify][test] move tests that don't require InstCombine; NFC
These are existing/missing simplifications, so the tests
don't need the full power of InstCombine.
Craig Topper [Thu, 24 Jun 2021 16:08:57 +0000 (09:08 -0700)]
[TargetLowering][ARM] Don't alter opaque constants in TargetLowering::ShrinkDemandedConstant.
We don't constant fold based on demanded bits elsewhere in
SimplifyDemandedBits, so I don't think we should shrink them either.
The affected ARM test changes because a constant become non-opaque
and eventually enabled some constant folding. This no longer happens.
I checked and InstCombine is able to simplify this test. I'm not sure exactly
what it was trying to test.
Reviewed By: lebedev.ri, dmgreen
Differential Revision: https://reviews.llvm.org/D104832
Petr Hosek [Fri, 21 May 2021 06:58:07 +0000 (23:58 -0700)]
[CMake] Don't LTO optimize targets on Darwin either
This is a follow up to D102732 which also expands the logic to Darwin.
Differential Revision: https://reviews.llvm.org/D104764
Anirudh Prasad [Thu, 24 Jun 2021 16:49:38 +0000 (12:49 -0400)]
[AsmParser][SystemZ][z/OS] Support for emitting labels in upper case
- Currently, the emitting of labels in the parsePrimaryExpr function is case independent. It just takes the identifier and emits it.
- However, for HLASM the emitting of labels is case independent. We are emitting them in the upper case only, to enforce case independency. So we need to ensure that at the time of parsing the label we are emitting the upper case (in `parseAsHLASMLabel`), but also, when we are processing a PC-relative relocatable expression, we need to ensure we emit it in upper case (in `parsePrimaryExpr`)
- To achieve this a new MCAsmInfo attribute has been introduced which corresponding targets can override if needed.
Reviewed By: abhina.sreeskantharajan, uweigand
Differential Revision: https://reviews.llvm.org/D104715
Geoffrey Martin-Noble [Thu, 24 Jun 2021 16:42:14 +0000 (09:42 -0700)]
Update Bazel build for
929189a499
Updates Bazel build files to match
https://github.com/llvm/llvm-project/commit/
929189a499
Differential Revision: https://reviews.llvm.org/D104864
David Spickett [Fri, 19 Feb 2021 16:31:46 +0000 (16:31 +0000)]
[lldb][AArch64] Add "memory tag read" command
This new command looks much like "memory read"
and mirrors its basic behaviour.
(lldb) memory tag read new_buf_ptr new_buf_ptr+32
Logical tag: 0x9
Allocation tags:
[0x900fffff7ffa000, 0x900fffff7ffa010): 0x9
[0x900fffff7ffa010, 0x900fffff7ffa020): 0x0
Important proprties:
* The end address is optional and defaults to reading
1 tag if ommitted
* It is an error to try to read tags if the architecture
or process doesn't support it, or if the range asked
for is not tagged.
* It is an error to read an inverted range (end < begin)
(logical tags are removed for this check so you can
pass tagged addresses here)
* The range will be expanded to fit the tagging granule,
so you can get more tags than simply (end-begin)/granule size.
Whatever you get back will always cover the original range.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D97285
Alex Zinenko [Thu, 24 Jun 2021 16:34:49 +0000 (18:34 +0200)]
[mlir] remove repeated use of TypeToLLVM.cpp in cmake targets
David Spickett [Fri, 19 Feb 2021 15:57:29 +0000 (15:57 +0000)]
[lldb][AArch64] Add MTE memory tag reading to lldb
This adds GDB client support for the qMemTags packet
which reads memory tags. Following the design
which was recently committed to GDB.
https://sourceware.org/gdb/current/onlinedocs/gdb/General-Query-Packets.html#General-Query-Packets
(look for qMemTags)
lldb commands will use the new Process methods
GetMemoryTagManager and ReadMemoryTags.
The former takes a range and checks that:
* The current process architecture has an architecture plugin
* That plugin provides a MemoryTagManager
* That the range of memory requested lies in a tagged range
(it will expand it to granules for you)
If all that was true you get a MemoryTagManager you
can give to ReadMemoryTags.
This two step process is done to allow commands to get the
tag manager without having to read tags as well. For example
you might just want to remove a logical tag, or error early
if a range with tagged addresses is inverted.
Note that getting a MemoryTagManager doesn't mean that the process
or a specific memory range is tagged. Those are seperate checks.
Having a tag manager just means this architecture *could* have
a tagging feature enabled.
An architecture plugin has been added for AArch64 which
will return a MemoryTagManagerAArch64MTE, which was added in a
previous patch.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D95602
Alexander Yermolovich [Thu, 24 Jun 2021 15:02:45 +0000 (08:02 -0700)]
[LLD][LLVM] CG Graph profile using relocations
Currently when .llvm.call-graph-profile is created by llvm it explicitly encodes the symbol indices. This section is basically a black box for post processing tools. For example, if we run strip -s on the object files the symbol table changes, but indices in that section do not. In non-visible behavior indices point to wrong symbols. The visible behavior indices point outside of Symbol table: "invalid symbol index".
This patch changes the format by using R_*_NONE relocations to indicate the from/to symbols. The Frequency (Weight) will still be in the .llvm.call-graph-profile, but symbol information will be in relocation section. In LLD information from both sections is used to reconstruct call graph profile. Relocations themselves will never be applied.
With this approach post processing tools that handle relocations correctly work for this section also. Tools can add/remove symbols and as long as they handle relocation sections with this approach information stays correct.
Doing a quick experiment with clang-13.
The size went up from 107KB to 322KB, aggregate of all the input sections. Size of clang-13 binary is ~118MB. For users of -fprofile-use/-fprofile-sample-use the size of object files will go up slightly, it will not impact final binary size.
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D104080
William S. Moses [Tue, 22 Jun 2021 17:57:04 +0000 (13:57 -0400)]
[MLIR][LLVM] Expose type translator from LLVM to MLIR Type
This commit moves the type translator from LLVM to MLIR to a public header for use by external projects or other code.
Unlike a previous attempt (https://reviews.llvm.org/D104726), this patch moves the type conversion into separate files which remedies the linker error which was only caught by CI.
Differential Revision: https://reviews.llvm.org/D104834
David Spickett [Fri, 19 Feb 2021 15:57:59 +0000 (15:57 +0000)]
[lldb][AArch64] Add memory tag reading to lldb-server
This adds memory tag reading using the new "qMemTags"
packet and ptrace on AArch64 Linux.
This new packet is following the one used by GDB.
(https://sourceware.org/gdb/current/onlinedocs/gdb/General-Query-Packets.html)
On AArch64 Linux we use ptrace's PEEKMTETAGS to read
tags and we assume that lldb has already checked that the
memory region actually has tagging enabled.
We do not assume that lldb has expanded the requested range
to granules and expand it again to be sure.
(although lldb will be sending aligned ranges because it happens
to need them client side anyway)
Also we don't assume untagged addresses. So for AArch64 we'll
remove the top byte before using them. (the top byte includes
MTE and other non address data)
To do the ptrace read NativeProcessLinux will ask the native
register context for a memory tag manager based on the
type in the packet. This also gives you the ptrace numbers you need.
(it's called a register context but it also has non register data,
so it saves adding another per platform sub class)
The only supported platform for this is AArch64 Linux and the only
supported tag type is MTE allocation tags. Anything else will
error.
Ptrace can return a partial result but for lldb-server we will
be treating that as an error. To succeed we need to get all the tags
we expect.
(Note that the protocol leaves room for logical tags to be
read via qMemTags but this is not going to be implemented for lldb
at this time.)
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D95601
Florian Hahn [Thu, 24 Jun 2021 15:38:33 +0000 (16:38 +0100)]
[VPlan] Fix indentation of check lines in sinking test (NFC).
Nico Weber [Thu, 24 Jun 2021 15:06:15 +0000 (11:06 -0400)]
[gn build] Fix a comment typo and a comment copy-pasto
Nicolas Vasilache [Thu, 24 Jun 2021 14:39:50 +0000 (14:39 +0000)]
[mlir][Linalg] Add support for scf::ForOp in comprehensive bufferization (7/n)
scf::ForOp bufferization analysis proceeds just like for any other op (including FuncOp) at its boundaries; i.e. if:
1. The tensor operand is inplaceable.
2. The matching result has no subsequent read (i.e. all reads dominate the scf::ForOp).
3. In and does not create a RAW interference.
then it can bufferize inplace.
Still there are a few differences:
1. bbArgs for an scf::ForOp are always considered inplaceable when seen from ops inside the body. This is because a) either the matching tensor operand is not inplaceable and an alloc will be inserted (which makes bbArg itself inplaceable); or b) the tensor operand and bbArg are both already inplaceable.
2. Bufferization within the scf::ForOp body has implications to the outside world : the scf.yield terminator may well ping-pong values of the same type. This muddies the water for alias analysis and is not supported atm. Such cases result in a pass failure.
Differential revision: https://reviews.llvm.org/D104490
Sjoerd Meijer [Wed, 23 Jun 2021 12:40:46 +0000 (13:40 +0100)]
[AArch64] Precommit extending load tests for D104782. NFC.
David Spickett [Thu, 24 Jun 2021 14:52:02 +0000 (15:52 +0100)]
[lldb][AArch64] Fix unpack tags test case
Use %zu to print size_t vars.
Saurabh Jha [Wed, 23 Jun 2021 14:47:01 +0000 (15:47 +0100)]
Add documentation for compound assignment and type conversion of matrix types
David Spickett [Fri, 19 Feb 2021 15:32:09 +0000 (15:32 +0000)]
[lldb][AArch64] Add memory-tagging qSupported feature
This feature "memory-tagging+" indicates that lldb-server
supports memory tagging packets. (added in a later patch)
We check HWCAP2_MTE to decide whether to enable this
feature for Linux.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D97282
Nico Weber [Thu, 24 Jun 2021 14:17:13 +0000 (10:17 -0400)]
[gn build] Remove an unneeded -I flag
Everything includes clang/Config/config.h by qualified "clang/Config/config.h"
path, so there's no need for `-Igen/clang/include/clang/Config/clang/include`.
No behavior change.
Tobias Gysi [Thu, 24 Jun 2021 13:45:18 +0000 (13:45 +0000)]
[mlir][linalg][python] Add shape-only tensor support to OpDSL.
Add an index_dim annotation to specify the shape to loop mapping of shape-only tensors. A shape-only tensor serves is not accessed withing the body of the operation but is required to span the iteration space of certain operations such as pooling.
Differential Revision: https://reviews.llvm.org/D104767
David Spickett [Fri, 19 Feb 2021 13:37:37 +0000 (13:37 +0000)]
[lldb][AArch64] Add class for managing memory tags
This adds the MemoryTagManager class and a specialisation
of that class for AArch64 MTE tags. It provides a generic
interface for various tagging operations.
Adding/removing tags, diffing tagged pointers, etc.
Later patches will use this manager to handle memory tags
in generic code in both lldb and lldb-server.
Since it will be used in both, the base class header is in
lldb/Target.
(MemoryRegionInfo is another example of this pattern)
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D97281
Brendon Cahoon [Mon, 14 Jun 2021 17:10:50 +0000 (13:10 -0400)]
[GlobalISel] Describe undefined values for G_SBFX/G_UBFX operands
Differential Revision: https://reviews.llvm.org/D104245
Florian Hahn [Thu, 24 Jun 2021 12:55:08 +0000 (13:55 +0100)]
[LV] Support sinking recipe in replicate region after another region.
This patch handles sinking a replicate region after another replicate
region. In that case, we can connect the sink region after the target
region. This properly handles the case for which an assertion has been
added in
337d7652823f.
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=34842.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D103514
Tobias Gysi [Thu, 24 Jun 2021 09:56:16 +0000 (09:56 +0000)]
[mlir][linalg][python] Add attribute support to the YAML codegen.
Extend the yaml code generation to support the index attributes that https://reviews.llvm.org/D104711 added to the OpDSL.
Differential Revision: https://reviews.llvm.org/D104712
Stephen Tozer [Wed, 23 Jun 2021 14:04:37 +0000 (15:04 +0100)]
[DebugInfo] Enable variadic debug value salvaging
This patch enables the salvaging of debug values that may be calculated
from more than one SSA value, such as with binary operators that do not
use a constant argument. The actual functionality for this behaviour is
added in a previous commit (
c7270567), but with the ability to actually
emit the resulting debug values switched off.
The reason for this is that the prior patch has been reverted several
times due to issues discovered downstream, some time after the actual
landing of the patch. The patch in question is rather large and touches
several widely used header files, and all issues discovered are more
related to the handling of variadic debug values as a whole rather than
the details of the patch itself. Therefore, to minimize the build time
impact and risk of conflicts involved in any potential future
revert/reapply of that patch, this significantly smaller patch (that
touches no header files) will instead be used as the capstone to enable
variadic debug value salvaging.
The review linked to this patch is mostly implemented by the previous
commit,
c7270567, but also contains the changes in this patch.
Differential Revision: https://reviews.llvm.org/D91722
David Green [Thu, 24 Jun 2021 12:09:11 +0000 (13:09 +0100)]
[ARM] Extend narrow values to allow using truncating scatters
As a minor adjustment to the existing lowering of offset scatters, this
extends any smaller-than-legal vectors into full vectors using a zext,
so that the truncating scatters can be used. Due to the way MVE
legalizes the vectors this should be cheap in most situations, and will
prevent the vector from being scalarized.
Differential Revision: https://reviews.llvm.org/D103704
Roman Lebedev [Thu, 24 Jun 2021 12:08:20 +0000 (15:08 +0300)]
[NFC][SimplifyCFG] Add basic test for tail-merging `resume` function terminators
Jay Foad [Thu, 24 Jun 2021 10:00:13 +0000 (11:00 +0100)]
[MCA] Allow unlimited cycles in the timeline view
Change --max-timeline-cycles=0 to mean no limit on the number of cycles.
Use this in AMDGPU tests to show all instructions in the timeline view
instead of having it arbitrarily truncated.
Differential Revision: https://reviews.llvm.org/D104846
Florian Hahn [Thu, 24 Jun 2021 11:24:04 +0000 (12:24 +0100)]
[X86] Exclude invalid element types for bitcast/broadcast folding.
It looks like the fold introduced in
63f3383ece25efa can cause crashes
if the type of the bitcasted value is not a valid vector element type,
like x86_mmx.
To resolve the crash, reject invalid vector element types. The way it is
done in the patch is a bit clunky. Perhaps there's a better way to
check?
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D104792
Florian Hahn [Thu, 24 Jun 2021 09:00:27 +0000 (10:00 +0100)]
[SCEV] Generalize MatchBinaryAddToConst to support non-add expressions.
This patch generalizes MatchBinaryAddToConst to support matching
(A + C1), (A + C2), instead of just matching (A + C1), A.
The existing cases can be handled by treating non-add expressions A as
A + 0.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D104634
Rosie Sumpter [Tue, 15 Jun 2021 09:29:27 +0000 (10:29 +0100)]
[CostModel][AArch64] Improve cost model for vector reduction intrinsics
OR, XOR and AND entries are added to the cost table. An extra cost
is added when vector splitting occurs.
This is done to address the issue of a missed SLP vectorization
opportunity due to unreasonably high costs being attributed to the vector
Or reduction (see: https://bugs.llvm.org/show_bug.cgi?id=44593).
Differential Revision: https://reviews.llvm.org/D104538
Nicolas Vasilache [Thu, 24 Jun 2021 06:41:07 +0000 (06:41 +0000)]
[mlir][Linalg] Add basic lowering test to library calls
This test shows how convert-linalg-to-std rewrites named linalg ops as library calls.
This can be coupled with a C++ shim to connect to existing libraries such as https://gist.github.com/nicolasvasilache/
691ef992404c49dc9b5d543c4aa6db38.
Everything can then be linked together with mlir-cpu-runner and MLIR can call C++ (which can itself call MLIR if needed).
This should evolve into specific rewrite patterns that can be applied on op instances independently rather than having to use a full conversion.
Differential Revision: https://reviews.llvm.org/D104842
Muhammad Omair Javaid [Thu, 24 Jun 2021 10:46:41 +0000 (15:46 +0500)]
[Clang] XFAIL sanitize-coverage-old-pm.c on 32bit Armv8l
sanitize-coverage-old-pm.c started failing on arm 32 bit where
underlying architecture reported is armv8l fore 32bit arm.
This patch XFAILS sanitize-coverage-old-pm.c on armv8l similar
to armv7 and thumbv7.
Simon Pilgrim [Thu, 24 Jun 2021 10:15:44 +0000 (11:15 +0100)]
[X86] Fold nested select_cc to select (cmp*ge/le Cond0, Cond1), LHS, Y)
select (cmpeq Cond0, Cond1), LHS, (select (cmpugt Cond0, Cond1), LHS, Y) --> (select (cmpuge Cond0, Cond1), LHS, Y)
etc,
We already perform this fold in DAGCombiner for MVT::i1 comparison results, but these can still appear after legalization (in x86 case with MVT::i8 results), where we need to be more careful about generating new comparison codes.
Pulled out of D101074 to help address the remaining regressions.
Differential Revision: https://reviews.llvm.org/D104707
Sander de Smalen [Thu, 24 Jun 2021 08:58:21 +0000 (09:58 +0100)]
[GlobalISel] NFC: Change LLT::vector to take ElementCount.
This also adds new interfaces for the fixed- and scalable case:
* LLT::fixed_vector
* LLT::scalable_vector
The strategy for migrating to the new interfaces was as follows:
* If the new LLT is a (modified) clone of another LLT, taking the
same number of elements, then use LLT::vector(OtherTy.getElementCount())
or if the number of elements is halfed/doubled, it uses .divideCoefficientBy(2)
or operator*. That is because there is no reason to specifically restrict
the types to 'fixed_vector'.
* If the algorithm works on the number of elements (as unsigned), then
just use fixed_vector. This will need to be fixed up in the future when
modifying the algorithm to also work for scalable vectors, and will need
then need additional tests to confirm the behaviour works the same for
scalable vectors.
* If the test used the '/*Scalable=*/true` flag of LLT::vector, then
this is replaced by LLT::scalable_vector.
Reviewed By: aemerson
Differential Revision: https://reviews.llvm.org/D104451
Roman Lebedev [Thu, 24 Jun 2021 10:15:39 +0000 (13:15 +0300)]
[SimplifyCFG] Tail-merging all blocks with `ret` terminator
Based ontop of D104598, which is a NFCI-ish refactoring.
Here, a restriction, that only empty blocks can be merged, is lifted.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D104597
Roman Lebedev [Thu, 24 Jun 2021 09:50:41 +0000 (12:50 +0300)]
[NFC][AArch64] Un-autogenerate swifterror.ll tests
It appears the change needed in D104597 is minimal and obvious,
so let's not make them so verbose.
Tobias Gysi [Thu, 24 Jun 2021 09:21:12 +0000 (09:21 +0000)]
[mlir][linalg][python] Add attribute support to the OpDSL.
Extend the OpDSL with index attributes. After tensors and scalars, index attributes are the third operand type. An index attribute represents a compile-time constant that is limited to index expressions. A use cases are the strides and dilations defined by convolution and pooling operations.
The patch only updates the OpDSL. The C++ yaml codegen is updated by a followup patch.
Differential Revision: https://reviews.llvm.org/D104711
Denys Petrov [Wed, 16 Jun 2021 13:44:36 +0000 (16:44 +0300)]
[analyzer] Added a test case for PR46264
Summary: It's not able to reproduce the issue (https://bugs.llvm.org/show_bug.cgi?id=46264) for the latest sources. Add a reported test case to try to catch the problem if occur es.
Differential Revision: https://reviews.llvm.org/D104381
Prevent: https://bugs.llvm.org/show_bug.cgi?id=46264
Fraser Cormack [Wed, 23 Jun 2021 09:11:13 +0000 (10:11 +0100)]
[RISCV] Lower RVV vector SELECTs to VSELECTs
This patch optimizes the code generation of vector-type SELECTs (LLVM
select instructions with scalar conditions) by custom-lowering to
VSELECTs (LLVM select instructions with vector conditions) by splatting
the condition to a vector. This avoids the default expansion path which
would either introduce control flow or fully scalarize.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D104772
Abid Malik [Thu, 24 Jun 2021 08:42:46 +0000 (09:42 +0100)]
[MLIR][OpenMP]Basic OpenMP target operation
This includes a basic implementation for the OpenMP target
operation. Currently, the if, thread_limit, private, shared, device, and nowait clauses are included in this implementation.
Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>
Reviewed By: ftynse, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D102816
Florian Mayer [Thu, 17 Jun 2021 14:23:19 +0000 (15:23 +0100)]
[hwasan] print exact mismatch offset for short granules.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D104463
Stephen Tozer [Thu, 17 Jun 2021 15:35:17 +0000 (16:35 +0100)]
Partial Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands"
This is a partial reapply of the original commit and the followup commit
that were previously reverted; this reapply also includes a small fix
for a potential source of non-determinism, but also has a small change
to turn off variadic debug value salvaging, to ensure that any future
revert/reapply steps to disable and renable this feature do not risk
causing conflicts.
Differential Revision: https://reviews.llvm.org/D91722
This reverts commit
386b66b2fc297cda121a3cc8a36887a6ecbcfc68.
Florian Hahn [Thu, 24 Jun 2021 08:19:28 +0000 (09:19 +0100)]
[SLP] Add some tests that require memory runtime checks.
Dmitry Vyukov [Sat, 19 Jun 2021 10:52:26 +0000 (12:52 +0200)]
tsan: re-enable mmap_stress.cpp test
The comment says it was flaky in 2016,
but it wasn't possible to debug it back then.
Re-enable the test at least on linux/x86_64.
It will either work, or at least we should
see failure output from lit today.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D104592
Dmitry Vyukov [Sat, 19 Jun 2021 10:49:37 +0000 (12:49 +0200)]
tsan: fix mmap atomicity
Mmap interceptor is not atomic in the sense that it
exposes unmapped shadow for a brief period of time.
This breaks programs that mmap over another mmap
and access the region concurrently.
Don't unmap shadow in the mmap interceptor to fix this.
Just mapping new shadow on top should be enough to zero it.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D104593
Vitaly Buka [Thu, 24 Jun 2021 07:07:24 +0000 (00:07 -0700)]
[scudo] Fix test on arch without MTE
Vitaly Buka [Thu, 24 Jun 2021 06:58:09 +0000 (23:58 -0700)]
[scudo] Avoid ifdef in test
Vitaly Buka [Thu, 24 Jun 2021 06:52:47 +0000 (23:52 -0700)]
[scudo] Fix use of ScopedDisableMemoryTagChecks in test
Walter Erquinigo [Thu, 24 Jun 2021 06:03:26 +0000 (23:03 -0700)]
[NFC][trace] remove dead function
The Trace::GetCursorPosition function was never really implemented well and it's being replaced by a more correct TraceCursor object.
Vitaly Buka [Sun, 30 May 2021 00:11:36 +0000 (17:11 -0700)]
[scudo] Enabled MTE before the first allocator
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D103726
Walter Erquinigo [Wed, 16 Jun 2021 21:09:46 +0000 (14:09 -0700)]
[trace] Add a TraceCursor class
As a follow up of D103588, I'm reinitiating the discussion with a new proposal for traversing instructions in a trace which uses the feedback gotten in that diff.
See the embedded documentation in TraceCursor for more information. The idea is to offer an OOP way to traverse instructions exposing a minimal interface that makes no assumptions on:
- the number of instructions in the trace (i.e. having indices for instructions might be impractical for gigantic intel-pt traces, as it would require to decode the entire trace). This renders the use of indices to point to instructions impractical. Traces are big and expensive, and the consumer should try to do look linear lookups (forwards and/or backwards) and avoid random accesses (the API could be extended though, but for now I want to dicard that funcionality and leave the API extensible if needed).
- the way the instructions are represented internally by each Trace plug-in. They could be mmap'ed from a file, exist in plain vector or generated on the fly as the user requests the data.
- the actual data structure used internally for each plug-in. Ideas like having a struct TraceInstruction have been discarded because that would make the plug-in follow a certain data type, which might be costly. Instead, the user can ask the cursor for each independent property of the instruction it's pointing at.
The way to get a cursor is to ask Trace.h for the end or being cursor or a thread's trace.
There are some benefits of this approach:
- there's little cost to create a cursor, and this allows for lazily decoding a trace as the user requests data.
- each trace plug-in could decide how to cache the instructions it generates. For example, if a trace is small, it might decide to keep everything in memory, or if the trace is massive, it might decide to keep around the last thousands of instructions to speed up local searches.
- a cursor can outlive a stop point, which makes trace comparison for live processes feasible. An application of this is to compare profiling data of two runs of the same function, which should be doable with intel pt.
Differential Revision: https://reviews.llvm.org/D104422
Greg McGary [Tue, 22 Jun 2021 16:10:20 +0000 (09:10 -0700)]
[lld-macho] add tests for ICF, plus cleanups
Add tests for pending TODOs, plus some global cleanups:
* No fold: func has personality/LSDA
* Fold: reference to absolute symbol with different name but identical value
* No fold: reloc references to absolute symbols with different values
* No fold: N_ALT_ENTRY symbols
Differential Revision: https://reviews.llvm.org/D104721
Carl Ritson [Thu, 24 Jun 2021 00:59:55 +0000 (09:59 +0900)]
[AMDGPU] Add 224-bit vector types and link 192-bit types to MVTs
Add SReg_224, VReg_224, AReg_224, etc.
Link 224-bit types with v7i32/v7f32.
Link existing 192-bit types to newly added v3i64/v3f64/v6i32/v6f32.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D104622
Carl Ritson [Thu, 24 Jun 2021 00:59:25 +0000 (09:59 +0900)]
[ValueTypes] Define MVTs for v3i64/v3f64 to complement v6i32/v6f32
Having type symmetry with these is somewhat necessary when implementing support for 192-bit values.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D104621
Yaxun (Sam) Liu [Wed, 16 Jun 2021 19:40:27 +0000 (15:40 -0400)]
[HIP] Defer operator overloading errors
Although clang is able to defer overloading resolution
diagnostics for common functions. It does not defer
overloading resolution caused diagnostics for overloaded
operators.
This patch extends the existing deferred
diagnostic mechanism and defers a diagnostic caused
by overloaded operator.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D104505
Kai Luo [Thu, 24 Jun 2021 03:20:35 +0000 (03:20 +0000)]
[PowerPC] Add test to show passes in O3 pipeline. NFC.
Arthur Eubanks [Wed, 23 Jun 2021 20:16:04 +0000 (13:16 -0700)]
[docs][NewPM] Add some instructions on how to invoke opt
Also add link to blog post.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D104812
Peter Collingbourne [Thu, 24 Jun 2021 02:25:01 +0000 (19:25 -0700)]
gn build: Build ubsan_minimal on Android.
Zequan Wu [Thu, 24 Jun 2021 02:24:05 +0000 (19:24 -0700)]
Revert "ThinLTO: Fix inline assembly references to static functions with CFI"
This casues compiler crash: Assertion `materialized_use_empty() && "Uses remain when a value is destroyed!"'
This reverts commit
e3d24b45b8f808ec66213e134c4ceda5202fbe31.
Peter Collingbourne [Fri, 5 Feb 2021 00:14:04 +0000 (16:14 -0800)]
AST: Create __va_list in the std namespace even in C.
This ensures that the mangled type names match between C and C++,
which is significant when using -fsanitize=cfi-icall. Ideally we
wouldn't have created this namespace at all, but it's now part of
the ABI (e.g. in mangled names), so we can't change it.
Differential Revision: https://reviews.llvm.org/D104830
Evgenii Stepanov [Tue, 22 Jun 2021 23:27:11 +0000 (16:27 -0700)]
[hwasan] Respect llvm.asan.globals.
This enable no_sanitize C++ attribute to exclude globals from hwasan
testing, and automatically excludes other sanitizers' globals (such as
ubsan location descriptors).
Differential Revision: https://reviews.llvm.org/D104825
Jon Chesterfield [Thu, 24 Jun 2021 01:33:50 +0000 (02:33 +0100)]
Revert "[AMDGPU] [IndirectCalls] Don't propagate attributes to address taken functions and their callees"
This reverts commit
6a3beb1f68d6791a4cd0190f68b48510f754a00a.
Test case that triggers an infinite loop before the revert is at
the review for D103138.
Anthony Canino [Thu, 24 Jun 2021 01:00:46 +0000 (01:00 +0000)]
Implement an scf.for range folding optimization pass.
In cases where arithmetic (addi/muli) ops are performed on an scf.for loops induction variable with a single use, we can fold those ops directly into the scf.for loop.
For example, in the following code:
```
scf.for %i = %c0 to %arg1 step %c1 {
%0 = addi %arg2, %i : index
%1 = muli %0, %c4 : index
%2 = memref.load %arg0[%1] : memref<?xi32>
%3 = muli %2, %2 : i32
memref.store %3, %arg0[%1] : memref<?xi32>
}
```
we can lift `%0` up into the scf.for loop range, as it is the only user of %i:
```
%lb = addi %arg2, %c0 : index
%ub = addi %arg2, %i : index
scf.for %i = %lb to %ub step %c1 {
%1 = muli %0, %c4 : index
%2 = memref.load %arg0[%1] : memref<?xi32>
%3 = muli %2, %2 : i32
memref.store %3, %arg0[%1] : memref<?xi32>
}
```
Reviewed By: mehdi_amini, ftynse, Anthony
Differential Revision: https://reviews.llvm.org/D104289
Carl Ritson [Thu, 24 Jun 2021 00:36:58 +0000 (09:36 +0900)]
[LVI] Remove recursion from getValueForCondition (NFCI)
Convert getValueForCondition to a worklist model instead of using
recursion.
In pathological cases getValueForCondition recurses heavily.
Stack frames are quite expensive on x86-64, and some operating
systems (e.g. Windows) have relatively low stack size limits.
Using a worklist avoids potential failures from stack overflow.
Differential Revision: https://reviews.llvm.org/D104191
Whitney Tsang [Thu, 24 Jun 2021 00:22:06 +0000 (00:22 +0000)]
[AIX] Emitting diagnostics error for profile options
Only LLVM-based instrumentation profile is supported on AIX.
And it currently must be used with full LTO.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D104803
modimo [Thu, 24 Jun 2021 00:15:12 +0000 (17:15 -0700)]
[Clang] Check for returns_nonnull when deciding to add allocation null checks
Non-throwing allocators currently will always get null-check code. However, if the non-throwing allocator is explicitly annotated with returns_nonnull the null check should be elided.
Testing:
ninja check-all
added test case correctly elides
Reviewed By: bruno
Differential Revision: https://reviews.llvm.org/D102820
modimo [Thu, 24 Jun 2021 00:08:59 +0000 (17:08 -0700)]
[NFC] [DwarfEHPrepare] Add additional stats for EH
Stats added:
1. NumCleanupLandingPadsUnreachable: how many cleanup landing pads were optimized as unreachable
1. NumCleanupLandingPadsRemaining: how many cleanup landing pads remain
1. NumNoUnwind: Number of functions with nounwind attribute
1. NumUnwind: Number of functions with unwind attribute
DwarfEHPrepare is always run a single time as part of `TargetPassConfig::addISelPasses()` which makes it an ideal place near the end of the pipeline to record this information.
Example output from clang built with exceptions cumulative during thinLTO backend (NumCleanupLandingPadsUnreachable was not incremented):
"dwarfehprepare.NumCleanupLandingPadsRemaining": 123660,
"dwarfehprepare.NumNoUnwind": 323836,
"dwarfehprepare.NumUnwind": 472893,
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D104161
Nick Desaulniers [Wed, 23 Jun 2021 23:28:36 +0000 (16:28 -0700)]
[LangRef] add note to warn-frame-size about ODR
As sugguested by @dblaikie in D104342.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D104736
Bill Wendling [Tue, 22 Jun 2021 18:40:03 +0000 (11:40 -0700)]
[llvm-diff] Explicitly check ConstantStructs for differences
A ConstantStruct is renamed when the LLVM context sees a new one. This
makes global variable initializers appear different when they aren't.
Instead, check the ConstantStruct for equivalence.
Differential Revision: https://reviews.llvm.org/D104734
Craig Topper [Wed, 23 Jun 2021 22:38:03 +0000 (15:38 -0700)]
[CGP][RISCV] Teach CodeGenPrepare::optimizeSwitchInst to honor isSExtCheaperThanZExt.
This optimization pre-promotes the input and constants for a
switch instruction to a legal type so that all the generated compares
share the same extend. Since RISCV prefers sext for i32 to i64
extends, we should honor that to use sext.w instead of a pair
of shifts.
Reviewed By: jrtc27
Differential Revision: https://reviews.llvm.org/D104612
Xun Li [Wed, 23 Jun 2021 22:33:55 +0000 (15:33 -0700)]
[SjLj] Insert UnregisterFn before musttail call
When inserting UnregisterFn, if there is a musttail call, we must insert before the call so that we don't break the musttail call contract.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D104807
Xun Li [Wed, 23 Jun 2021 22:31:35 +0000 (15:31 -0700)]
Revert "[SjLj] Insert UnregisterFn before musttail call"
This reverts commit
f36703ada3dc18388ef5cdcbb8f39f74c27ad8e9.
Test failure: https://lab.llvm.org/buildbot#builders/104/builds/3450
Saleem Abdulrasool [Sun, 13 Jun 2021 18:07:34 +0000 (11:07 -0700)]
mailmap: add mappings for myself
Add aliases for various alternative email addresses.
Patrick Holland [Wed, 23 Jun 2021 20:03:16 +0000 (13:03 -0700)]
[MCA][TimelineView] Fixed a bug that was causing instructions outside of the timeline-max-cycles to still be printed.
Differential Revision: https://reviews.llvm.org/D104815