Alexey Bataev [Thu, 29 Apr 2021 19:39:48 +0000 (12:39 -0700)]
Revert "[COST] Improve shuffle kind detection if shuffle mask is provided."
This reverts commit
92399322217917e67c0d72a55ec51ddc82251cf6 to fix
a compiler crash on mask checks.
Jez Ng [Thu, 29 Apr 2021 19:32:43 +0000 (15:32 -0400)]
[lld-macho] Remove stray file
Sriraman Tallam [Thu, 29 Apr 2021 18:48:11 +0000 (11:48 -0700)]
Basic block sections for functions with implicit-section-name attribute
Functions can have section names set via #pragma or section attributes,
basic block sections should be correctly named for such functions.
With #pragma, the expectation is that all functions in that file are placed
in the same section in the final binary. Basic block sections should be
correctly named with the unique flag set so that the final binary has all the
basic blocks of the function in that named section. This patch fixes the bug
by calling getExplictSectionGlobal when implicit-section-name attribute is set
to make sure the function's basic blocks get the correct section name.
Differential Revision: https://reviews.llvm.org/D101311
Jez Ng [Thu, 29 Apr 2021 02:45:03 +0000 (22:45 -0400)]
[lld-macho][nfc] Clean up header.s test
I don't think it's super worthwhile to test the dylib headers outputs of
all the different archs when x86_64 is the only one that has interesting
behavior.
Motivated by my upcoming addition of arm32...
Jez Ng [Thu, 29 Apr 2021 19:09:01 +0000 (15:09 -0400)]
[lld-macho] Make everything PIE by default
Modern versions of macOS (>= 10.7) and in general all modern Mach-O
target archs want PIEs by default. ld64 defaults to PIE for iOS >= 4.3,
as well as for all versions of watchOS and simulators. Basically all the
platforms LLD is likely to target want PIE. So instead of cluttering LLD's
code with legacy version checks, I think it's simpler to just default to
PIE for everything.
Note that `-no_pie` still works, so users can still opt out of it.
Reviewed By: #lld-macho, thakis, MaskRay
Differential Revision: https://reviews.llvm.org/D101513
Aart Bik [Thu, 29 Apr 2021 01:15:11 +0000 (18:15 -0700)]
[mlir][sparse] migrate sparse operations into new sparse tensor dialect
This is the very first step toward removing the glue and clutter from linalg and
replace it with proper sparse tensor types. This revision migrates the LinalgSparseOps
into SparseTensorOps of a sparse tensor dialect. This also provides a new home for
sparse tensor related transformation.
NOTE: the actual replacement with sparse tensor types (and removal of linalg glue/clutter)
will follow but I am trying to keep the amount of changes per revision manageable.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D101488
Sanjay Patel [Thu, 29 Apr 2021 18:02:50 +0000 (14:02 -0400)]
[InstCombine] narrow popcount with zext operand
https://llvm.org/PR50141
Sanjay Patel [Thu, 29 Apr 2021 17:46:10 +0000 (13:46 -0400)]
[InstCombine] add tests for popcount with zext operand; NFC
PR50141
Tim Northover [Thu, 29 Apr 2021 18:58:51 +0000 (19:58 +0100)]
Revert "RegAlloc: do not consider liveins to EH-pad successors as liveout."
Some liveins *can* come from this block (e.g. any SSA value except the call),
it's only the ones that produce `landingpad` values that can't and I didn't
think it through properly.
Petar Avramovic [Wed, 28 Apr 2021 11:11:44 +0000 (13:11 +0200)]
AMDGPU/GlobalISel: Fix selection of image intrinsics with unused return
When atomic image intrinsic return value is unused, register class for
destination of a sub-register copy of return value ends up not being set.
This copy then hits 'Register class not set' assert later.
If return value has uses, register class is determined by use instruction.
Fix is to not create sub-register copy when image intrinsic destination has
no uses because it would be deleted by dead-mi-elimination later anyway.
Differential Revision: https://reviews.llvm.org/D101448
Dan Liew [Wed, 28 Apr 2021 21:22:08 +0000 (14:22 -0700)]
[ASan] Rename `-fsanitize-address-destructor-kind=` to drop the `-kind` suffix.
Renaming the option is based on discussions in https://reviews.llvm.org/D101122.
It is normally not a good idea to rename driver flags but this flag is
new enough and obscure enough that it is very unlikely to have adopters.
While we're here also drop the `<kind>` metavar. It's not necessary and
is actually inconsistent with the documentation in
`clang/docs/ClangCommandLineReference.rst`.
Differential Revision: https://reviews.llvm.org/D101491
Tim Northover [Thu, 29 Apr 2021 11:31:22 +0000 (12:31 +0100)]
RegAlloc: do not consider liveins to EH-pad successors as liveout.
These registers get defined by the runtime, not the block being allocated, and
treating them as preassigned in RegAllocFast adds extra pressure, sometimes
enough to make the function unallocatable.
Victor Huang [Wed, 28 Apr 2021 18:57:16 +0000 (13:57 -0500)]
[AIX][TLS] Add ASM portion changes to support TLSGD relocations to XCOFF objects
- Add new variantKinds for the symbol's variable offset and region handle
- Print the proper relocation specifier @gd in the asm streamer when emitting
the TC Entry for the variable offset for the symbol
- Fix the switch section failure between the TC Entry of variable offset and
region handle
- Put .__tls_get_addr symbol in the ProgramCodeSects with XTY_ER property
Reviewed by: sfertile
Differential Revision: https://reviews.llvm.org/D100956
Roman Lebedev [Thu, 29 Apr 2021 16:20:06 +0000 (19:20 +0300)]
[SimplifyCFG] Common code sinking: fix application of profitability check
The profitability check is: we don't want to create more than a single PHI
per instruction sunk. We need to create the PHI unless we'll sink
all of it's would-be incoming values.
But there is a caveat there.
This profitability check doesn't converge on the first iteration!
If we first decide that we want to sink 10 instructions,
but then determine that 5'th one is unprofitable to sink,
that may result in us not sinking some instructions that
resulted in determining that some other instruction
we've determined to be profitable to sink becoming unprofitable.
So we need to iterate until we converge, as in determine
that all leftover instructions are profitable to sink.
But, the direct approach of just re-iterating seems dumb,
because in the worst case we'd find that the last instruction
is unprofitable, which would result in revisiting instructions
many many times.
Instead, i think we can get away with just two passes - forward and backward.
However then it isn't obvious what is the most performant way to update
InstructionsToSink.
Sam Clegg [Mon, 5 Apr 2021 15:00:30 +0000 (08:00 -0700)]
[lld][WebAssembly] Add `--export-if-defined`
Unlike the existing `--export` option this will not causes errors
or warnings if the specified symbol is not defined.
See: https://github.com/emscripten-core/emscripten/issues/13736
Differential Revision: https://reviews.llvm.org/D99887
Mark de Wever [Sat, 27 Feb 2021 15:52:39 +0000 (16:52 +0100)]
[libc++] Fixes std::to_chars for bases != 10.
While working on D70631, Microsoft's unit tests discovered an issue.
Our `std::to_chars` implementation for bases != 10 uses the range
`[first,last)` as temporary buffer. This violates the contract for
to_chars:
[charconv.to.chars]/1 http://eel.is/c++draft/charconv#to.chars-1
`to_chars_result to_chars(char* first, char* last, see below value, int base = 10);`
"If the member ec of the return value is such that the value is equal to
the value of a value-initialized errc, the conversion was successful and
the member ptr is the one-past-the-end pointer of the characters
written."
Our implementation modifies the range `[member ptr, last)`, which causes
Microsoft's test to fail. Their test verifies the buffer
`[member ptr, last)` is unchanged. (The test is only done when the
conversion is successful.)
While looking at the code I noticed the performance for bases != 10 also
is suboptimal. This is tracked in D97705.
This patch fixes the issue and adds a benchmark. This benchmark will be
used as baseline for D97705.
Reviewed By: #libc, Quuxplusone, zoecarver
Differential Revision: https://reviews.llvm.org/D100722
Petr Hosek [Thu, 29 Apr 2021 17:18:02 +0000 (10:18 -0700)]
[CMake] Set correct CXX_FLAGS for relative-vtables variants
We overrite CXX_FLAGS to enable relative vtables, but doing so
overwrites generic Fuchsia CXX_FLAGS leading to a build failure
on Windows.
Differential Revision: https://reviews.llvm.org/D101551
Raphael Isemann [Thu, 29 Apr 2021 17:13:21 +0000 (19:13 +0200)]
[lldb] Make the NSSet formatter faster and less prone to infinite recursion
Right now to get the 'NSSet *` pointer value we first derefence it and then take
the address of the result.
Beside being inefficient this potentially can cause an infinite recursion if the
`pointer` value we get is a pointer of a type that the TypeSystem can't
derefence. If the pointer is for example some form of `void *` that the dynamic
type resolution can't resolve to an actual type, then the `Derefence` call goes
back to asking the formatters how to reference it. If the NSSet formatter then
checks if it's an NSSet variation under the hood then we just end infinitely
often recursion.
In practice this seems to happen with some form of Builtin.RawPointer we get
from a NSDictionary in Swift.
FWIW, no other formatter is doing the same deref->addressOf as here and there
doesn't seem to be any specific reason to do so in the git history (it's just
part of the initial formatter commit)
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D101537
LLVM GN Syncbot [Thu, 29 Apr 2021 16:59:58 +0000 (16:59 +0000)]
[gn build] Port
df323ba445f7
Benjamin Kramer [Thu, 29 Apr 2021 16:51:34 +0000 (18:51 +0200)]
Revert "[X86] Support AMX fast register allocation"
This reverts commit
3b8ec86fd576b9808dc63da620d9a4f7bbe04372.
Revert "[X86] Refine AMX fast register allocation"
This reverts commit
c3f95e9197643b699b891ca416ce7d72cf89f5fc.
This pass breaks using LLVM in a multi-threaded environment by
introducing global state.
Vitaly Buka [Thu, 29 Apr 2021 16:55:28 +0000 (09:55 -0700)]
Revert "[scudo] Use require_constant_initialization"
This reverts commit
7ad4dee3e733d820115f44cecce73ceb64c76450.
Sanjay Patel [Thu, 29 Apr 2021 16:53:26 +0000 (12:53 -0400)]
[ConstantFolding] propagate poison through vector reduction intrinsics
Martin Storsjö [Wed, 28 Apr 2021 08:08:12 +0000 (11:08 +0300)]
[libcxx] [test] Include more libraries that normally are linked automatically
As the libcxx tests link with -nostdlib, libraries that normally
are added by default by the compiler driver has to be added
manually.
The "oldnames" library is automatically added when driving linking
with clang-cl. When linking with the plain clang driver, as the
libcxx tests do, the clang driver does the same but only since Clang
12.0). But when linking with -nostdlib, like the libcxx tests do,
the driver defaults aren't added at all, and we need to specify the
defaults manually.
This allows removing a TODO from the Windows CI setup; it turns out
that upgrading to Clang 12.0 didn't help here as expected, sorry about
that mixup.
Differential Revision: https://reviews.llvm.org/D101434
Vitaly Buka [Thu, 29 Apr 2021 08:19:51 +0000 (01:19 -0700)]
[scudo] Use require_constant_initialization
Attribute guaranties safe static initialization of globals.
Differential Revision: https://reviews.llvm.org/D101514
Craig Topper [Thu, 29 Apr 2021 16:39:21 +0000 (09:39 -0700)]
[RISCV] Teach DAG combine to fold (and (select_cc lhs, rhs, cc, -1, c), x) -> (select_cc lhs, rhs, cc, x, (and, x, c))
Similar for or/xor with 0 in place of -1.
This is the canonical form produced by InstCombine for something like `c ? x & y : x;` Since we have to use control flow to expand select we'll usually end up with a mv in basic block. By folding this we may be able to pull the and/or/xor into the block instead and avoid a mv instruction.
The code here is based on code from ARM that uses this to create predicated instructions. I'm doing it on SELECT_CC so it happens late, but we could do it on select earlier which is what ARM does. I'm not sure if we lose any combine opportunities if we do it earlier.
I left out add and sub because this can separate sext.w from the add/sub. It also made a conditional i64 addition/subtraction on RV32 worse. I guess both of those would be fixed by doing this earlier on select.
The select-binop-identity.ll test has not been commited yet, but I made the diff show the changes to it.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D101485
Craig Topper [Thu, 29 Apr 2021 16:15:08 +0000 (09:15 -0700)]
[RISCV] Add test cases for D101485. NFC
Alexey Bataev [Tue, 20 Apr 2021 14:05:30 +0000 (07:05 -0700)]
[COST] Improve shuffle kind detection if shuffle mask is provided.
Added an extra analysis for better choosing of shuffle kind in
getShuffleCost functions for better cost estimation if mask was
provided.
Differential Revision: https://reviews.llvm.org/D100865
Fangrui Song [Thu, 29 Apr 2021 16:37:58 +0000 (09:37 -0700)]
[unittest] Fix Frontend/OpenMPIRBuilderTest.cpp -Wsign-compare after D89671
Fangrui Song [Thu, 29 Apr 2021 16:35:48 +0000 (09:35 -0700)]
[DebugInfo] Add tests that we emit .eh_frame instead of .debug_frame
Add tests which can catch the issue in
0ce723cb228bc1d1a0f5718f3862fb836145a333
(If any function needs CFISection::EH, the module should use CFISection::EH).
Reviewed By: echristo
Differential Revision: https://reviews.llvm.org/D101339
Sanjay Patel [Thu, 29 Apr 2021 16:20:59 +0000 (12:20 -0400)]
[ConstProp] add tests for vector reductions of poison; NFC
Sanjay Patel [Thu, 29 Apr 2021 16:07:51 +0000 (12:07 -0400)]
[ConstantFolding] refactor helper for vector reductions; NFC
We should handle other cases (undef/poison), so reduce
the duplication of repeated switches.
Sanjay Patel [Thu, 29 Apr 2021 14:43:11 +0000 (10:43 -0400)]
[ADT] fix typo in code block comment; NFC
Anirudh Prasad [Thu, 29 Apr 2021 15:27:56 +0000 (11:27 -0400)]
[AsmParser][SystemZ][z/OS] Reject "Dot" as current PC on z/OS
- Currently, the "." (Dot) character, when not identifying an Identifier or a Constant, refers to the current PC (Program Counter)
- However, in z/OS, for the HLASM dialect, it strictly accepts only the "*" as the current PC (Support for this will be put up in a follow-up patch)
- The changes in this patch allow individual platforms to choose whether they would like to use the "." (Dot) character as a marker for the current PC or not.
- It is achieved by introducing a new field in MCAsmInfo.h called `DotIsPC` (similar to `DollarIsPC`)
Reviewed By: abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D100975
Fangrui Song [Thu, 29 Apr 2021 15:51:09 +0000 (08:51 -0700)]
[ELF] Support .rela.eh_frame with unordered r_offset values
GNU ld -r can create .rela.eh_frame with unordered r_offset values.
(With LLD, we can craft such a case by reordering sections in .eh_frame.)
This is currently unsupported and will trigger
`assert(pieces[i].inputOff <= off ...` in `OffsetGetter::get`
(the content is corrupted in a -DLLVM_ENABLE_ASSERTIONS=off build).
This patch supports this case.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D101116
Craig Topper [Thu, 29 Apr 2021 15:10:39 +0000 (08:10 -0700)]
[RISCV] Enable SPLAT_VECTOR for fixed vXi64 types on RV32.
This replaces D98479.
This allows type legalization to form SPLAT_VECTOR_PARTS so we don't
lose the splattedness when the scalar type is split.
I'm handling SPLAT_VECTOR_PARTS for fixed vectors separately so
we can continue using non-VL nodes for scalable vectors.
I limited to RV32+vXi64 because DAGCombiner::visitBUILD_VECTOR likes
to form SPLAT_VECTOR before seeing if it can replace the BUILD_VECTOR
with other operations. Especially interesting is a splat BUILD_VECTOR of
the extract_vector_elt which can become a splat shuffle, but won't if
we form SPLAT_VECTOR first. We either need to reorder visitBUILD_VECTOR
or add visitSPLAT_VECTOR.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D100803
Craig Topper [Thu, 29 Apr 2021 15:00:10 +0000 (08:00 -0700)]
[RISCV] Teach computeKnownBits that vsetvli returns number less than 2^31.
This seems like a reasonable upper bound on VL. WG discussions for
the V spec would probably allow us to use 2^16 as an upper bound
on VLEN, but this is good enough for now.
This allows us to remove sext and zext if user happens to assign
the size_t result into an int and then uses it as a VL intrinsic
argument which is size_t.
Reviewed By: frasercrmck, rogfer01, arcbbb
Differential Revision: https://reviews.llvm.org/D101472
Sander de Smalen [Thu, 29 Apr 2021 14:37:57 +0000 (15:37 +0100)]
Revert "[LV] Calculate max feasible scalable VF."
Temporarily reverting this patch due to some unexpected issue found
by one of the PPC buildbots.
This reverts commit
584e9b6e4b4987b882719923e640eed854613d91.
Jay Foad [Thu, 29 Apr 2021 15:03:00 +0000 (16:03 +0100)]
[AMDGPU] Add a v_swap_b32 test case to be fixed
Chirag Khandelwal [Thu, 29 Apr 2021 13:36:07 +0000 (19:06 +0530)]
[Clang][OpenMP] Frontend work for sections - D89671
This patch is child of D89671, contains the clang
implementation to use the OpenMP IRBuilder's section
construct.
Co-author: @anchu-rajendran
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D91054
David Zarzycki [Thu, 29 Apr 2021 14:01:37 +0000 (10:01 -0400)]
Unbreak no-asserts testing
Anastasia Stulova [Thu, 29 Apr 2021 13:08:21 +0000 (14:08 +0100)]
[OpenCL][Docs] Misc updates to C++ for OpenCL and offline compilation
Differential Revision: https://reviews.llvm.org/D101092
Chirag Khandelwal [Thu, 29 Apr 2021 13:08:24 +0000 (18:38 +0530)]
[LLVM][OpenMP] Adding support for OpenMP sections construct in OpenMPIRBuilder
This patch adds section support in the OpenMP IRBuilder module, along with a test for the same.
Reviewed By: fghanim
Differential Revision: https://reviews.llvm.org/D89671
Anastasia Stulova [Thu, 29 Apr 2021 13:02:29 +0000 (14:02 +0100)]
[OpenCL][Docs] Describe extension for legacy atomics with generic addr space.
This extension is primarily targeting SPIR-V compilations flow
as the IR translation is the same between 1.x and 2.x atomics.
Differential Revision: https://reviews.llvm.org/D101089
Florian Hahn [Thu, 29 Apr 2021 12:17:37 +0000 (13:17 +0100)]
[VPlan] Add getVPSingleValue helper.
As suggested in D99294, this adds a getVPSingleValue helper to use for
recipes that are guaranteed to define a single value. This replaces uses
of getVPValue() which used to default to I = 0.
Arnamoy Bhattacharyya [Thu, 29 Apr 2021 12:29:58 +0000 (08:29 -0400)]
[flang][OpenMP] Add semantic checks for strict nesting inside `teams` construct.
Alex Zinenko [Thu, 29 Apr 2021 11:26:54 +0000 (13:26 +0200)]
[mlir] fix shared-lib build
Bradley Smith [Fri, 23 Apr 2021 15:34:26 +0000 (16:34 +0100)]
[AArch64][SVE] Use SIMD variant of INSR when scalar is the result of a vector extract
At the intrinsic layer the sve.insr operation takes a scalar. When this
scalar is an integer we are forcing a data transition between GPRs and
ZPRs that is potentially costly.
Often the integer scalar is the result of a vector extract, when
performing a reduction for example. In such cases we should keep all
data within the ZPRs.
Co-authored-by: Paul Walker <paul.walker@arm.com>
Differential Revision: https://reviews.llvm.org/D101169
Bradley Smith [Fri, 23 Apr 2021 12:55:42 +0000 (13:55 +0100)]
[AArch64][SVE] Convert svdup(vec, SV_VL1, elm) to insertelement(vec, elm, 0)
By converting the SVE intrinsic to a normal LLVM insertelement we give
the code generator a better chance to remove transitions between GPRs
and VPRs
Co-authored-by: Paul Walker <paul.walker@arm.com>
Depends on D101302
Differential Revision: https://reviews.llvm.org/D101167
Bradley Smith [Mon, 26 Apr 2021 15:19:25 +0000 (16:19 +0100)]
[AArch64][SVE] Move convert.{from,to}.svbool optimization into InstCombine
As part of this the ptrue coalescing done in SVEIntrinsicOpts has been
modified to not introduce redundant converts, since the convert removal
will no longer run after that optimisation to clean up.
Differential Revision: https://reviews.llvm.org/D101302
Alex Zinenko [Thu, 29 Apr 2021 11:15:21 +0000 (13:15 +0200)]
[mlir] support max/min lower/upper bounds in affine.parallel
This enables to express more complex parallel loops in the affine framework,
for example, in cases of tiling by sizes not dividing loop trip counts perfectly
or inner wavefront parallelism, among others. One can't use affine.max/min
and supply values to the nested loop bounds since the results of such
affine.max/min operations aren't valid symbols. Making them valid symbols
isn't an option since they would introduce selection trees into memref
subscript arithmetic as an unintended and undesired consequence. Also
add support for converting such loops to SCF. Drop some API that isn't used in
the core repo from AffineParallelOp since its semantics becomes ambiguous in
presence of max/min bounds. Loop normalization is currently unavailable for
such loops.
Depends On D101171
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D101172
Alex Zinenko [Thu, 29 Apr 2021 11:14:58 +0000 (13:14 +0200)]
[mlir] Affine: parallelize affine loops with reductions
Introduce a basic support for parallelizing affine loops with reductions
expressed using iteration arguments. Affine parallelism detector now has a flag
to assume such reductions are parallel. The transformation handles a subset of
parallel reductions that are can be expressed using affine.parallel:
integer/float addition and multiplication. This requires to detect the
reduction operation since affine.parallel only supports a fixed set of
reduction operators.
Reviewed By: chelini, kumasento, bondhugula
Differential Revision: https://reviews.llvm.org/D101171
Lorenzo Chelini [Thu, 29 Apr 2021 11:06:40 +0000 (13:06 +0200)]
[mlir] Fix top-level comments (NFC)
Nathan Sidwell [Wed, 28 Apr 2021 11:03:11 +0000 (04:03 -0700)]
Update libstdc++ hack comment
This libstc++ hack isn't ready for removal. Updating the comment to
note what I found. While I have not proven Ville's
__is_throw_swappable patch made this go away, that patch did remove
the use of noexcept(noexcept(swap(....))). I'm not sure when gcc grew
deferred noexcept parsing.
Differential Revision: https://reviews.llvm.org/D101441
Sebastian Neubauer [Thu, 29 Apr 2021 10:52:29 +0000 (12:52 +0200)]
[AMDGPU] Allow buildSpillLoadStore in empty bb
This allows calling buildSpillLoadStore for an empty basic block, where
MI points at the end of the block instead of to an instruction.
This only happens with downstream CFI changes, so I was not able to
create a testcase that works with upstream LLVM.
Differential Revision: https://reviews.llvm.org/D101356
Amara Emerson [Thu, 29 Apr 2021 10:40:50 +0000 (03:40 -0700)]
Try to fix bots. We shouldn't be setting the entrybuilder's DL to a null one.
This was causing a DILocation verifier error, the old code path didn't try to do
this when building constants via the finishPendingPhis() method.
Fraser Cormack [Thu, 29 Apr 2021 10:38:10 +0000 (11:38 +0100)]
[RISCV][NFC] Combine identical RV32 and RV64 test checks
Tres Popp [Thu, 29 Apr 2021 10:20:28 +0000 (12:20 +0200)]
[mlir] Add LinalgTransforms dependency on Complex
David Green [Thu, 29 Apr 2021 09:59:14 +0000 (10:59 +0100)]
[ARM] Ensure CSINC has one use in CSINV combine
Otherwise the CMP glue may be used in multiple nodes, needing to be
emitted multiple times. Currently this either increases instruction
count or fails as it attempt to insert the same node multiple times.
Tres Popp [Tue, 6 Apr 2021 10:39:07 +0000 (12:39 +0200)]
[mlir] Support complex numbers in Linalg promotion
FillOp allows complex ops, and filling a properly sized buffer with
a default zero complex number is implemented.
Differential Revision: https://reviews.llvm.org/D99939
Serguei Katkov [Thu, 29 Apr 2021 05:22:19 +0000 (12:22 +0700)]
[Greedy RA] Replace ll to mir test to make more stable to check an error.
Alex Zinenko [Thu, 22 Apr 2021 15:32:10 +0000 (17:32 +0200)]
[mlir] Split out Python bindings entry point into a separate file
This will allow the bindings to be built as a library and reused in out-of-tree
projects that want to provide bindings on top of MLIR bindings.
Reviewed By: stellaraccident, mikeurbach
Differential Revision: https://reviews.llvm.org/D101075
David Spickett [Thu, 29 Apr 2021 08:54:03 +0000 (09:54 +0100)]
[NVPTX] Fix unused var warning with asserts disabled
<...>/llvm-project/llvm/lib/Target/NVPTX/NVPTXLowerArgs.cpp:191:15:
warning: unused variable ‘ASC’ [-Wunused-variable]
191 | if (auto *ASC =
dyn_cast<AddrSpaceCastInst>(I.OldInstruction)) {
| ^~~
Nick Lewycky [Wed, 28 Apr 2021 20:15:39 +0000 (13:15 -0700)]
Improve error messages for attributes in the wrong context.
verifyFunctionAttrs has a comment that the value V is printed in error messages. The recently added errors for attributes didn't print V. Make them print V.
Change the stringification of AttributeList. Firstly they started with 'PAL[' which stood for ParamAttrsList. Change that to 'AttributeList[' matching its current name AttributeList. Print out semantic meaning of the index instead of the raw index value (i.e. 'return', 'function' or 'arg(n)').
Differential revision: https://reviews.llvm.org/D101484
Qiu Chaofan [Thu, 29 Apr 2021 08:28:34 +0000 (16:28 +0800)]
[SPE] Support constrained float operations on SPE
This patch enables support on SPE for constrained arithmetic and
comparison operations. This fixes bugzilla 50070.
One thing not covered is fcmp vs. fcmps on SPE. Some condition code
generates singaling comparison while some not. In this patch, all are
considered as singaling. So there might be still some issue when
compiling from C code.
Reviewed By: jhibbits
Differential Revision: https://reviews.llvm.org/D101282
David Spickett [Wed, 14 Apr 2021 16:11:26 +0000 (17:11 +0100)]
[lldb][AArch64] Don't check for VmFlags in smaps files
AArch64 kernel builds default to having /smaps and
the "VmFlags" line was added in 3.8. Long before MTE
was supported.
So we can assume that if you're AArch64 with MTE,
you can run this test.
The previous method of checking had a race condition
where the process we read smaps for, could finish before
we get to read the file.
I explored some alternatives but in the end I think
it's fine to just assume we have what we need.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D100493
Vitaly Buka [Thu, 29 Apr 2021 08:23:44 +0000 (01:23 -0700)]
[NFC][scudo] Suppress "division by zero" warning
Fraser Cormack [Mon, 22 Mar 2021 15:54:04 +0000 (15:54 +0000)]
[RISCV] Fix stack slot for argument types (Bug 49500)
This is an complementary/alternative fix for D99068. It takes a slightly
different approach by explicitly summing up all of the required split
part type sizes and ensuring we allocate enough space for them. It also
takes the maximum alignment of each part.
Compared with D99068 there are fewer changes to the stack objects in
existing tests. However, @luismarques has shown in that patch that there
are opportunities to reduce our stack usage in the future.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D99087
Frederik Gossen [Thu, 29 Apr 2021 08:07:20 +0000 (10:07 +0200)]
[MLIR][Shape] Fix `shape.broadcast` to standard lowering
Differential Revision: https://reviews.llvm.org/D101456
Marek Kurdej [Thu, 29 Apr 2021 08:06:46 +0000 (10:06 +0200)]
[clang-format] Fix build on gcc < 7 introduced in rG9363aa9.
This fixes another bogus build error on gcc, e.g. https://lab.llvm.org/buildbot/#/builders/110/builds/2974.
/home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux/llvm/clang/lib/Format/TokenAnnotator.cpp:3412:34: error: binding ‘const clang::format::FormatStyle’ to reference of type ‘clang::format::FormatStyle&’ discards qualifiers
auto ShouldAddSpacesInAngles = [&Style = this->Style,
^
Amara Emerson [Thu, 29 Apr 2021 07:59:12 +0000 (00:59 -0700)]
[GlobalISel] Bump CallLoweringInfo::OrigArgs initial size to 32. NFC.
We spend some time during sqlite3 compilation regrowing this vector,
bump it up to avoid this.
Gives around 1-2% improvement in codegen-only time for sqlite3 at -O0.
Fraser Cormack [Wed, 28 Apr 2021 15:23:13 +0000 (16:23 +0100)]
[Utils][vim] Highlight 'vscale' constant
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D101466
Marek Kurdej [Thu, 29 Apr 2021 07:56:11 +0000 (09:56 +0200)]
[clang-format] Fix build on gcc < 7 introduced in rG9363aa9.
This fixes a bogus build error on gcc, e.g. https://lab.llvm.org/buildbot/#/builders/110/builds/2973.
/home/ssglocal/clang-cmake-x86_64-avx2-linux/clang-cmake-x86_64-avx2-linux/llvm/clang/lib/Format/TokenAnnotator.cpp:3097:53: error: binding ‘const clang::SourceRange’ to reference of type ‘clang::SourceRange&’ discards qualifiers
auto HasExistingWhitespace = [&Whitespace = Right.WhitespaceRange]() {
^
Nicolas Vasilache [Wed, 28 Apr 2021 21:52:30 +0000 (21:52 +0000)]
[mlir][Linalg] Generalize linalg vectorization
This revision adds support for vectorizing more general linalg operations with projected permutation maps.
This is achieved by eagerly broadcasting the intermediate vector to the common size
of the iteration domain of the linalg op. This allows a much more natural expression of
generalized vectorization but may introduce additional computations until all the
proper canonicalizations are implemented.
This generalization modifies the vector.transfer_read/write permutation logic and
exposes the fact that the logic employed in vector.contract was too ad-hoc.
As a consequence, changes occur in the permutation / transposition logic for contraction. In turn this prompts supporting more cases in the lowering of contract
to matrix intrinsics, which is required to make the corresponding tests pass.
Differential revision: https://reviews.llvm.org/D101165
Sjoerd Meijer [Thu, 29 Apr 2021 07:29:57 +0000 (08:29 +0100)]
Follow up of rGddb3b26a1269: added 'requires asserts' to test case.
Harald van Dijk [Thu, 29 Apr 2021 07:33:22 +0000 (08:33 +0100)]
[X32][CET] Fix handling of indirect branches
As X32 uses 32-bit pointers without having 32-bit indirect branch
instructions, we need to fix up indirect branches by extending the
branch targets to 64 bits. This was already done for BRIND but not yet
for NT_BRIND. The same logic works for both, so this applies that
existing logic to NT_BRIND as well.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D101499
Evgeny Leviant [Thu, 29 Apr 2021 07:29:42 +0000 (10:29 +0300)]
[NewPM] Add an option to dump pass structure
Patch adds -debug-pass-structure option to dump pass structure when
new pass manager is used.
Differential revision: https://reviews.llvm.org/D99599
Tobias Gysi [Thu, 29 Apr 2021 06:45:34 +0000 (06:45 +0000)]
[mlir][Python][Linalg] Adding const, capture, and index support to the OpDSL.
The patch extends the OpDSL with support for:
- Constant values
- Capture scalar parameters
- Access the iteration indices using the index operation
- Provide predefined floating point and integer types.
Up to now the patch only supports emitting the new nodes. The C++/yaml path is not fully implemented. The fill_rng_2d operation defined in emit_structured_generic.py makes use of the new DSL constructs.
Differential Revision: https://reviews.llvm.org/D101364
Marek Kurdej [Thu, 29 Apr 2021 06:57:33 +0000 (08:57 +0200)]
[clang-format] Add `SpacesInAngles: Leave` option to keep spacing inside angle brackets as is.
A need for such an option came up in a few libc++ reviews. That's because libc++ has both code in C++03 and newer standards.
Currently, it uses `Standard: C++03` setting for clang-format, but this breaks e.g. u8"string" literals.
Also, angle brackets are the only place where C++03-specific formatting needs to be applied.
Reviewed By: MyDeveloperDay, HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D101344
Amara Emerson [Thu, 29 Apr 2021 06:16:54 +0000 (23:16 -0700)]
[GlobalISel][IRTranslator] Move line zero DebugLoc creation to constant translation. NFC.
This is a compile time optimization. DILocation:get() is expensive to call, and
we were calling it to create a line zero debug loc for *every* instruction we
translated. We only really need to do this just before we build constants in the
entry block, so I moved this code there. This reduces the LLVM -O0 codegen time
of sqlite3 IR by around 0.7% instructions executed and by about ~2% in CPU time.
We can probably do better with a more involved change, since the reason we need
to create one for each new constant is because we're using the debug scope and
inlined-at loc. If we just use a single instruction's scope and drop the
inlined-at, we can just cache these and have them be free.
David Green [Thu, 29 Apr 2021 06:44:04 +0000 (07:44 +0100)]
[ARM] Use just ARM::t2B in ARMBlockPlacementPass
The ARMConstantIsland pass will convert any t2B to tB if they are within
range after it has added or moved any constant pools. They don't need to
be deliberately converted beforehand, and it doesn't deal with needing
to convert tB to t2B very well.
Dmitry Vyukov [Wed, 28 Apr 2021 06:36:03 +0000 (08:36 +0200)]
tsan: fix warnings in tests
Fix format specifier.
Fix warnings about non-standard attribute placement.
Make free_race2.c test a bit more interesting:
test access with/without an offset.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D101424
Reshabh Sharma [Thu, 29 Apr 2021 05:34:23 +0000 (11:04 +0530)]
[ASAN] NFC: Use addrspace cast for pointers in non-zero addrspace
Pointers in non-zero address spaces need to be address space
casted before appending to the used list.
Reviewed by: vitalybuka
Differential Revision: https://reviews.llvm.org/D101363
Dmitry Vyukov [Fri, 23 Apr 2021 13:47:21 +0000 (15:47 +0200)]
tsan: increase dense slab alloc capacity
We've got a user report about heap block allocator overflow.
Bump the L1 capacity of all dense slab allocators to maximum
and be careful to not page the whole L1 array in from .bss.
If OS uses huge pages, this still may cause a limited RSS increase
due to boundary huge pages, but avoiding that looks hard.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D101161
Reshabh Sharma [Thu, 29 Apr 2021 04:49:12 +0000 (10:19 +0530)]
[ASAN] NFC: Copy address space when creating globals with redzones
This patch makes sure that globals in supported address spaces
will be replaced by globals with red zones in the same address
space by copying the address space.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D101362
Dan Liew [Wed, 28 Apr 2021 21:06:28 +0000 (14:06 -0700)]
[NFC] Rename SanitizeAddressDtorKind codegen opt to not have `Kind` suffix.
This is post commit follow up based on discussions in
https://reviews.llvm.org/D101122.
Differential Revision: https://reviews.llvm.org/D101490
(cherry picked from commit
f4c7e82d1b21e637c4e0c53125b126c407d8bdbf)
Mike Urbach [Sat, 24 Apr 2021 02:54:04 +0000 (20:54 -0600)]
[mlir][python] Add `destroy` method to PyOperation.
This adds a method to directly invoke `mlirOperationDestroy` on the
MlirOperation wrapped by a PyOperation.
Reviewed By: stellaraccident, mehdi_amini
Differential Revision: https://reviews.llvm.org/D101422
Roland McGrath [Wed, 28 Apr 2021 23:45:20 +0000 (16:45 -0700)]
[gwp_asan] Use __sanitizer_fast_backtrace on Fuchsia
Reviewed By: phosek, cryptoad, hctim
Differential Revision: https://reviews.llvm.org/D101407
John Demme [Wed, 28 Apr 2021 23:16:45 +0000 (16:16 -0700)]
[mlir] Move PyConcreteType to header. NFC.
This allows out-of-tree users to derive PyConcreteType to bind custom
types.
The Type version of https://reviews.llvm.org/D101063/new/
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D101496
Alexander Shaposhnikov [Wed, 28 Apr 2021 23:27:53 +0000 (16:27 -0700)]
[llvm-objcopy][MachO] Add support for LC_THREAD/LC_UNIXTHREAD
Add support for LC_THREAD/LC_UNIXTHREAD
(these load commands can be copied over without any modifications).
Test plan: make check-all
Differential revision: https://reviews.llvm.org/D101384
Craig Topper [Wed, 28 Apr 2021 22:43:44 +0000 (15:43 -0700)]
[TableGen] Remove predicate filtering from GenerateVariants.
After D100691, predicates should be cheap to compare again so
we don't need to filter anymore.
This is mostly just a revert of several patches going back to 2018.
Reviewed By: kparzysz
Differential Revision: https://reviews.llvm.org/D100695
Amanieu d'Antras [Mon, 12 Apr 2021 17:05:18 +0000 (18:05 +0100)]
[ConstantMerge] Don't merge thread_local constants with non-thread_local constants
Fixes PR49932
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D100322
Roman Lebedev [Wed, 28 Apr 2021 22:23:44 +0000 (01:23 +0300)]
[SimplifyCFG] Common code sinking: fixup variable name
As noticed in post-commit review.
I've gone through several iterations of that name,
and somehow managed to end up with an incorrect one.
Dávid Bolvanský [Wed, 28 Apr 2021 22:16:57 +0000 (00:16 +0200)]
[BuildLibCalls] Remove inaccessiblememonly inference for calloc
Solves regression mentioned in PR50143.
As noted in D101440, proper modelling for calloc would require new attribute inaccessible_or_returned_memonly.
Denys Petrov [Mon, 26 Apr 2021 16:17:56 +0000 (19:17 +0300)]
[analyzer] Wrong type cast occurs during pointer dereferencing after type punning
Summary: During pointer dereferencing CastRetrievedVal uses wrong type from the Store after type punning. Namely, the pointer casts to another type and then assigns with a value of one more another type. It produces NonLoc value when Loc is expected.
Differential Revision: https://reviews.llvm.org/D89055
Fixes:
https://bugs.llvm.org/show_bug.cgi?id=37503
https://bugs.llvm.org/show_bug.cgi?id=49007
Roman Lebedev [Wed, 28 Apr 2021 19:47:15 +0000 (22:47 +0300)]
[SimplifyCFG] Common code sinking: relax restriction on non-uncond predecessors
While we have a known profitability issue for sinking in presence of
non-unconditional predecessors, there isn't any known issues
for having multiple such non-unconditional predecessors,
so said restriction appears to be artificial. Lift it.
Roman Lebedev [Wed, 28 Apr 2021 19:56:45 +0000 (22:56 +0300)]
[NFC][SimplifyCFG] Add test for sinking common code with multuple cond predecessors
Roman Lebedev [Wed, 28 Apr 2021 21:49:21 +0000 (00:49 +0300)]
[NFC][SimplifyCFG] Add test showing that profitability check for sinking is broken
Essentially, we can't promise that the instruction is sinkable without
introducing PHI's until we know that it is profitable to sink.
Roman Lebedev [Wed, 28 Apr 2021 20:40:58 +0000 (23:40 +0300)]
[NFC][SimplifyCFG] Common code sinking: check profitability once
We can just eagerly pre-check all the instructions that we *could*
sink that we'd actually want to sink them, clamping the number of
instructions that we'll sink to stop just before the first unprofitable one.
Roman Lebedev [Wed, 28 Apr 2021 19:40:07 +0000 (22:40 +0300)]
[NFC][SimplifyCFG] SinkCommonCodeFromPredecessors(): reword comment about PR30244
Vitaly Buka [Wed, 28 Apr 2021 21:56:08 +0000 (14:56 -0700)]
[NFC][scudo] Add reference to a QEMU bug
D101031 added workaround for the bug.