Igor Kudrin [Thu, 17 Jun 2021 10:08:13 +0000 (17:08 +0700)]
[ELF] Restore arm-branch.s test
After D77330, the comments are inconsistent with the disassembled code.
As the value of `far` has been changed, a thunk to reach it is now
generated, and target addresses of branch instructions are different
from what was initially expected.
The patch fixes that and makes the test closer to what it was originally.
Differential Revision: https://reviews.llvm.org/D104286
Martin Storsjö [Wed, 16 Jun 2021 13:56:33 +0000 (16:56 +0300)]
[LLD] [COFF] Remove a stray duplicate comment. NFC.
The following class isn't part of the export table; there's a
second correctly placed comment about the things that actually
belong to the export table.
Martin Storsjö [Mon, 14 Jun 2021 10:21:54 +0000 (13:21 +0300)]
[llvm-dlltool] Imply the target arch from a tool triple prefix
Also use the default LLVM target as default for dlltool. This
matches how GNU dlltool behaves; it is compiled with one default
target, which is used if no option is provided.
Extend the anonymous namespace in the implementation file instead
of using static functions.
Based on a patch by Mateusz Mikuła.
The effect of the default LLVM target, if neither the -m option
nor a tool triple prefix is provided, isn't tested, as we can't
make assumptions about what it is set to.
(We could make the default be forced to one of the four supported
architectures if the default triple is another arch, and then just
test that llvm-dlltool without an -m option is able to produce an
import library, without checking the actual architecture though.)
Differential Revision: https://reviews.llvm.org/D104212
Martin Storsjö [Mon, 14 Jun 2021 10:28:33 +0000 (13:28 +0300)]
[llvm-dlltool] [test] Add a testcase for all machine option types. NFC.
The existing tests only test that some options (but not e.g. arm)
are accepted, but it doesn't test their functional effect of
affecting the generated object files.
Differential Revision: https://reviews.llvm.org/D104215
Martin Storsjö [Mon, 14 Jun 2021 10:25:28 +0000 (13:25 +0300)]
[llvm-dlltool] [test] Remove superfluous --coff-exports option to llvm-readobj. NFC.
The --coff-exports option to llvm-readobj prints the exported symbols
from a DLL/EXE, it doesn't do anything with regards to an import
library.
Differential Revision: https://reviews.llvm.org/D104214
Martin Storsjö [Mon, 14 Jun 2021 10:23:01 +0000 (13:23 +0300)]
[llvm-dlltool] [test] Test both short and long forms of options. NFC.
Differential Revision: https://reviews.llvm.org/D104213
Martin Storsjö [Wed, 16 Jun 2021 12:22:28 +0000 (15:22 +0300)]
[libcxx] Fix a case of -Wundef warnings regarding _POSIX_TIMERS
Differential Revision: https://reviews.llvm.org/D104372
Alex Zinenko [Tue, 15 Jun 2021 13:27:01 +0000 (15:27 +0200)]
[mlir] separable registration of operation interfaces
This is similar to attribute and type interfaces and mostly the same mechanism
(FallbackModel / ExternalModel, ODS generation). There are minor differences in
how the concept-based polymorphism is implemented for operations that are
accounted for by ODS backends, and this essentially adds a test and exposes the
API.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D104294
Sjoerd Meijer [Wed, 16 Jun 2021 13:26:57 +0000 (14:26 +0100)]
[FuncSpec] Don't specialise functions with attribute NoDuplicate.
Differential Revision: https://reviews.llvm.org/D104378
Fraser Cormack [Mon, 14 Jun 2021 10:00:25 +0000 (11:00 +0100)]
[RISCV][VP] Lower FP VP ISD nodes to RVV instructions
With the exception of `frem`, this patch supports the current set of VP
floating-point binary intrinsics by lowering them to to RVV instructions. It
does so by using the existing `RISCVISD *_VL` custom nodes as an intermediate
layer. Both scalable and fixed-length vectors are supported by using this
method.
The `frem` node is unsupported due to a lack of available instructions. For
fixed-length vectors we could scalarize but that option is not (currently)
available for scalable-vector types. The support is intentionally left out so
it equivalent for both vector types.
The matching of vector/scalar forms is currently lacking, as scalable vector
types do not lower to the custom `VFMV_V_F_VL` node. We could either make
floating-point scalable vector splats lower to this node, or support the
matching of multiple kinds of splat via a `ComplexPattern`, much like we do for
integer types.
Reviewed By: rogfer01
Differential Revision: https://reviews.llvm.org/D104237
Balázs Kéri [Thu, 17 Jun 2021 07:12:36 +0000 (09:12 +0200)]
[clang][AST] Set correct DeclContext in ASTImporter lookup table for template params.
Template parameters are created in ASTImporter with the translation unit as DeclContext.
The DeclContext is later updated (by the create function of template classes).
ASTImporterLookupTable was not updated after these changes of the DC. The patch
adds update of the DeclContext in ASTImporterLookupTable.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D103792
David Green [Thu, 17 Jun 2021 08:53:33 +0000 (09:53 +0100)]
[InterleaveAccess] Copy fast math flags when adjusting binary operators in interleave access pass
The Interleave Access pass will convert shuffle(binop(load, load)) to
binop(shuffle(load), shuffle(load)), in order to create more
interleaving load patterns (VLD2/3/4) that might have been messed up by
instcombine. As shown in D104247 we were missing copying IR flags to the
new instruction though, which should just be kept the same as the
original instruction.
Differential Revision: https://reviews.llvm.org/D104255
Tomasz Miąsko [Thu, 17 Jun 2021 08:30:04 +0000 (10:30 +0200)]
[Demangle] Support Rust v0 mangling scheme in llvm::demangle
The llvm::demangle is currently used by llvm-objdump and llvm-readobj,
so this effectively adds support for Rust v0 mangling to those
applications.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D104340
Florian Hahn [Thu, 17 Jun 2021 07:53:36 +0000 (08:53 +0100)]
[VPlan] Support PHIs as LastInst when inserting scalars in ::get().
At the moment, we create insertelement instructions directly after
LastInst when inserting scalar values in a vector in
VPTransformState::get.
This results in invalid IR when LastInst is a phi, followed by another
phi. In that case, the new instructions should be inserted just after
the last PHI node in the block.
At the moment, I don't think the problematic case can be triggered, but
it can happen once predicate regions are merged and multiple
VPredInstPHI recipes are in the same block (D100260).
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D104188
Yilong Guo [Thu, 17 Jun 2021 08:33:00 +0000 (09:33 +0100)]
[Format] Fix incorrect pointer/reference detection
https://llvm.org/PR50568
When an overloaded operator is called, its argument must be an
expression.
Before:
void f() { a.operator()(a *a); }
After:
void f() { a.operator()(a * a); }
Reviewed By: HazardyKnusperkeks, curdeius, MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D103678
Kirstóf Umann [Fri, 21 May 2021 12:02:03 +0000 (14:02 +0200)]
[analyzer] Make checker silencing work for non-pathsensitive bug reports
D66572 separated BugReport and BugReporter into basic and path sensitive
versions. As a result, checker silencing, which worked deep in the path
sensitive report generation facilities became specific to it. DeadStoresChecker,
for instance, despite being in the static analyzer, emits non-pathsensitive
reports, and was impossible to silence.
This patch moves the corresponding code before the call to the virtual function
generateDiagnosticForConsumerMap (which is overriden by the specific kinds of
bug reporters). Although we see bug reporting as relatively lightweight compared
to the analysis, this will get rid of several steps we used to throw away.
Quoting from D65379:
At a very high level, this consists of 3 steps:
For all BugReports in the same BugReportEquivClass, collect all their error
nodes in a set. With that set, create a new, trimmed ExplodedGraph whose leafs
are all error nodes.
Until a valid report is found, construct a bug path, which is yet another
ExplodedGraph, that is linear from a given error node to the root of the graph.
Run all visitors on the constructed bug path. If in this process the report got
invalidated, start over from step 2.
Checker silencing used to kick in after all of these. Now it does before any of
them :^)
Differential Revision: https://reviews.llvm.org/D102914
Change-Id: Ice42939304516f2bebd05a1ea19878b89c96a25d
Alex Zinenko [Wed, 16 Jun 2021 14:31:17 +0000 (16:31 +0200)]
[mlir] ODS: emit interface traits outside of the interface class
ODS currently emits the interface trait class as a nested class inside the
interface class. As an unintended consequence, the default implementations of
interface methods have implicit access to static fields of the interface class,
e.g. those declared in `extraClassDeclaration`, including private methods (!),
or in the parent class. This may break the use of default implementations for
external models, which are not defined in the interface class, and generally
complexifies the abstraction.
Emit intraface traits outside of the interface class itself to avoid accidental
implicit visibility. Public static fields can still be accessed via explicit
qualification with a class name, e.g., `MyOpInterface::staticMethod()` instead
of `staticMethod`.
Update the documentation to clarify the role of `extraClassDeclaration` in
interfaces.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D104384
Raphael Isemann [Thu, 17 Jun 2021 07:52:08 +0000 (09:52 +0200)]
[lldb] Skip variant/optional libc++ tests for Clang 5/6
Clang 5 and Clang 6 can no longer parse newer versions of libc++. As we can't
specify the specific libc++ version in the decorator, let's only allow Clang
versions that can parse all currently available libc++ versions.
Bjorn Pettersson [Fri, 26 Mar 2021 20:02:26 +0000 (21:02 +0100)]
Update @llvm.powi to handle different int sizes for the exponent
This can be seen as a follow up to commit
0ee439b705e82a4fe20e2,
that changed the second argument of __powidf2, __powisf2 and
__powitf2 in compiler-rt from si_int to int. That was to align with
how those runtimes are defined in libgcc.
One thing that seem to have been missing in that patch was to make
sure that the rest of LLVM also handle that the argument now depends
on the size of int (not using the si_int machine mode for 32-bit).
When using __builtin_powi for a target with 16-bit int clang crashed.
And when emitting libcalls to those rtlib functions, typically when
lowering @llvm.powi), the backend would always prepare the exponent
argument as an i32 which caused miscompiles when the rtlib was
compiled with 16-bit int.
The solution used here is to use an overloaded type for the second
argument in @llvm.powi. This way clang can use the "correct" type
when lowering __builtin_powi, and then later when emitting the libcall
it is assumed that the type used in @llvm.powi matches the rtlib
function.
One thing that needed some extra attention was that when vectorizing
calls several passes did not support that several arguments could
be overloaded in the intrinsics. This patch allows overload of a
scalar operand by adding hasVectorInstrinsicOverloadedScalarOpd, with
an entry for powi.
Differential Revision: https://reviews.llvm.org/D99439
Kadir Cetinkaya [Mon, 31 May 2021 06:24:06 +0000 (08:24 +0200)]
[clangd] Fix feature modules to drop diagnostics
Ignored diagnostics were only checked after level adjusters and assumed
it would stay the same for the rest. But it can also be modified by
FeatureModules.
Differential Revision: https://reviews.llvm.org/D103387
Kadir Cetinkaya [Fri, 19 Mar 2021 09:01:14 +0000 (10:01 +0100)]
[clangd] Use command line adjusters for inserting compile flags
This fixes issues with `--` in the compile flags.
Fixes https://github.com/clangd/clangd/issues/632.
Differential Revision: https://reviews.llvm.org/D99523
Kristof Beyls [Tue, 15 Jun 2021 12:37:08 +0000 (13:37 +0100)]
Avoid unnecessary AArch64 DSB in __clear_cache in some situations.
The dsb after instruction cache invalidation only needs to be executed
if any instruction cache invalidation did happen.
Without this change, if the CTR_EL0.DIC bit indicates that instruction
cache invalidation is not needed, __clear_cache would execute two dsb
instructions in a row; with the second one being unnecessary.
Differential Revision: https://reviews.llvm.org/D104371
MaheshRavishankar [Thu, 17 Jun 2021 05:12:16 +0000 (22:12 -0700)]
[mlir] Move `memref.dim` canonicalization using `InferShapedTypeOpInterface` to a separate pass.
Based on dicussion in
[this](https://llvm.discourse.group/t/remove-canonicalizer-for-memref-dim-via-shapedtypeopinterface/3641)
thread the pattern to resolve the `memref.dim` of a value that is a
result of an operation that implements the
`InferShapedTypeOpInterface` is moved to a separate pass instead of
running it as a canonicalization pass. This allows shape resolution to
happen when explicitly required, instead of automatically through a
canonicalization.
Differential Revision: https://reviews.llvm.org/D104321
Lang Hames [Wed, 16 Jun 2021 11:58:26 +0000 (21:58 +1000)]
[ORC] Switch from uint8_t to char buffers for TargetProcessControl::runWrapper.
This matches WrapperFunctionResult's char buffer, cutting down on the number of
pointer casts needed.
Mehdi Amini [Thu, 17 Jun 2021 01:28:17 +0000 (01:28 +0000)]
Improve error reporting on pass registration collision (NFC)
Differential Revision: https://reviews.llvm.org/D104430
Xuanda Yang [Thu, 17 Jun 2021 02:13:15 +0000 (10:13 +0800)]
[lld][MachO] Sort symbols in parallel in -map
source: https://bugs.llvm.org/show_bug.cgi?id=50689
When writing a map file, sort symbols in parallel using parallelSort.
Use address name to break ties if two symbols have the same address.
Reviewed By: thakis, int3
Differential Revision: https://reviews.llvm.org/D104346
Haruki Imai [Thu, 17 Jun 2021 01:37:32 +0000 (18:37 -0700)]
[mlir] Fixed dynamic operand storage on big-endian machines.
Many tests fails by D101969 (https://reviews.llvm.org/D101969)
on big-endian machines. This patch changes bit order of
TrailingOperandStorage in big-endian machines. This patch
works on System Z (Triple = "s390x-ibm-linux", CPU = "z14").
Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D104225
Kevin Athey [Wed, 16 Jun 2021 05:54:04 +0000 (22:54 -0700)]
Remove obsolete call to AsyncSignalSafeLazyInitiFakeStack.
Code was originally added for Myriad D46626 which was removed
with D104279.
related to: https://github.com/google/sanitizers/issues/1394
Reviewed By: vitalybuka, morehouse
Differential Revision: https://reviews.llvm.org/D104419
River Riddle [Thu, 17 Jun 2021 01:21:47 +0000 (18:21 -0700)]
[mlir-vscode] Add a link to mlir.llvm.org at the top of the vscode extension doc
River Riddle [Thu, 17 Jun 2021 01:21:39 +0000 (18:21 -0700)]
[mlir-lsp-server] Add an explicit blurb on where to send code contributions.
When the vscode extension is published, it may be unclear how to contribute improvements to the extension. This revision makes it clear that contributions should follow the traditional LLVM guidelines.
Adrian Prantl [Thu, 17 Jun 2021 01:17:58 +0000 (18:17 -0700)]
Relax language comparison when matching up C++ forward decls with definitions
when dealing with -gmodules debug info.
This fixes the bot failures on Darwin.
A recent clang change (presumably https://reviews.llvm.org/D104291)
introduced a bug where .pcm files would identify themselves as
DW_LANG_C_plus_plus, but the .o that references them would identify as
DW_LANG_C_plus_plus_14. While that bug needs to be fixed, too, it
shows that the current strict comparison also isn't meaningful.
rdar://
79423225
peter klausler [Tue, 15 Jun 2021 22:17:16 +0000 (15:17 -0700)]
[flang] Complain about more cases of calls to insufficiently defined procedures
When a function is called in a specification expression, it must be
sufficiently defined, and cannot be a recursive call (10.1.11(5)).
The best fix for this is to change the contract for the procedure
characterization infrastructure to catch and report such errors,
and to guarantee that it does emit errors on failed characterizations.
Some call sites were adjusted to avoid cascades.
Differential Revision: https://reviews.llvm.org/D104330
River Riddle [Thu, 17 Jun 2021 00:58:24 +0000 (17:58 -0700)]
[mlir-lsp-server][Docs] Tweak the documentation for the visual studio code extension
This revision updates the feature set, and cleans up the contributing section a little.
Mehdi Amini [Thu, 17 Jun 2021 00:22:35 +0000 (00:22 +0000)]
Improve error message on pass registration failures to include the faulty pass name
Matheus Izvekov [Sun, 6 Jun 2021 01:20:25 +0000 (03:20 +0200)]
[clang] use correct builtin type for defaulted comparison analyzer
Fixes PR50591.
When analyzing classes with members which have user-defined conversion
operators to builtin types, the defaulted comparison analyzer was
picking the member type instead of the type for the builtin operator
which was selected as the best match.
This could either result in wrong comparison category being selected,
or a crash when runtime checks are enabled.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D103760
Stanislav Mekhanoshin [Wed, 16 Jun 2021 22:38:39 +0000 (15:38 -0700)]
[AMDGPU] Fixed constexpr expansion to handle multiple uses
Recently added convertConstantExprsToInstructions() does not handle
a case when a same ConstantExpr used multiple times in the same
instruction. A first use is replaced and the rest of the uses in the
instruction are replaced as well with the replaceUsesOfWith(). Then
function attempts to replace a constant already destroyed.
So far this interface is only used by the AMDGPU BE.
Differential Revision: https://reviews.llvm.org/D104425
Matheus Izvekov [Fri, 19 Mar 2021 02:32:06 +0000 (03:32 +0100)]
[clang] NRVO: Improvements and handling of more cases.
This expands NRVO propagation for more cases:
Parse analysis improvement:
* Lambdas and Blocks with dependent return type can have their variables
marked as NRVO Candidates.
Variable instantiation improvements:
* Fixes crash when instantiating NRVO variables in Blocks.
* Functions, Lambdas, and Blocks which have auto return type have their
variables' NRVO status propagated. For Blocks with non-auto return type,
as a limitation, this propagation does not consider the actual return
type.
This also implements exclusion of VarDecls which are references to
dependent types.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: Quuxplusone
Differential Revision: https://reviews.llvm.org/D99696
Mehdi Amini [Wed, 16 Jun 2021 23:42:13 +0000 (23:42 +0000)]
Migrate MLIR test passes to the new registration API
Make sure they all define getArgument()/getDescription().
Depends On D104421
Differential Revision: https://reviews.llvm.org/D104426
Mehdi Amini [Wed, 16 Jun 2021 23:41:23 +0000 (23:41 +0000)]
Decouple registring passes from specifying argument/description
This patch changes the (not recommended) static registration API from:
static PassRegistration<MyPass> reg("my-pass", "My Pass Description.");
to:
static PassRegistration<MyPass> reg;
And the explicit registration from:
void registerPass("my-pass", "My Pass Description.",
[] { return createMyPass(); });
To:
void registerPass([] { return createMyPass(); });
It is expected that Pass implementations overrides the getArgument() method
instead. This will ensure that pipeline description can be printed and parsed
back.
Differential Revision: https://reviews.llvm.org/D104421
peter klausler [Wed, 16 Jun 2021 23:37:20 +0000 (16:37 -0700)]
[flang] Fix ARM/POWER test failure (folding20.f90)
Recent code for folding MINVAL() didn't allow for architectures
whose C/C++ char type is unsigned, so the value of the maximum
Fortran character was incorrect. This was caught by the
folding20.f90 test. The fix is to avoid numeric_limits<> and
use hard values for max signed integers of various character kinds.
Pushing into llvm-project/main to restore ARM/POWER buildbots.
Ben Shi [Wed, 16 Jun 2021 23:02:33 +0000 (07:02 +0800)]
[RISCV][test] Add new tests of SH*ADD in the zba extension
These tests will show the following optimization by future patches.
Rx + Ry * 6 => (SH1ADD (SH2ADD Rx, Ry), Ry)
Rx + Ry * 10 => (SH1ADD (SH3ADD Rx, Ry), Ry)
Rx + Ry * 12 => (SH2ADD (SH3ADD Rx, Ry), Ry)
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D104210
Robert David [Wed, 16 Jun 2021 22:31:20 +0000 (15:31 -0700)]
[mlir] Make Type::print and Type::dump const
Nico Weber [Wed, 16 Jun 2021 22:04:46 +0000 (18:04 -0400)]
[gn build] (manually) port
f9aba9a5afe
peter klausler [Tue, 15 Jun 2021 22:19:51 +0000 (15:19 -0700)]
[flang] Implement runtime for IALL & IANY
We had IPARITY (xor-reduction) but I missed IALL (and)
and IANY (or).
Differential Revision: https://reviews.llvm.org/D104339
Joachim Meyer [Thu, 6 May 2021 18:18:11 +0000 (20:18 +0200)]
Use `-cfg-func-name` value as filter for `-view-cfg`, etc.
Currently the value is only used when calling `F->viewCFG()` which is missing out on its potential and usefulness.
So I added the check to the printer passes as well.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D102011
Adrian Prantl [Wed, 16 Jun 2021 21:26:02 +0000 (14:26 -0700)]
Move the definition of LLVM_SUPPORT_XCODE_SIGNPOSTS into llvm-config.h
since it is now used by a public header file (Signposts.h).
This fixes the standalone LLDB build.
peter klausler [Tue, 15 Jun 2021 22:19:18 +0000 (15:19 -0700)]
[flang] Use a "double-double" accumulator in SUM
Use a "double-double" accumulator, a/k/a Kahan summation,
in the SUM intrinsic in the runtime for real & complex.
This seems to be the best-recommended technique for reducing
error, as opposed to the initial implementation of SUM's
distinct accumulators for positive and negative items.
Differential Revision: https://reviews.llvm.org/D104338
Kostya Kortchinsky [Wed, 16 Jun 2021 17:51:51 +0000 (10:51 -0700)]
[scudo] Ensure proper allocator alignment in TSD test
The `MockAllocator` used in `ScudoTSDTest` wasn't allocated
properly aligned, which resulted in the `TSDs` of the shared
registry not being aligned either. This lead to some failures
like: https://reviews.llvm.org/D103119#2822008
This changes how the `MockAllocator` is allocated, same as
Vitaly did in the combined tests, properly aligning it, which
results in the `TSDs` being aligned as well.
Add a `DCHECK` in the shared registry to check that it is.
Differential Revision: https://reviews.llvm.org/D104402
peter klausler [Tue, 15 Jun 2021 22:18:41 +0000 (15:18 -0700)]
[flang] Fold MAXVAL & MINVAL
Implement constant folding for the reduction transformational
intrinsic functions MAXVAL and MINVAL.
In anticipation of more folding work to follow, with (I hope)
some common infrastructure, these two have been implemented in a
new header file.
Differential Revision: https://reviews.llvm.org/D104337
Andrzej Warzynski [Wed, 16 Jun 2021 21:00:13 +0000 (21:00 +0000)]
[flang][driver] Add missing `! REQUIRES` LIT directive
The test added in https://reviews.llvm.org/D104305 will only work with
the new driver and should be marked as such.
Sending this without a review as it's fairly straightforward and fixes
test failures for developers that don't want to build the new driver.
peter klausler [Tue, 15 Jun 2021 22:18:01 +0000 (15:18 -0700)]
[flang] Cope with errors with array constructors
When a program attempts to put something like a subprogram
into an array constructor, emit an error rather than crashing.
Differential Revision: https://reviews.llvm.org/D104336
Terry Wilmarth [Fri, 21 May 2021 22:06:09 +0000 (17:06 -0500)]
[OpenMP] Add Two-level Distributed Barrier
Two-level distributed barrier is a new experimental barrier designed
for Intel hardware that has better performance in some cases than the
default hyper barrier.
This barrier is designed to handle fine granularity parallelism where
barriers are used frequently with little compute and memory access
between barriers. There is no need to use it for codes with few
barriers and large granularity compute, or memory intensive
applications, as little difference will be seen between this barrier
and the default hyper barrier. This barrier is designed to work
optimally with a fixed number of threads, and has a significant setup
time, so should NOT be used in situations where the number of threads
in a team is varied frequently.
The two-level distributed barrier is off by default -- hyper barrier
is used by default. To use this barrier, you must set all barrier
patterns to use this type, because it will not work with other barrier
patterns. Thus, to turn it on, the following settings are required:
KMP_FORKJOIN_BARRIER_PATTERN=dist,dist
KMP_PLAIN_BARRIER_PATTERN=dist,dist
KMP_REDUCTION_BARRIER_PATTERN=dist,dist
Branching factors (set with KMP_FORKJOIN_BARRIER, KMP_PLAIN_BARRIER,
and KMP_REDUCTION_BARRIER) are ignored by the two-level distributed
barrier.
Differential Revision: https://reviews.llvm.org/D103121
Yitzhak Mandelbaum [Wed, 16 Jun 2021 14:52:13 +0000 (14:52 +0000)]
[libTooling] Change `access` stencil to recognize use of `operator*`.
Currently, `access` doesn't recognize a dereferenced smart pointer. So,
`access(e, "field")` where `e = *x`, yields:
* `x->field`, for normal-pointer x,
* `(*x).field`, for smart-pointer x.
This patch normalizes handling of smart pointer to match normal pointer, when
the smart pointer type supports `->`.
Differential Revision: https://reviews.llvm.org/D104390
Gus Smith [Wed, 16 Jun 2021 20:04:09 +0000 (13:04 -0700)]
Add sparse matrix multiplication integration test
Adds an integration test for the SPMM (sparse matrix multiplication) kernel, which multiplies a sparse matrix by a dense matrix, resulting in a dense matrix. This is just a simple modification on the existing matrix-vector multiplication kernel.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D104334
Yitzhak Mandelbaum [Wed, 16 Jun 2021 14:47:18 +0000 (14:47 +0000)]
[ASTMatchers] Fix bug in `hasUnaryOperand`
Currently, `hasUnaryOperand` fails for the overloaded `operator*`. This patch fixes the bug and
adds tests for this case.
Differential Revision: https://reviews.llvm.org/D104389
Uday Bondhugula [Thu, 27 May 2021 19:38:30 +0000 (01:08 +0530)]
[MLIR] Make store to load fwd condition less conservative
Make store to load fwd condition for -memref-dataflow-opt less
conservative. Post dominance info is not really needed. Add additional
check for common cases.
Differential Revision: https://reviews.llvm.org/D104174
Prashant Kumar [Wed, 16 Jun 2021 19:45:35 +0000 (01:15 +0530)]
[MLIR] Fix affine parallelize pass.
To control the number of outer parallel loops, we need to process the
outer loops first and hence pre-order walk fixes the issue.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D104361
Jacques Pienaar [Wed, 16 Jun 2021 19:53:21 +0000 (12:53 -0700)]
Add hook for dialect specializing processing blocks post inlining calls
This allows for dialects to do different post-processing depending on operations with the inliner (my use case requires different attribute propagation rules depending on call op). This hook runs before the regular processInlinedBlocks method.
Differential Revision: https://reviews.llvm.org/D104399
peter klausler [Tue, 15 Jun 2021 22:15:34 +0000 (15:15 -0700)]
[flang] Fix crashes on calls to non-procedures
When a procedure reference is attempted to an entity that just
isn't a procedure, say so.
Differential Revision: https://reviews.llvm.org/D104329
Min-Yih Hsu [Wed, 16 Jun 2021 17:34:57 +0000 (10:34 -0700)]
[MCA] Anchoring the vtable of CustomBehaviour
Put the dtor of mca::CustomBehaviour into the cpp file to avoid
undefined vtable when linking libLLVMMCACustomBehaviourAMDGPU as shared
library.
Differential Revision: https://reviews.llvm.org/D104401
Mehdi Amini [Wed, 16 Jun 2021 19:42:41 +0000 (19:42 +0000)]
Use early exist and simplify a condition in Block SuccessorRange (NFC)
Mehdi Amini [Wed, 16 Jun 2021 18:59:43 +0000 (18:59 +0000)]
Fix verifier crashing on some invalid IR
In a region with multiple blocks the verifier will try to look for
dominance and may get successor list for blocks, even though a block
may be empty or does not end with a terminator.
Differential Revision: https://reviews.llvm.org/D104411
Eli Friedman [Wed, 16 Jun 2021 19:30:45 +0000 (12:30 -0700)]
[NFC][ScalarEvolution] Refactor createNodeForSelectOrPHI
In preparation for D103660.
peter klausler [Tue, 15 Jun 2021 22:14:16 +0000 (15:14 -0700)]
[flang] Don't crash on some bogus expressions
Recover more gracefully from user errors in expressions.
Differential Revision: https://reviews.llvm.org/D104326
Jez Ng [Wed, 16 Jun 2021 19:23:07 +0000 (15:23 -0400)]
[lld-macho] Put DATA_IN_CODE immediately after FUNCTION_STARTS
codesign checks for this.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D104354
Jez Ng [Wed, 16 Jun 2021 19:23:06 +0000 (15:23 -0400)]
[lld-macho] Handle multiple LC_LINKER_OPTIONs
We previously only parsed the first one.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D104352
Jez Ng [Wed, 16 Jun 2021 19:23:04 +0000 (15:23 -0400)]
[lld-macho][nfc] Put back shouldOmitFromOutput() asserts
I removed them in rG5de7467e982 but @thakis pointed out that
they were useful to keep, so here they are again. I've also converted
the `!isCoalescedWeak()` asserts into `!shouldOmitFromOutput()` asserts,
since the latter check subsumes the former.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D104169
Fangrui Song [Wed, 16 Jun 2021 19:09:49 +0000 (12:09 -0700)]
[llvm-objcopy][MachO] Copy LC_LINKER_OPTIMIZATION_HINT
This fixes `error: unsupported load command (cmd=0x2e)`
Hongtao Yu [Tue, 15 Jun 2021 22:59:06 +0000 (15:59 -0700)]
[CSSPGO] Report zero-count probe in profile instead of dangling probes.
Previously dangling samples were represented by INT64_MAX in sample profile while probes never executed were not reported. This was based on an observation that dangling probes were only at a smaller portion than zero-count probes. However, with compiler optimizations, dangling probes end up becoming at large portion of all probes in general and reporting them does not make sense from profile size point of view. This change flips sample reporting by reporting zero-count probes instead. This enabled dangling probe to be represented by none (missing entry in profile). This has a couple benefits:
1. Reducing sample profile size in optimize mode, even when the number of non-executed probes outperform the number of dangling probes, since INT64_MAX takes more space over 0 to encode.
2. Binary size savings. No need to encode dangling probe anymore, since missing probes are treated as dangling in the profile reader.
3. Reducing compiler work to track dangling probes. However, for probes that are real dead and removed, we still need the compiler to identify them so that they can be reported as zero-count, instead of mistreated as dangling probes.
4. Improving counts quality by respecting the counts already collected on the non-dangling copy of a probe. A probe, when duplicated, gets two copies at runtime. If one of them is dangling while the other is not, merging the two probes at profile generation time will cause the real samples collected on the non-dangling one to be discarded. Not reporting the dangling counterpart will keep the real samples.
5. Better readability.
6. Be consistent with non-CS dwarf line number based profile. Zero counts are trusted by the compiler counts inferencer while missing counts will be inferred by the compiler.
Note that the current patch does include any work for #3. There will be follow-up changes.
For #1, I've seen for a large Facebook service, the text profile is reduced by 7%. For extbinary profile, the size of LBRProfileSection is reduced by 35%.
For #4, I have seen general counts quality for SPEC2017 is improved by 10%.
Reviewed By: wenlei, wlei, wmi
Differential Revision: https://reviews.llvm.org/D104129
Aart Bik [Tue, 15 Jun 2021 22:56:32 +0000 (15:56 -0700)]
[mlir][sparse] support new kind of scalar in sparse linalg generic op
We have several ways of introducing a scalar invariant value into
linalg generic ops (should we limit this somewhat?). This revision
makes sure we handle all of them correctly in the sparse compiler.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D104335
Sanjay Patel [Wed, 16 Jun 2021 17:36:19 +0000 (13:36 -0400)]
[ValueTracking] add FP intrinsics to test for propagatesPoison; NFC
I'm not sure what behavior we want if the FP environment is
not default (also not sure if there's a way to enumerate
the full list of intrinsics programmatically), but currently
these are all defaulting to 'false' (doesn't propagate).
Fangrui Song [Wed, 16 Jun 2021 17:42:43 +0000 (10:42 -0700)]
RISCVFixupKinds.h: Don’t duplicate function or class name at the beginning of the comment && fix some comments
peter klausler [Tue, 15 Jun 2021 22:09:04 +0000 (15:09 -0700)]
[flang] Correct the subscripts used for arguments to character intrinsics
When chasing down another unrelated bug, I noticed that the
implementations of various character intrinsic functions assume
that the lower bounds of (some of) their arguments were 1.
This isn't necessarily the case, so I've cleaned them up, tweaked
the unit tests to exercise the fix, and regularized the allocation
pattern used for results to use SetBounds() before Allocate() rather
than the old original Descriptor::Allocate() wrapper around
CFI_allocate().
Since there were few other remaining uses of the old original
Descriptor::Allocate() wrapper, I also converted them to the
new one and deleted the old one.
Differential Revision: https://reviews.llvm.org/D104325
Ben Langmuir [Wed, 16 Jun 2021 00:44:06 +0000 (17:44 -0700)]
[index] Fix performance regression with indexing macros
When using FileIndexRecord with macros, symbol references can be seen
out of source order, which was causing a regression to insert the
symbols into a vector. Instead, we now lazily sort the vector. The
impact is small on most code, but in very large files with many macro
references (M) near the beginning of the file followed by many decl
references (D) it was O(M*D). A particularly bad protobuf-generated
header was observed with a 100% regression in practice.
rdar://
78628133
Fangrui Song [Wed, 16 Jun 2021 17:08:20 +0000 (10:08 -0700)]
[llvm-objcopy] Make ihex writer similar to binary writer
There is no need to differentiate whether `UseSegments` is true or
false. Unifying the cases makes the behavior closer to BinaryWriter.
This improves compatibility with objcopy because SHF_ALLOC sections not in
a PT_LOAD will not be skipped. Such cases are usually erroneous input, though.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D104186
Sushma Unnibhavi [Wed, 16 Jun 2021 16:45:12 +0000 (10:45 -0600)]
[M68k][GloballSel] Adding initial GlobalISel infrastructure
Wiring up GlobalISel for the M68k backend
Differential Revision: https://reviews.llvm.org/D101819
Christopher Di Bella [Sat, 12 Jun 2021 06:13:44 +0000 (06:13 +0000)]
Revert "Revert "[libcxx][module-map] creates submodules for private headers""
This reverts commit
d9633f229c36f292dab0e5f510ac635cfaf3a798 as a
workaround was discovered.
Differential Revision: https://reviews.llvm.org/D104170
Sanjay Patel [Wed, 16 Jun 2021 16:12:19 +0000 (12:12 -0400)]
[ValueTracking] add tests for propagatesPoison with FP ops; NFC
Verify that this matches the behavior in InstSimplify:
D104383 /
ce95200b7942
We still need to add code/tests for FP intrinsics.
LLVM GN Syncbot [Wed, 16 Jun 2021 15:57:43 +0000 (15:57 +0000)]
[gn build] Port
ef16c8eaa5cd
Patrick Holland [Wed, 16 Jun 2021 15:22:54 +0000 (16:22 +0100)]
Reapply "[MCA] Adding the CustomBehaviour class to llvm-mca".
The original change was pushed in main as commit
f7a23ecece52.
It was then reverted by commit
a04f01bab2 because it caused linker failures
on buildbots that don't build the AMDGPU target.
--
Some instructions are not defined well enough within the target’s scheduling
model for llvm-mca to be able to properly simulate its behaviour. The ideal
solution to this situation is to modify the scheduling model, but that’s not
always a viable strategy. Maybe other parts of the backend depend on that
instruction being modelled the way that it is. Or maybe the instruction is quite
complex and it’s difficult to fully capture its behaviour with tablegen. The
CustomBehaviour class (which I will refer to as CB frequently) is designed to
provide intuitive scaffolding for developers to implement the correct modelling
for these instructions.
More details are available in the original commit log message (
f7a23ecece52).
Differential Revision: https://reviews.llvm.org/D104149
Vyacheslav Zakharin [Wed, 16 Jun 2021 15:30:35 +0000 (08:30 -0700)]
[NFC][libomptarget] Reduce the dependency on libelf
This change-set removes libelf usage from elf_common part of the plugins.
libelf is still used in x86_64 generic plugin code and in some plugins
(e.g. amdgpu) - these will have to be cleaned up in separate checkins.
Differential Revision: https://reviews.llvm.org/D103545
Sanjay Patel [Wed, 16 Jun 2021 15:22:15 +0000 (11:22 -0400)]
[InstSimplify] propagate poison through FP ops
We already have this fold:
fadd float poison, 1.0 --> poison
...via ConstantFolding, so this makes the behavior consistent
if the other operand(s) are non-constant.
The fold for undef was added before poison existed as a
value/type in IR.
This came up in D102673 / D103169
because we're trying to sort out the more complicated handling
for constrained math ops.
We should have the handling for the regular instructions done
first, so we can build on that (or diverge as needed).
Differential Revision: https://reviews.llvm.org/D104383
Sjoerd Meijer [Wed, 16 Jun 2021 15:21:16 +0000 (16:21 +0100)]
[FuncSpec] Fixed prefix typo in test function-specialization-noexec.ll. NFC.
Yitzhak Mandelbaum [Tue, 15 Jun 2021 19:56:32 +0000 (19:56 +0000)]
[libTooling][NFC] Refactor implemenation of Transformer Stencils to use standard OOP
Currently, the implementation combines OOP and overloads, using a template to
tie the two together. In practice, this has proven confusing with no
benefits. This patch simplifies the code to use standard OOP design (a
collection of classes deriving from an interface).
Differential Revision: https://reviews.llvm.org/D104317
Jez Ng [Wed, 16 Jun 2021 15:06:14 +0000 (11:06 -0400)]
[lld-macho] Downgrade version mismatch to warning
It's a warning in ld64. While having LLD be stricter would be nice, it
makes it harder for it to be a drop-in replacement into existing builds.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D104333
Dylan Fleming [Wed, 16 Jun 2021 14:38:31 +0000 (15:38 +0100)]
[SVE] Selection failure with scalable insertelements
Reviewed By: efriedma, CarolineConcatto
Differential Revision: https://reviews.llvm.org/D104244
James Henderson [Wed, 16 Jun 2021 13:57:44 +0000 (14:57 +0100)]
[obj2yaml] Address D104035 review comments
Accidentally missed from commit
5c1639fe064b.
Differential Revision: https://reviews.llvm.org/D104035
Jay Foad [Wed, 16 Jun 2021 13:29:36 +0000 (14:29 +0100)]
[AMDGPU] Set VOP3P flag on Real instructions
This does not affect codegen but might benefit llvm-mca.
David Spickett [Wed, 9 Jun 2021 16:36:39 +0000 (16:36 +0000)]
[llvm][AArch64] Handle arrays of struct properly (from IR)
This only applies to FastIsel. GlobalIsel seems to sidestep
the issue.
This fixes https://bugs.llvm.org/show_bug.cgi?id=46996
One of the things we do in llvm is decide if a type needs
consecutive registers. Previously, we just checked if it
was an array or not.
(plus an SVE specific check that is not changing here)
This causes some confusion when you arbitrary IR like:
```
%T1 = type { double, i1 };
define [ 1 x %T1 ] @foo() {
entry:
ret [ 1 x %T1 ] zeroinitializer
}
```
We see it is an array so we call CC_AArch64_Custom_Block
which bails out when it sees the i1, a type we don't want
to put into a block.
This leaves the location of the double in some kind of
intermediate state and leads to odd codegen. Which then crashes
the backend because it doesn't know how to implement
what it's been asked for.
You get this:
```
renamable $d0 = FMOVD0
$w0 = COPY killed renamable $d0
```
Rather than this:
```
$d0 = FMOVD0
$w0 = COPY $wzr
```
The backend knows how to copy 64 bit to 64 bit registers,
but not 64 to 32. It can certainly be taught how but the real
issue seems to be us even trying to assign a register block
in the first place.
This change makes the logic of
AArch64TargetLowering::functionArgumentNeedsConsecutiveRegisters
a bit more in depth. If we find an array, also check that all the
nested aggregates in that array have a single member type.
Then CC_AArch64_Custom_Block's assumption of a type that looks
like [ N x type ] will be valid and we get the expected codegen.
New tests have been added to exercise these situations. Note that
some of the output is not ABI compliant. The aim of this change is
to simply handle these situations and not to make our processing
of arbitrary IR ABI compliant.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D104123
Louis Dionne [Tue, 15 Jun 2021 20:08:38 +0000 (16:08 -0400)]
[libc++] Undeprecate the std::allocator<void> specialization
While the std::allocator<void> specialization was deprecated by
https://wg21.link/p0174#2.2, the *use* of std::allocator<void> by users
was not. The intent was that std::allocator<void> could still be used
in C++17 and C++20, but starting with C++20 (with the removal of the
specialization), std::allocator<void> would use the primary template.
That intent was called out in wg21.link/p0619r4#3.9.
As a result of this patch, _LIBCPP_ENABLE_CXX20_REMOVED_ALLOCATOR_MEMBERS
will also not control whether the explicit specialization is provided or
not. It shouldn't matter, since in C++20, one can simply use the primary
template.
Fixes http://llvm.org/PR50299
Differential Revision: https://reviews.llvm.org/D104323
Andrea Di Biagio [Wed, 16 Jun 2021 13:39:14 +0000 (14:39 +0100)]
[MCA][InstrBuilder] Always check for implicit uses of resource units (PR50725).
When instructions are issued to the underlying pipeline resources, the
mca::ResourceManager should also check for the presence of extra uses induced by
the explicit consumption of multiple partially overlapping group resources.
Fixes PR50725
Nicolas Vasilache [Wed, 16 Jun 2021 11:10:00 +0000 (11:10 +0000)]
[mlir] NFC - Drop newline form BlockArgument printing.
Differential Revision: https://reviews.llvm.org/D104368
Simon Pilgrim [Wed, 16 Jun 2021 12:42:11 +0000 (13:42 +0100)]
[X86][AVX] Regenerate pr15296.ll tests
Exposes some really bad shift lowering codegen in shiftInput___canonical
Ben Dunbobbin [Fri, 11 Jun 2021 12:35:04 +0000 (13:35 +0100)]
[llvm-symbolizer] improve test and fix doc example after recent --print-source-context-lines behaviour change
I believe that after https://reviews.llvm.org/D102355 the behaviour of --print-source-context-lines has changed.
Before: --print-source-context-lines=3 prints 4 lines.
After: --print-source-context-lines=3 prints 3 lines.
Adjust the example in the docs for this change and make the testing a little more robust.
Differential Revision: https://reviews.llvm.org/D104114
Jay Foad [Wed, 16 Jun 2021 12:35:29 +0000 (13:35 +0100)]
[AMDGPU] Set SALU, VALU and other instruction type flags on Real instructions
This does not affect codegen but might benefit llvm-mca.
Amilendra Kodithuwakku [Wed, 16 Jun 2021 12:18:01 +0000 (13:18 +0100)]
[libcxx] Fix exception raised during downstream bare-metal libunwind tests
Fix for the following exception.
AttributeError: 'TestingConfig' object has no attribute 'target_triple'
Related revision: https://reviews.llvm.org/D102012
'TestingConfig' object has no attribute 'target_triple'
Reviewed By: #libunwind, miyuki, danielkiss, mstorsjo
Differential Revision: https://reviews.llvm.org/D103140
Guillaume Chatelet [Tue, 15 Jun 2021 07:57:13 +0000 (07:57 +0000)]
[libc] Add a set of elementary operations
Resubmission of D100646 now making sure that we handle cases were `__builtin_memcpy_inline` is not available.
Original commit message:
Each of these elementary operations can be assembled to support higher order constructs (Overlapping access, Loop, Aligned Loop).
The patch does not compile yet as it depends on other ones (D100571, D100631) but it allows to get the conversation started.
A self-contained version of this code is available at https://godbolt.org/z/e1x6xdaxM
Dylan Fleming [Wed, 16 Jun 2021 12:01:58 +0000 (13:01 +0100)]
[SVE] Fix PromoteIntRes_TRUNCATE not to call getVectorNumElements
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D104115
Raphael Isemann [Wed, 16 Jun 2021 12:04:31 +0000 (14:04 +0200)]
[lldb] Require Clang 8 for gpubnames test
This test is using -gpubnames which is only available since Clang 8. The
original Clang 7 requirement was based on the availability of
-accel-tables=Dwarf (which the test initially used before being changed to
-gpubnames in commit
15a6df52efaa7 ).
AndreyChurbanov [Wed, 16 Jun 2021 11:47:29 +0000 (14:47 +0300)]
[OpenMP] libomp: fixed implementation of OMP 5.1 inoutset task dependence type
Refactored code of dependence processing and added new inoutset dependence type.
Compiler can set dependence flag to 0x8 when call __kmpc_omp_task_with_deps.
All dependence flags library gets so far and corresponding dependence types:
1 - IN, 2 - OUT, 3 - INOUT, 4 - MUTEXINOUTSET, 8 - INOUTSET.
Differential Revision: https://reviews.llvm.org/D97085