Philip Reames [Mon, 27 Jun 2022 19:53:57 +0000 (12:53 -0700)]
[LV] Allow scalable vectorization with vscale = 1
This change is a bit subtle. If we have a type like <vscale x 1 x i64>, the vectorizer will currently reject vectorization. The reason is that a type like <1 x i64> is likely to get simply rescalarized, and the vectorizer doesn't want to be in the game of simple unrolling.
(I've given the example in terms of 1 x types which use a single register, but the same issue exists for any N x types which use N registers. e.g. RISCV LMULs.)
This change distinguishes scalable types from fixed types under the reasoning that converting to a scalable type isn't unrolling. Because the actual vscale isn't known until runtime, using a vscale type is potentially very profitable.
This makes an important, but unchecked, assumption. Specifically, the scalable type is assumed to only be legal per the cost model if there's actually a scalable register class which is distinct from the scalar domain. This is, to my knowledge, true for all targets which return non-invalid costs for scalable vector ops today, but in theory, we could have a target decide to lower scalable to fixed length vector or even scalar registers. If that ever happens, we'd need to revisit this code.
In practice, this patch unblocks scalable vectorization for ELEN types on RISCV.
Let me sketch one alternate implementation I considered. We could have restricted this to when we know a minimum value for vscale. Specifically, for the default +v extension for RISCV, we actually know that vscale >= 2 for ELEN types. However, doing it this way means we can't generate scalable vectors when using the various embedded vector extensions which have a minimum vscale of 1.
Differential Revision: https://reviews.llvm.org/D128542
Roy Sundahl [Fri, 24 Jun 2022 23:25:09 +0000 (16:25 -0700)]
Add wait for child processe(s) to exit. (amended+clang-formatted)
It was possible for the parent process to exit before the
forked child process had finished. In some shells, this
causes the pipe to close and FileCheck misses some output
from the child. Waiting for the child process to exit before
exiting the parent, assures that all output from stdout and
stderr is combined and forwarded through the pipe to FileCheck.
rdar://
95241490
Differential Revision: https://reviews.llvm.org/D128565
Xing Xue [Mon, 27 Jun 2022 20:07:27 +0000 (16:07 -0400)]
[libc++][lit][AIX] Port tests for getting time to AIX
Summary:
This patch ports libc++ LIT test cases for getting time in various locales to AIX.
Reviewed by: philnik, Mordante, libc++
Differential Revision: https://reviews.llvm.org/D128087
Xing Xue [Mon, 27 Jun 2022 19:57:54 +0000 (15:57 -0400)]
[libc++][lit][AIX] Port tests for money format to AIX
Summary:
This patch ports libc++ LIT test cases for money formats to AIX. On AIX, the money format of locale zh_CN.UTF-8 is the similar to that of en_US.UTF-8, i.e., sign, symbol, none, value.
Reviewed by: Mordante, DiggerLin, libc++
Differential Revision: https://reviews.llvm.org/D128220
Philip Reames [Mon, 27 Jun 2022 19:46:09 +0000 (12:46 -0700)]
[RISCV] Remove a use of getMinVLen in favor of getRealMinVLen
The later is possibly greater than the former, and thus the assert was overly strong when a wider VLEN was set at the command line.
Philip Reames [Sun, 26 Jun 2022 19:50:57 +0000 (12:50 -0700)]
[RISCV] Cost model for scalable reductions
This extends the existing cost model for reductions for scalable vectors.
The existing cost model assumes that reductions are roughly logarithmic in cost for unordered variants and linear for ordered ones. This change keeps that same basic model, and extends it out to the maximum number of elements a scalable vector could possibly have.
This results in costs which aren't terribly high for unordered reductions, but are for ordered ones. This seems about right; we want to strongly bias away from using scalable ordered reductions if the cost might be linear in VL.
Differential Revision: https://reviews.llvm.org/D127447
Vitaly Buka [Mon, 27 Jun 2022 19:42:08 +0000 (12:42 -0700)]
Revert "[X86] Support `_Float16` on SSE2 and up"
Breaks buildbot
https://lab.llvm.org/buildbot/#/builders/37/builds/14334
This reverts commit
f5d781d6273cc56dd8b44ee9e4cfb2ae5579bb04.
Matthias Springer [Mon, 27 Jun 2022 19:34:09 +0000 (21:34 +0200)]
[mlir][bufferize] Improve to_tensor/to_memref folding
Differential Revision: https://reviews.llvm.org/D128615
Vitaly Buka [Mon, 27 Jun 2022 19:36:33 +0000 (12:36 -0700)]
[test] Add workaround for flaky error we see on Windows bots
Alex Langford [Mon, 27 Jun 2022 19:32:18 +0000 (12:32 -0700)]
[NFC][lldb] Correct Module::FindFunctions documentation
Looks like a copy/paste from ModuleList::FindCompileUnits.
Yuanfang Chen [Mon, 27 Jun 2022 19:22:02 +0000 (12:22 -0700)]
Fix sphinx docs build
Fix "Title underline too short."
Yuanfang Chen [Mon, 27 Jun 2022 18:36:32 +0000 (11:36 -0700)]
[Coroutine] Remove the '!func_sanitize' metadata for split functions
There is no proper RTTI for these split functions. So just delete the
metadata.
Fixes https://github.com/llvm/llvm-project/issues/49689.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D116130
Yuanfang Chen [Mon, 27 Jun 2022 18:33:45 +0000 (11:33 -0700)]
[ubsan] Using metadata instead of prologue data for function sanitizer
Information in the function `Prologue Data` is intentionally opaque.
When a function with `Prologue Data` is duplicated. The self (global
value) references inside `Prologue Data` is still pointing to the
original function. This may cause errors like `fatal error: error in backend: Cannot represent a difference across sections`.
This patch detaches the information from function `Prologue Data`
and attaches it to a function metadata node.
This and D116130 fix https://github.com/llvm/llvm-project/issues/49689.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D115844
Wei Yi Tee [Mon, 27 Jun 2022 19:05:36 +0000 (21:05 +0200)]
[clang][dataflow] Add `buildAndSubstituteFlowCondition` to `DataflowEnvironment`
Depends On D128658
Reviewed By: gribozavr2, xazax.hun
Differential Revision: https://reviews.llvm.org/D128659
Wei Yi Tee [Mon, 27 Jun 2022 18:53:07 +0000 (20:53 +0200)]
[clang][dataflow] Do not allow substitution of true/false boolean literals in `buildAndSubstituteFlowCondition`
Reviewed By: gribozavr2, xazax.hun
Differential Revision: https://reviews.llvm.org/D128658
Aart Bik [Mon, 27 Jun 2022 18:36:35 +0000 (11:36 -0700)]
[mlir][sparse] remove redundant whitespace
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D128673
Groverkss [Mon, 27 Jun 2022 18:35:20 +0000 (19:35 +0100)]
[MLIR][Parser] Fix AffineParser colliding bare identifiers with primitive types
The parser currently can't parse bare identifiers like 'i0' in affine
maps and sets, and similarly ids like f16/f32. But these bare ids are
part of the grammar - although they are primitive types.
```
error: expected bare identifier
set = affine_set<(i0, i1) : ()>
^
```
This patch allows the parser for AffineMap/IntegerSet to parse bare
identifiers as defined by the grammer.
Reviewed By: bondhugula, rriddle
Differential Revision: https://reviews.llvm.org/D127076
Peiming Liu [Mon, 27 Jun 2022 17:59:46 +0000 (10:59 -0700)]
[mlir][sparse]more integration test cases for sparse_tensor.BinaryOp
Adding more test cases for sparse_tensor.BinaryOp, including different cases when overlap/left/right region is implemented/empty/identity
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D128383
Daniel Thornburgh [Mon, 27 Jun 2022 18:13:52 +0000 (11:13 -0700)]
[Symbolize] Fix MarkupFilter tests for Windows.
The tests use in-band ANSI color codes, while the Windows cmd console
uses an out-of-band interface for color.
Daniel Thornburgh [Mon, 27 Jun 2022 18:13:52 +0000 (11:13 -0700)]
[Symbolize] Fix llvm-symbolizer --filter-markup test on Windows.
The tests use in-band ANSI color codes, while the Windows cmd console
uses an out-of-band interface for color.
Amir Ayupov [Mon, 27 Jun 2022 17:30:30 +0000 (10:30 -0700)]
[BOLT] Restrict icp-inline to callsites
ICP peel for inline mode only makes sense for calls, not jump tables.
Plus, add a check that the Target BinaryFunction is found.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D128404
Peiming Liu [Mon, 27 Jun 2022 17:17:58 +0000 (10:17 -0700)]
[mlir][sparse]Add more integration tests for sparse_tensor.unary
Previously, the sparse_tensor.unary integration test does not contain cases with the use of `linalg.index` (previoulsy unsupported), this commit adds test cases that use `linalg.index` operators.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D128460
LLVM GN Syncbot [Mon, 27 Jun 2022 17:44:48 +0000 (17:44 +0000)]
[gn build] Port
eb5af0acf054
Daniel Thornburgh [Wed, 4 May 2022 22:47:42 +0000 (22:47 +0000)]
[Symbolize] Add log markup --filter to llvm-symbolizer.
This adds a --filter option to llvm-symbolizer. This takes log-bearing
symbolizer markup from stdin and writes a human-readable version to
stdout.
For now, this only implements the "symbol" markup tag; all others are
passed through unaltered. This is a proof-of-concept bit of
functionalty; implement the various tags is more-or-less just a matter
of hooking up various parts of the Symbolize library to the architecture
established here.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D126980
Chris Bieneman [Thu, 16 Jun 2022 19:34:57 +0000 (14:34 -0500)]
[Docs] Update clang & llvm release notes for HLSL
Adding release note entries for LLVM & Clang to introduce the HLSL &
DirectX support that is being added.
Reviewed By: aaron.ballman, MaskRay
Differential Revision: https://reviews.llvm.org/D127890
Joseph Huber [Mon, 27 Jun 2022 17:23:25 +0000 (13:23 -0400)]
[OpenMP] Only strip runtime attributes if needed
Summary:
Currently in OpenMPOpt we strip `noinline` attributes from runtime
functions. This is here because the device bitcode library that we link
has problems with needed definitions getting prematurely optimized out.
This is only necessary for OpenMP offloading to GPUs so we should narrow
the scope for where we spend time doing this. In the future this
shouldn't be necessary as we move to using a linked library rather than
pulling in a bitcode library in Clang.
Kirill Okhotnikov [Mon, 27 Jun 2022 17:30:31 +0000 (19:30 +0200)]
[libc][docs] Added fmod performance results.
Amir Ayupov [Mon, 27 Jun 2022 17:29:10 +0000 (10:29 -0700)]
[BOLT][NFC] Add aliases for ICP flags
- `indirect-call-promotion` -> `icp`
- `indirect-call-promotion-mispredict-threshold` -> `icp-mp-threshold`
- `indirect-call-promotion-use-mispredicts` -> `icp-use-mp`
- `indirect-call-promotion-topn` -> `icp-topn`
- `indirect-call-promotion-calls-topn` -> `icp-calls-topn`
- `indirect-call-promotion-jump-tables-topn` -> `icp-jt-topn`
- `icp-jump-table-targets` -> `icp-jt-targets`
This also fixes an inconsistency in ICP flag names that some start with
`indirect-call-promotion` while others start with `icp`.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D128375
Amir Ayupov [Mon, 27 Jun 2022 17:26:55 +0000 (10:26 -0700)]
[BOLT][NFC] Use llvm::less_first
Follow the case of https://reviews.llvm.org/D126068 and simplify call sites
with `llvm::less_first`.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D128242
Paul Walker [Wed, 22 Jun 2022 15:03:44 +0000 (16:03 +0100)]
[NFC][SVE] Add more tests of vector compares and selects taking an immediate operand.
Increases coverage of predicated compares (int and fp) along with
predicated zeroing of active floating point lanes.
Matt Arsenault [Thu, 9 Jun 2022 15:17:09 +0000 (11:17 -0400)]
llvm-reduce: Check shouldKeep before trying to reduce operands
No point doing the more complicated check first.
Jonas Devlieghere [Mon, 27 Jun 2022 17:00:05 +0000 (10:00 -0700)]
[lldb] Add a log dump command
Add a log dump command to dump logs to a file. This only works for
channels that have a log handler associated that supports dumping. For
now that's limited to the circular log handler, but more could be added
in the future.
Differential revision: https://reviews.llvm.org/D128557
Patrick Walton [Mon, 27 Jun 2022 17:00:43 +0000 (10:00 -0700)]
Round up zero-sized symbols to 1 byte in `.debug_aranges` (without breaking other logic).
This commit modifies the AsmPrinter to avoid emitting any zero-sized symbols to
the .debug_aranges table, by rounding their size up to 1. Entries with zero
length violate the DWARF 5 spec, which states:
> Each descriptor is a triple consisting of a segment selector, the beginning
> address within that segment of a range of text or data covered by some entry
> owned by the corresponding compilation unit, followed by the non-zero length
> of that range.
In practice, these zero-sized entries produce annoying warnings in lld and
cause GNU binutils to truncate the table when parsing it.
Other parts of LLVM, such as DWARFDebugARanges in the DebugInfo module
(specifically the appendRange method), already avoid emitting zero-sized
symbols to .debug_aranges, but not comprehensively in the AsmPrinter. In fact,
the AsmPrinter does try to avoid emitting such zero-sized symbols when labels
aren't involved, but doesn't when the symbol to emitted is a difference of two
labels; this patch extends that logic to handle the case in which the symbol is
defined via labels.
Furthermore, this patch fixes a bug in which `available_externally` symbols
would cause unpredictable values to be emitted into the `.debug_aranges` table
under certain circumstances. In practice I don't believe that this caused
issues up until now, but the root cause of this bug--an invalid DenseMap
lookup--triggered failures in Chromium when combined with an earlier version of
this patch. Therefore, this patch fixes that bug too.
This is a revised version of diff D126257, which was reverted due to breaking
tests. The now-reverted version of this patch didn't distinguish between
symbols that didn't have their size reported to the DwarfDebug handler and
those that had their size reported to be zero. This new version of the patch
instead restricts the special handling only to the symbols whose size is
definitively known to be zero.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D126835
Louis Dionne [Mon, 27 Jun 2022 15:37:01 +0000 (11:37 -0400)]
[libc++] Add a few missing min/max macro push/pop
Also, improve the test for nasty macros to define min and max, so this
will be caught in the future.
Differential Revision: https://reviews.llvm.org/D128655
Min-Yih Hsu [Tue, 24 May 2022 22:24:23 +0000 (15:24 -0700)]
[mlir][LLVMIR] Memorize compatible LLVM types
This patch memorize compatible LLVM types in `LLVM::isCompatibleType` in
order to avoid redundant works.
This is especially useful when the size of program is big and there are
multiple occurrences of some deeply nested LLVM struct types, in which
case we can gain quite some speedups with this patch.
Differential Revision: https://reviews.llvm.org/D127918
Min-Yih Hsu [Mon, 23 May 2022 20:52:38 +0000 (13:52 -0700)]
[mlir][LLVMIR] Add support for va_start/copy/end intrinsics
This patch adds three new LLVM intrinsic operations: llvm.intr.vastart/copy/end.
And its translation from LLVM IR.
This effectively removes a restriction, imposed by
0126dcf1f0a1, where
non-external functions in LLVM dialect cannot be variadic. At that time
it was not clear how LLVM intrinsics are going to be modeled, which
indirectly affects va_start/copy/end, the core intrinsics used in
variadic functions. But since we have LLVM intrinsics as normal
MLIR operations, it's not a problem anymore.
Differential Revision: https://reviews.llvm.org/D127540
Snehasish Kumar [Fri, 24 Jun 2022 23:34:18 +0000 (16:34 -0700)]
[memprof] Return an error for unsupported symbolization.
Add a check to detect that the profiled binary was build with position
independent code. Add a test with a pie binary to which can be reused
later when support is added. Also clean up the error messages with
trailing colons.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D128564
Michaล Gรณrny [Mon, 27 Jun 2022 16:22:15 +0000 (18:22 +0200)]
[lldb] [llgs] Skip new vCont test on Windows
Sponsored by: The FreeBSD Foundation
gbreynoo [Mon, 27 Jun 2022 16:10:11 +0000 (17:10 +0100)]
[llvm-ar] Fix MRI ADDLIB command when used with thin archives
We did not properly handle using CREATETHIN in an MRI script and
attempting to use ADDLIB to add the contents of a regular archive. This
fix outputs a meaningful error message in this case and provides some
more testing.
Differential Revision: https://reviews.llvm.org/D128067
Aaron Ballman [Mon, 27 Jun 2022 16:02:34 +0000 (12:02 -0400)]
Silence an "illegal conversion" diagnostic
MSVC was issuing "illegal conversion; more than one user-defined
conversion has been implicitly applied" as a warning on this code.
Explicitly calling .str() causes a StringRef to be materialized so
that a second user-defined conversion is not required.
Chi Chun Chen [Mon, 27 Jun 2022 16:02:39 +0000 (11:02 -0500)]
[Clang][OpenMP] Claim nowait clause on taskwait
Aaron Ballman [Mon, 27 Jun 2022 15:52:41 +0000 (11:52 -0400)]
Silence some format specifier warnings
This solves a format specifier warning for char32_t being converted to
an unsigned integer type, and multiple format specifier warnings for
size_t being converted to long.
Mark de Wever [Mon, 27 Jun 2022 15:42:37 +0000 (17:42 +0200)]
[libc++][doc] Fixes a broken table entry.
Jay Foad [Fri, 24 Jun 2022 12:26:50 +0000 (13:26 +0100)]
[AMDGPU] Cluster stores as well as loads for GFX11
Differential Revision: https://reviews.llvm.org/D128517
Frederic Cambus [Sun, 26 Jun 2022 22:26:22 +0000 (00:26 +0200)]
[Driver][test] Add libclang_rt.profile{{.*}}.a tests for NetBSD
Differential Revision: https://reviews.llvm.org/D128620
Ritanya B Bharadwaj [Mon, 27 Jun 2022 15:35:30 +0000 (10:35 -0500)]
Adding support for target in_reduction
Implementing target in_reduction by wrapping target task with host task with in_reduction and if clause. This is in compliance with OpenMP 5.0 section: 2.19.5.6.
So, this
```
for (int i=0; i<N; i++) {
res = res+i
}
```
will become
```
#pragma omp task in_reduction(+:res) if(0)
#pragma omp target map(res)
for (int i=0; i<N; i++) {
res = res+i
}
```
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D125669
Michaล Gรณrny [Fri, 3 Jun 2022 18:21:26 +0000 (20:21 +0200)]
[lldb] [llgs] Support "t" vCont action
Implement support for the "t" action that is used to stop a thread.
Normally this action is used only in non-stop mode. However, there's
no technical reason why it couldn't be also used in all-stop mode,
e.g. to express "resume all threads except ..." (`t:...;c`).
While at it, add a more complete test for vCont correctly resuming
a subset of program's threads.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.llvm.org/D126983
Andrzej Warzynski [Fri, 17 Jun 2022 13:08:42 +0000 (13:08 +0000)]
[flang][driver] Use `-O{0|1|2|3}` to define LLVM backend pass pipeline
Support for optimisation flags in LLVM Flang was originally added in
https://reviews.llvm.org/D128043. That patch focused on LLVM
middle-end/optimisation pipelines. With this patch, Flang will
additionally configure LLVM backend pass pipelines accordingly. This
behavior is consistent with Clang.
New hook is added to translate compiler optimisation flags (e.g. `-O3`)
into backend optimisation level: `getCGOptLevel`. Identical hooks are
available in Clang and LLVM. In other words, the meaning of these
optimisation flags remains consistent with other sub-projects that use
LLVM backends.
Differential Revision: https://reviews.llvm.org/D128050
Matthias Springer [Mon, 27 Jun 2022 15:02:45 +0000 (17:02 +0200)]
[mlir][SCF][bufferize] Small simplification and more comments
Differential Revision: https://reviews.llvm.org/D128651
Nikita Popov [Fri, 24 Jun 2022 14:02:28 +0000 (16:02 +0200)]
[GlobalOpt] Fix memset handling in global ctor evaluation (PR55859)
The global ctor evaluator currently handles by checking whether the
memset memory is already zero, and skips it in that case. However,
it only actually checks the first byte of the memory being set.
This patch extends the code to check all bytes being set. This is
done byte-by-byte to avoid converting undef values to zeros in
larger reads. However, the handling is still not completely correct,
because there might still be padding bytes (though probably this
doesn't matter much in practice, as I'd expect global variable
padding to be zero-initialized in practice).
Mostly fixes https://github.com/llvm/llvm-project/issues/55859.
Differential Revision: https://reviews.llvm.org/D128532
Lewuathe [Mon, 27 Jun 2022 12:29:15 +0000 (14:29 +0200)]
[mlir][complex] complex.arg op to calculate the angle of complex number
Add complex.arg op which calculates the angle of complex number. The op name is inspired by the function carg in libm.
See: https://sourceware.org/newlib/libm.html#carg
Differential Revision: https://reviews.llvm.org/D128531
Nikita Popov [Mon, 27 Jun 2022 14:36:10 +0000 (16:36 +0200)]
[GlobalOpt] Add tests for memset with non-zero value (NFC)
Matthias Springer [Mon, 27 Jun 2022 14:28:38 +0000 (16:28 +0200)]
[mlir][bufferize] Infer memory space in all bufferization patterns
This change updates all remaining bufferization patterns (except for scf.while) and the remaining bufferization infrastructure to infer the memory space whenever possible instead of falling back to "0". (If a default memory space is set in the bufferization options, we still fall back to that value if the memory space could not be inferred.)
Differential Revision: https://reviews.llvm.org/D128423
Than McIntosh [Mon, 27 Jun 2022 13:28:18 +0000 (09:28 -0400)]
tsan: add missing guard for DumpProcessMap call
Add a missing "#if !SANITIZER_GO" guard for a call to DumpProcessMap
in the Finalize hook (needed to build an updated Go race detector syso
image).
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D128641
Matthias Springer [Mon, 27 Jun 2022 13:59:00 +0000 (15:59 +0200)]
[mlir][bufferize][NFC] Change signature of allocateTensorForShapedValue
Add a failure return value and bufferization options argument. This is to keep a subsequent change smaller.
Differential Revision: https://reviews.llvm.org/D128278
Phoebe Wang [Mon, 27 Jun 2022 13:02:57 +0000 (21:02 +0800)]
[X86] Support `_Float16` on SSE2 and up
This is split from D113107 to address #56204 and https://discourse.llvm.org/t/how-to-build-compiler-rt-for-new-x86-half-float-abi/63366
Reviewed By: zahiraam, rjmccall, bkramer
Differential Revision: https://reviews.llvm.org/D128571
Louis Dionne [Mon, 27 Jun 2022 13:36:52 +0000 (09:36 -0400)]
[libc++][NFC] Remove trailing whitespace
Florian Hahn [Mon, 27 Jun 2022 13:17:53 +0000 (14:17 +0100)]
[Clang] Remove unused function declaration after
77475ffd22418ca72.
Louis Dionne [Mon, 27 Jun 2022 13:15:28 +0000 (09:15 -0400)]
[libc++] Remove dummy command in Dockerfile
It turns out that the Docker images on CI instances are not updated
based on what's in this file, but instead when a new image is pushed
to ldionne/libcxx-builder on DockerHub. So this is effectively useless.
Javier Setoain [Mon, 6 Jun 2022 10:55:04 +0000 (11:55 +0100)]
[mlir][llvm] Add vector insert/extract intrinsics
These intrinsics will be needed to convert between fixed-length vectors
and scalable vectors.
This operation will be needed for VLS (vector-length specific)
vectorization, when interfacing with vector functions or intrinsics that
take scalable vectors as operands in a context where the length of our
vectors is known or assumed at compile time, but we still want to
generate scalable vector instructions.
Differential Revision: https://reviews.llvm.org/D127100
Koakuma [Mon, 27 Jun 2022 12:52:10 +0000 (14:52 +0200)]
[SPARC] Don't do leaf optimization on procedures with inline assembly
On SPARC, leaf function optimization omits the register window sliding (and the associated register name changes). This might result in miscompilation of procedures containing inline assembly, as some of the register constraints used may interfere with the register usage of optimized functions, so we disable leaf procedure optimization on those procedures to prevent it from happening.
This is a continuation of patch D102342 by @LemonBoy, the original comment is reproduced below:
> Leaf functions allow the compiler to omit the setup and teardown of a frame pointer, therefore avoiding the exchange of the in/out register. According to the SPARC architecture manual every reference to %i0-%i5 should be replaced with %o0-o5, if the target register is already in use a further remapping step to %g1-%g7 is required to free the output register.
>
> Add a simple check to make sure not to stomp on any output register that's already in use.
Reviewed By: dcederman
Differential Revision: https://reviews.llvm.org/D128263
Lucas Prates [Fri, 6 May 2022 09:31:11 +0000 (10:31 +0100)]
[ARM][Thumb] Command-line option to ensure AAPCS compliant Frame Records
Currently the a AAPCS compliant frame record is not always created for
functions when it should. Although a consistent frame record might not
be required in some cases, there are still scenarios where applications
may want to make use of the call hierarchy made available trough it.
In order to enable the use of AAPCS compliant frame records whilst keep
backwards compatibility, this patch introduces a new command-line option
(`-mframe-chain=[none|aapcs|aapcs+leaf]`) for Aarch32 and Thumb backends.
The option allows users to explicitly select when to use it, and is also
useful to ensure the extra overhead introduced by the frame records is
only introduced when necessary, in particular for Thumb targets.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D125094
Valentin Clement [Mon, 27 Jun 2022 13:00:07 +0000 (15:00 +0200)]
[flang][NFC] Add IO lowering tests
These tests were left behind or only partially upstreamed during
the lower code upstreaming.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D128634
Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
Tim Northover [Mon, 27 Jun 2022 12:59:35 +0000 (13:59 +0100)]
ARM: don't try to load function pointer before long call.
Deciding to load an arbitrary global based on whether the entire module is
being built for long calls is pretty clearly spurious, and in fact the existing
indirect logic is sufficient.
Nikita Popov [Mon, 27 Jun 2022 12:54:03 +0000 (14:54 +0200)]
[IndVars] Add test for PR56242 (NFC)
Matt Arsenault [Wed, 22 Jun 2022 23:04:07 +0000 (19:04 -0400)]
MIR: Fix parse error on empty CustomRegMask
LLVM GN Syncbot [Mon, 27 Jun 2022 12:35:34 +0000 (12:35 +0000)]
[gn build] Port
633d1d0df766
Louis Dionne [Mon, 6 Jun 2022 18:01:38 +0000 (14:01 -0400)]
[libc++] Use bounded iterators in std::span when the debug mode is enabled
Previously, we'd use raw pointers when the debug mode was enabled,
which means we wouldn't get out-of-range checking with std::span's
iterators.
This patch introduces a new class called __bounded_iter which can
be used to wrap iterators and make them carry around bounds-related
information. This allows iterators to assert when they are dereferenced
outside of their bounds.
As a fly-by change, this commit removes the _LIBCPP_ABI_SPAN_POINTER_ITERATORS
knob. Indeed, not using a raw pointer as the iterator type is useful to
avoid users depending on properties of raw pointers in their code.
This is an alternative to D127401.
Differential Revision: https://reviews.llvm.org/D127418
Louis Dionne [Thu, 23 Jun 2022 19:09:52 +0000 (15:09 -0400)]
[libc++] Improve Lit's buildhost=XXXX feature on a few platforms
Differential Revision: https://reviews.llvm.org/D128455
Valentin Clement [Mon, 27 Jun 2022 12:19:23 +0000 (14:19 +0200)]
[flang][NFC] Add array lowering tests
These tests were left behind during the upstreaming of parts lowering.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D128632
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Wei Yi Tee [Mon, 27 Jun 2022 12:14:01 +0000 (14:14 +0200)]
[clang][dataflow] Singleton pointer values for null pointers.
When a `nullptr` is assigned to a pointer variable, it is wrapped in a `ImplicitCastExpr` with cast kind `CK_NullTo(Member)Pointer`. This patch assigns singleton pointer values representing null to these expressions.
For each pointee type, a singleton null `PointerValue` is created and stored in the `NullPointerVals` map of the `DataflowAnalysisContext` class. The pointee type is retrieved from the implicit cast expression, and used to initialise the `PointeeLoc` field of the `PointerValue`. The `PointeeLoc` created is not mapped to any `Value`, reflecting the absence of value indicated by null pointers.
Reviewed By: gribozavr2, sgatev, xazax.hun
Differential Revision: https://reviews.llvm.org/D128056
Nicolas Vasilache [Fri, 24 Jun 2022 09:26:22 +0000 (02:26 -0700)]
[SCF] Add thread_dim_mapping attribute to scf.foreach_thread
An optional thread_dim_mapping index array attribute specifies for each
virtual thread dimension, how it remaps 1-1 to a set of concrete processing
element resources (e.g. a CUDA grid dimension or a level of concrete nested
async parallelism). At this time, the specification is backend-dependent and
is not verified by the op, beyond being an index array attribute.
It is the reponsibility of the lowering to interpret the index array in the
context of the concrete target the op is lowered to, or to ignore it when
the specification is ill-formed or unsupported for a particular target.
Differential Revision: https://reviews.llvm.org/D128633
Matthias Springer [Mon, 27 Jun 2022 11:46:37 +0000 (13:46 +0200)]
[mlir][bufferization][NFC] Add error handling to getBuffer
This is in preparation of adding memory space support.
Differential Revision: https://reviews.llvm.org/D128277
Matthias Springer [Mon, 27 Jun 2022 11:41:18 +0000 (13:41 +0200)]
[mlir][bufferization][NFC] Fix typo in AllocTensorOp builders
Matthias Springer [Mon, 27 Jun 2022 11:25:52 +0000 (13:25 +0200)]
[mlir][SCF][bufferize][NFC] Bufferize scf.for terminator separately
This allows for better type inference during bufferization and is in preparation of supporting memory spaces.
Differential Revision: https://reviews.llvm.org/D128422
Matthias Springer [Mon, 27 Jun 2022 11:17:45 +0000 (13:17 +0200)]
[mlir][SCF][bufferize] Bufferize scf.if/execute_region terminators separately
This allows for better type inference during bufferization and is in preparation of supporting memory spaces.
Differential Revision: https://reviews.llvm.org/D128581
Matthias Springer [Mon, 27 Jun 2022 11:10:00 +0000 (13:10 +0200)]
[mlir][SCF][bufferize][NFC] Bufferize parallel_insert_slice separately
This allows for better type inference during bufferization and is in preparation of supporting memory spaces.
Differential Revision: https://reviews.llvm.org/D128580
Jay Foad [Mon, 27 Jun 2022 11:15:25 +0000 (12:15 +0100)]
[AMDGPU] Regenerate MIR checks. NFC.
Aaron Ballman [Mon, 27 Jun 2022 11:12:36 +0000 (07:12 -0400)]
Fix clang docs build; NFC
This should address the break from:
https://lab.llvm.org/buildbot/#/builders/92/builds/28769
Matthias Springer [Mon, 27 Jun 2022 10:52:59 +0000 (12:52 +0200)]
[mlir][shape][bufferize][NFC] Bufferize block terminators separately
This allows for better type inference during bufferization and is in preparation of supporting memory spaces.
Differential Revision: https://reviews.llvm.org/D128579
Dmitry Preobrazhensky [Mon, 27 Jun 2022 11:01:09 +0000 (14:01 +0300)]
[AMDGPU][GFX9][DOC][NFC] Update assembler syntax description
Summary of changes:
- Updated MUBUF lds syntax (see https://reviews.llvm.org/D124485).
- Updated SMEM syntax (see https://reviews.llvm.org/D127314).
- Enabled src0=literal for v_madak*, v_madmk* (see https://reviews.llvm.org/D111067).
- Removed SYSMSG_OP_HOST_TRAP_ACK message.
- Minor bug fixing and improvements.
Edd Barrett [Mon, 27 Jun 2022 10:53:38 +0000 (11:53 +0100)]
[STACKMAPS] Document+test UINT64_MAX stack size.
When a function does a dynamic stack allocation, the function's stack
size (in the stack map) is reported as UINT64_MAX.
This change tests and documents this property.
Differential Revision: https://reviews.llvm.org/D128525
Bradley Smith [Thu, 16 Jun 2022 14:45:28 +0000 (14:45 +0000)]
[IR] Move vector.insert/vector.extract out of experimental namespace
These intrinsics are now fundemental for SVE code generation and have been
present for a year and a half, hence move them out of the experimental
namespace.
Differential Revision: https://reviews.llvm.org/D127976
Simon Pilgrim [Mon, 27 Jun 2022 10:46:42 +0000 (11:46 +0100)]
[X86] combineConcatVectorOps - IsConcatFree must check extraction index
Identified in the regression reported by @alexfh on rGb5d7beeb9792 - IsConcatFree wasn't ensuring the subvector extraction index matched the position it would be concatenated back into.
Matthias Springer [Mon, 27 Jun 2022 10:42:07 +0000 (12:42 +0200)]
[mlir][bufferization][NFC] Bufferize with PostOrder traversal
This is useful because the result type of an op can sometimes be inferred from its body (e.g., `scf.if`). This will be utilized in subsequent changes.
Also introduces a new `getBufferType` interface method on BufferizableOpInterface. This method is useful for computing a bufferized block argument type with respect to OpOperand types of the parent op.
Differential Revision: https://reviews.llvm.org/D128420
Jolanta Jensen [Mon, 13 Jun 2022 13:37:19 +0000 (14:37 +0100)]
[AArch64] Define __FP_FAST_FMA[F]
Libraries use this flag to decide whether to use the fma builtin.
Author: Paul Walker
Differential Revision: https://reviews.llvm.org/D127655
Matthias Springer [Mon, 27 Jun 2022 10:31:55 +0000 (12:31 +0200)]
[mlir][bufferization] Add `memory_space` op attribute
This attribute is currently supported on AllocTensorOp only. Future changes will add support to other ops. Furthermore, the memory space is not propagated properly in all bufferization patterns and some of the core bufferization infrastructure. This will be addressed in a subsequent change.
Differential Revision: https://reviews.llvm.org/D128274
gbreynoo [Mon, 27 Jun 2022 10:08:36 +0000 (11:08 +0100)]
[llvm-ar] Improve MRI script CREATE command handling
I discovered that when compared to GNU the llvm-ar MRI script parsing of
CREATE could lead to some strange behaviour. This fix improves the error
message in the case when no archive name is given and will not allow the
adding of members until CREATE is called. Along with this change I added
more testing of the CREATE command.
Differential Revision: https://reviews.llvm.org/D128055
Andrzej Warzynski [Mon, 6 Jun 2022 09:44:21 +0000 (09:44 +0000)]
[flang][driver] Add support for `-O{0|1|2|3}`
This patch adds support for most common optimisation compiler flags:
`-O{0|1|2|3}`. This is implemented in both the compiler and frontend
drivers. At this point, these options are only used to configure the
LLVM optimisation pipelines (aka middle-end). LLVM backend or MLIR/FIR
optimisations are not supported yet.
Previously, the middle-end pass manager was only required when
generating LLVM bitcode (i.e. for `flang-new -c -emit-llvm <file>` or
`flang-new -fc1 -emit-llvm-bc <file>`). With this change, it becomes
required for all frontend actions that are represented as
`CodeGenAction` and `CodeGenAction::executeAction` is refactored
accordingly (in the spirit of better code re-use).
Additionally, the `-fdebug-pass-manager` option is enabled to facilitate
testing. This flag can be used to configure the pass manager to print
the middle-end passes that are being run. Similar option exists in Clang
and the semantics in Flang are identical. This option translates to
extra configuration when setting up the pass manager. This is
implemented in `CodeGenAction::runOptimizationPipeline`.
This patch also adds some bolier plate code to manage code-gen options
("code-gen" refers to generating machine code in LLVM in this context).
This was extracted from Clang. In Clang, it simplifies defining code-gen
options and enables option marshalling. In Flang, option marshalling is
not yet supported (we might do at some point), but being able to
auto-generate some code with macros is beneficial. This will become
particularly apparent when we start adding more options (at least in
Clang, the list of code-gen options is rather long).
Differential Revision: https://reviews.llvm.org/D128043
Wei Yi Tee [Mon, 27 Jun 2022 09:18:01 +0000 (11:18 +0200)]
[clang][dataflow] Implement functionality for flow condition variable substitution.
This patch introduces `buildAndSubstituteFlowCondition` - given a flow condition token, this function returns the expression of constraints defining the flow condition, with values substituted where specified.
As an example:
Say we have tokens `FC1`, `FC2`, `FC3`:
```
FlowConditionConstraints: {
FC1: C1,
FC2: C2,
FC3: (FC1 v FC2) ^ C3,
}
```
`buildAndSubstituteFlowCondition(FC3, /*Substitutions:*/{{C1 -> C1'}})`
returns a value corresponding to `(C1' v C2) ^ C3`.
Note:
This function returns the flow condition expressed directly as its constraints, which differs to how we currently represent the flow condition as a token bound to a set of constraints and dependencies. Making the representation consistent may be an option to consider in the future.
Depends On D128357
Reviewed By: gribozavr2, xazax.hun
Differential Revision: https://reviews.llvm.org/D128363
Andrzej Warzynski [Thu, 23 Jun 2022 09:29:12 +0000 (09:29 +0000)]
[flang] Update the release notes
Document changes introduced in https://reviews.llvm.org/D126164.
Differential Revision: https://reviews.llvm.org/D128413
Wei Yi Tee [Mon, 27 Jun 2022 09:12:37 +0000 (11:12 +0200)]
[clang][dataflow] Move logic for `createStorageLocation` from `DataflowEnvironment` to `DataflowAnalysisContext`.
`createStorageLocation` in `DataflowEnvironment` is now a trivial wrapper around the logic in `DataflowAnalysisContext`.
Additionally, `getObjectFields` and `getFieldsFromClassHierarchy` (required for the implementation of `createStorageLocation`) are also moved to `DataflowAnalysisContext`.
Reviewed By: gribozavr2, sgatev
Differential Revision: https://reviews.llvm.org/D128359
Siva Chandra Reddy [Fri, 24 Jun 2022 17:45:02 +0000 (17:45 +0000)]
[libc] Add a simple arm32 config.
This will be expanded in future as more functions are brought up on arm32.
Sven van Haastregt [Mon, 27 Jun 2022 08:55:44 +0000 (09:55 +0100)]
[OpenCL] Reduce emitting candidate notes for builtins
When overload resolution fails, clang emits a note diagnostic for each
candidate. For OpenCL builtins this often leads to many repeated note
diagnostics with no new information. Stop emitting such notes.
Update a test that was relying on counting those notes to check how
many builtins are available for certain extension configurations.
Differential Revision: https://reviews.llvm.org/D127961
Nikita Popov [Mon, 27 Jun 2022 08:50:33 +0000 (10:50 +0200)]
[SCEV] Assert that GEP source element type is sized (NFC)
This is checked by the IR verifier, so replace the condition with
an assert.
Jay Foad [Thu, 23 Jun 2022 12:35:02 +0000 (13:35 +0100)]
[AMDGPU] Fix assertion failure on mad with negative immediate addend
Without this, the new test case would fail with:
AMDGPUInstPrinter.cpp:545: void llvm::AMDGPUInstPrinter::printImmediate64(uint64_t, const llvm::MCSubtargetInfo &, llvm::raw_ostream &): Assertion `isUInt<32>(Imm) || Imm == 0x3fc45f306dc9c882' failed.
Differential Revision: https://reviews.llvm.org/D128435
Siva Chandra Reddy [Sat, 25 Jun 2022 08:28:57 +0000 (08:28 +0000)]
[libc][NFC] Make the support thread library an object library.
It was previously a header library. Making it an object library will
allow us to declare thread local variables which can used to setup a
thread's self object.
Matthias Springer [Mon, 27 Jun 2022 08:01:58 +0000 (10:01 +0200)]
[mlir][bufferization][NFC] Change signature of getMemRefType
These functions now accep unsigned attributes for address spaces instead of Attributes.
Differential Revision: https://reviews.llvm.org/D128275
Simon Tatham [Mon, 27 Jun 2022 08:36:20 +0000 (09:36 +0100)]
[libunwind,EHABI,ARM] Fix get/set of RA_AUTH_CODE.
According to EHABI32 ยง8.5.2, the PAC for the return address of a
function described in an exception table is supposed to be addressed
in the _Unwind_VRS_{Get,Set} API by setting regclass=_UVRSC_PSEUDO and
regno=0. (The space of 'regno' values is independent for each
regclass, and for _UVRSC_PSEUDO, there is only one valid regno so far.)
That is indeed what libunwind's _Unwind_VRS_{Get,Set} functions expect
to receive. But at two call sites, the wrong values are passed in:
regno is being set to UNW_ARM_RA_AUTH_CODE (0x8F) instead of 0, and in
one case, regclass is _UVRSC_CORE instead of _UVRSC_PSEUDO.
As a result, those calls to _Unwind_VRS_{Get,Set} return
_UVRSR_FAILED, which their callers ignore. So if you compile in the
AUTG instruction that actually validates the PAC, it will try to
validate what's effectively an uninitialised register as an
authentication code, and trigger a CPU fault even on correct exception
unwinding.
Reviewed By: danielkiss
Differential Revision: https://reviews.llvm.org/D128522
Florian Hahn [Mon, 27 Jun 2022 08:33:04 +0000 (09:33 +0100)]
[SCEV] Use SCEVUnknown(poison) instead of SCEVUnknown(undef).
Use poison instead of undef for SCEVUnkown of unreachable values.
This should be in line with the movement to replace undef with poison
when possible.
Suggested in D114650.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D128586