Joseph Huber [Tue, 17 Jan 2023 20:25:29 +0000 (14:25 -0600)]
[Clang] Configure definitions for amdgpu/nvptx arch query tools
Summary:
These tools are built unconditionally now. However, there seemed to be
problems where the headers would be found during cross compilation, but
no libraries present. To combat this we should elect to make the CMake
indicate whether or not we should use the dynamic library method or link
it directly rather than using `__has_include`.
Joe Loser [Mon, 16 Jan 2023 21:52:16 +0000 (14:52 -0700)]
[llvm][ADT] Mark `makeMutableArrayRef` as deprecated
Now that all of the uses of `makeMutableArrayRef` are replaced in-tree with use
of deduction guides (see
https://github.com/llvm/llvm-project/commit/
a288d7f937708cf67d960962bfa22ffae37ddbf4),
mark `makeMutableArrayRef` as deprecated.
Also remove the old tests for `makeMutableArrayRef` in favor of the ones
introduced with the deduction guides in
https://github.com/llvm/llvm-project/commit/
38791259c1165cedfa313e06dc20e443f1e20634.
Differential Revision: https://reviews.llvm.org/D141872
Sanjay Patel [Tue, 17 Jan 2023 19:37:18 +0000 (14:37 -0500)]
[InstCombine] factor difference-of-squares to reduce multiplication
(X * X) - (Y * Y) --> (X + Y) * (X - Y)
https://alive2.llvm.org/ce/z/BAuRCf
The no-wrap propagation could be relaxed in some cases,
but there does not seem to be an obvious rule for that.
Sanjay Patel [Tue, 17 Jan 2023 18:32:50 +0000 (13:32 -0500)]
[InstCombine] add tests for difference-of-squares; NFC
Craig Topper [Tue, 17 Jan 2023 19:50:54 +0000 (11:50 -0800)]
[RISCV] Remove MCRegisterInfo dependency from compressInst/uncompresInst/isCompressibleInst.
This was being used to lookup the register class for a register number,
but those live in a tablegened array. We can index that array directly
just like RISCVAsmParser does.
Differential Revision: https://reviews.llvm.org/D141951
Sergei Barannikov [Sun, 15 Jan 2023 10:02:26 +0000 (13:02 +0300)]
[MC] Use MCRegister instead of unsigned in MCInstPrinter (NFC)
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D140654
Craig Topper [Tue, 17 Jan 2023 19:32:08 +0000 (11:32 -0800)]
[RISCV] Use Zvl*b as a lower bound for VScaleRange.
The backend has a fatal error in RISCVSubtarget::getMinRVVVectorSizeInBits
if RVVVectorBitsMin is less than the Zvl length from -march. Now
RVVVectorBitsMin is connected to VScaleRange in the backend, we
can trip this fatal error.
This patch adds the Zvl*b length as a lower bound to protect this.
The test is updated to test vscale-min with Zvl64b instead of V.
I'd like to do a proper diagnostic for this, but I don't think we
can do that from this function. Since -mvscale-min is an internal cc1
option, I'm not sure it's a big deal.
I'm planning to add a driver option -msve-vector-bits. I will
probably implement a diagnostic for that.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D141459
Aaron Ballman [Tue, 17 Jan 2023 19:26:29 +0000 (14:26 -0500)]
Diagnose extensions in 'offsetof'
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2350.htm made very
clear that it is an UB having type definitions with in offsetof.
Clang supports defining a type as the first argument as a conforming
extension due to how many projects use the construct in C99 and earlier
to calculate the alignment of a type. GCC also supports defining a type
as the first argument.
This adds extension warnings and documentation for the functionality
Clang explicitly supports.
Fixes #57065
Co-authored-by: Yingchi Long <i@lyc.dev>
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
Paul Robinson [Tue, 17 Jan 2023 19:26:51 +0000 (11:26 -0800)]
[PS5] Handle visibility options same as PS4
This update was missed in the initial rounds of upstreaming PS5.
Paul Robinson [Tue, 17 Jan 2023 18:47:07 +0000 (10:47 -0800)]
[PS4] NFC: rewrite a test to use lit's DEFINE feature
Preparatory to running the same test for PS5.
Frederik Gossen [Tue, 17 Jan 2023 19:07:33 +0000 (14:07 -0500)]
[MLIR] Add return type inference to scf.if builder
Differential Revision: https://reviews.llvm.org/D141928
Noah Goldstein [Tue, 17 Jan 2023 18:41:31 +0000 (10:41 -0800)]
Add additional tests for ctlz{_zero_undef} to test folding with xor; NFC
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D141549
Ashay Rane [Tue, 17 Jan 2023 18:55:02 +0000 (10:55 -0800)]
[mlir] fix dereferencing of optional sym_name attribute
`sym_name` is an optional attribute of `ModuleOp`, so it is unsafe to
fetch the underlying value without checking whether it is non-empty.
Such unsafe dereferencing causes the lower-host-to-llvm-calls_fail.mlir
test to segfault. Although this bug existed for four months, it wasn't
triggered, since previous tests executed a code path that used a default
value instead of one fetched from the module attribute.
This patch makes the code use a default value if the optional attribute
does not have a value.
Reviewed By: stella.stamenova
Differential Revision: https://reviews.llvm.org/D141941
Joseph Huber [Tue, 17 Jan 2023 15:34:49 +0000 (09:34 -0600)]
[OpenMP] Make `-Xarch_host` and `-Xarch_device` work for OpenMP offloading
Clang currently supports the `-Xarch_host` and `-Xarch_device` variants
to handle passing arguments to only one part of the offloading
toolchain. This was previously only supported fully for HIP / CUDA This
patch simple updates the logic to make it work for any offloading kind.
Fixes #59799
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D141935
Joseph Huber [Tue, 17 Jan 2023 14:55:55 +0000 (08:55 -0600)]
[Libomptarget] Replace Nvidia arch lookup with 'nvptx-arch'
This method to look up the CUDA architecture is deprecated in newer
versions of CMake. We also have our own way to query this information
that we control now via the `nvptx-arch` program, which should always be
present in LLVM builds with clang going forward. This is currently only
used for testing so I think we should be okay with the dependency.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D141933
Augusto Noronha [Fri, 13 Jan 2023 21:30:41 +0000 (13:30 -0800)]
[lldb] Only allow SymbolFiles to construct Types
SymbolFiles should own Types by keeping them in their TypeList. This
patch privates the Type constructor to guarantee that every created Type
is kept in the SymbolFile's type list.
Mitch Phillips [Fri, 13 Jan 2023 00:01:06 +0000 (16:01 -0800)]
Reland: [GWP-ASan] Add recoverable mode.
The GWP-ASan recoverable mode allows a process to continue to function
after a GWP-ASan error is detected. The error will continue to be
dumped, but GWP-ASan now has APIs that a signal handler (like the
example optional crash handler) can call in order to allow the
continuation of a process.
When an error occurs with an allocation, the slot used for that
allocation will be permanently disabled. This means that free() of that
pointer is a no-op, and use-after-frees will succeed (writing and
reading the data present in the page).
For heap-buffer-overflow/underflow, the guard page is marked as accessible
and buffer-overflows will succeed (writing and reading the data present
in the now-accessible guard page). This does impact adjacent
allocations, buffer-underflow and buffer-overflows from adjacent
allocations will no longer touch an inaccessible guard page. This could
be improved in future by having two guard pages between each adjacent
allocation, but that's out of scope of this patch.
Each allocation only ever has a single error report generated. It's
whatever came first between invalid-free, double-free, use-after-free or
heap-buffer-overflow, but only one.
Reviewed By: eugenis, fmayer
Differential Revision: https://reviews.llvm.org/D140173
Slava Zakharin [Tue, 17 Jan 2023 17:08:43 +0000 (09:08 -0800)]
[flang] Generate TBAA information.
This is initial version of TBAA information generation for Flang
generated IR. The desired behavior is that TBAA type descriptors
are generated for FIR types during FIR to LLVM types conversion,
and then TBAA access tags are attached to memory accessing operations
when they are converted to LLVM IR dialect.
In the initial version the type conversion is not producing
TBAA type descriptors, and all memory accesses are just partitioned
into two sets of box and non-box accesses, which can never alias.
The TBAA generation is enabled by default at >O0 optimization levels.
TBAA generation may also be enabled via `apply-tbaa` option of
`fir-to-llvm-ir` conversion pass. `-mllvm -disable-tbaa` engineering
option allows disabling TBAA generation to override Flang's default
(e.g. when -O1 is used).
SPEC CPU2006/437.leslie3d speeds up by more than 2x on Icelake.
Reviewed By: jeanPerier, clementval
Differential Revision: https://reviews.llvm.org/D141820
Anshil Gandhi [Tue, 17 Jan 2023 17:36:05 +0000 (10:36 -0700)]
[InstCombine] Handle PHI nodes in PtrReplacer
This patch adds on to the functionality implemented
in rG42ab5dc5a5dd6c79476104bdc921afa2a18559cf,
where PHI nodes are supported in the use-def traversal
algorithm to determine if an alloca ever overwritten
in addition to a memmove/memcpy. This patch implements
the support needed by the PointerReplacer to collect
all (indirect) users of the alloca in cases where a PHI
is involved. Finally, a new PHI is defined in the replace
method which takes in replaced incoming values and
updates the WorkMap accordingly.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D136201
Lorenzo Chelini [Tue, 17 Jan 2023 17:47:36 +0000 (18:47 +0100)]
[MLIR][SCF] Fix comment in `TestTilingInterface.cpp` (NFC)
The method is called `tileConsumerAndFuseProducerGreedilyUsingSCFForOp`
and not `tileAndFuseGreedilyUsingSCFForOp`.
Thurston Dang [Thu, 12 Jan 2023 23:18:17 +0000 (23:18 +0000)]
tsan: fix broken aarch64_39/42 mappings and expand them
The aarch64 39- and 42-bit mappings were broken: mappings to meta and shadow were not fully invertible. This CL introduces a working set of mappings, and also increases the size of some app regions:
* aarch64, 39-bit (2^39 == 512GB):
- Low: (Old) 4GB -> (New) 20GB
- Mid: 4GB -> 20GB
- Heap: 4GB -> 12GB
- High: 8GB -> 12GB
* aarch64, 42-bit (2^42 == 4TB):
- Low: 64GB -> 128GB
- Mid: 4GB -> 88GB
- Heap: 64GB -> 192GB
- High: 64GB
Additionally, this CL improves the code comments for all the linux aarch64 mappings.
Differential Revision: https://reviews.llvm.org/D141640
Thomas Raoux [Tue, 17 Jan 2023 17:25:51 +0000 (17:25 +0000)]
[mlir][vector] Fix extract op canonicalization for 0d vector
Fix ExtractOpFromBroadcast when the broadcast source is a 0d vector.
Differential Revision: https://reviews.llvm.org/D141735
Thomas Raoux [Fri, 13 Jan 2023 19:53:56 +0000 (19:53 +0000)]
[mlir][gpu] Improve foreach_thread distribution
Replace Ids with 0 when block dim is 1 when distributing foreach_thread.
Differential Revision: https://reviews.llvm.org/D141718
Thomas Raoux [Tue, 17 Jan 2023 17:05:11 +0000 (17:05 +0000)]
[mlir][vector] Add extra lowering for more transfer_write maps
Add pattern to lower transfer_write with permutation map that are not
permutation of minor identity map.
Differential Revision: https://reviews.llvm.org/D141815
Christopher Bate [Thu, 8 Dec 2022 00:28:27 +0000 (17:28 -0700)]
[mlir][EmitC] Remove Pure trait from `emitc.include`
The op `emitc.include` does not have results and thus will be elided
during canonicalization, which is not correct behavior. This change
removes the 'Pure' trait and adds a canonicalization test.
Reviewed By: jpienaar, marbre
Differential Revision: https://reviews.llvm.org/D141704
Thomas Raoux [Sun, 15 Jan 2023 07:25:00 +0000 (07:25 +0000)]
[mlir][vector] Fix lowering of permutation maps for transfer_write op
The lowering of transfer write permutation maps didn't match the op definition:
https://github.com/llvm/llvm-project/blob/
93ccccb00d9717b58ba93f0942a243ba6dac4ef6/mlir/include/mlir/Dialect/Vector/IR/VectorOps.td#L1476
Fix the lowering and add a case to the integration test in
order to enforce the correct semantic.
Differential Revision: https://reviews.llvm.org/D141801
Alex Brachet [Tue, 17 Jan 2023 17:02:50 +0000 (17:02 +0000)]
[scudo] Fix -Wsign-compare warning
Mehdi Amini [Tue, 17 Jan 2023 08:28:08 +0000 (08:28 +0000)]
Fix crash in LLVM Dialect inliner interface: add support for llvm.return
The LLVM inliner was missing the `handleTerminator` method in the
Dialect interface implementation.
Fixes #60093
Differential Revision: https://reviews.llvm.org/D141901
Mehdi Amini [Tue, 17 Jan 2023 16:20:20 +0000 (16:20 +0000)]
Fix crash in scf.parallel verifier
Fixes #59989
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D141911
Kadir Cetinkaya [Tue, 17 Jan 2023 16:11:02 +0000 (17:11 +0100)]
[clangd] Disable modernize-macro-to-enum tidy check
Check relies on seeing PP-directives from preamble, hence it's unusable.
See https://github.com/clangd/clangd/issues/1464.
Nikita Popov [Tue, 17 Jan 2023 15:53:59 +0000 (16:53 +0100)]
[CVP] Avoid duplicate range calculation (NFC)
Calculate the range once for all the sdiv/srem transforms.
David Green [Tue, 17 Jan 2023 15:49:29 +0000 (15:49 +0000)]
[AArch64][SVE] Implement isVScaleKnownToBeAPowerOfTwo
According to https://developer.arm.com/documentation/102105/ia-00/?lang=en
> Arm is making a retrospective change to the SVE architecture to remove
> the capability of selecting a non-power-of-two vector length in
> non-Streaming SVE as well as in Streaming SVE mode. Specific updates as
> a result of this change will be communicated in due course.
This patch implements the isVScaleKnownToBeAPowerOfTwo method to teach
DAG Combines that VScale will be known to be a power of 2, which helps
reduce or simplify some expressions (notably the udiv in vector trip
count expressions).
Differential Revision: https://reviews.llvm.org/D141486
Nikita Popov [Tue, 17 Jan 2023 15:39:27 +0000 (16:39 +0100)]
[CVP] Avoid duplicate range calculation (NFC)
Calculate the range once and use it in processURem() and
narrowUDivOrURem().
Nikita Popov [Tue, 17 Jan 2023 15:22:57 +0000 (16:22 +0100)]
[CVP] Handle use-site conditions in domain-based folds
As a side-effect, this switchem them to use getConstantRange() rather
than getPredicateAt(). getPredicateAt() is not supposed to be more
powerful than getConstantRange() for non-equality comparisons (as
long as block values are used).
Erich Keane [Tue, 17 Jan 2023 14:12:40 +0000 (06:12 -0800)]
Revert "[clang] Instantiate concepts with sugared template arguments"
This reverts commit
b8064374b217db061213c561ec8f3376681ff9c8.
Based on the report here:
https://github.com/llvm/llvm-project/issues/59271
this produces a significant increase in memory use of the compiler and a
large compile-time regression. This patch reverts this so that we don't
branch for release with that issue.
Nikita Popov [Tue, 17 Jan 2023 15:10:41 +0000 (16:10 +0100)]
[CVP] Handle use-site conditions in more folds
Valentin Clement [Tue, 17 Jan 2023 15:11:45 +0000 (16:11 +0100)]
[flang] Support allocate with source for polymorphic entities
Apply the source type spec to the descriptor for
polyrmophic entities.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D141822
Florian Hahn [Tue, 17 Jan 2023 15:11:37 +0000 (15:11 +0000)]
[VPlan] Remove duplicated VPValue IDs (NFCI).
At the moment, both VPValue and VPDef have an ID used when casting via
classof. This duplication is cumbersome, because it requires adding IDs
for new recipes twice and also requires setting them twice. In a few
cases, there's only a VPDef ID and no VPValue ID, which can cause same
confusion.
To simplify things, remove the VPValue IDs for different recipes.
Instead, only retain the generic VPValue ID (= used VPValues without a
corresponding defining recipe) and VPVRecipe for VPValues that are
defined by recipes that inherit from VPValue.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D140848
Nicolas Vasilache [Tue, 17 Jan 2023 14:25:21 +0000 (06:25 -0800)]
[mlir][Transform] Add a transform.get_consumers_of_result navigation op
Differential Revision: https://reviews.llvm.org/D141930
Francesco Petrogalli [Tue, 17 Jan 2023 11:02:08 +0000 (12:02 +0100)]
[MIScheduler] Print top/down cycle in the SUnit dump.
Add an extra command line option to `llc` that allows checking at what cycle an instruction has been scheduled by the machine scheduler.
Differential Revision: https://reviews.llvm.org/D141289
Valentin Clement [Tue, 17 Jan 2023 14:51:04 +0000 (15:51 +0100)]
[flang] Lower allocation with MOLD
Lower allocate statement with MOLD= to calls to the Fortran
runtime. PointerApplyMold and AllocatableApplyMold are called
depending on the object to be allocated.
Reviewed By: jeanPerier, PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D141843
Raghu Maddhipatla [Thu, 12 Jan 2023 06:32:06 +0000 (00:32 -0600)]
[Flang] [OpenMP] Refine parser restrictions for OMP TARGET UPDATE clauses.
In Parser, move some clauses of OMP TARGET UPDATE to allowedOnceClauses so that restrictions will be imposed.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D141567
Amy Wang [Tue, 17 Jan 2023 14:33:36 +0000 (09:33 -0500)]
[MLIR][Transform] Introduce loop.coalesce transform op.
This patch made a minor refactor of LoopCoalescing.cpp's walkLoops
templated method and placed it in Affine's LoopUtils.cpp/h.
This method is also renamed as coalescePerfectlyNestedLoops method. This
minor change enables this method to be invoked
by both the original LoopCoalescing pass as well as the newly introduced
loop.coalesce transform op.
The loop.coalesce transform op has the ability to coalesce affine, and
scf loop nests, leveraging existing LoopCoalescing
mechanism. I have created it inside the SCFTransformOps.td instead of
AffineTransformOps.td as it feels to be similar
in spirit as the loop.unroll op that can handle both scf and affine
loops. Please let me know if you feel that this op
should be moved into AffineTransformOps.td instead.
The testcase added illustrates loop.coalesce transform op working for
scf, affine loops (inner, outer) as well as
coalesced loop can be further unrolled (achieving composibility).
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D141202
Alex Bradbury [Tue, 17 Jan 2023 14:28:15 +0000 (14:28 +0000)]
[clang-repl] XFAIL riscv targets in simple-exception test case
This test fails for RISC-V and Arm targets are already XFAILed, so add
RISC-V to the XFAIL list.
Differential Revision: https://reviews.llvm.org/D141380
luxufan [Tue, 17 Jan 2023 13:21:11 +0000 (21:21 +0800)]
[InstCombine] Don't combine smul of i1 type constant one
Fixes: https://github.com/llvm/llvm-project/issues/59876
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D141214
Leonardo Sandoval [Tue, 17 Jan 2023 12:54:19 +0000 (12:54 +0000)]
[flang] fix FIRLangRef.md path
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D141416
Nicolas Vasilache [Tue, 17 Jan 2023 13:40:23 +0000 (05:40 -0800)]
[mlir][Linalg] Fix post-commit typo for
5443743ca1874acfe2d5654fedd4a0c0bed6777e
Nicolas Vasilache [Mon, 16 Jan 2023 17:06:14 +0000 (09:06 -0800)]
[mlir][Linalg] Add a transform.structured.pack operation
This revision introduces a `transform.structured.pack` operation to
transform any Linalg operation to a higher-dimensional Linalg operation on
packed operands.
`tensor.pack` (resp. `tensor.unpack`) operations are inserted for the operands
(resp. results) that need to be packed (resp. unpacked) according to the
`packed_sizes` specification.
At the moment, the packing operation always pads with `getZeroAttr` which will
need to be adjusted depending on the consumers.
Packing is limited to those dimensions that are indexed only by AffineDimExpr.
Packing more advanced indexings requires modular arithmetic that is outside the
scoped of a `linalg.generic` at the moment.
Differential Revision: https://reviews.llvm.org/D141860
Paul Walker [Tue, 17 Jan 2023 13:06:14 +0000 (13:06 +0000)]
[AArch64][SVE] Fix typo after post review change to D141471.
Weining Lu [Tue, 17 Jan 2023 12:53:58 +0000 (20:53 +0800)]
[docs] Add llvm & clang release notes for LoongArch
Reviewed By: rengolin
Differential Revision: https://reviews.llvm.org/D141750
zhongyunde [Tue, 17 Jan 2023 12:43:05 +0000 (20:43 +0800)]
[AArch64][SVE] Fix crash for DestructiveBinaryComm zero merging
This fix is similar to D124325, and I find the DestructiveBinaryComm
operation type also may be allocated same register, so insert the LSL.
movprfx z0.s, p0/z, z0.s
lsl z0.b, p0/m, z0.b, #0
fmul z0.s, p0/m, z0.s, z0.s
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D141471
Jean Perier [Tue, 17 Jan 2023 12:40:44 +0000 (13:40 +0100)]
[flang][hlfir] Lower some character elemental references
Lower character elemental user procedures with constant length, and
bot dynamic and constant length ADJUSTL, ADJUSTR, and MERGE references
(which leaves out MIN/MAX).
Character elemental user procedures with dynamic length are a bit more
involving and since it is an edge-case that is not currently supported,
I will take this on later.
Differential Revision: https://reviews.llvm.org/D141847
Abid Malik [Tue, 17 Jan 2023 11:54:23 +0000 (11:54 +0000)]
[flang][OpenMP] Parser support for the unroll construct (5.1)
added parser support for the unroll construct
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D138229
Markus Böck [Tue, 10 Jan 2023 19:40:25 +0000 (20:40 +0100)]
[mlir][Tensor][NFC] Migrate Tensor dialect to the new fold API
See https://discourse.llvm.org/t/psa-new-improved-fold-method-signature-has-landed-please-update-your-downstream-projects/67618 for context
Differential Revision: https://reviews.llvm.org/D141530
Sebastian Neubauer [Tue, 17 Jan 2023 12:18:26 +0000 (13:18 +0100)]
[BitcodeReader] Allow reading pointer types from old IR
When opaque pointers are enabled and old IR with typed pointers is read,
the BitcodeReader automatically upgrades all typed pointers to opaque
pointers. This is a lossy conversion, i.e. when a function argument is a
pointer and unused, it’s impossible to reconstruct the original type
behind the pointer.
There are cases where the type information of pointers is needed. One is
reading DXIL, which is bitcode of old LLVM IR and makes a lot of use of
pointers in function signatures.
We’d like to keep using up-to-date llvm to read in and process DXIL, so
in the face of opaque pointers, we need some way to access the type
information of pointers from the read bitcode.
This patch allows extracting type information by supplying functions to
parseBitcodeFile that get called for each function signature or metadata
value. The function can access the type information via the reader’s
type IDs and the getTypeByID and getContainedTypeID functions.
The tests exemplarily shows how type info from pointers can be stored in
metadata for use after the BitcodeReader finished.
Differential Revision: https://reviews.llvm.org/D127728
Johannes Reifferscheid [Tue, 17 Jan 2023 12:17:56 +0000 (13:17 +0100)]
Fix bazel build overlay.
Florian Hahn [Tue, 17 Jan 2023 12:12:10 +0000 (12:12 +0000)]
[VPlan] Add test for VPAllSuccessorIterator directly. (NFC)
Additional test coverage for D140511.
Abid Malik [Mon, 16 Jan 2023 17:41:30 +0000 (17:41 +0000)]
[flang][OpenMP] Added parser support for Tile Construct ( OpenMP 5.1)
Added parser support for Tile Construct .
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D136359
Florian Hahn [Tue, 17 Jan 2023 11:44:49 +0000 (11:44 +0000)]
[VPlan] Remove unnecessary getNumSuccessors call (NFC).
If ParentWithSuccs is nullptr, the number of successors is guaranteed to
be 0. Simplify the code as suggested by @Ayal in D140511.
David Green [Tue, 17 Jan 2023 11:29:51 +0000 (11:29 +0000)]
[ARM] Fix i1 shuffle lowering with multiple operands.
The existing lowering of i1 vector shuffle was only considering
single-source shuffles, always assuming the second was undef. This
extends that to properly handle both operands.
Nikita Popov [Tue, 17 Jan 2023 11:28:41 +0000 (12:28 +0100)]
[Linker] Convert test to opaque pointers (NFC)
Remove pointer indirections to preserve test intent.
Nikita Popov [Tue, 17 Jan 2023 11:20:43 +0000 (12:20 +0100)]
[Linker] Convert test to opaque pointers (NFC)
Removing pointer indirections to at least somewhat preserve test
intent. I wasn't aware this kind of directly co-recursive type
is even legal.
Quentin Colombet [Mon, 16 Jan 2023 14:34:29 +0000 (14:34 +0000)]
[mlir][vector] Share enums with the transform dialect
Refactor the definition of the enums that are used in the lower_vectors
operation of the transformation dialect.
This avoid duplicating the definition of all the configurations that
this operation can trigger.
NFC
Differential Revision: https://reviews.llvm.org/D141867
Guillaume Chatelet [Tue, 17 Jan 2023 11:05:50 +0000 (11:05 +0000)]
[libc] Fix memcpy inefficiency
Nikita Popov [Tue, 17 Jan 2023 10:54:53 +0000 (11:54 +0100)]
[Linker] Convert test to opaque pointers (NFC)
To at least somewhat preserve the test intent, remove some
pointer indirections and make types structurally different.
Jean Perier [Tue, 17 Jan 2023 10:40:09 +0000 (11:40 +0100)]
[flang] Lower elemental and transformational clean-up in HLFIR
In lowering to hlfir, no clean-up was added yet for
the created hlfir.elemental. Add the needed hlfir.destroy.
Regarding transformational lowering, clean-ups were created because
they are lowered in memory, but this is inconvenient because this
prevented lowering to hlfir from "moving" the created variable to
an expression. Add a new entry point in IntrinsicCall.h that keeps
track of whether or not the returned storage needs to be deallocated,
but does not insert the deallocation in the StatementContext.
This allows using the newly added hlfir.as_expr "move" aspect to be
used and save creating a copy.
Depends on D141839
Reviewed By: clementval
Differential Revision: https://reviews.llvm.org/D141841
chenglin.bi [Tue, 17 Jan 2023 10:40:35 +0000 (18:40 +0800)]
Revert "[AArch64] fold subs ugt/ult to ands when the second operand is a mask"
This reverts commit
4a64024c1410692197e4b54e27e7b269a67c78f4.
The original commit made a misstake that ugt reverse should be ule
Samuel Parker [Tue, 17 Jan 2023 10:34:43 +0000 (10:34 +0000)]
[NFC][WebAssembly] Update test
Run update_llc_test_checks.py on address-offsets.ll
Jean Perier [Tue, 17 Jan 2023 10:24:40 +0000 (11:24 +0100)]
[flang][hlfir] Add hlfir.destroy operation.
Add the operation to mark the end of life of hlfir.expr.
As described in its description this is the easiest solution
to deploy given lowering "knows" where expression value are last
used.
However, inserting these points in lowering will probably make
it harder to do some IR transformation that would move the code
using or creating hlfir.expr (no use should be moved after an
hlfir.destroy).
Once the dust settle with the HLFIR change, it will be worth assessing
the situation and see if an analysis could do a better and safer job at
finding those destruction points.
Depends on D141832
Reviewed By: clementval
Differential Revision: https://reviews.llvm.org/D141839
Jean Perier [Tue, 17 Jan 2023 10:22:33 +0000 (11:22 +0100)]
[flang][hlfir] Add move semantics to hlfir.as_expr.
hlfir.as_expr allows turning an array, character, or derived type
variable into a value when it the usage require an hlfir.expr (e.g,
when returning the element value inside and hlfir.elemental).
The default implementation of this operation in bufferization is to
make a copy of the variable into a temporary buffer.
This adds a time and memory overhead in cases where such copy is not
needed because the variable is already a temporary that was created
in lowering to compute the expression value, and the "as_expr" is
the sole usage of the variable.
This is for instance the case for many transformational intrinsics
that do not have hlfir.expr operation (at least for now, but some may
never benefit from having one) and must be implemented "on memory"
in lowering.
This patch adds a way to "move" the variable storage along its value.
It allows the bufferization to re-use the variable storage for the
hlfir.expr created by hlfir.as_expr, and in exchange, the
responsibility of deallocating the buffer (if the variable was heap
allocated) if passed along to the hlfir.expr, and will need to be
done after the last hlfir.expr usage.
Differential Revision: https://reviews.llvm.org/D141832
Nikita Popov [Tue, 17 Jan 2023 10:20:03 +0000 (11:20 +0100)]
[MLIR] Convert some tests to opaque pointers (NFC)
Nikita Popov [Tue, 17 Jan 2023 09:59:32 +0000 (10:59 +0100)]
[MLIR] Convert test to opaque pointers (NFC)
Nikita Popov [Tue, 17 Jan 2023 10:12:54 +0000 (11:12 +0100)]
[MLIR] Don't verify opaque pointer type in cmpxchg
We should not check the element type for opaque pointers. We should
still check that the value operands have the same type though.
This causes a verifier error when converting instructions.ll to
opaque pointers.
Nikita Popov [Tue, 17 Jan 2023 10:06:23 +0000 (11:06 +0100)]
[MLIR] Don't verify opaque pointer type in atomicrmw
If the pointer type is opaque, we should not check the element type.
This causes a verifier failure when converting instructions.ll to
opaque pointers.
Nikita Popov [Tue, 17 Jan 2023 09:53:22 +0000 (10:53 +0100)]
[MLIR] Don't verify call signature for indirect opaque ptr call
Fixes a crash when converting the instructions.ll test to opaque
pointers.
Chuanqi Xu [Tue, 17 Jan 2023 09:26:48 +0000 (17:26 +0800)]
[C++20] [Modules] Only diagnose the non-inline external variable
definitions in header units
Address part of https://github.com/llvm/llvm-project/issues/60079.
Since the the declaration of a non-inline static data member in its
class definition is not a definition. The following form:
```
class A {
public:
static const int value = 43;
};
```
should be fine to appear in a header unit. From the perspective of
implementation, it looks like we simply forgot to check if the variable
is a definition...
Reviewed By: iains
Differential Revision: https://reviews.llvm.org/D141905
Nikita Popov [Tue, 17 Jan 2023 09:35:31 +0000 (10:35 +0100)]
[MLIR] Convert test to opaque pointers (NFC)
Nikita Popov [Tue, 17 Jan 2023 09:12:02 +0000 (10:12 +0100)]
[Polly] Convert some tests to opaque pointers (NFC)
Nikita Popov [Tue, 17 Jan 2023 09:08:33 +0000 (10:08 +0100)]
[Clang] Convert test to opaque pointers (NFC)
Florian Hahn [Tue, 17 Jan 2023 09:08:33 +0000 (09:08 +0000)]
[VPlan] Remove unneeded VPUser::classof(const VPDef *) (NFC).
This specialization is not needed any longer as VPRecipeBase inherits
from VPUser and getDefiningRecipe returns a VPRecipeBase.
Mariusz Sikora [Mon, 16 Jan 2023 11:54:56 +0000 (12:54 +0100)]
[AMDGPU] v_fmac_f64 encoding tests for gfx940
Differential Revision: https://reviews.llvm.org/D141857
Nikita Popov [Mon, 16 Jan 2023 11:57:26 +0000 (12:57 +0100)]
[Clang] Convert test to opaque pointers (NFC)
Nikita Popov [Mon, 16 Jan 2023 14:03:35 +0000 (15:03 +0100)]
[Support] Fix alternation support in backreferences (PR60073)
backref() always performs a full match on the remaining string,
and as such also needs to be matched against the whole remaining
strip. For alternations, the match was performed against just the
sub-strip for one alternative, which would of course fail to match
the whole string.
This can be done by skipping the part of the strip between OOR1
and O_CH, so that only the first alternative in the strip is
matched, and the remaining ones are skipped. Indeed, the necessary
OOR1 skipping code was already implemented in the easy-path of
backref(), so this is clearly how it was supposed to work.
However, there were two bugs: First, under this scheme we should
be passing the stop point of the original strip, not just the
alternative sub-strip. Second, while skipping for OOR1 was
implemented, handling for O_CH was missing. This would occur when
the last alternative matches, as O_CH is preceded by an implicit
OOR1 only.
Fixes https://github.com/llvm/llvm-project/issues/60073.
Rainer Orth [Tue, 17 Jan 2023 08:41:00 +0000 (09:41 +0100)]
[sanitizer_common] Don't intercept __tls_get_addr on Solaris
When building/testing ASan inside the GCC tree on Solaris while using GNU
`ld` instead of Solaris `ld`, a large number of tests SEGVs on both sparc
and x86 like this:
Thread 2 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1 (LWP 1)]
0xfe014cfc in __sanitizer::atomic_load<__sanitizer::atomic_uintptr_t>
(a=0xfc602a58, mo=__sanitizer::memory_order_acquire) at
sanitizer_common/sanitizer_atomic_clang_x86.h:46
46 v = a->val_dont_use;
1: x/i $pc
=> 0xfe014cfc
<_ZN11__sanitizer11atomic_loadINS_16atomic_uintptr_tEEENT_4TypeEPVKS2_NS_12memory_orderE+62>:
mov (%eax),%eax
(gdb) bt
#0 0xfe014cfc in __sanitizer::atomic_load<__sanitizer::atomic_uintptr_t>
(a=0xfc602a58, mo=__sanitizer::memory_order_acquire) at
sanitizer_common/sanitizer_atomic_clang_x86.h:46
#1 0xfe0bd1d7 in __sanitizer::DTLS_NextBlock (cur=0xfc602a58) at
sanitizer_common/sanitizer_tls_get_addr.cpp:53
#2 0xfe0bd319 in __sanitizer::DTLS_Find (id=1) at
sanitizer_common/sanitizer_tls_get_addr.cpp:77
#3 0xfe0bd466 in __sanitizer::DTLS_on_tls_get_addr (arg_void=0xfeffd068,
res=0xfe602a18, static_tls_begin=0, static_tls_end=0) at
sanitizer_common/sanitizer_tls_get_addr.cpp:116
#4 0xfe063f81 in __interceptor___tls_get_addr (arg=0xfeffd068) at
sanitizer_common/sanitizer_common_interceptors.inc:5501
#5 0xfe0a3054 in __sanitizer::CollectStaticTlsBlocks (info=0xfeffd108,
size=40, data=0xfeffd16c) at
sanitizer_common/sanitizer_linux_libcdep.cpp:366
#6 0xfe6ba9fa in dl_iterate_phdr () from /usr/lib/ld.so.1
#7 0xfe0a3132 in __sanitizer::GetStaticTlsBoundary (addr=0xfe608020,
size=0xfeffd244, align=0xfeffd1b0) at
sanitizer_common/sanitizer_linux_libcdep.cpp:382
#8 0xfe0a33f7 in __sanitizer::GetTls (addr=0xfe608020, size=0xfeffd244)
at sanitizer_common/sanitizer_linux_libcdep.cpp:482
#9 0xfe0a34b1 in __sanitizer::GetThreadStackAndTls (main=true,
stk_addr=0xfe608010, stk_size=0xfeffd240, tls_addr=0xfe608020,
tls_size=0xfeffd244) at sanitizer_common/sanitizer_linux_libcdep.cpp:565
The address being accessed is unmapped. However, even when the tests
`PASS` with Solaris `ld`, `ASAN_OPTIONS=verbosity=2` shows
==6582==__tls_get_addr: Can't guess glibc version
Given that that the code is stricly `glibc`-specific according to
`sanitizer_tls_get_addr.h`, there seems little point in using the
interceptor on non-`glibc` targets.
That's what this patch does. Tested on `i386-pc-solaris2.11` and
`sparc-sun-solaris2.11` inside the GCC tree.
Differential Revision: https://reviews.llvm.org/D141385
Sergey Kachkov [Thu, 22 Dec 2022 13:59:06 +0000 (16:59 +0300)]
[GVN] Refactor handling of pointer-select in GVN pass
This patch extends Def memory dependency with support of select
instructions to consistently handle pointer-select conversion.
Differential Revision: https://reviews.llvm.org/D141619
Kadir Cetinkaya [Tue, 17 Jan 2023 08:08:46 +0000 (09:08 +0100)]
[clangd] Disable ScopedMemoryLimit on tsan builds
This is causing flakiness, see https://lab.llvm.org/buildbot/#/builders/131/builds/39272
Fangrui Song [Tue, 17 Jan 2023 07:57:44 +0000 (23:57 -0800)]
[ARM] Properly fix -Wsign-compare after D141791
chendewen [Tue, 17 Jan 2023 07:24:06 +0000 (15:24 +0800)]
Revert "[AArch64][SVE] Add more intrinsics in 'isZeroingInactiveLanes'."
This reverts commit
6ef6b2b5162ef48a63fb2697d77cffa6d7b1f7e7.
Noah Goldstein [Tue, 17 Jan 2023 02:51:08 +0000 (18:51 -0800)]
Transform AtomicRMW logic operations to BT{R|C|S} if only changing/testing a single bit.
This is essentially expanding on the optimizations added on: D120199
but applies the optimization to cases where the bit being changed /
tested is not am IMM but is a provable power of 2.
The only case currently added for cases like:
`__atomic_fetch_xor(p, 1 << c, __ATOMIC_RELAXED) & (1 << c)`
Which instead of using a `cmpxchg` loop can be done with `btcl; setcc; shl`.
There are still a variety of missed cases that could/should be
addressed in the future. This commit documents many of those
cases with Todos.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D140939
Noah Goldstein [Tue, 17 Jan 2023 02:50:15 +0000 (18:50 -0800)]
Add tests for BMI patterns across non-adjacent and assosiative instructions.
I.e for blsi match (and (sub 0, x), x) but we currently miss valid
patterns like (and (and (sub 0, x), y), x).
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D141178
Jonathan Peyton [Mon, 5 Dec 2022 15:06:01 +0000 (09:06 -0600)]
[OpenMP][libomp] Add topology information to thread structure
Each time a thread gets a new affinity assigned, it will not
only assign its mask, but also topology information including
which socket, core, thread and core-attributes (if available)
it is now assigned. This occurs for all non-disabled KMP_AFFINITY
values as well as OMP_PLACES/OMP_PROC_BIND.
The information regarding which socket, core, etc. can take on three
values:
1) The actual ID of the unit (0 - (N-1)), given N units
2) UNKNOWN_ID (-1) which indicates it does not know which ID
3) MULTIPLE_ID (-2) which indicates the thread is spread across
multiple of this unit (e.g., affinity mask is spread across
multiple hardware threads)
This new information is stored in th_topology_ids[] array. An example
how to get the socket Id, one would read th_topology_ids[KMP_HW_SOCKET].
This could be expanded in the future to something more descriptive for
the "multiple" case, like a range of values. For now, the single
value suffices.
The information regarding the core attributes can take on two values:
1) The actual core-type or core-eff
2) KMP_HW_CORE_TYPE_UNKNOWN if the core type is unknown, and
UNKNOWN_CORE_EFF (-1) if the core eff is unknown.
This new information is stored in th_topology_attrs. An example
how to get the core type, one would read
th_topology_attrs.core_type.
Differential Revision: https://reviews.llvm.org/D139854
Shilei Tian [Tue, 17 Jan 2023 04:55:17 +0000 (23:55 -0500)]
[OpenMP] Fix the wrong format string used in `__kmpc_error`
This patch fixes the wrong format string used in `__kmpc_error`, which could
cause segment fault at runtime.
Reviewed By: jlpeyton
Differential Revision: https://reviews.llvm.org/D141889
Jonathan Peyton [Mon, 12 Dec 2022 17:33:52 +0000 (11:33 -0600)]
[OpenMP][libomp] Fix macOS 12 library destruction
When building the library with icc and using it on macOS 12,
the library destruction process is skipped which has many OMPT tests
failing for macOS 12. This change registers the
__kmp_internal_end_library() call for atexit() which will be a
harmless, redundant call for macOS 11 and below and the only destructor
called for macOS 12+.
Differential Revision: https://reviews.llvm.org/D139857
chenglin.bi [Tue, 17 Jan 2023 04:01:41 +0000 (12:01 +0800)]
[AArch64] fold subs ugt/ult to ands when the second operand is a mask
https://alive2.llvm.org/ce/z/pLhHI9
Fix: https://github.com/llvm/llvm-project/issues/59598
Reviewed By: samtebbs
Differential Revision: https://reviews.llvm.org/D141829
Chuanqi Xu [Tue, 17 Jan 2023 03:31:24 +0000 (11:31 +0800)]
[C++20] [Coroutines] Disable to take the address of labels in coroutines
Closing https://github.com/llvm/llvm-project/issues/56436
We can't support the GNU address of label extension in coroutines well
in current architecture. Since the coroutines are going to split into
pieces in the middle end so the address of labels are ambiguous that
time.
To avoid any further misunderstanding, we try to emit an error here.
Differential Revision: https://reviews.llvm.org/D131938
Shilei Tian [Tue, 17 Jan 2023 03:34:14 +0000 (22:34 -0500)]
[Clang][OpenMP] Fix the issue that a functor is not captured properly in a task region
This patch fixes the issue that a functor is not captured properly if
that is used in a task region. It was introduced by https://reviews.llvm.org/D114546
where `CallExpr` is treated specially, but the callee itself is not properly visited.
https://reviews.llvm.org/D115902 already did some fix for one case. This patch
fixes another case.
Fix #57757.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D141873
Freddy Ye [Tue, 17 Jan 2023 02:29:47 +0000 (10:29 +0800)]
[NFC][X86] clang-format change for avx512vlbwintrin.h
chendewen [Tue, 17 Jan 2023 01:47:35 +0000 (09:47 +0800)]
[AArch64][SVE] Add more intrinsics in 'isZeroingInactiveLanes'.
The REINTERPRET_CAST operation generates redundant and and ptrue instructions.
For some instructions, this is redundant, because its inactive lanes are zeroed by construction.
For example. Codegen before:
```
facgt p2.d, p0/z, z4.d, z1.d
ptrue p1.d
and p1.b, p2/z, p2.b, p1.b
```
After:
```
facgt p1.d, p0/z, z4.d, z1.d
```
ref: https://reviews.llvm.org/D129851
Reviewed By:sdesmalen
Differential Revision:https://reviews.llvm.org/D141469
Arthur Eubanks [Tue, 17 Jan 2023 01:50:46 +0000 (17:50 -0800)]
[bolt][test] Add REQUIRES: asserts to jt-symbol-disambiguation-3.s
Or else it unexpectedly passes in non-assert builds of bolt.
Arthur Eubanks [Wed, 11 Jan 2023 00:16:04 +0000 (16:16 -0800)]
[docs][NewPM] Clarify more status of legacy PM + optimization pipeline
Reviewed By: asbirlea, nikic
Differential Revision: https://reviews.llvm.org/D141443