Argyrios Kyrtzidis [Fri, 23 Jun 2023 22:33:39 +0000 (15:33 -0700)]
[clang/HeaderSearch] Make sure `loadSubdirectoryModuleMaps` doesn't cause loading of regular files
`HeaderSearch::loadSubdirectoryModuleMaps` `stat`s all the files in a directory which causes the dependency scanning
service to load and cache their contents. This is problematic because a file may be in the process of being generated
and could be cached by the dep-scan service while it is still incomplete.
To address this change `loadSubdirectoryModuleMaps` to ignore regular files.
Differential Revision: https://reviews.llvm.org/D153670
Wenlei He [Sun, 25 Jun 2023 23:39:16 +0000 (16:39 -0700)]
[NFC] Generalize llvm-profgen message to cover both AutoFDO and CSSPGO
Update llvm-profgen profile density message to cover both AutoFDO and CSSPGO.
Differential Revision: https://reviews.llvm.org/D153730
Nicolas Vasilache [Wed, 21 Jun 2023 14:35:30 +0000 (14:35 +0000)]
[mlir][Transform] Add support for mma.sync m16n8k16 f16 rewrite.
This PR adds support for the m16n8k16 f16 case.
At this point, the support is mostly mechanical and could be Tablegen'd to all cases.
Until then, this can be populated as needed on a case-by-case basis.
Depends on: D153420
Differential Revision: https://reviews.llvm.org/D153428
Ahmed Bougacha [Mon, 24 Oct 2022 15:33:30 +0000 (08:33 -0700)]
[AArch64][PAC] Select MOVK for ptrauth.blend intrinsic.
Blend combines two discriminator values used by other ptrauth ops.
On AArch64 here, it does that by replacing the high 16 bits of the
LHS with the low 16 bits of the RHS.
Usually the RHS is a constant, which lets us do this efficiently in
a single MOVK. When the RHS isn't constant, we can do a BFI.
In a sense, this is implementing an ABI decision (how to lower the
software construct of "blend"), but if there are interesting variants to
consider, this could be made object-file-format-specific in some way.
Differential Revision: https://reviews.llvm.org/D132384
Sam McCall [Mon, 26 Jun 2023 16:35:39 +0000 (18:35 +0200)]
Revert "[dataflow] avoid more accidental copies of Environment"
This reverts commit
ae54f01dd8c53d18c276420b23f0d0ab7afefff1.
Accidentally committed without review :-(
Nikolas Klauser [Mon, 26 Jun 2023 03:19:01 +0000 (20:19 -0700)]
[clang] __is_trivially_equality_comparable should return false for arrays
When comparing two arrays, their pointers are compared instead of their elements, which means that they are not trivially equality comparable
Fixes #63371
Reviewed By: cor3ntin
Spies: cor3ntin, cfe-commits
Differential Revision: https://reviews.llvm.org/D153737
Nikolas Klauser [Mon, 26 Jun 2023 16:35:34 +0000 (09:35 -0700)]
[NFC] Add clang whitespace removal patch to .git-blame-ignore-revs
Nikolas Klauser [Mon, 26 Jun 2023 01:59:56 +0000 (18:59 -0700)]
[clang][NFC] Remove trailing whitespaces and enforce it in lib, include and docs
A lot of editors remove trailing whitespaces. This patch removes any trailing whitespaces and makes sure that no new ones are added.
Reviewed By: erichkeane, paulkirth, #libc, philnik
Spies: wangpc, aheejin, MaskRay, pcwang-thead, cfe-commits, libcxx-commits, dschuff, nemanjai, arichardson, kbarton, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, s.egerton, sameer.abuasal, apazos, luismarques, martong, frasercrmck, steakhal, luke
Differential Revision: https://reviews.llvm.org/D151963
walter erquinigo [Thu, 15 Jun 2023 20:57:07 +0000 (15:57 -0500)]
[LLDB] Add DWARF definitions for the new Mojo language
The new language Mojo recently received a proper DWARF code, which can be seen in https://dwarfstd.org/languages.html, and this patch adds the basic definitions for using this language in DWARF.
Differential Revision: https://reviews.llvm.org/D153073
Nicolas Vasilache [Wed, 21 Jun 2023 12:01:15 +0000 (12:01 +0000)]
[mlir][Transform] Introduce nvgpu transform extensions
Mapping to NVGPU operations such as mma.sync with mixed precision and ldmatrix with transposes and
various data types involves complex matchings from low-level IR.
This is akin to raising complex patterns after unnecessarily having lost structural information.
To avoid such unnecessary complexity, introduce a direct mapping step from a matmul on memrefs
to distributed NVGPU vector abstractions.
In this context, mapping to specific mma.sync operations is trivial and consists in simply
translating the documentation into indexing expressions.
Correctness is demonstrated with an end-to-end integration test.
Differential Revision: https://reviews.llvm.org/D153420
Valentin Clement [Mon, 26 Jun 2023 16:19:43 +0000 (09:19 -0700)]
[flang][openacc] Support array reduction for min in lowering
Add loweirng support for array reduction with the
min operator.
Depends on D153650
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D153661
Mark de Wever [Mon, 26 Jun 2023 16:15:58 +0000 (18:15 +0200)]
[libc++] Silences an invalid compiler diagnostic.
When the value is not initialized it's never used. However silencing the
warning is trivial, as suggested by BlamKiwi.
Fixes https://llvm.org/PR63421
Matthias Springer [Mon, 26 Jun 2023 16:04:23 +0000 (18:04 +0200)]
[mlir][transform] Fix TrackingListener in regions that are isolated from above
When an operation is removed/replaced, the TrackingListener updates the internal transform state mapping between handles and payload IR. All handles must be updated, even the ones that are defined in a region that is beyond the most recent region that is isolated from above.
This fixes a bug, where a payload op was erased in a named sequence. Not only handles defined inside of the named region must be updated, but also all other handles such as the ones where the sequence is included.
Differential Revision: https://reviews.llvm.org/D153767
Matthias Springer [Mon, 26 Jun 2023 15:49:31 +0000 (17:49 +0200)]
[mlir][transform] Remove redundant handle check in `replacePayload...`
Differential Revision: https://reviews.llvm.org/D153766
Arthur Eubanks [Mon, 26 Jun 2023 15:45:17 +0000 (08:45 -0700)]
[gn build] Port
f2123af1e7d7
Arthur Eubanks [Mon, 26 Jun 2023 15:45:15 +0000 (08:45 -0700)]
[gn build] Port
8de9f2b558a0
Simon Pilgrim [Mon, 26 Jun 2023 15:50:03 +0000 (16:50 +0100)]
[X86] combineMul - ensure getTargetConstantFromNode splat extraction is the correct element width
The extracted Constant and Constant::getSplatValue can both be any bitwidth - they don't necessarily match the original ConstantSDNode type
Fixes #63507
Simon Pilgrim [Mon, 26 Jun 2023 14:31:29 +0000 (15:31 +0100)]
[X86] lowerV8I16Shuffle - use PACKSS(SEXT_INREG(X),SEXT_INREG(Y)) for pre-SSSE3 truncation shuffles
The comment about PSHUFLW+PSHUFHW+PSHUFD was outdated as that referred to a single input case, but that is now always handled earlier.
Another step towards removing premature combines to vector truncation combines to PACK.
Corentin Jabot [Mon, 26 Jun 2023 15:42:25 +0000 (17:42 +0200)]
[Clang] Fix invalid runline in test
Joseph Huber [Fri, 23 Jun 2023 12:42:19 +0000 (07:42 -0500)]
[libc] Allow the RPC client to be initialized via a H2D memcpy
The RPC client must be initialized to set a pointer to the underlying
buffer. This is currently done with the `reset` method which may not be
ideal for the use-case. We want runtimes to be able to initialize this
without needing to call a kernel. Recent changes allowed the `Client`
type to be trivially copyable. That means we can create a client on the
server side and then copy it over. To that end we take the existing
externally visible symbol and initialize it to the client's pointer.
Therefore we can look up the symbol and copy it over once loaded.
No test currently, I tested with a demo OpenMP application but couldn't think of
how to put that in-tree.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D153633
Valentin Clement [Mon, 26 Jun 2023 15:37:18 +0000 (08:37 -0700)]
[flang][openacc] Generate loop nest as column major
Address comment from D153455
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D153650
Matthias Springer [Mon, 26 Jun 2023 15:29:57 +0000 (17:29 +0200)]
[mlir][transform][NFC] Store all Mappings in region stack
Do not swap the Mappings when entering a region that is isolated from above. Simply push another Mappings struct to the stack and prevent invalid accesses during lookups.
Differential Revision: https://reviews.llvm.org/D153765
Philip Reames [Mon, 26 Jun 2023 15:26:25 +0000 (08:26 -0700)]
[RISCV] Rename all TA variants of VPseudoUnaryMask and VPatoUnaryMask [NFC]
All of the these pseudo's take their policy from the policy operand via the normal mechanisms, and aren't "tail agnostic" in any particular sense.
Note that the existing VPatUnaryMask class was unused, and thus this is just a rename.
Matthias Springer [Mon, 26 Jun 2023 15:21:35 +0000 (17:21 +0200)]
[mlir][transform] Add notifyPayloadOperationReplaced to TransformRewriter
This function allows users to update payload op mappings in cases where such replacements cannot be performed automatically by the rewriter/listener interface.
Differential Revision: https://reviews.llvm.org/D153764
Philip Reames [Mon, 26 Jun 2023 14:59:39 +0000 (07:59 -0700)]
[RISCV] Combine VPseudoUnaryMask and VPseudoUnaryMaskTA [NFC]
The only difference between these classes was the existance of a policy operand on the later. We can use the policy operand version for the one place which used the non-TA suffixed one. I then renamed to remove TA as these aren't tail agnostic; they take their policy from the operand.
Note that this wouldn't be strictly NFC except that the one user of the class being removed wasn't in the masked psuedo table, and thus doesn't go through mask to unmasked conversion in ISEL. That's a missed optimization we may want to fix at some point.
David Spickett [Mon, 26 Jun 2023 15:19:06 +0000 (15:19 +0000)]
[clang][OpenMP] Fix unused var warning
This was added by
453e02ca0903c9f65529d21c513925ab0fdea1e1.Use
isa instead since we don't use the result.
Fixes:
<..>SemaOpenMP.cpp:23149:13: warning: unused variable ‘TargetVarDecl’ [-Wunused-variable]
23149 | if (auto *TargetVarDecl = dyn_cast_or_null<VarDecl>(TargetDecl))
| ^~~~~~~~~~~~~
Which came up when building with GCC 9.
Mark de Wever [Mon, 26 Jun 2023 15:09:22 +0000 (17:09 +0200)]
[libc++][doc] Fixes a typo.
Thanks to ChuanqiXu for spotting it.
Kelvin Li [Wed, 14 Jun 2023 13:50:56 +0000 (09:50 -0400)]
[flang] Add PPC vec_max, vec_min, vec_madd and vec_nmsub intrinsic
Differential Revision: https://reviews.llvm.org/D152938
Philip Reames [Mon, 26 Jun 2023 14:42:08 +0000 (07:42 -0700)]
[RISCV] Split usage of VPseudoUnaryNoMask with GPR destination
These instructions don't have a passthrough operand or any of the policy behaviors, while are the other ones do. Split them out into their own class to make this separation clear, and rename the mask variant to match. (We'd already done the same for the mask variant.)
Differential Revision: https://reviews.llvm.org/D153596
Corentin Jabot [Sat, 10 Sep 2022 21:03:05 +0000 (23:03 +0200)]
[Clang] Implement P2738R1 - constexpr cast from void*
Reviewed By: #clang-language-wg, erichkeane
Differential Revision: https://reviews.llvm.org/D153702
Mehdi Amini [Mon, 26 Jun 2023 14:10:31 +0000 (16:10 +0200)]
Add missing dependent test dialect in a MLIR test pass
Fix #62317
Tue Ly [Mon, 26 Jun 2023 14:31:51 +0000 (14:31 +0000)]
[libc][Obvious] Fix docs warning.
Sam McCall [Mon, 26 Jun 2023 14:29:47 +0000 (16:29 +0200)]
[dataflow] fix test after conflict between
ae54f01dd8c53d1 &
f2123af1e7d75
Sam McCall [Thu, 22 Jun 2023 03:03:24 +0000 (05:03 +0200)]
[dataflow] avoid more accidental copies of Environment
This is clunky but greatly improves debugging of flow conditions - each
copy adds more indirections in the form of flow condition tokens.
(LatticeEffect presumably once did something here, but it's now both
unused and untested.)
For the exit flow condition of:
```
void target(base::Optional<int*> opt) {
if (opt.value_or(nullptr) != nullptr) {
opt.value();
} else {
opt.value(); // unsafe
}
}
```
Before:
```
(B0:1 = V15)
(B1:1 = V8)
(B2:1 = V10)
(B3:1 = (V4 & (!V7 => V6)))
(V10 = (B3:1 & !V7))
(V12 = B1:1)
(V13 = B2:1)
(V15 = (V12 | V13))
(V3 = V2)
(V4 = V3)
(V8 = (B3:1 & !!V7))
B0:1
V2
```
After D153491:
```
(B0:1 = (V9 | V10))
(B1:1 = (B3:1 & !!V6))
(B2:1 = (B3:1 & !V6))
(B3:1 = (V3 & (!V6 => V5)))
(V10 = B2:1)
(V3 = V2)
(V9 = B1:1)
B0:1
V2
```
After this patch, we can finally see the relations between the flow
conditions directly:
```
(B0:1 = (B2:1 | B1:1))
(B1:1 = (B3:1 & !!V6))
(B2:1 = (B3:1 & !V6))
(B3:1 = (V3 & (!V6 => V5)))
(V3 = V2)
B0:1
V2
```
(I believe V2 is the FC for the InitEnv, and V3 is introduced when
computing the input state for B3 - not sure how to eliminate it)
Differential Revision: https://reviews.llvm.org/D153493
Martin Braenne [Tue, 20 Jun 2023 08:00:01 +0000 (08:00 +0000)]
[clang][dataflow] Perform deep copies in copy and move operations.
This serves two purposes:
- Because, today, we only copy the `StructValue`, modifying the destination of
the copy also modifies the source. This is demonstrated by the new checks
added to `CopyConstructor` and `MoveConstructor`, which fail without the
deep copy.
- It lays the groundwork for eliminating the redundancy between
`AggregateStorageLocation` and `StructValue`, which will happen as part of the
ongoing migration to strict handling of value categories (seeo
https://discourse.llvm.org/t/70086 for details). This will involve turning
`StructValue` into essentially just a wrapper for `AggregateStorageLocation`;
under this scheme, the current "shallow" copy (copying a `StructValue` from
one `AggregateStorageLocation` to another) will no longer be possible.
Because we now perform deep copies, tests need to perform a deep equality
comparison instead of just comparing for equal identity of the `StructValue`s.
The new function `recordsEqual()` provides such a deep equality comparison.
Reviewed By: xazax.hun
Differential Revision: https://reviews.llvm.org/D153006
Alex Brachet [Mon, 26 Jun 2023 13:49:22 +0000 (13:49 +0000)]
[CMake][Fuchsia] Enable standalone libatomic
BUILTINS_${target}_COMPILER_RT_BUILD_STANDALONE_LIBATOMIC
actually builds libatomic, and
RUNTIMES_${target}_COMPILER_RT_BUILD_STANDALONE_LIBATOMIC
tells the compiler-rt tests that we built it and it is
safe to use in tests.
Differential Revision: https://reviews.llvm.org/D151681
Alex Brachet [Mon, 26 Jun 2023 13:40:22 +0000 (13:40 +0000)]
[compiler-rt] Stop using system ldd to detect libc version
The system libc may be different from the libc passed in
CMAKE_SYSROOT. Instead of using the ldd in PATH to detect
glibc version, use the features.h header file.
Differential Revision: https://reviews.llvm.org/D151678
Graham Hunter [Fri, 10 Mar 2023 11:17:04 +0000 (11:17 +0000)]
[AArch64][CodeGen] Lower (de)interleave2 intrinsics to ld2/st2
The InterleavedAccess pass currently matches (de)interleaving
shufflevector instructions with loads or stores, and calls into
target lowering to generate ldN or stN instructions.
Since we can't use shufflevector for scalable vectors (besides a
splat with zeroinitializer), we have interleave2 and deinterleave2
intrinsics. This patch extends InterleavedAccess to recognize those
intrinsics and if possible replace them with ld2/st2.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D146218
Louis Dionne [Mon, 5 Jun 2023 19:27:38 +0000 (12:27 -0700)]
[libc++][filesystem] Avoid using anonymous namespaces in support headers
This avoids using anonymous namespaces in headers and ensures that
the various helper functions get deduplicated across the TUs
implementing <filesystem>. Otherwise, we'd get a definition of
these helper functions in each TU where they are used, which is
entirely unnecessary.
Differential Revision: https://reviews.llvm.org/D152378
Zain Jaffal [Mon, 26 Jun 2023 12:07:56 +0000 (13:07 +0100)]
[YAMLParser] Support block nodes when parsing YAML strings.
Previously if a string is in the format
```
|
val
val2
val3
```
Yaml parser will error out without parsing the string. The mentioned pattern is a valid yaml str and should be parsed.
Differential Revision: https://reviews.llvm.org/D153760
Nikita Popov [Mon, 26 Jun 2023 13:26:13 +0000 (15:26 +0200)]
[SCEV] Print block dispositions on mismatch (NFC)
Sam McCall [Sat, 24 Jun 2023 00:45:17 +0000 (02:45 +0200)]
[dataflow] Disallow implicit copy of Environment, use fork() instead
Environments are heavyweight, and copies are observably different from the
original: they introduce new SAT variables, which degrade performance &
debugging. Copies should only be done deliberately, where justified.
Empirically there are several places in the framework where we perform dubious
copies, sometimes entirely accidentally. (see e.g. D153491). Making these
explicit makes this mistake harder.
This patch forces copies to go through fork(), the copy-constructor is private.
This requires changes to existing callsites: some are correct and call fork(),
some are incorrect and are fixed, others are difficult and I left a FIXME.
Differential Revision: https://reviews.llvm.org/D153674
Christian Sigg [Mon, 26 Jun 2023 12:55:03 +0000 (14:55 +0200)]
[Bazel][llvm] Fix after 8de9f2b
Nikita Popov [Fri, 16 Jun 2023 15:16:52 +0000 (17:16 +0200)]
[LCSSA] Compute SCEV of LCSSA phi if original instruction had SCEV
The backstory is that the LCSSA invalidation we perform here is not
really necessary from a SCEV perspective. However, other code may
rely on the fact that invalidating only LCSSA phi nodes is sufficient
for transforms like loop peeling
(see https://reviews.llvm.org/D149331#4398582 for more details).
However, performing invalidation during LCSSA construction also
means that SCEV expansion (which may need to construct LCSSA) can
invalidate SCEV, which is somewhat unexpected and code may not be
prepared to deal with it (see the added test case, reported at
https://reviews.llvm.org/D149435#4428219).
Instead of invalidating SCEV, ensure that the LCSSA phi node also
has cached SCEV if the original instruction did. This means that
later invalidation of LCSSA phi nodes will work as expected. This
should avoid both the above issues and be more efficient.
Differential Revision: https://reviews.llvm.org/D153145
Leandro Lupori [Wed, 21 Jun 2023 19:32:48 +0000 (19:32 +0000)]
[NFC][flang] Fix PushSemantics macro
Add and use the CONCAT macro to force the expansion of __LINE__ in
PushSemantics body.
Reviewed By: clementval
Differential Revision: https://reviews.llvm.org/D153460
Leandro Lupori [Fri, 16 Jun 2023 15:59:52 +0000 (15:59 +0000)]
[flang] Fix lowering of array paths in elemental calls
Elemental procedures may need their array arguments to be passed by
address. This is done by setting ArrayExprLowering::semant to a
value that corresponds to this semantics. Later, member functions
such as applyPathToArrayLoad() read this variable to generate FIR
instructions that match the needed behavior. The problem is that
the semant variable also affects how array paths are lowered. Thus,
if an index of the path is an array element, this will cause its
address to be used instead of its value, which usually results in a
segmentation fault at runtime.
Example: b(i:i) = elem_func(a(v(i):v(i)))
To fix this, ArrayExprLowering::nextPathSemant was added. When it's
set, the next array path is handled with the semantics specified by
it, while the elemental argument retains its original semantics.
Fixes https://github.com/llvm/llvm-project/issues/62981
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D153454
Aaron Ballman [Mon, 26 Jun 2023 12:04:20 +0000 (08:04 -0400)]
Diagnose incorrect syntax for #pragma clang diagnostic
We would previously fail to diagnose unexpected tokens after a 'push'
or 'pop' directive.
Fixes https://github.com/llvm/llvm-project/issues/13920
serge-sans-paille [Mon, 26 Jun 2023 07:52:59 +0000 (09:52 +0200)]
[Remarks] Make sure -fdiagnostics-hotness-threshold implies -fdiagnostics-show-hotness
When asking for -fdiagnostics-hotness-threshold, we currently require
-fdiagnostics-show-hotness otherwise we silently display nothing.
I don't see a scenario where that makes sense, so have one option impy
the other.
Differential Revision: https://reviews.llvm.org/D153746
Simon Pilgrim [Mon, 26 Jun 2023 11:39:43 +0000 (12:39 +0100)]
[X86] Generalize combineVectorTruncationWithPACKUS/combineVectorTruncationWithPACKSS and reuse in LowerTRUNCATE
Rename combineVectorTruncationWithPACK* to truncateVectorWithPACK* and split the operands so LowerTRUNCATE can more easily use them.
Noticed while investigating some regressions in D152928 due to us trying to truncate to PACKUS/PACKSS instructions too early
Jean Perier [Mon, 26 Jun 2023 11:23:12 +0000 (13:23 +0200)]
[flang][hlfir] user defined assignment codegen
Add codegen support for hlfir.region_assign with user defined
assignment.
It is currently a bit pessimistic, because outside of forall, it
does not use the PURE aspect, if any, of the assignment routine to
rule out that the routine can write to something else than the LHS that
could overlap with the RHS.
However, the current lowering is anyway adding parenthesis around the
RHS, so this should not cause performance regressions.
Differential Revision: https://reviews.llvm.org/D153516
Jean Perier [Mon, 26 Jun 2023 11:06:43 +0000 (13:06 +0200)]
[flang][hlfir] Lower user defined assignment
Lower user defined assignment inside the hlfir.region_assign
"userDefinedAssignment" mlir region.
This is done by adding an entry point to ConvertCall.h in order
to call genUserCall with the region block arguments as arguments.
The codegen for hlfir.region_assign with user defined assignment
will be added in a later patch.
Differential Revision: https://reviews.llvm.org/D153404
Théo Degioanni [Mon, 26 Jun 2023 10:49:54 +0000 (12:49 +0200)]
[mlir][llvm] Introduce some constant folding.
This revision introduces some constant folding features to the LLVM
dialect. This specific choice of operations to cover is intended to
allow the elimination of logic generated by mem2reg with memset in the
common case of memsets of constant values.
This also introduces new verifiers for integer extension operations.
This lead to a fix in SPIRV to LLVM conversion, as it would sometimes
generate invalid ZExt and SExt operations.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D153135
Nikita Popov [Thu, 22 Jun 2023 07:14:08 +0000 (09:14 +0200)]
[BasicAA] Don't short-circuit non-capturing arguments
This is an alternative to D153464. BasicAA currently assumes that
an unescaped alloca cannot be read through non-nocapture arguments
of a call, based on the argument that if the argument were based on
the alloca, it would not be unescaped.
This currently fails in the case where the call is an ephemeral value
and as such does not count as a capture. It also happens for calls
that are readonly+nounwind+void, though that case tends to not matter
in practice, because such calls will get DCEd anyway.
Differential Revision: https://reviews.llvm.org/D153511
Florian Hahn [Mon, 26 Jun 2023 10:14:33 +0000 (11:14 +0100)]
[ConstraintElim] Add extra phi use tests.
Add additional tests for D153660.
Benjamin Kramer [Mon, 26 Jun 2023 10:04:24 +0000 (12:04 +0200)]
[bazel] Add TargetParser dep to tblgen after
8de9f2b558a046da15cf73191da627bdd83676ca
Quentin Colombet [Mon, 26 Jun 2023 09:42:55 +0000 (11:42 +0200)]
[Reassociation] Only form CSE expressions for local operands
# TL;DR #
This patch constrains how much freedom the heuristic that tries to from CSE
expressions has. The added constrain is that the CSE-able expressions must be
within the same basic block as the expressions they get moved before.
# Details #
The reassociation pass currently tweaks the rewrite of the final expression
towards surfacing pairs of operands that would be CSE-able.
This heuristic applies after the regular ordering of the expression.
The regular ordering uses the program structure to choose in which order each
subexpression is materialized. That order follows the topological order.
Now, to expose more CSE opportunities, this heurisitc effectively bypasses the
previous ordering normally defined by the program and pushes up sub-expressions
that are arbitrary deep in the CFG.
E.g., let's say the program order (top to bottom) gives `((a*b)*c)*d)*e` and
`b*e` appears the most in the program. The expression will be reordered in
`(((b*e)*a)*c)*d`
This reordering implies that all the sub expressions (in this example `xx*a`,
then `yy*c`, etc.) will need to appear after the CSE-able expression.
This may over-constrain where the (sub) expressions may live and in particular
it may create loop-dependent expressions.
This patch only allows to move expressions up the expression chain when the
related values are definied in the same basic block as the ones they
"push-down".
This constrain is far for being perfect but at least it avoids accidentally
creating loop dependent variables.
If we really want to expose CSE-able expressions in a proper way, we would need
a profitability metric and also make the decision globally as opposed to one
chain at a time.
I've put the new constrain behind an option to make comparing the old and new
versions easy. However, I believe that even if we find cases where the old
version performs better it is probably by accident. What I am aiming for with
this change is more predictability, then we can improve if need be.
This fixes www.llvm.org/PR61458
Differential Revision: https://reviews.llvm.org/D147457
Lorenzo Chelini [Wed, 21 Jun 2023 08:53:43 +0000 (10:53 +0200)]
[MLIR][Linalg] Avoid generalizing `linalg.map`
We cannot trivially generalize `linalg.map`, as it does not use the
output as a region argument in the block, while `linalg.generic` expects
many region arguments as the input/output operands.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D153442
Piotr Zegar [Mon, 26 Jun 2023 09:19:01 +0000 (09:19 +0000)]
Revert "[clang-tidy] Add modernize-printf-to-std-print check"
This reverts commit
ec89cb9a81529fd41fb37b8e62203a2e9f23bd54.
Emilia Kond [Mon, 26 Jun 2023 09:35:47 +0000 (12:35 +0300)]
[clang-format] Preserve AmpAmpTokenType in nested parentheses
When parsing a requires clause, the UnwrappedLineParser would delegate to
parseParens with an AmpAmpTokenType set to BinaryOperator. However,
parseParens would not carry this over into any nested parens, meaning it
could assign a different token type to an && in a requires clause.
This patch makes sure parseParens inherits its parameter when performing
a recursive call.
Fixes https://github.com/llvm/llvm-project/issues/63251
Reviewed By: HazardyKnusperkeks, owenpan, MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D153641
Jens Massberg [Tue, 20 Jun 2023 11:25:56 +0000 (13:25 +0200)]
[clangd][c++20]Consider rewritten binary operators in TargetFinder
In C++20 some binary operations can be rewritten, e.g. `a != b`
can be rewritten to `!(a == b)` if `!=` is not explicitly defined.
The `TargetFinder` hasn't considered the corresponding `CXXRewrittenBinaryOperator` yet. This resulted that the definition of such operators couldn't be found
when navigating to such a `!=` operator, see https://github.com/clangd/clangd/issues/1476.
In this patch we add support of `CXXRewrittenBinaryOperator` in `FindTarget`.
In such a case we redirect to the inner binary operator of the decomposed form.
E.g. in case that `a != b` has been rewritten to `!(a == b)` we go to the
`==` operator. The `==` operator might be implicitly defined (e.g. by a `<=>`
operator), but this case is already handled, see the new test.
I'm not sure if I the hover test which is added in this patch is the right one,
but at least is passed with this patch and fails without it :)
Note, that it might be a bit missleading that hovering over a `!=` refers to
"instance method operator==".
Differential Revision: https://reviews.llvm.org/D153331
Job Noorman [Mon, 26 Jun 2023 08:26:56 +0000 (10:26 +0200)]
Move SubtargetFeature.h from MC to TargetParser
SubtargetFeature.h is currently part of MC while it doesn't depend on
anything in MC. Since some LLVM components might have the need to work
with target features without necessarily needing MC, it might be
worthwhile to move SubtargetFeature.h to a different location. This will
reduce the dependencies of said components.
Note that I choose TargetParser as the destination because that's where
Triple lives and SubtargetFeatures feels related to that.
This issues came up during a JITLink review (D149522). JITLink would
like to avoid a dependency on MC while still needing to store target
features.
Reviewed By: MaskRay, arsenm
Differential Revision: https://reviews.llvm.org/D150549
Emilia Kond [Mon, 26 Jun 2023 09:26:03 +0000 (12:26 +0300)]
[clang-format][NFC] Use correct test method for new tests
7a38b3bfeb56 landed after
20b4df1ed611, which refactored how tests are
structured in FormatTest. This quick fix-up unifies the tests added in
7a38b3bfeb56 to comply with this new format.
Mike Crowe [Mon, 26 Jun 2023 05:44:14 +0000 (05:44 +0000)]
[clang-tidy] Add modernize-printf-to-std-print check
Add FormatStringConverter utility class that is capable of converting
printf-style format strings into std::print-style format strings along
with recording a set of casts to wrap the arguments as required and
removing now-unnecessary calls to std::string::c_str() and
std::string::data()
Use FormatStringConverter to implement a new clang-tidy check that is
capable of converting calls to printf, fprintf, absl::PrintF,
absl::FPrintF, or any functions configured by an option to calls to
std::print and std::println, or other functions configured by options.
In other words, the check turns:
fprintf(stderr, "The %s is %3d\n", description.c_str(), value);
into:
std::println(stderr, "The {} is {:3}", description, value);
if it can.
std::print and std::println can do almost anything that standard printf
can, but the conversion has some some limitations that are described in
the documentation. If conversion is not possible then the call remains
unchanged.
Depends on D153716
Reviewed By: PiotrZSL
Differential Revision: https://reviews.llvm.org/D149280
eopXD [Thu, 22 Jun 2023 08:35:34 +0000 (01:35 -0700)]
[Clang][RISCV] Check type support for local variable declaration of RVV type
Guard local variable declaration for RVV intrinsic types.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D153510
Ingo Müller [Mon, 26 Jun 2023 08:22:22 +0000 (08:22 +0000)]
[mlir][LLVMIR] Allow !llvm.ptr<ptr> operands in atomicrmw xchg op.
Previously, llvm.atomicrmw only allowed operands that are pointers to
LLVM floats or integers. However, according to the LLVM IR Language
Reference, that op allows pointer to pointer operands in its `xchg`
mode. This patch allows those operands also in MLIR's LLVM dialect and
adapts the tests accordingly.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D153747
Luke Lau [Fri, 23 Jun 2023 13:56:56 +0000 (14:56 +0100)]
[RISCV] Teach doPeepholeMaskedRVV to handle vslide{up,down}
We already handle vslide1{up,down}, so this extends it to vslide{up,down}.
This was unintentionally added in https://reviews.llvm.org/D150463 and
then removed in
37cfcfcef76bb615b941d7077ca81168bd7ad080, but unless I'm
missing something this should still be ok as the mask only controls what
destination elements are written to.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D153631
Luke Lau [Fri, 23 Jun 2023 13:31:45 +0000 (14:31 +0100)]
[RISCV] Add test cases for vmerge peephole with vslides
Currently vslide1{up,down}s can have vmerges folded into them, but not
vslide{up,down}s.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D153630
Ingo Müller [Sun, 25 Jun 2023 09:04:49 +0000 (09:04 +0000)]
[mlir][affine][doc] Fix example snippet for AffineParallelOp. (NFC)
There were various syntax errors; all have pretty trivial fixes but
might distract novice users (like me).
Reviewed By: ingomueller-net
Differential Revision: https://reviews.llvm.org/D153720
Haohai Wen [Mon, 26 Jun 2023 07:27:06 +0000 (15:27 +0800)]
Reland [COFF] Support -gsplit-dwarf for COFF on Windows
This relands
3eee5aa528abd67bb6d057e25ce1980d0d38c445 with fixes.
Job Noorman [Mon, 26 Jun 2023 07:09:36 +0000 (09:09 +0200)]
[JITLink][RISCV] Adjust offsets of non-relaxable edges
The relaxation algorithm used to only update offsets of relaxable edges.
This caused non-relaxable edges that appear after a relaxed instruction
to have an incorrect offset and be applied at the wrong location. This
patch fixes this by updating the offsets of all edges.
Note that this bug was caused by an incorrect translation of LLD's
relaxation algorithm. LLD always uses all edges during relaxation while
I decided to filter-out relaxable edges to prevent having to iterate
non-relaxable edges at each step. However, this had the side-effect of
only updating offsets of relaxable edges. This patch leaves the
filtering of relaxable edges as-is but iterates all edges when updating
offsets.
Reviewed By: StephenFan
Differential Revision: https://reviews.llvm.org/D153515
Job Noorman [Mon, 26 Jun 2023 07:09:29 +0000 (09:09 +0200)]
[JITLink][RISCV] Expose relaxation pass publicly
This is useful for contexts where shouldAddDefaultTargetPasses returns
false but that still want to perform relaxation.
Reviewed By: StephenFan
Differential Revision: https://reviews.llvm.org/D153538
Job Noorman [Mon, 26 Jun 2023 07:09:19 +0000 (09:09 +0200)]
[JITLink][RISCV] Support relaxable edges without relaxation pass
Relaxable edges are created unconditionally, even when the relaxation
pass will not run. However, they were not recognized by applyFixup
causing them to not be applied.
To support configurations without the relaxation pass, this patch adds
these relaxable edges to applyFixup:
- CallRelaxable: Can be treated as R_RISCV_CALL
- AlignRelaxable: Can simply be ignored
An alternative could be to unconditionally run the relaxation pass, even
in contexts where shouldAddDefaultTargetPasses returns false. However, I
could imagine there being use cases for disabling relaxation which
wouldn't be possible anymore then.
Reviewed By: StephenFan
Differential Revision: https://reviews.llvm.org/D153541
pvanhout [Fri, 23 Jun 2023 10:26:57 +0000 (12:26 +0200)]
[NFC][GlobalISel] Don't return `bool` from apply functions
There is no case where those functions return false. It's always return true.
Even if they were to return false, it's not really something we should rely on I think.
With the current combiner implementation, it would just make `tryCombineAll` return false without retrying anymore rules.
I also believe that if an applyer were to return false, it would mean that the match function is not good enough. Asserting on failure in an apply function is a better idea, IMO.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D153619
WANG Xuerui [Mon, 26 Jun 2023 07:07:21 +0000 (15:07 +0800)]
[Clang][LoongArch] Consume and check -mabi and -mfpu even if -m*-float is present
This kind of CLI flags duplication can sometimes be convenient for build
systems that may have to tinker with these.
For example, in the Linux kernel we almost always want to ensure no FP
instruction is emitted, so `-msoft-float` is present by default; but
sometimes we do want to allow FPU usage (e.g. certain parts of amdgpu DC
code), in which case we want the `-msoft-float` stripped and `-mfpu=64`
added. Here we face a dilemma without this change:
* Either `-mabi` is not supplied by `arch/loongarch` Makefile, in which
case the correct ABI has to be supplied by the driver Makefile
(otherwise the ABI will become double-float due to `-mfpu`), which is
arguably not appropriate for a driver;
* Or `-mabi` is still supplied by `arch/loongarch` Makefile, and the
build immediately errors out because
`-Werror=unused-command-line-argument` is unconditionally set for
Clang builds.
To solve this, simply make sure to check `-mabi` and `-mfpu` (and gain
some useful diagnostics in case of conflicting settings) when
`-m*-float` is successfully parsed.
Reviewed By: SixWeining, MaskRay
Differential Revision: https://reviews.llvm.org/D153707
David CARLIER [Sun, 25 Jun 2023 15:21:15 +0000 (16:21 +0100)]
sanitizer: enable getentropy interception on Linux/GLIBC 2.25 and onwards
https://man7.org/linux/man-pages/man3/getentropy.3.html
Reviewers: melver
Reviewed-By: melver
Differential Revision: https://reviews.llvm.org/D153723
Tobias Gysi [Mon, 26 Jun 2023 06:40:02 +0000 (06:40 +0000)]
[mlir] Avoid expensive LLVM IR import warnings
The revision adds a flag to the LLVM IR import
that avoids emitting expensive warnings about
unsupported debug intrinsics and unhandled
metadata.
Reviewed By: Dinistro
Differential Revision: https://reviews.llvm.org/D153625
Chuanqi Xu [Mon, 26 Jun 2023 06:27:05 +0000 (14:27 +0800)]
[C++] [Coroutines] Assume the allocation doesn't return nullptr
In case of 'get_return_object_on_allocation_failure' get declared, the
compiler is required to call 'operator new(size_t, nothrow_t)' and the
handle the failure case by calling
'get_return_object_on_allocation_failure()'. But the failure case should
be rare and we can assume the allocation is successful and pass the
information to the optimizer.
Craig Topper [Mon, 26 Jun 2023 05:58:33 +0000 (22:58 -0700)]
[RISCV] Properly handle partial writes in isConvertibleToVMV_V_V.
We were only checking for the previous insructions to write exactly
the register or a super register. We ignored writes to a subregister
and continued searching for the producing instruction. We need to
abort instead.
There's another check inside the if body to abort if the registers
don't match exactly. So we just need to check for overlap so we
enter the if body.
Reviewed By: fakepaper56
Differential Revision: https://reviews.llvm.org/D153490
Craig Topper [Mon, 26 Jun 2023 05:58:18 +0000 (22:58 -0700)]
[RISCV] Add test case for D153490. NFC
Craig Topper [Mon, 26 Jun 2023 05:53:58 +0000 (22:53 -0700)]
[RISCV] Use unsigned types for orc_b builtins.
Inspired by D153235, I think bit manipulation makes more
sense on unsigned types.
Reviewed By: Jim
Differential Revision: https://reviews.llvm.org/D153403
Jim Lin [Mon, 26 Jun 2023 02:27:55 +0000 (10:27 +0800)]
[RISCV] Change the type of argument to clz and ctz from ZiZi/WiWi to iUZi/iUWi
Input argument of clz and ctz should be unsigned type and return value
should be integer like `builtin_clz` and `builtin_ctz` defined in clang/include/clang/Basic/Builtins.def.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D153235
Craig Topper [Mon, 26 Jun 2023 04:59:27 +0000 (21:59 -0700)]
[RISCV] Remove WriteJmpReg. Use WriteJalr in its place.
It was only used for the compressed instruction c.jr which expands
to jalr with rd=x0. Use WriteJalr instead to match jalr.
Aiden Grossman [Mon, 26 Jun 2023 03:15:43 +0000 (03:15 +0000)]
Revert "[llvm-exegesis] Introduce SubprocessMemory Utility Class"
This reverts commit
0b6b400b98b921279fc08c63a2a68ebfcb12a3e2.
The sporadic test failures were fixed during this land, but I forgot to
fix the build failures on certain platforms (seems to mostly be
AArch64/PPC) that result from them not being able to find the symbols
for shm_open and shm_unlink.
WANG Xuerui [Mon, 26 Jun 2023 01:55:13 +0000 (09:55 +0800)]
[llvm-objcopy] Add LoongArch support
Apart from general feature parity, this is also necessary for enabling
ClangBuiltLinux that defaults to using LLVM tools.
While at it, add a missing comment for the Hexagon definition directly
above, so it doesn't get confused with the SPARC definitions.
Reviewed By: SixWeining, MaskRay, jhenderson
Differential Revision: https://reviews.llvm.org/D153609
Weining Lu [Mon, 26 Jun 2023 01:26:15 +0000 (09:26 +0800)]
[LoongArch] Remove AssemblerPredicate for features: f/d/lsx/lasx/lvz/lbt
Linux LoongArch port [1] uses `-msoft-float` (implies no FPU) in its
`cflags` while it also uses floating-point insns in asm sources [2].
GAS allows this usage while IAS currently does not.
This patch removes `AssemblerPredicate`s for floating-point insns so
that to make IAS compitable with GAS. Similarly, also remove
`AssemblerPredicate`s for other ISA extensions, i.e. lsx/las/lvz/lbt.
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/loongarch/Makefile?h=v6.4-rc1#n49
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/loongarch/kernel/fpu.S?h=v6.4-rc1#n29
Reviewed By: xen0n, hev
Differential Revision: https://reviews.llvm.org/D150196
Weining Lu [Mon, 26 Jun 2023 01:50:36 +0000 (09:50 +0800)]
[doc][LoongArch] Add missed release note about `ual` feature addition
I meant to fold this into
47601815ec3a4f31c797c75748af08acfabc46dc
but failed to do so.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D152671
Aiden Grossman [Sat, 20 May 2023 09:50:43 +0000 (09:50 +0000)]
[llvm-exegesis] Introduce SubprocessMemory Utility Class
This patch introduces the SubprocessMemory class to llvm-exegesis. This
class contains several utilities that are needed for managing memory to
set up an execution environment for memory annotations.
Reviewed By: courbet
Differential Revision: https://reviews.llvm.org/D151022
Wang Rui [Mon, 26 Jun 2023 02:32:24 +0000 (10:32 +0800)]
[LoongArch] Optimize conditional selection of integer
This patch optimizes code generation by leveraging the zeroing behavior of the `maskeqz`/`masknez` instructions.
```
int sel(int a, int b)
{
return (a < b) ? a : 0;
}
```
```
slt $a1,$a0,$a1
masknez $a2,$r0,$a1
maskeqz $a0,$a0,$a1
or $a0,$a0,$a2
```
=>
```
slt $a1,$a0,$a1
maskeqz $a0,$a0,$a1
```
Reviewed By: SixWeining
Differential Revision: https://reviews.llvm.org/D153193
Weining Lu [Mon, 26 Jun 2023 02:30:42 +0000 (10:30 +0800)]
Revert "[LoongArch] Optimize conditional selection of integer"
This reverts commit
3dd319ecf3be64598ea84d1730033854cade7123.
Sorry, I forgot to amend the author name and email when merging this
patch.
Aiden Grossman [Sat, 20 May 2023 09:23:27 +0000 (09:23 +0000)]
[llvm-exegesis] Introduce Subprocess Executor Mode
This patch introduces the subprocess executor mode. Currently, this new
mode doesn't do anything fancy, just executing the same code that the
inprocess executor would do, but within a subprocess. This sets up the
ability to add in many more memory-related features in the future.
Reviewed By: courbet
Differential Revision: https://reviews.llvm.org/D151021
Younan Zhang [Sun, 25 Jun 2023 16:33:16 +0000 (00:33 +0800)]
[clang] Fix a crash on invalid destructor
This is a follow-up patch to D126194 in order to
fix https://github.com/llvm/llvm-project/issues/63503.
Reviewed By: shafik
Differential Revision: https://reviews.llvm.org/D153724
Jie Fu [Mon, 26 Jun 2023 00:59:37 +0000 (08:59 +0800)]
[SimplifyCFG] Remove unused variable 'Inc' (NFC)
/data/llvm-project/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:6051:10: error: unused variable 'Inc' [-Werror,-Wunused-variable]
bool Inc, Wrapped = false;
^
1 error generated.
Aiden Grossman [Sun, 25 Jun 2023 23:46:22 +0000 (23:46 +0000)]
[llvm-exegesis] Add ability to assign perf counters to specific PID
This patch gives the ability to assign performance counters within
llvm-exegesis to a specific process by passing its PID. This is needed
later on for implementing a subprocess executor. Defaults to zero, the
current process, for the InProcessFunctionExecutorImpl.
Reviewed By: courbet
Differential Revision: https://reviews.llvm.org/D151020
khei4 [Mon, 19 Jun 2023 03:57:29 +0000 (12:57 +0900)]
[SimplifyCFG] add nsw on BuildLookuptable LinearMap calculation
Differential Revision: https://reviews.llvm.org/D150943
khei4 [Mon, 19 Jun 2023 03:57:07 +0000 (12:57 +0900)]
[SimplifyCFG] precommit test for LinearMap nsw (NFC)
Differential Revision: https://reviews.llvm.org/D153238
Matt Arsenault [Sun, 25 Jun 2023 15:23:15 +0000 (11:23 -0400)]
RegAllocGreedy: Fix assert with remarks on unassigned subregisters
This tried to query the physical subregister on virtual registers
if they were left unassigned.
Matt Arsenault [Sun, 25 Jun 2023 21:16:10 +0000 (17:16 -0400)]
AMDGPU: Handle the easy parts of strict fptrunc
f64->f16 is hard. The expansion is all integer but we need
to raise exceptions. Also doesn't handle the illegal f16 targets.
Matt Arsenault [Sun, 25 Jun 2023 20:39:15 +0000 (16:39 -0400)]
AMDGPU: Handle constrained fpext
Amaury Séchet [Sun, 25 Jun 2023 22:56:42 +0000 (22:56 +0000)]
[NFC] Autogenerate CodeGen/AMDGPU/combine-reg-or-const.ll
Amaury Séchet [Sun, 25 Jun 2023 21:19:27 +0000 (21:19 +0000)]
[NFC] Autogenerate CodeGen/PowerPC/tail-dup-break-cfg.ll