Antonio Frighetto [Thu, 29 Jun 2023 18:32:55 +0000 (20:32 +0200)]
[ConstraintElimination] Handle solving-only `ICMP_NE` predicates
Simplification of non-equality predicates for solving constraint
systems is now supported by checking the validity of related
inequalities and equalities.
Differential Revision: https://reviews.llvm.org/D152684
Sam McCall [Thu, 29 Jun 2023 19:20:53 +0000 (21:20 +0200)]
[dataflow] fix compile on gcc7
Reported on https://reviews.llvm.org/D153674
This returned expression is move-eligible, this is a bug in old GCC.
Jennifer Yu [Wed, 21 Jun 2023 23:26:35 +0000 (16:26 -0700)]
[OMP5.2] Initial support for doacross clause.
Yaxun (Sam) Liu [Thu, 29 Jun 2023 13:00:39 +0000 (09:00 -0400)]
[HIP] Fix version detection for old HIP-PATH
ROCm used to install components under individual directories,
e.g. HIP installed to /opt/rocm/hip and rocblas installed to
/opt/rocm/rocblas. ROCm has transitioned to a flat directory
structure where all components are installed to /opt/rocm.
HIP-PATH and --hip-path are supposed to be /opt/rocm as
clang detect HIP version by /opt/rocm/share/hip/version.
However, some existing HIP app still uses HIP-PATH=/opt/rocm/hip.
To avoid regression, clang will also try detect share/hip/version
under the parent directory of HIP-PATH or --hip-path.
This way, the detection will work for both new HIP-PATH and
old HIP-PATH.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D154077
Fixes: SWDEV-407757
Alexey Bataev [Thu, 29 Jun 2023 18:44:51 +0000 (11:44 -0700)]
[SLP][NFC]Add a test for buildvector with reused scalars and
extractelements.
John Harrison [Thu, 29 Jun 2023 18:49:17 +0000 (14:49 -0400)]
[lldb-vscode] Prior to running the launchCommands during a launch request set the launch info so the configured launch information is accessible by the launch commands.
Reviewed By: wallace
Differential Revision: https://reviews.llvm.org/D154028
John Harrison [Thu, 29 Jun 2023 16:56:50 +0000 (12:56 -0400)]
Creating a startDebugging reverse DAP request handler in lldb-vscode.
Adds support for a reverse DAP request to startDebugging. The new request can be used to launch child processes from lldb scripts, for example it would be start forward to configure a debug configuration for a server and a client allowing you to launch both processes with a single debug configuraiton.
Reviewed By: wallace, ivanhernandez13
Differential Revision: https://reviews.llvm.org/D153447
Aart Bik [Wed, 28 Jun 2023 19:54:12 +0000 (12:54 -0700)]
[mlir][sparse] Start migration to new surface syntax for STEA
We are in the progress of migrating to a much improved surface syntax for the Sparse Tensor Encoding Attribute (STEA).
You can see a preview of this in the StableHLO RFC at
https://github.com/openxla/stablehlo/blob/main/rfcs/
20230210-sparsity.md
//**This design is courtesy Wren Romano.**//
This initial revision
(1) Introduces the first version of a new parser written by Wren Romano
(2) Introduces a simple "migration plan" using NEW_SYNTAX on the STEA, which will allow us to test the new parser with new examples, as well as migrate existing examples over without the need to rewrite them all
This first "drop" merely provides the entry points to parse the new syntax. The parser is still under active development. For example, we need to address the "lookahead" issue when parsing the lvl spec (viz. do we see l0 = d0 or a direct d0). Another larger task is to actually implement "affine" parsing (since the MLIR affine parser is not accessible in other parts of the tree).
EXAMPLE:
Currently, CSR looks like
#CSR = #sparse_tensor.encoding<{
lvlTypes = ["dense","compressed"],
dimToLvl = affine_map<(i,j) -> (i,j)>
}>
but you can "force" the new parser with
#CSR = #sparse_tensor.encoding<{
NEW_SYNTAX =
(d0, d1) -> (l0 = d0 : dense, l1 = d1 : compressed)
}>
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D153997
Arthur Eubanks [Thu, 29 Jun 2023 18:27:15 +0000 (11:27 -0700)]
[gn build] Fix tablegen dependencies
The source_set needs to depend on Support so llvm-config files are generated first.
Joseph Huber [Thu, 29 Jun 2023 17:03:40 +0000 (12:03 -0500)]
[libc] Fix the implementation of exit on the GPU
The RPC calls all have delays associated with them. Currently the `exit`
function does an async send and immediately exits the GPU. This can have
the effect that the RPC server never sees the exit call and we continue.
This patch changes that to first sync with the server before continuing
to perform its exit. There is still a hazard here, where the kernel can
complete before the RPC call reads back its response, but this is simply
multi-threaded hazards. This change ensures that the server *will*
always exit some time after the GPU exits.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D154112
Aiden Grossman [Thu, 29 Jun 2023 18:16:41 +0000 (18:16 +0000)]
[llvm-exegesis] Change map address in memory annotation tests
Test failures have been reported by some LLVM developers in regards to
the low value of of the location where the memory is being mapped into
the virtual address space as it causes problems with some default
configurations of vm.mmap_min_addr. This patch sets it to 2^20 (1048576)
to alleviate this issues as most distros seem to use a default value of
65536.
wlei [Thu, 29 Jun 2023 00:21:40 +0000 (17:21 -0700)]
[CSSPGO] Enable stale profile matching by default for CSSPGO
We tested the stale profile matching on several Meta's internal services, all results are positive, for instance, in one service that refreshed its profile every one or two weeks, it consistently gave 1~2% performance improvement. We also observed an instance that a trivial refactoring caused a 2% regression and the matching can successfully recover the whole regression. Therefore, we'd like to turn it on by default for CSSPGO.
Reviewed By: hoy, wenlei
Differential Revision: https://reviews.llvm.org/D154027
Slava Zakharin [Thu, 29 Jun 2023 17:39:52 +0000 (10:39 -0700)]
[flang][hlfir] Set/propagate 'unordered' attribute for elementals.
This patch adds 'unordered' attribute handling the HLFIR elementals'
builders and fixes the attribute handling in lowering and transformations.
Depends on D154031, D154032
Reviewed By: jeanPerier, tblah
Differential Revision: https://reviews.llvm.org/D154035
Slava Zakharin [Thu, 29 Jun 2023 17:39:40 +0000 (10:39 -0700)]
[flang][hlfir] Do not inline ordered elementals.
This patch just disables inlining of ordered hlfir.elemental operations.
Proving the safeness of inlining is left for future development.
Depends on D154032
Reviewed By: jeanPerier, tblah
Differential Revision: https://reviews.llvm.org/D154034
Noah Goldstein [Thu, 29 Jun 2023 07:37:23 +0000 (02:37 -0500)]
[InstCombine] Canonicalize `(icmp eq/ne (and x, C), x)` -> `(icmp eq/ne (and x, ~C), 0)`
This increases the likelyhood `x` is single-use and is typically
easier to analyze.
Proofs: https://alive2.llvm.org/ce/z/8ZpS2W
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D154004
Mark de Wever [Thu, 29 Jun 2023 17:56:28 +0000 (19:56 +0200)]
[NFC][libc++] Use a better type_trait to show the intention.
Igor Kirillov [Mon, 12 Jun 2023 10:18:16 +0000 (10:18 +0000)]
[LV] Add mask support for vectorizing interleaved groups
This patch extends LoopVectorize to handle the vectorization of interleaved
memory accesses with scalable vectors when mask is required or/and predicated
tail folding is enabled.
Differential Revision: https://reviews.llvm.org/D152258
Luke Lau [Thu, 29 Jun 2023 16:04:41 +0000 (16:04 +0000)]
[SLP] Explicitly pass AccessTy to getGEPCost
Building on D149889, this patch updates SLP to pass the vector type as
the AccessTy to getGEPCost.
This should have the effect of GEPs being costed for more often instead
of being treated as foldable into the address mode and thus free, as
some architectures, notably RISC-V, do not have offset+reg addressing
modes for vector memory accesses.
Note that in SLP, GEPs are costed in two places: getPointersChainCost
and GetGEPCostDiff.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D153570
Luke Lau [Thu, 29 Jun 2023 12:26:55 +0000 (12:26 +0000)]
[RISCV][SLP] Add tests for unprofitable SLP vectorization due to GEP. NFC
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D149888
Slava Zakharin [Thu, 29 Jun 2023 16:41:44 +0000 (09:41 -0700)]
[flang][hlfir] Codegen for unordered elemental operations.
Depends on D154031, D154032
Reviewed By: jeanPerier, tblah
Differential Revision: https://reviews.llvm.org/D154033
Slava Zakharin [Thu, 29 Jun 2023 16:41:43 +0000 (09:41 -0700)]
[flang][hlfir] Parse unordered attribute for elemental operations.
By default, `hlfir.elemental` and `hlfir.elemental_addr` must process
the elements in order. The `unordered` attribute may be set,
if it is safe to process the elements out of order.
This patch just adds parsing support for the new attribute.
Reviewed By: jeanPerier, tblah
Differential Revision: https://reviews.llvm.org/D154032
Slava Zakharin [Thu, 29 Jun 2023 16:41:36 +0000 (09:41 -0700)]
[flang][hlfir] Lower ordered elemental subroutine calls.
This patch sets `unordered` `fir.do_loop` attribute during lowering
of elemental subroutine calls to HLFIR, when it is safe to do so.
Proper handling of `hlfir.elemental` will be done in a separate patch.
Reviewed By: jeanPerier, tblah
Differential Revision: https://reviews.llvm.org/D154031
Sergei Barannikov [Sat, 24 Jun 2023 10:34:12 +0000 (13:34 +0300)]
[clang][CodeGen] Remove no-op EmitCastToVoidPtr (NFC)
Reviewed By: JOE1994
Differential Revision: https://reviews.llvm.org/D153694
Craig Topper [Thu, 29 Jun 2023 17:23:39 +0000 (10:23 -0700)]
[RISCV] Add a helper class for creating GPR register classes.
Reduces the amount of repeated template parameters for every class.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D154042
Joseph Huber [Thu, 29 Jun 2023 16:32:50 +0000 (11:32 -0500)]
[OpenMP] Adjust using the NVPTX architecture detection tool
A previous patch by @arsenm adjusted these to find the `amdgpu-arch`
tool correctly if we do a `LLVM_ENABLE_PROJECTS` build. This patch
applies the same to `nvptx-arch` tool to keep it consistent.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D154107
Scott Todd [Thu, 22 Jun 2023 19:27:41 +0000 (12:27 -0700)]
[mlir][docgen] Handle Windows line endings in doc generation.
The `printReindented` function searches for Unix style line endings (`\n`), but strings may have Windows style line endings (`\r\n`). Prior to this change, generated document sections could have extra indentation, which some markdown renderers interpret as code blocks rather than paragraphs.
Differential Revision: https://reviews.llvm.org/D153591
Emilia Kond [Thu, 29 Jun 2023 16:46:04 +0000 (19:46 +0300)]
[clang-format] Correctly annotate operator free function call
The annotator correctly annotates an overloaded operator call when
called as a member function, like `x.operator+(y)`, however, when called
as a free function, like `operator+(x, y)`, the annotator assumed it was
an overloaded operator function *declaration*, instead of a call.
This patch allows for a free function call to correctly be annotated as
a call, but only if the current like cannot be a declaration, usually
within the bodies of a function.
Fixes https://github.com/llvm/llvm-project/issues/49973
Reviewed By: HazardyKnusperkeks, owenpan, MyDeveloperDay, Nuullll
Differential Revision: https://reviews.llvm.org/D153798
Carlos Eduardo Seo [Thu, 29 Jun 2023 14:08:10 +0000 (11:08 -0300)]
Replace sprintf by snprintf
The macOS toolchain deprecated sprintf in favor of snprintf. This was blocking
the build on macOS. Replaced all instances of sprintf by snprintf.
Valentin Clement [Thu, 29 Jun 2023 16:45:12 +0000 (09:45 -0700)]
[openacc] Allow async, wait and device_type on the data construct
From OpenACC 3.2 specification:
The async, wait, and device_type clauses may be specified on data
constructs.
This patch adds these clauses in the ACC.td file and adds some tests
for them in flang parsing.
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D154013
Jean Perier [Thu, 29 Jun 2023 16:38:18 +0000 (18:38 +0200)]
[flang] Fix array substring emboxing code generation
The code generation of the fir.embox op creating descriptors for
array substring with a non constant length base was using the
substring length to compute the first dimension result stride.
Fix it to use the input length instead.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D154086
Fangrui Song [Thu, 29 Jun 2023 16:39:57 +0000 (09:39 -0700)]
[RISCV] Make linker-relaxable instructions terminate MCDataFragment
`MCExpr::evaluateAsAbsolute` has a longstanding bug. When the MCAssembler is
non-null and the MCAsmLayout is null, it may incorrectly fold A-B even if A and
B are separated by a linker-relaxable instruction. This behavior can suppress
some ADD/SUB relocations and lead to wrong results if the linker performs
relaxation.
To fix the bug, ensure that linker-relaxable instructions only appear at the end
of an MCDataFragment, thereby making them terminate the fragment. When computing
A-B, suppress folding if A and B are separated by a linker-relaxable
instruction.
* `.subsection` now correctly give errors for non-foldable expressions.
* gen-dwarf.s will pass even if we add back the .debug_line or .eh_frame/.debug_frame code from D150004
* This will fix suppressed relocation when we add R_RISCV_SET_ULEB128/R_RISCV_SUB_ULEB128.
In the future, we should investigate the desired behavior for
`MCExpr::evaluateAsAbsolute` when both MCAssembler and MCAsmLayout are non-null.
(Note: MCRelaxableFragment is only for assembler-relaxation. If we ever need
linker-relaxable MCRelaxableFragment, we would need to adjust RISCVMCExpr.cpp
(D58943/D73211).)
Depends on D153096
Differential Revision: https://reviews.llvm.org/D153097
Arthur Eubanks [Mon, 26 Jun 2023 16:51:22 +0000 (09:51 -0700)]
[PassBuilder] Add textual representation for function simplification pipeline
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D153784
Jean Perier [Thu, 29 Jun 2023 16:36:05 +0000 (18:36 +0200)]
[flang] Support CHARACTER(4) pointer targets
fir.rebox is emitting an llvm.sdiv to compute the character length
given the byte size from the input descriptor.
Inside a fir.global, this is not needed given the target length must
be accessible via the type, and it caused MLIR to fail LLVM IR
code generation (and crash).
Use the input type length when available instead.
Reviewed By: PeteSteinfeld, vzakhari
Differential Revision: https://reviews.llvm.org/D154072
Arthur Eubanks [Thu, 29 Jun 2023 16:36:10 +0000 (09:36 -0700)]
[NFC] Update stale comment after D154001
Arthur Eubanks [Wed, 28 Jun 2023 21:08:38 +0000 (14:08 -0700)]
[ScalarEvolution] Analyze ranges for heap allocations
Followup to D153624. Allows for better exit count calculations for loops checking heap allocations against null.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D154001
Arthur Eubanks [Wed, 28 Jun 2023 21:04:57 +0000 (14:04 -0700)]
[test] Precommit SCEV test
Johannes Doerfert [Tue, 20 Jun 2023 04:37:21 +0000 (21:37 -0700)]
[Attributor][NFC] Introduce a flag to skip liveness checks
While we can disallow AAs, liveness checks are everywhere and if the
user doesn't want them it is costly to go through just to find out
everything is assumed live.
Johannes Doerfert [Wed, 28 Jun 2023 00:31:11 +0000 (17:31 -0700)]
[OpenMPOpt] Properly check AA pointers
The interface was changed to return pointers, so we need to check them
for null now at they might actually be null in the future).
Johannes Doerfert [Wed, 28 Jun 2023 00:29:31 +0000 (17:29 -0700)]
[AAAMDAttributes] AAPointerInfo depends on AAUnderlyingObjects
Johannes Doerfert [Wed, 28 Jun 2023 00:20:21 +0000 (17:20 -0700)]
[AMDGPUAttributor][NFC] Make the debug output meaningful
Zhiheng Xie [Thu, 29 Jun 2023 11:37:49 +0000 (04:37 -0700)]
[OpenMP] Fix lvalue reference type generation in untied task loop
For variables with lvalue reference type in untied task loop,
it now wrongly sets its actual type as ElementType. It should
be converted to pointer type.
It fixes https://github.com/llvm/llvm-project/issues/62965
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D153321
Scott Linder [Tue, 27 Jun 2023 19:03:37 +0000 (19:03 +0000)]
[NFC][AMDGPU] Refactor AMDGPUDisassembler
Clean up ahead of a patch to fix bugs in the AMDGPUDisassembler.
Use split-file to simplify and extend existing kernel-descriptor
disassembly tests.
Add a comment to AMDHSAKernelDescriptor.h, as at least one small set
towards keeping all kernel-descriptor sensitive code in sync.
Reviewed By: MaskRay, kzhuravl, arsenm
Differential Revision: https://reviews.llvm.org/D130105
Valentin Clement [Thu, 29 Jun 2023 16:05:21 +0000 (09:05 -0700)]
[flang][openacc] Fix name resolution for fct name in acc routine
Name resolution was failing when the routine name is
a function/subroutine in the parent scope.
Reviewed By: vzakhari
Differential Revision: https://reviews.llvm.org/D154002
sstwcw [Thu, 29 Jun 2023 15:37:06 +0000 (15:37 +0000)]
[clang-format] Indent Verilog struct literal on new line
Before:
```
c = //
'{default: 0};
```
After:
```
c = //
'{default: 0};
```
If the line has to be broken, the continuation part should be
indented. Before this fix, it was not the case if the continuation
part was a struct literal. The rule that caused the problem was added
in
783bac6b. It was intended for aligning the field labels in
ProtoBuf. The type `TT_DictLiteral` was only for colons back then, so
the program didn't have to check whether the token was a colon when it
was already type `TT_DictLiteral`. Now the type applies to more
things including the braces enclosing a dictionary literal. In
Verilog, struct literals start with a quote. The quote is regarded as
an identifier by the program. So the rule for aligning the fields in
ProtoBuf applied to this situation by mistake.
Reviewed By: HazardyKnusperkeks
Differential Revision: https://reviews.llvm.org/D152623
Arnold Schwaighofer [Thu, 29 Jun 2023 15:07:11 +0000 (08:07 -0700)]
Add a type_checked_load_relative to support relative function pointer tables
This adds a type_checked_load_relative intrinsic whose semantics it is to
load a relative function pointer.
A relative function pointer is a pointer to a 32bit value that when
added to its address yields the address of the function.
Differential Revision: https://reviews.llvm.org/D143204
Peter Klausler [Fri, 23 Jun 2023 18:01:33 +0000 (11:01 -0700)]
[flang] Honor #line and related preprocessing directives
Extend the SourceFile class to take account of #line directives
when computing source file positions for error messages.
Adjust the output of #line directives to -E output so that they
reflect any #line directives that were in the input.
Differential Revision: https://reviews.llvm.org/D153910
Jeffrey Byrnes [Fri, 23 Jun 2023 21:42:30 +0000 (14:42 -0700)]
[HIP]: Add -fhip-emit-relocatable to override link job creation for -fno-gpu-rdc
Differential Revision: https://reviews.llvm.org/D153667
Change-Id: Idcc5c7c25dc350b8dc9a1865fd67982904d06ecd
Haojian Wu [Thu, 29 Jun 2023 11:57:36 +0000 (13:57 +0200)]
[clangd] Don't show header for namespace decl in Hover
The header for namespace symbol is barely useful.
Differential Revision: https://reviews.llvm.org/D154068
Arnold Schwaighofer [Wed, 28 Jun 2023 20:23:04 +0000 (13:23 -0700)]
WholeProgramDevirt: Fix call target propagation for ptrauth architectures
We can't have a call with a constant target with a ptrauth bundle. Remove the
ptrauth bundle operand in such a case
rdar://
105696396
Differential Revision: https://reviews.llvm.org/D144581
Zahira Ammarguellat [Thu, 29 Jun 2023 14:16:14 +0000 (10:16 -0400)]
Fix test regression on 32-bit x86.
Differential Revision: https://reviews.llvm.org/D153770
Matthias Springer [Thu, 29 Jun 2023 14:32:11 +0000 (16:32 +0200)]
[mlir][Transforms][NFC] CSE: Add non-pass entry point
Add an additional entry point so that CSE can be used without a pass. This allows CSE to be used from the Transform dialect without invalidating all handles.
* All IR modifications are done with a rewriter.
* The C++ entry point takes a `RewriterBase &`, which may have a listener attached to it. This allows users to track all IR modifications.
Differential Revision: https://reviews.llvm.org/D145226
Philip Reames [Thu, 29 Jun 2023 14:24:54 +0000 (07:24 -0700)]
[RISCV] Remove legacy TA/TU pseudo distinction for unary instructions
This change continues with the line of work discussed in https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295. In D153155, we started removing the legacy distinction between unsuffixed (TA) and _TU pseudos. This patch continues that effort for the unary instruction families.
The change consists of a few interacting pieces:
* Adding a vector policy operand to VPseudoUnaryNoMaskTU.
* Then using VPseudoUnaryNoMaskTU for all cases where VPseudoUnaryNoMask was previously used and deleting the unsuffixed form.
* Then renaming VPseudoUnaryNoMaskTU to VPseudoUnaryNoMask, and adjusting the RISCVMaskedPseudo table to use the combined pseudo.
* Fixing up two places in C++ code which manually construct VMV_V_* instructions.
Normally, I'd try to factor this into a couple of changes, but in this case, the table structure is tied to naming and thus we can't really separate the otherwise NFC bits.
As before, we see codegen changes (some improvements and some regressions) due to scheduling differences caused by the extra implicit_def instructions.
Differential Revision: https://reviews.llvm.org/D153899
Simon Pilgrim [Thu, 29 Jun 2023 14:27:28 +0000 (15:27 +0100)]
[X86] LowerTRUNCATE - attempt to use PACKSS/PACKUS on AVX512 targets if the truncation source is concatenating from smaller subvectors
Don't just use AVX512 truncation ops if PACKSS/PACKUS can do this more cheaply
Simon Pilgrim [Thu, 29 Jun 2023 13:58:56 +0000 (14:58 +0100)]
[X86] Add isFreeToSplitVector helper to detect nodes that we can freely split/extract subvectors from.
Helper wrapper around the existing collectConcatOps method.
LLVM GN Syncbot [Thu, 29 Jun 2023 14:13:28 +0000 (14:13 +0000)]
[gn build] Port
cfa096d9c92e
Christian Trott [Thu, 29 Jun 2023 14:06:47 +0000 (10:06 -0400)]
[libc++][mdspan] Implement layout_right
This commit implements layout_right in support of C++23 mdspan
(https://wg21.link/p0009). layout_right is a layout mapping policy
whose index mapping corresponds to the memory layout of multidimensional
C-arrays, and is thus also referred to as the C-layout.
Co-authored-by: Damien L-G <dalg24@gmail.com>
Differential Revision: https://reviews.llvm.org/D151267
Takuya Shimizu [Thu, 29 Jun 2023 14:02:09 +0000 (23:02 +0900)]
[clang][Sema] Remove dead diagnostic for loss of __unaligned qualifier
D120936 has made the loss of `__unaligned` qualifier NOT a bad-conversion.
Because of this, the bad-conversion note about the loss of this qualifier does not take effect.
e.g.
```
void foo(int *ptr);
void func(const __unaligned int *var) { foo(var); }
```
BEFORE this patch:
```
source.cpp:3:41: error: no matching function for call to 'foo'
3 | void func(const __unaligned int *var) { foo(var); }
| ^~~
source.cpp:1:6: note: candidate function not viable: 1st argument ('const __unaligned int *') would lose __unaligned qualifier
1 | void foo(int *ptr);
| ^
2 |
3 | void func(const __unaligned int *var) { foo(var); }
| ~~~
```
AFTER this patch:
```
source.cpp:3:41: error: no matching function for call to 'foo'
3 | void func(const __unaligned int *var) { foo(var); }
| ^~~
source.cpp:1:6: note: candidate function not viable: 1st argument ('const __unaligned int *') would lose const qualifier
1 | void foo(int *ptr);
| ^
2 |
3 | void func(const __unaligned int *var) { foo(var); }
| ~~~
```
Please note the different mentions of `__unaligned` and `const` in notes.
Reviewed By: cjdb, rnk
Differential Revision: https://reviews.llvm.org/D153690
Mike Crowe [Wed, 28 Jun 2023 15:36:42 +0000 (15:36 +0000)]
[clang-tidy] Fix modernize-use-std-print check when return value used
The initial implementation of the modernize-use-std-print check was
capable of converting calls to printf (etc.) which used the return value
to calls to std::print which has no return value, thus breaking the
code.
Use code inspired by the implementation of bugprone-unused-return-value
check to ignore cases where the return value is used. Add appropriate
lit test cases and documentation.
Reviewed By: PiotrZSL
Differential Revision: https://reviews.llvm.org/D153860
Alexey Lapshin [Tue, 27 Jun 2023 19:27:34 +0000 (21:27 +0200)]
[DWARFv5][DWARFLinker] Remove dsymutil-classic compatibility feature as it leads to an error.
DWARFLinker has a compatibility feature with dsymutil-classic.
It may keep location expression attribute even if does not
reference live address. Current llvm-dwarfdump --verify
reports a error if variable references an address but is not
added into the .debug_names table.
error: Name Index @ 0x0: Entry for DIE @ 0xf35 (DW_TAG_variable) with name seed missing.
DW_TAG_variable
DW_AT_name ("seed")
DW_AT_type (0x00000000000047b7 "uint64_t")
DW_AT_location (DW_OP_addr 0x9ff8) <<<< dead address
DWARFLinker does not add the variable into .debug_names table
because it references dead address. To have a valid variable and
consistent accelerator table it is necessary to remove location expression
referencing dead address. This patch removes dsymutil-classic
compatibilty feature.
Differential Revision: https://reviews.llvm.org/D153988
Nikita Popov [Thu, 29 Jun 2023 13:31:18 +0000 (15:31 +0200)]
[X86] Add tests for PR63475 (NFC)
pvanhout [Mon, 26 Jun 2023 15:59:55 +0000 (17:59 +0200)]
[MCP] Optimize copies from undef
Revert D152502 and instead optimize away copy from undefs, but clear the undef flag on the original copy.
Apparently, not optimizing the COPY can cause performance issues in some cases.
Fixes SWDEV-405813, SWDEV-405899
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D153838
Sean Perry [Thu, 29 Jun 2023 12:54:20 +0000 (08:54 -0400)]
[SystemZ][z/OS] Add support for z/OS link step (executable and shared libs)
Add support for performing a link step on z/OS. This will support C & C++ building executables and shared libs.
Reviewed By: zibi, abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D153580
Luke Lau [Thu, 29 Jun 2023 12:54:57 +0000 (13:54 +0100)]
[RISCV] Add tests for cost modelling constants in phis
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D149168
pvanhout [Tue, 27 Jun 2023 14:50:01 +0000 (16:50 +0200)]
[AMDGPU] Handle Additional Cases in tryFoldPhiAGPR
Sometimes PHI have different incoming values, such as:
```
%1:vgpr_256 = COPY %0:agpr_256
%2:vgpr_32 = COPY %1:vgpr_256.sub0
```
Those weren't handled, which could lead to massive performance issues if break-large-PHIs kicked in + AGPRs were used (MFMA)
Fixes SWDEV-407986
Reviewed By: #amdgpu, arsenm
Differential Revision: https://reviews.llvm.org/D153879
Luke Lau [Thu, 4 May 2023 18:42:05 +0000 (19:42 +0100)]
[TTI] Use users of GEP to guess access type in getGEPCost
Currently getGEPCost uses the target type of the GEP as a heuristic for
the type that will be accessed, to pass onto isLegalAddressingMode.
Targets use this to work out if a GEP can then be folded into the
load/store instruction that uses the GEP.
For example, on RISC-V loads and stores can have an offset added to a
base register folded into a single instruction, so the following GEP is
free:
%p = getelementptr i32, ptr %base, i32 42 ; getInstructionCost = 0
%x = load i32, ptr %p ; getInstructionCost = 1
------------------------------------------------------------------------
lw t0, a0(42)
However vector loads and stores cannot have an offset folded into them,
so the following GEP is costed:
%p = getelementptr <2 x i32>, ptr %base, i32 42 ; getInstructionCost = 1
%x = load <2 x i32>, ptr %p ; getInstructionCost = 1
------------------------------------------------------------------------
addi a0, 42
vle32 v8, (a0)
The issue arises whenever there is a mismatch between the target type of
the GEP and the type that is actually accessed:
%p = getelementptr i32, ptr %base, i32 42 ; getInstructionCost = 0
%x = load <2 x i32>, ptr %p ; getInstructionCost = 1
------------------------------------------------------------------------
addi a0, 42
vle32 v8, (a0)
Even though this GEP will result in an add instruction, because TTI
thinks it's loading an i32, it will think it can be folded and not
charge for it.
The target type can become mismatched with the memory access during
transformations, noticeably during SLP where a scalar base pointer will
be reused to perform a vector load or store.
This patch adds an optional AccessType argument to getGEPCost which
allows the type of memory accessed by users to be passed in as a hint,
so that we can more accurately determine if the GEP can be folded into
its users.
If AccessType is not provided, getGEPCost falls back to the old
behaviour of using the PointeeType to guess the memory access type. This
can be revisited in a later patch.
Also for now, only GEPs with exactly one user use the access type hint.
Whilst we could look through all users and use all access types to
determine if we can fold the GEP, this patch avoids doing so to prevent
O(N) behaviour.
Differential Revision: https://reviews.llvm.org/D149889
Luke Lau [Thu, 4 May 2023 18:36:48 +0000 (19:36 +0100)]
[RISCV][SLP] Add tests for GEP costs
This patch updates the tests in gep.ll to have explicitly memory
accesses using them, to illustrate the new behaviour in D149889.
New tests have also been added for mismatched pointer types and memory
access types, and gep-zero-indices.ll has also been added to make sure
that we always cost GEPs with all zero indices as free.
Ivan Butygin [Fri, 23 Jun 2023 17:38:32 +0000 (19:38 +0200)]
[mlir][memref] Add some missing interfaces to memref ops.
Add `ViewLikeOpInterface` to `ExtractStridedMetadataOp` as it returns its buffer as one of the results.
Add mem Read/Write attributes to atomic ops.
Differential Revision: https://reviews.llvm.org/D153647
David Green [Thu, 29 Jun 2023 12:29:34 +0000 (13:29 +0100)]
[AArch64] Add and cmp cost model tests. NFC
See D153611. Tests for the cost of icmp(and, 0) are added, in addition to
expanding the extractelements-to-shuffle.ll test, which has always been a bit
simple, to include a more complete example with both a vector and scalar
version. The icmp(and, 0) costs are targetting at improving the second when the
cost of vector inserts and extracts is lowered.
eopXD [Thu, 29 Jun 2023 08:39:25 +0000 (01:39 -0700)]
[Clang][RISCV] Fix RISC-V vector / SiFive intrinsic inclusion in SemaLookup
The existing code assumes that both `DeclareRISCVVBuiltins` and
`DeclareRISCVSiFiveVectorBuiltins` are set when coming into the if-statement
under SemaLookup.cpp.
This is not the case and causes issue #63571.
This patch resolves the issue.
Reviewed By: 4vtomat, kito-cheng
Differential Revision: https://reviews.llvm.org/D154050
Guillaume Chatelet [Thu, 29 Jun 2023 12:21:26 +0000 (12:21 +0000)]
[libc][NFC] Use SIZE_MAX instead of size_t(-1)
Paul Walker [Thu, 22 Jun 2023 14:03:28 +0000 (14:03 +0000)]
[Clang] Allow C++11 style initialisation of SVE types.
Fixes https://github.com/llvm/llvm-project/issues/63223
Differential Revision: https://reviews.llvm.org/D153560
Ben Shi [Fri, 23 Jun 2023 07:15:57 +0000 (15:15 +0800)]
[CSKY][test][NFC] Add tests of ANDI/ORI
These tests will be optimized with BSETI32/BCLRI32
in the future.
Reviewed By: zixuan-wu
Differential Revision: https://reviews.llvm.org/D153613
Ben Shi [Wed, 21 Jun 2023 07:43:06 +0000 (15:43 +0800)]
[CSKY][NFC] Simplify code with multiclass
Reviewed By: zixuan-wu
Differential Revision: https://reviews.llvm.org/D153402
Alex Brachet [Thu, 29 Jun 2023 11:22:00 +0000 (11:22 +0000)]
[ELF][NFC] Change comment terminology
Differential Revision: https://reviews.llvm.org/D153978
Quentin Colombet [Thu, 29 Jun 2023 10:25:15 +0000 (12:25 +0200)]
[mlir][Linalg] Add a softmax op
This patch adds a softmax op.
For now, nothing interesting happens, we can only do a round trip.
Later patches will add the tiling interface and the lowering of this op to
a sequence of simpler ops.
This is graduating the linag_ext.softmax op from iree to LLVM.
Original implementation from Harsh Menon <harsh@nod-labs.com>
Nicolas Vasilache <nicolas.vasilache@gmail.com> co-authored this patch.
Differential Revision: https://reviews.llvm.org/D153422
Joel Wee [Thu, 29 Jun 2023 10:46:54 +0000 (12:46 +0200)]
[mlir][GreedyPatternRewriter] Add out param to detect changes in IR in `applyPatternsAndFoldGreedily`
This allows users of `applyPatternsAndFoldGreedily` to detect if any MLIR changes have occurred. An example use-case is where we expect the `applyPatternsAndFoldGreedily` to change the IR and want to validate that it indeed does change it.
Differential Revision: https://reviews.llvm.org/D153986
Florian Hahn [Thu, 29 Jun 2023 10:18:43 +0000 (11:18 +0100)]
[ConstraintElim] Add ptr phi tests with upper bounds with const offsets.
Extra tests for D152730.
Ivan Kosarev [Thu, 29 Jun 2023 09:51:37 +0000 (10:51 +0100)]
[AMDGPU][AsmParser][NFC] Simplify instruction operand definitions.
This addresses the trivial cases that only require removing the
operand classes and renaming related entities.
Part of <https://github.com/llvm/llvm-project/issues/62629>.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D153965
Liren Peng [Thu, 29 Jun 2023 08:08:06 +0000 (16:08 +0800)]
Revert "[ScalarEvolution] Infer loop max trip count from array accesses"
This reverts commit
57e093162e27334730d8ed8f7b25b1b6f65ec8c8.
Jie Fu [Thu, 29 Jun 2023 08:58:12 +0000 (16:58 +0800)]
[RISCV] Remove unused variables in RISCVISelDAGToDAG.cpp (NFC)
/Users/jiefu/llvm-project/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp:97:33: error: unused variable 'FuncInfo' [-Werror,-Wunused-variable]
RISCVMachineFunctionInfo *FuncInfo =
^
/Users/jiefu/llvm-project/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp:106:29: error: unused variable 'TLI' [-Werror,-Wunused-variable]
const TargetLowering &TLI = CurDAG->getTargetLoweringInfo();
^
2 errors generated
Yunze Zhu [Thu, 29 Jun 2023 06:38:45 +0000 (14:38 +0800)]
[RISCV] Use temporary stack in expanding SPLAT_VECTOR_SPLIT_I64_VL node
There is an issue: https://github.com/llvm/llvm-project/issues/63515
The issue is because when expanding SPLAT_VECTOR_SPLIT_I64_VL node, only memoperand is used to create dependency.
However in ScheduleDAGNodes, dependency is checked with chain only, and breaks order of store/load instructions.
I think in llvm.bitreverse.nxv2i64 intrinsic SPLAT_VECTOR_SPLIT_I64_VL nodes are parallel processed,
so no chain should be add to these nodes.
Using temporary in expanding SPLAT_VECTOR_SPLIT_I64_VL node can keep vlse instruction get correct value
no matter order of store instructions is changed.
Differential Revision: https://reviews.llvm.org/D153743
Florian Hahn [Thu, 29 Jun 2023 08:35:35 +0000 (09:35 +0100)]
[ConstraintElim] Add pointer induction tests with struct types.
Extra tests for D152730.
David Spickett [Wed, 28 Jun 2023 08:01:14 +0000 (08:01 +0000)]
Reland "[LLDB] Fix the use of "platform process launch" with no extra arguments"
This reverts commit
3254623d73fb7252385817d8057640c9d5d5ffd1.
One test has been updated to add the "-s" flag which along with
86fd957af981f146a306831608d7ad2de65b9560 should fix the tests on MacOS.
An assert on hijack listener added in that patch was removed, it seems
to be correct on MacOS but not on Linux.
Arthur Eubanks [Thu, 29 Jun 2023 08:18:07 +0000 (10:18 +0200)]
[PhaseOrdering] Add test with gep null compare in loop (NFC)
Test from D153392 in both the alloca and malloc variants.
Florian Hahn [Thu, 29 Jun 2023 08:17:37 +0000 (09:17 +0100)]
[ConstraintElim] Allow and check preconditions in doesHold.
Delegate checking of the constraint & its preconditions to the existing
::isValid. This reduces duplication and allows additional optimizations
together with D152730.
Christian Ulmann [Thu, 29 Jun 2023 07:53:18 +0000 (07:53 +0000)]
[mlir][llvm] Dominance violating debug intrinsic import
Debug intrinsics are allowed to violate SSA dominance and might thus
cause the LLVM import to produce invalid LLVM dialect. This commit
ensures that the debug intrinsics are emitted right after the definition
of their SSA operands.
As the position of debug intrinsics has no meaning, changing it has no
semantic implication.
Reviewed By: gysit
Differential Revision: https://reviews.llvm.org/D153984
Michael Platings [Thu, 29 Jun 2023 08:07:12 +0000 (09:07 +0100)]
[Clang][Driver] Change missing multilib error to warning
The error could be awkward to work around when experimenting with flags
that didn't have a matching multilib. It also broke many tests when
multilib.yaml was present in the build directory.
Reviewed By: simon_tatham, MaskRay
Differential Revision: https://reviews.llvm.org/D153885
Michael Platings [Wed, 28 Jun 2023 07:44:41 +0000 (08:44 +0100)]
[test] Replace aarch64-*-eabi with aarch64
Also replace aarch64_be-*-eabi with aarch64_be
Using "eabi" for aarch64 targets is a common mistake and warned by Clang Driver.
We want to avoid it elsewhere as well. Just use the common "aarch64" without
other triple components.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D153943
Nikita Popov [Thu, 29 Jun 2023 07:58:49 +0000 (09:58 +0200)]
[FunctionAttrs] Regenerate test checks (NFC)
Juan Manuel MARTINEZ CAAMAÑO [Thu, 29 Jun 2023 07:31:34 +0000 (09:31 +0200)]
[InlineCost][TargetTransformInfo][AMDGPU] Consider cost of alloca instructions in the caller (2/2)
Before this patch, the compiler gave a bump to the inline-threshold
when the total size of the allocas passed as arguments to the
callee was below 256 bytes.
This heuristic ignores that some of these allocas could have be removed
by SROA if inlining was applied.
Ideally, this bonus would be attributed to the threshold once the
size of all the allocas that could not be handled by SROA is known:
at the end of the InlineCost analysis.
However, we may never reach this point if the inline-cost analysis exits
early when the inline cost goes over the threshold mid-analysis.
This patch proposes:
* Attribute the bonus in the inline-threshold when allocas are passed
as arguments (regardless of their total size).
* Assigns a cost to each alloca proportional to its size,
such that the cost of all the allocas cancels the bonus.
Potential problems:
* This patch assumes that removing alloca instructions with SROA is
always profitable. This may not be the case if the total size of the
allocas is still too big to be promoted to registers/LDS.
* Redundant calls to getTotalAllocaSize
* Awkwardly, the threshold attributed contributes to the single-bb and
vector bonus.
Reviewed By: scchan
Differential Revision: https://reviews.llvm.org/D149741
Juan Manuel MARTINEZ CAAMAÑO [Thu, 29 Jun 2023 07:11:54 +0000 (09:11 +0200)]
[InlineCost][TargetTransformInfo][AMDGPU] Consider cost of alloca instructions in the caller (1/2)
On AMDGPU, alloca instructions have penalty that can
be avoided when SROA is applied after inlining.
This patch introduces the default implementation of
TargetTransformInfo::getCallerAllocaCost.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D149740
mgrzywac [Thu, 29 Jun 2023 07:41:08 +0000 (07:41 +0000)]
[libunwind] Add cached compile and link flags to libunwind
Add flags allowing to use compile flags and libraries provided in cache with libunwind.
Similar flags are already present in libc++ and libc++abi CMakeLists files.
Differential Revision: https://reviews.llvm.org/D150252
Craig Topper [Thu, 29 Jun 2023 07:20:47 +0000 (00:20 -0700)]
[RISCV] Do a more complete job of disabling extending loads and truncating stores for fixed vector types.
We weren't marking some combinations as Expand if ones of the
types wasn't legal.
Fixes #63596.
Hanbum Park [Thu, 29 Jun 2023 07:13:18 +0000 (09:13 +0200)]
[InstSimplify] Fold icmp of allocas based on offset difference
Strengthen the fold for icmps of non-overlapping storage, by
working on the difference of offsets, rather than considering
both offsets independently. In particular, this allows handling
comparisons of pointers to the end of equal-sized allocations.
Proofs: https://alive2.llvm.org/ce/z/Po2nL4
Differential Revision: https://reviews.llvm.org/D153752
Martin Braenne [Thu, 29 Jun 2023 06:39:39 +0000 (06:39 +0000)]
[clang][dataflow] Don't crash when creating pointers to members.
The newly added tests crash without the other changes in this patch.
Reviewed By: sammccall, xazax.hun, gribozavr2
Differential Revision: https://reviews.llvm.org/D153960
Nikita Popov [Fri, 23 Jun 2023 10:50:27 +0000 (12:50 +0200)]
[SCEV] Make use of non-null pointers for range calculation
We know that certain pointers (e.g. non-extern-weak globals or
allocas in default address space) are not null, in which case the
lowest address they can be allocated at is their alignment.
This allows us to calculate better exit counts for loops that have
an additional null check in the guarding condition
(see alloca_icmp_null_exit_count).
Differential Revision: https://reviews.llvm.org/D153624
Tobias Gysi [Thu, 29 Jun 2023 06:31:03 +0000 (06:31 +0000)]
[mlir][llvm] Add debug label intrinsic
This revision adds support for the llvm.dbg.label.intrinsic
and the corresponding DILabel metadata.
Reviewed By: Dinistro
Differential Revision: https://reviews.llvm.org/D153975
Jianjian GUAN [Thu, 29 Jun 2023 03:15:14 +0000 (11:15 +0800)]
[RISCV] Update computeKnownBitsForTargetNode for FPCLASS.
The fclass instruction only set one of the low 10 bits.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D154040
Mikael Holmen [Thu, 29 Jun 2023 05:51:15 +0000 (07:51 +0200)]
[StructuralHash] Ignore global variable declarations
Ignore declarations of global variables, just as we do with declarations
of functions.
Done as a follow up to the comments in https://reviews.llvm.org/D149209
Differential Revision: https://reviews.llvm.org/D153855
# Conflicts:
# llvm/lib/IR/StructuralHash.cpp
Han Shen [Thu, 29 Jun 2023 05:18:53 +0000 (22:18 -0700)]
[Analysis] Refactor MBB hotness/coldness into templated PSI functions.
Currently, to use PSI->isFunctionHotInCallGraph, we first need to
calculate BPI->BFI, which is expensive. Instead, we can implement this
directly with MBFI. Also as @wenlei mentioned in another patch review,
that MachineSizeOpts already has isFunctionColdInCallGraph,
isFunctionHotInCallGraphNthPercentile, etc implemented. These can be
refactored and so they can be reused across MachineFunctionSplitting
and MachineSizeOpts passes.
This CL does this - it refactors out those internal static functions
into PSI as templated functions, so they can be accessed easily.
Differential Revision: https://reviews.llvm.org/D153927