David Blaikie [Thu, 5 May 2022 18:09:34 +0000 (18:09 +0000)]
DWARFVerifier: Verify CU/TU index overlap issues
Discovered in a large object that would need a 64 bit index (but the
cu/tu index format doesn't include a 64 bit offset/length mode in
DWARF64 - a spec bug) but instead binutils dwp overflowed the offsets
causing overlapping regions.
Nick Desaulniers [Thu, 5 May 2022 18:06:09 +0000 (11:06 -0700)]
[X86SchedSandyBridge] update cost of COPY to 1 cycle from 0
To match the cost of other scheduling models. This is expected to
schedule mov instructions around INLINEASM less frequently for the
default machineschedule (pre-RA scheduling).
Suggested by Craig Topper.
Link: https://github.com/llvm/llvm-project/issues/41914
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D122350
Serge Pavlov [Thu, 5 May 2022 18:05:05 +0000 (01:05 +0700)]
Revert "[InstCombine] Remove side effect of replaced constrained intrinsics"
This reverts commit
83914ee96fc2d828e1cfb8913f5d156d39150e2c.
The change caused discussion: https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-
20220502/1034841.html
Peter Kasting [Thu, 5 May 2022 17:52:12 +0000 (19:52 +0200)]
[libc++] Avoid a Microsoft SAL macro.
Bug: https://github.com/llvm/llvm-project/issues/55195
Reviewed By: #libc, Mordante
Differential Revision: https://reviews.llvm.org/D124695
Nick Desaulniers [Thu, 5 May 2022 17:59:49 +0000 (10:59 -0700)]
[x86][scheduler] Add MIR test for 41914
Generated via:
$ clang -fno-omit-frame-pointer -m32 -mregparm=3 -O2 crash.c -emit-llvm -S
$ llc -print-before=machine-scheduler -mcpu=sandybridge crash.mir
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D122348
Andrzej Warzynski [Thu, 5 May 2022 17:58:08 +0000 (17:58 +0000)]
[flang][driver] Add missing parentheses in an assert
The assert in https://reviews.llvm.org/D124665 was missing parentheses,
which triggered a warning in GCC (verified with GCC 11). As `-Werror` is
on by default in FLang, that triggered build errors, see e.g. [1].
The fix is rather straightforward, so I am sending this without a
review.
[1] https://lab.llvm.org/buildbot/#/builders/160/builds/7016
Differential Revision: https://reviews.llvm.org/D125027
Aaron Ballman [Thu, 5 May 2022 17:53:03 +0000 (13:53 -0400)]
No longer accept scoped enumerations in C
We had a think-o that would allow a user to declare a scoped
enumeration in C language modes "as a C++11 extension". This is a
think-o because there's no way for the user to spell the name of the
enumerators; C does not have '::' for a fully-qualified name. See
commit
d0d87b597259a2b74ae5c2825a081c7e336cb1d0 for details on why this
is unintentional for C.
Fixes #42372
Joe Nash [Thu, 14 Apr 2022 13:29:25 +0000 (09:29 -0400)]
[AMDGPU] Split FeatureAtomicFaddInsts
FeatureAtomicFaddInsts is replaced with three more granular features.
Contributors:
Petar Avramovic <Petar.Avramovic@amd.com>
Patch 3/N for upstreaming of AMDGPU gfx11 architecture
Depends on D124537
Reviewed By: foad, #amdgpu, arsenm
Differential Revision: https://reviews.llvm.org/D124538
Amir Ayupov [Thu, 5 May 2022 17:38:31 +0000 (10:38 -0700)]
[BOLT][CMAKE] Check build target architecture for runtime libs
Account for cross-compilation build scenarios (X86 to ARM, Linux
to Windows, etc).
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D124712
Christopher Bate [Wed, 4 May 2022 20:12:07 +0000 (14:12 -0600)]
[mlir][nvvm] Fix support for tf32 data type in mma.sync
The NVVM dialect test coverage for all possible type/shape combinations
in the `nvvm.mma.sync` op is mostly complete. However, there were tests
missing for TF32 datatype support. This change adds tests for the one
relevant shape/type combination. This uncovered a small bug in the op
verifier, which this change also fixes.
Differential Revision: https://reviews.llvm.org/D124975
Sam McCall [Thu, 5 May 2022 16:50:44 +0000 (18:50 +0200)]
[clangd] Fix inlayhints crash, don't assume functions have FunctionTypeLocs
Fixes https://github.com/clangd/clangd/issues/1140
Sanjay Patel [Thu, 5 May 2022 16:47:11 +0000 (12:47 -0400)]
[InstCombine] fix typo in test name; NFC
Sanjay Patel [Thu, 5 May 2022 16:41:32 +0000 (12:41 -0400)]
[InstCombine] add scalable vector test for logical select; NFC
D124997 shows that the code is not ready to handle scalable vectors,
so add some more coverage for a potential crashing case.
Craig Topper [Thu, 5 May 2022 16:40:10 +0000 (09:40 -0700)]
[SelectionDAG] Constant fold (sext_inreg undef, VT) to 0 instead of undef.
The result of sign_extend_inreg needs to have as many sign bits
as requested by the VT argument. The easiest way to guarantee this
is to fold it to 0.
SystemZ test was modified to avoid using undef.
Fixes https://github.com/llvm/llvm-project/issues/55178
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D124696
Andrzej Warzynski [Thu, 5 May 2022 16:35:59 +0000 (16:35 +0000)]
[flang] Fix triple in a couple of driver tests
In https://reviews.llvm.org/D124667, I added tests that check the
generated assembly. I verified the assembly on AArch64 and X86_64, but
the PPC Flang buildbot [1] started failing (i.e. the assembly was not
generic enough).
In order to fix this, I'm changing these tests to be only run on
AAarch64 - that's the architecture that most of public Flang buildbots
use.
I'm hoping that this is straightforward enough and am merging it without
a review.
[1] https://lab.llvm.org/buildbot/#/builders/21/builds/40256
Craig Topper [Thu, 5 May 2022 07:19:10 +0000 (00:19 -0700)]
[DAGCombiner] Fold (sext/zext undef) -> 0 and aext(undef) -> undef.
Differential Revision: https://reviews.llvm.org/D124988
Craig Topper [Thu, 5 May 2022 06:38:24 +0000 (23:38 -0700)]
[DAGCombiner] Fold (max/min X, X) -> X.
Differential Revision: https://reviews.llvm.org/D124951
Craig Topper [Thu, 5 May 2022 06:37:00 +0000 (23:37 -0700)]
[RISCV] Add integer min/max intrinsic tests. NFC
Add basic tests and some tests for same operands and all undef
operands inspired by PR55271.
i32 is umin/umax is using signext to match RISC-V ABI. i8/i16 are
using signext/zeroext to match the operation.
Differential Revision: https://reviews.llvm.org/D124948
AndreyChurbanov [Thu, 5 May 2022 16:27:48 +0000 (11:27 -0500)]
[OpenMP] libomp: Add itt notifications to sync dependent tasks.
Intel Inspector uses itt notifications to analyze code execution, and it
reports race conditions in dependent tasks.
This patch fixes the issue notifying Inspector on tasks dependency
synchronizations.
Differential Revision: https://reviews.llvm.org/D123042
Amara Emerson [Thu, 5 May 2022 16:09:51 +0000 (09:09 -0700)]
[AArch64][GlobalISel] Add undef combines to postlegalizer combiner.
Aaron Ballman [Thu, 5 May 2022 16:10:06 +0000 (12:10 -0400)]
Silence a false positive about an unevaluated expr w/side effects
If the operand to `sizeof` is an expression of VLA type, the operand is
still evaluated, so we should not issue a diagnostic about ignoring the
side effects in this case, as they're not actually ignored.
Fixes #48010
Ilya Biryukov [Thu, 5 May 2022 16:09:23 +0000 (16:09 +0000)]
[clang] Fix Clang release notes
I have forgotten a space by mistake in the previous commit.
AndreyChurbanov [Thu, 5 May 2022 16:00:12 +0000 (11:00 -0500)]
[OpenMP] libomp: cleanup - remove duplicate check
The identical check remains 20 lines above in the code.
Differential Revision: https://reviews.llvm.org/D123046
AndreyChurbanov [Thu, 5 May 2022 15:55:52 +0000 (10:55 -0500)]
[OpenMP] libomp: cleanup dead code
Differential Revision: https://reviews.llvm.org/D123047
Ilya Biryukov [Thu, 5 May 2022 15:52:04 +0000 (15:52 +0000)]
[Driver] Remove -fno-concept-satisfaction-caching
The flag was added when the C++20 draft did not allow for concept
caching. The final C++20 standard permits the caching, so flag is
redundant. See http://wg21.link/p2104r0.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D125014
Brian Tracy [Thu, 5 May 2022 15:49:23 +0000 (17:49 +0200)]
Fix "the the" typo in documentation and user facing strings
There are many more instances of this pattern, but I chose to limit this change to .rst files (docs), anything in libcxx/include, and string literals. These have the highest chance of being seen by end users.
Reviewed By: #libc, Mordante, martong, ldionne
Differential Revision: https://reviews.llvm.org/D124708
Tomasz Kamiński [Thu, 5 May 2022 15:48:49 +0000 (17:48 +0200)]
[analyzer] Canonicalize SymIntExpr so the RHS is positive when possible
This PR changes the `SymIntExpr` so the expression that uses a
negative value as `RHS`, for example: `x +/- (-N)`, is modeled as
`x -/+ N` instead.
This avoids producing a very large `RHS` when the symbol is cased to
an unsigned number, and as consequence makes the value more robust in
presence of casts.
Note that this change is not applied if `N` is the lowest negative
value for which negation would not be representable.
Reviewed By: steakhal
Patch By: tomasz-kaminski-sonarsource!
Differential Revision: https://reviews.llvm.org/D124658
Sam McCall [Wed, 4 May 2022 19:37:24 +0000 (21:37 +0200)]
[clang-tidy] Make header-guard check a little looser on comment whitespace
Currently it rejects "// FOO_BAR_H" as an endif comment due to the extra space.
A user complained that this is too picky, which seems fair enough.
Differential Revision: https://reviews.llvm.org/D124955
Louis Dionne [Tue, 26 Apr 2022 21:02:40 +0000 (15:02 -0600)]
[libc++] Add a few _LIBCPP_ASSERTs in __tree
Several helper functions specify preconditions as comments, but we never
check them. I ran across a bug report (without a reproducer) in this code,
and I thought that having these assertions in place would make it easier
to troubleshoot.
Differential Revision: https://reviews.llvm.org/D124477
Andrzej Warzynski [Fri, 22 Apr 2022 09:07:31 +0000 (09:07 +0000)]
[flang][driver] Add support for consuming LLVM IR/BC files
This change makes sure that Flang's driver recognises LLVM IR and BC as
supported file formats. To this end, `isFortran` is extended and renamed
as `isSupportedByFlang` (the latter better reflects the new
functionality).
New tests are added to verify that the target triple is correctly
overridden by the frontend driver's default value or the value specified
with `-triple`. Strictly speaking, this is not a functionality that's
new in this patch (it was added in D124664). This patch simply enables
us to write such tests and hence I'm including them here.
Differential Revision: https://reviews.llvm.org/D124667
Ilya Biryukov [Thu, 5 May 2022 15:02:36 +0000 (15:02 +0000)]
[Sema] Replace invalid FIXME about memory leak. NFC
Added in my previous patch by mistake.
Thomas Preud'homme [Tue, 5 Apr 2022 21:34:40 +0000 (22:34 +0100)]
[MachinePipeliner] Fix unscheduled instruction
Prior to ordering instructions to be scheduled, the machine pipeliner
update recurrence node sets in groupRemainingNodes() by adding in a
given node set any node on the dependency path from a node set with
higher priority to the given node set. The function computePath() that
determine what constitutes a path follows artificial dependencies.
However, when ordering the nodes in the resulting node sets,
computeNodeOrder() calls ignoreDependence when looking at dependencies
which ignores artificial dependencies. This can cause a node not to be
scheduled which then causes wrong code generation and in the case of a
debug build will lead to an assert failure in generatePhis() in
ModuloScheduler.cpp.
This commit adds calls to ignoreDependence() in computePath() to not add
any node in groupRemainingNodes() that would not be ordered by
computeNodeOrder().
Reviewed By: sgundapa
Differential Revision: https://reviews.llvm.org/D124267
David Green [Thu, 5 May 2022 14:56:55 +0000 (15:56 +0100)]
[PowerPC] Add extra v2i64 splat load tests. NFC
In service of D123801, this add some tests targetting a v2i64 splat of a
load, and regenerates vsx_shuffle_le.ll for easier updating.
Sam McCall [Wed, 4 May 2022 23:15:28 +0000 (01:15 +0200)]
[Driver] Make "upgrade" of -include to include-pch optional; disable in clangd
If clang is passed "-include foo.h", it will rewrite to "-include-pch foo.h.pch"
before passing it to cc1, if foo.h.pch exists.
Existence is checked, but validity is not. This is probably a reasonable
assumption for the compiler itself, but not for clang-based tools where the
actual compiler may be a different version of clang, or even GCC.
In the end, we lose our -include, we gain a -include-pch that can't be used,
and the file often fails to parse.
I would like to turn this off for all non-clang invocations (i.e.
createInvocationFromCommandLine), but we have explicit tests of this behavior
for libclang and I can't work out the implications of changing it.
Instead this patch:
- makes it optional in the driver, default on (no change)
- makes it optional in createInvocationFromCommandLine, default on (no change)
- changes driver to do IO through the VFS so it can be tested
- tests the option
- turns the option off in clangd where the problem was reported
Subsequent patches should make libclang opt in explicitly and flip the default
for all other tools. It's probably also time to extract an options struct
for createInvocationFromCommandLine.
Fixes https://github.com/clangd/clangd/issues/856
Fixes https://github.com/clangd/vscode-clangd/issues/324
Differential Revision: https://reviews.llvm.org/D124970
Philip Reames [Thu, 5 May 2022 14:35:09 +0000 (07:35 -0700)]
[riscv] Use X0 for destination of VSETVLI instruction if result unused
If the GPR destination register of a VSETVLI instruction is unused, we can replace it with X0. This discards the result, and thus reduces register pressure.
Since after the core insertion/lowering algorithm has run, many user written VSETVLIs will have their GPR result unused (as VTYPE/VLEN is now explicitly read instead), this kicks in for most tests which involve a vsetvli intrinsic for fixed length vectorization. (vscale vectorization generally uses the GPR result to know how far to e.g. advance pointers in a loop and these uses are not removed.) When inserting VSETVLIs to lower psuedos, we prefer the X0 form anyways.
Differential Revision: https://reviews.llvm.org/D124961
David Green [Thu, 5 May 2022 14:27:44 +0000 (15:27 +0100)]
[ARM][AArch64] Add some extra shuffle conversion test coverage. NFC
This adds a big endian run line for the AArch64 TRN tests and
regenerated the check lines, along with adding an extra MVE VMOVN case
and regenerating vector-DAGCombine.ll for easier updating.
Peter Steinfeld [Thu, 5 May 2022 01:30:22 +0000 (18:30 -0700)]
[flang][nfc] Use a message class for "not yet implemented" messages
Following a previous suggestion from Peter Klausler.
Differential Revision: https://reviews.llvm.org/D124972
Benjamin Kramer [Thu, 5 May 2022 14:05:11 +0000 (16:05 +0200)]
[IR] Simplify code. NFCI.
Andrzej Warzynski [Fri, 22 Apr 2022 09:07:31 +0000 (09:07 +0000)]
[flang][driver] Re-organise the code-gen actions (nfc)
All frontend actions that generate code (MLIR, LLVM IR/BC,
Assembly/Object Code) are re-factored as essentially one action,
`CodeGenAction`, with minor specialisations. To facilate all this,
`CodeGenAction` is extended to hold `TargetMachine` and backend action
type (MLIR vs LLVM IR vs LLVM BC vs Assembly vs Object Code).
`CodeGenAction` is no longer a pure abstract class and the
corresponding `ExecuteAction` is implemented so that it covers all use
cases. All this allows a much better code re-use.
Key functionality is extracted into some helpful hooks:
* `SetUpTargetMachine`
* `GetOutputStream`
* `EmitObjectCodeHelper`
* `EmitBCHelper`
I hope that this clarifies the overall structure. I suspect that we may
need to revisit this again as the functionality grows in complexity.
Differential Revision: https://reviews.llvm.org/D124665
Fred Tingaud [Thu, 5 May 2022 13:03:12 +0000 (15:03 +0200)]
In MSVC compatibility mode, handle unqualified templated base class initialization
Before C++20, MSVC was supporting not mentioning the template argument of the base class when initializing a class inheriting a templated base class.
So the following code compiled correctly:
```
template <class T>
class Base {
};
template <class T>
class Derived : public Base<T> {
public:
Derived() : Base() {}
};
void test() {
Derived<int> d;
}
```
See https://godbolt.org/z/Pxxe7nccx for a conformance view.
This patch adds support for such construct when in MSVC compatibility mode.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D124666
Benjamin Kramer [Thu, 5 May 2022 13:50:33 +0000 (15:50 +0200)]
[ConstantFold] Use getFltSemantics instead of manually checking the type
Simplifies the code and makes fpext/fptrunc constant folding not crash
when the result is bf16.
Marco Elver [Thu, 5 May 2022 13:21:35 +0000 (15:21 +0200)]
[ThreadSanitizer] Add fallback DebugLocation for instrumentation calls
When building with debug info enabled, some load/store instructions do
not have a DebugLocation attached. When using the default IRBuilder, it
attempts to copy the DebugLocation from the insertion-point instruction.
When there's no DebugLocation, no attempt is made to add one.
This is problematic for inserted calls, where the enclosing function has
debug info but the call ends up without a DebugLocation in e.g. LTO
builds that verify that both the enclosing function and calls to
inlinable functions have debug info attached.
This issue was noticed in Linux kernel KCSAN builds with LTO and debug
info enabled:
| ...
| inlinable function call in a function with debug info must have a !dbg location
| call void @__tsan_read8(i8* %432)
| ...
To fix, ensure that all calls to the runtime have a DebugLocation
attached, where the possibility exists that the insertion-point might
not have any DebugLocation attached to it.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D124937
Sam McCall [Thu, 5 May 2022 00:15:24 +0000 (02:15 +0200)]
[Frontend] give createInvocationFromCommandLine an options struct
It's accumulating way too many optional params (see D124970)
While here, improve the name and the documentation.
Differential Revision: https://reviews.llvm.org/D124971
Alexey Bataev [Tue, 14 Dec 2021 18:02:06 +0000 (10:02 -0800)]
[SLP]Further improvement of the cost model for scalars used in buildvectors.
Further improvement of the cost model for the scalars used in
buildvectors sequences. The main functionality is outlined into
a separate function.
The cost is calculated in the following way:
1. If the Base vector is not undef vector, resizing the very first mask to
have common VF and perform action for 2 input vectors (including non-undef
Base). Other shuffle masks are combined with the resulting after the 1 stage and processed as a shuffle of 2 elements.
2. If the Base is undef vector and have only 1 shuffle mask, perform the
action only for 1 vector with the given mask, if it is not the identity
mask.
3. If > 2 masks are used, perform serie of shuffle actions for 2 vectors,
combing the masks properly between the steps.
The original implementation misses the very first analysis for the Base
vector, so the cost might too optimistic in some cases. But it improves
the cost for the insertelements which are part of the current SLP graph.
Part of D107966.
Differential Revision: https://reviews.llvm.org/D115750
Xing Xue [Thu, 5 May 2022 13:01:36 +0000 (09:01 -0400)]
[XCOFF][AIX] Use unique section names for LSDA and EH info sections with -ffunction-sections
Summary:
When -ffunction-sections is on, this patch makes the compiler to generate unique LSDA and EH info sections for functions on AIX by appending the function name to the section name as a suffix. This will allow the AIX linker to garbage-collect unused function.
Reviewed by: MaskRay, hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D124855
Peter Waller [Tue, 3 May 2022 08:36:07 +0000 (08:36 +0000)]
[AArch64] Add -aarch64-insert-extract-base-cost
The new flag -aarch64-insert-extract-base-cost can be used to
set the value of AArch64Subtarget::getVectorInsertExtractBaseCost(),
for the purposes of experimentation.
Differential Revision: https://reviews.llvm.org/D124835
Jay Foad [Thu, 21 Apr 2022 11:01:59 +0000 (12:01 +0100)]
[AMDGPU] Combine DPP mov even if old reg def is in different BB
Given a DPP mov like this:
%2:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
...
%3:vgpr_32 = V_MOV_B32_dpp %2, %1, 1, 1, 1, 0, implicit $exec
this patch just removes a check that %2 (the "old reg") was defined in
the same BB as the DPP mov instruction. GCNDPPCombine requires that the
MIR is in SSA form so I don't understand why the BB matters.
This lets the optimization work in more real world cases when the
definition of %2 gets hoisted out of a loop.
Differential Revision: https://reviews.llvm.org/D124182
Tobias Burnus [Thu, 5 May 2022 09:30:10 +0000 (10:30 +0100)]
sanitizer_common: Define FP_XSTATE_MAGIC1 for old glibc
D116208 (commit
1298273e8206a8fc2) added FP_XSTATE_MAGIC1.
However, when building with glibc < 2.16 for backward-dependency
compatibility, it is not defined - and the build breaks.
Note: The define comes from Linux's asm/sigcontext.h but the
file uses signal.h which includes glibc's bits/sigcontext.h - which
is synced from the kernel's file but lags behind.
Solution: For backward compatility with ancient systems, define
FP_XSTATE_MAGIC1 if undefined.
//For the old systems, we were building with Linux kernel 3.19 but to support really old glibc systems, we build with a sysroot of glibc 2.12. While our kernel (and the users' kernels) have FP_XSTATE_MAGIC1, glibc 2.12 is too old. – With this patch, building the sanitizer libs works again. This showed up for us today as GCC mainline/13 has now synced the sanitizer libs.//
Reviewed By: #sanitizers, vitalybuka
Differential Revision: https://reviews.llvm.org/D124927
einvbri [Sun, 24 Apr 2022 19:05:11 +0000 (14:05 -0500)]
[analyzer] Get direct binding for specific punned case
Region store was not able to see through this case to the actual
initialized value of STRUCT ff. This change addresses this case by
getting the direct binding. This was found and debugged in a downstream
compiler, with debug guidance from @steakhal. A positive and negative
test case is added.
The specific case where this issue was exposed.
typedef struct {
int a:1;
int b[2];
} STRUCT;
int main() {
STRUCT ff = {0};
STRUCT* pff = &ff;
int a = ((int)pff + 1);
return a;
}
Reviewed By: steakhal, martong
Differential Revision: https://reviews.llvm.org/D124349
Florian Hahn [Thu, 5 May 2022 09:49:51 +0000 (10:49 +0100)]
[LICM] Add test to exercise assertion from D123473.
Add a test case that triggers an assertion with earlier versions of
D123473.
Jay Foad [Thu, 5 May 2022 09:35:20 +0000 (10:35 +0100)]
RegAllocGreedy: Common up part of the priority calculation. NFC.
Nikita Popov [Wed, 4 May 2022 15:35:18 +0000 (17:35 +0200)]
[DAGCombine] Fold (X & ~Y) | Y with truncated not
This extends the (X & ~Y) | Y to X | Y fold to also work if ~Y is
a truncated not (when taking into account the mask X). This is
done by exporting the infrastructure added in D124856 and reusing
it here.
I've retained the old value of AllowUndefs=false, though probably
this can be switched to true with extra test coverage.
Differential Revision: https://reviews.llvm.org/D124930
Florian Hahn [Thu, 5 May 2022 08:44:07 +0000 (09:44 +0100)]
[SimpleLoopUnswitch] Add freeze if branch execs for partial unswitching.
We cannot skip the freezing the condition if the unswitched branch
executes, if the condition is a chain of ANDs/ORs. For example, if if we
have an AND %c1, %c2 with %c1 == undef and %c2 == 0, there would be no
branch on undef in the original code, but a branch on undef if we
unswitch %c1.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D124603
Jean Perier [Thu, 5 May 2022 08:32:40 +0000 (10:32 +0200)]
[flang] use 1-based dim in transformational runtime error msg
Flang transformational runtime was previously reporting conformity
issues in a zero based fashion to describe which dimension is non
conformant. This may confuse Fortran user, especially when the message
is about a dimension other than the first one.
Differential Revision: https://reviews.llvm.org/D124941
Adrian Kuegel [Thu, 5 May 2022 07:57:45 +0000 (09:57 +0200)]
[clang] Add static_cast to fix Bazel build.
Differential Revision: https://reviews.llvm.org/D124995
Matthias Springer [Thu, 5 May 2022 07:49:56 +0000 (16:49 +0900)]
[mlir][scf][bufferize] Update verifyAnalysis error message
The previous error message was technically incorrect. We do not compare equivalence of YieldOp operands and ForOp operands.
Differential Revision: https://reviews.llvm.org/D124934
Matthias Springer [Thu, 5 May 2022 07:49:37 +0000 (16:49 +0900)]
[mlir][scf][bufferize][NFC] Split ForOp bufferization into smaller functions
This is in preparation of WhileOp bufferization, which reuses these functions.
Differential Revision: https://reviews.llvm.org/D124933
Matthias Springer [Thu, 5 May 2022 07:49:18 +0000 (16:49 +0900)]
[mlir][scf][bufferize][NFC] Simplify verifyAnalysis implementation
Differential Revision: https://reviews.llvm.org/D124928
Nikita Popov [Wed, 4 May 2022 10:43:31 +0000 (12:43 +0200)]
[SCEV] Fold umin_seq to umin using implied poison reasoning
Similar to how we convert logical and/or to bitwise and/or, we should
also convert umin_seq to umin based on implied poison reasoning. In
%x umin_seq %y, if %y being poison implies %x being poison, then we
don't need the sequential evaluation: Having %y contribute towards
the result will never make the result more poisonous. An important
corollary of this is that if %y is never poison, we also don't need
the sequential evaluation.
This avoids some of the regressions in D124910.
Differential Revision: https://reviews.llvm.org/D124921
serge-sans-paille [Mon, 2 May 2022 10:19:48 +0000 (12:19 +0200)]
[lldb] Fix ppc64 detection in lldb
Currently, ppc64le and ppc64 (defaulting to big endian) have the same
descriptor, thus the linear scan always return ppc64le. Handle that through
subtype.
This is a recommit of
f114f009486816ed4b3bf984f0fbbb8fc80914f6 with a new test
setup that doesn't involves (unsupported) corefiles.
Differential Revision: https://reviews.llvm.org/D124760
Chuanqi Xu [Thu, 5 May 2022 07:15:09 +0000 (15:15 +0800)]
[NFC] [Pipelines] Hoist CoroCleanup as Module Pass
This is similar to previous patch https://reviews.llvm.org/D123925. It
could also reduce the time we call declaresCoroCleanupIntrinsics. And it
is helpful for further changes.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D124362
Chuanqi Xu [Thu, 21 Apr 2022 09:35:56 +0000 (17:35 +0800)]
[Pipelines] Hoist CoroCleanup to avoid blocking optimizations
CoroCleanup is designed to lowering all the remaining coroutine
intrinsics. It is required to run after CoroSplit only. However, the
position of CoroCleanup now is far too late. The downside here is that
the unlowered coroutine instrincs might blocking other optimizations
too. So it should be a pure win to hoist the position of CoroCleanup.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D124360
Zakk Chen [Thu, 5 May 2022 01:26:03 +0000 (18:26 -0700)]
[RISCV][Clang] add more tests for clang driver. (NFC)
Test experimental arch, Zfh, Zfmin and Zve arch.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D124611
Serge Pavlov [Thu, 5 May 2022 05:02:42 +0000 (12:02 +0700)]
[InstCombine] Remove side effect of replaced constrained intrinsics
If a constrained intrinsic call was replaced by some value, it was not
removed in some cases. The dangling instruction resulted in useless
instructions executed in runtime. It happened because constrained
intrinsics usually have side effect, it is used to model the interaction
with floating-point environment. In some cases it is correct behavior
but often the side effect is actually absent or can be ignored.
This change adds specific treatment of constrained intrinsics so that
their side effect can be removed if it actually absents.
Differential Revision: https://reviews.llvm.org/D118426
Mariusz Sikora [Mon, 25 Apr 2022 11:57:27 +0000 (12:57 +0100)]
[AMDGPU] Use d16 flag for image.sample instructions
Image.sample instruction can be forced to return half type instead of
float when d16 flag is enabled.
This patch adds new pattern in InstCombine to detect if output of
image.sample is used later only by fptrunc which converts the type
from float to half. If pattern is detected then fptrunc and image.sample
are combined to single image.sample which is returning half type.
Later in Lowering part d16 flag is added to image sample intrinsic.
Differential Revision: https://reviews.llvm.org/D124232
Wael Yehia [Tue, 3 May 2022 14:27:15 +0000 (10:27 -0400)]
[AIX][PGO] Enable linux style PGO on AIX
This patch switches the PGO implementation on AIX from using the runtime
registration-based section tracking to the __start_SECNAME/__stop_SECNAME
based. In order to enable the recognition of __start_SECNAME/__stop_SECNAME
symbols in the AIX linker, the -bdbg:namedsects:ss needs to be used.
Reviewed By: jsji, MaskRay, davidxl
Differential Revision: https://reviews.llvm.org/D124857
Eric Li [Wed, 4 May 2022 17:15:00 +0000 (17:15 +0000)]
[clang][dataflow] Add flowConditionIsTautology function
Provide a way for users to check if a flow condition is
unconditionally true.
Differential Revision: https://reviews.llvm.org/D124943
Patryk Wychowaniec [Thu, 5 May 2022 03:07:41 +0000 (03:07 +0000)]
[AVR] Always expand STDSPQRr & STDWSPQRr
Currently, STDSPQRr and STDWSPQRr are expanded only during
AVRFrameLowering - this means that if any of those instructions happen
to appear _outside_ of the typical FrameSetup / FrameDestroy
context, they wouldn't get substituted, eventually leading to a crash:
```
LLVM ERROR: Not supported instr: <MCInst XXX <MCOperand Reg:1>
<MCOperand Imm:15> <MCOperand Reg:53>>
```
This commit fixes this issue by moving expansion of those two opcodes
into AVRExpandPseudo.
This bug was originally discovered due to the Rust compiler_builtins
library. Its 0.1.37 release contained a 128-bit software
division/remainder routine that exercised this buggy branch in the code.
Reviewed By: benshi001
Differential Revision: https://reviews.llvm.org/D123528
Lian Wang [Fri, 29 Apr 2022 08:37:45 +0000 (08:37 +0000)]
[RISCV][NFC] Use true_mask replace riscv_vmset_vl in defined patterns.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D124660
Phoebe Wang [Thu, 5 May 2022 02:58:23 +0000 (10:58 +0800)]
[X86] Add `void` to void function. NFC
Luo, Yuanke [Wed, 4 May 2022 04:39:45 +0000 (12:39 +0800)]
[X86][AMX] Replace PXOR instruction with SET0 in AMX pre config.
To generate zero value, the PXOR instruction need 3 operands that is
tied to the same vreg. If is not good in SSA form and with undef value
two address instruction pass may convert
`%0:vr128 = PXORrr undef %0, undef %0`
to `%1:vr128 = PXORrr undef %1:vr128(tied-def 0), undef %0:vr128`.
It is not expected.
It can be simplified to SET0 instruction which only take 1 destination
operand. It should be more friendly to two address instruction pass and
register allocation pass.
`%0:vr128 = V_SET0`
Also add AVX1 code path so that it is consistant to other code.
Differential Revision: https://reviews.llvm.org/D124903
Ben Shi [Thu, 5 May 2022 02:18:09 +0000 (02:18 +0000)]
[Disassembler][AVR] Remove unused static functions
The unused static functions cause failures on some build machines.
Craig Topper [Thu, 5 May 2022 01:51:25 +0000 (18:51 -0700)]
[X86] Call initializeX86PreTileConfigPass from LLVMInitializeX86Target.
Without this, the pass doesn't show up in print-before/after-all.
Differential Revision: https://reviews.llvm.org/D124973
Craig Topper [Thu, 5 May 2022 01:29:15 +0000 (18:29 -0700)]
[SelectionDAG] Use llvm::any_of to simplify a loop. NFC
Ben Shi [Mon, 11 Apr 2022 01:44:49 +0000 (01:44 +0000)]
[MC][AVR] Implement decoding ST/LD
Reviewed By: aykevl, dylanmckay
Differential Revision: https://reviews.llvm.org/D123476
Ben Shi [Sat, 9 Apr 2022 01:45:22 +0000 (01:45 +0000)]
[MC][AVR] Implement decoding STD/LDD
Reviewed By: aykevl, dylanmckay
Differential Revision: https://reviews.llvm.org/D123442
Alexander Shaposhnikov [Thu, 5 May 2022 00:50:33 +0000 (00:50 +0000)]
[InstCombine] Fold ((A&B)^C)|B
Fold ((A&B)^C)|B into C|B.
https://alive2.llvm.org/ce/z/zSGSor
This addresses the issue https://github.com/llvm/llvm-project/issues/55169
Test plan: ninja check-all
Differential revision: https://reviews.llvm.org/D124710
Ayke van Laethem [Wed, 4 May 2022 22:46:27 +0000 (00:46 +0200)]
[compiler-rt][AVR] Fix avr_SOURCES CMake variable
D123200 did not include the generic sources, which means that only the
AVR-specific sources were compiled. With this change, generic sources
are included as expected.
Tested with the following commands:
cmake -G Ninja -DCOMPILER_RT_DEFAULT_TARGET_TRIPLE=avr -DCOMPILER_RT_BAREMETAL_BUILD=1 -DCMAKE_C_COMPILER=clang-14 -DCMAKE_C_FLAGS="--target=avr -mmcu=avr5 -nostdlibinc -mdouble=64" ../path/to/builtins
ninja
Differential Revision: https://reviews.llvm.org/D124969
Craig Topper [Thu, 5 May 2022 00:19:43 +0000 (17:19 -0700)]
[RISCV] Use movImm went multiplying by simm12 in getVLENFactoredAmount.
No reason to special case simm12, movImm handles all immediates.
This also fixe a bug that we weren't passing the frame-setup/destroy
flag to movImm when we were calling it.
Alexander Shaposhnikov [Thu, 5 May 2022 00:07:49 +0000 (00:07 +0000)]
[InstCombine][NFC] Update comment in and-xor-or.ll
Alexander Shaposhnikov [Thu, 5 May 2022 00:04:33 +0000 (00:04 +0000)]
[InstCombine][NFC] Add baseline tests for folds of ((A&B)^C)|B
Differential revision: https://reviews.llvm.org/D124709
Test plan: make check-all
Nico Weber [Fri, 22 Apr 2022 15:55:50 +0000 (11:55 -0400)]
[lld/mac] Support writing zippered dylibs and bundles
With -platform_version flags for two distinct platforms,
this writes a LC_BUILD_VERSION header for each.
The motivation is that this is needed for self-hosting with lld as linker
after D124059.
To create a zippered output at the clang driver level, pass
-target arm64-apple-macos -darwin-target-variant arm64-apple-ios-macabi
to create a zippered dylib.
(In Xcode's clang, `-darwin-target-variant` is spelled just `-target-variant`.)
(If you pass `-target arm64-apple-ios-macabi -target-variant arm64-apple-macos`
instead, ld64 crashes!)
This results in two -platform_version flags being passed to the linker.
ld64 also verifies that the iOS SDK version is at least 13.1. We don't do that
yet. But ld64 also does that for other platforms and we don't. So we need to
do that at some point, but not in this patch.
Only dylib and bundle outputs can be zippered.
I verified that a Catalyst app linked against a dylib created with
clang -shared foo.cc -o libfoo.dylib \
-target arm64-apple-macos \
-target-variant arm64-apple-ios-macabi \
-Wl,-install_name,@rpath/libfoo.dylib \
-fuse-ld=$PWD/out/gn/bin/ld64.lld
runs successfully. (The app calls a function `f()` in libfoo.dylib
that returns a const char* "foo", and NSLog(@"%s")s it.)
ld64 is a bit more permissive when writing zippered outputs,
see references to "unzippered twins". That's not implemented yet.
(If anybody wants to implement that, D124275 is a good start.)
Differential Revision: https://reviews.llvm.org/D124887
Nico Weber [Wed, 4 May 2022 13:08:58 +0000 (09:08 -0400)]
[llvm-otool] Make `llvm-otool -l` output compatible with otool for LC_BUILD_VERSION
Namely, only "symbolize" platform and tool names if `-v` is passed.
(`llvm-otool -lv` output still isn't quite the same as `otool -lv` output, but
`-v` output is arguably for consumption by humans, so I'm not changing that
at this point. Someone else could change it if it was important to them.)
Differential Revision: https://reviews.llvm.org/D124920
Craig Topper [Wed, 4 May 2022 23:13:06 +0000 (16:13 -0700)]
[PowerPC] Re-run update_mir_test_checks.py on nofpexcept.ll. NFC
This test was previously generated by the script, but the script
now uses CHECK-NEXT instead of CHECK.
This is preparation for a strictfp related patch I'm working on.
H.J. Lu [Wed, 4 May 2022 21:53:05 +0000 (14:53 -0700)]
[sanitizer] Use newfstatat for x32
Since newfstatat is supported on x32, use it for x32.
Differential Revision: https://reviews.llvm.org/D124968
Jason Molenda [Wed, 4 May 2022 22:27:09 +0000 (15:27 -0700)]
Remove expected fail for TestStepNoDebug on AArch64
My fix in https://reviews.llvm.org/D124492 should fix
this - I got an "unexpected pass" failure from an
Aarch64 Ubuntu bot when I landed my fix.
Junfeng Dong [Wed, 4 May 2022 22:21:30 +0000 (15:21 -0700)]
[DebugInfo] Give warning instead of error for premature terminator in .debug_aranges section.
llvm-profgen gives error message when the input binary contains premature terminator in .debug_aranges section. These zero length items point to some rodata with zero size type in embed Rust Library. Considering Zero-Sized Types are a valid feature in Rust. They are not real error. This change makes the "error:" message into a warning to avoid misleading.
Why do we still want a warning on such case? because it doesn't follow dwarf standard. https://bugs.llvm.org/show_bug.cgi?id=46805 contains early discussion.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D124121
Jez Ng [Wed, 4 May 2022 22:01:34 +0000 (18:01 -0400)]
[lld-macho][nfc] Set test min version to 11.0
The arm64-apple-macos triple is only valid for versions >= 11.0. (If
one passes arm64-apple-macos10.15 to llvm-mc, the output's min version is still
11.0). In order to write tests easily for both target archs, let's up the
default min version in our tests.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D124562
Jason Molenda [Wed, 4 May 2022 21:43:42 +0000 (14:43 -0700)]
Update the CFA to use $sp when $fp is restored on arm64
In UnwindAssemblyInstEmulation we correctly recognize when a LDP
restores the fp & lr in an epilogue, and mark them as having the
caller's contents now, but we don't update the CFA register rule
at that point to indicate that the CFA is now calculated in terms
of $sp. This doesn't impact the backtrace because the register
contents are all <same> now, but it can confuse the stepper when
the StackID changes mid-epilogue.
Differential Revision: https://reviews.llvm.org/D124492
rdar://
92064415
Zixu Wang [Wed, 4 May 2022 19:29:45 +0000 (12:29 -0700)]
Revert "Revert "[clang][extract-api] Use relative includes""
Reapply the change after fixing sanitizer errors.
The original problem was that `StringRef`s in `Matches` are pointing to
temporary local `std::string`s created by `path::convert_to_slash` in
the regex match call. This patch does the conversion up front in
container `FilePath`.
This reverts commit
2966f0fa505266735dbc8324b8821b7f0aa901ff.
Differential Revision: https://reviews.llvm.org/D124964
Philip Reames [Tue, 3 May 2022 21:00:51 +0000 (14:00 -0700)]
[RISCV] Add a version of insertVSETVLI which uses an iterator [NFC]
This is to simplify the final version of D124869.
Stanislav Mekhanoshin [Wed, 27 Apr 2022 19:10:16 +0000 (12:10 -0700)]
[AMDGPU] Handle LDS DMA and LDS_DIRECT hazards
There shall be 1 wait state between M0 write and LDS DMA/LDS_DIRECT use.
Differential Revision: https://reviews.llvm.org/D124550
Jon Chesterfield [Wed, 4 May 2022 21:42:05 +0000 (22:42 +0100)]
[amdgpu] Elide module lds allocation in kernels with no callees
Introduces a string attribute, amdgpu-requires-module-lds, to allow
eliding the module.lds block from kernels. Will allocate the block as before
if the attribute is missing or has its default value of true.
Patch uses the new attribute to detect the simplest possible instance of this,
where a kernel makes no calls and thus cannot call any functions that use LDS.
Tests updated to match, coverage was already good. Interesting cases is in
lower-module-lds-offsets where annotating the kernel allows the backend to pick
a different (in this case better) variable ordering than previously. A later
patch will avoid moving kernel variables into module.lds when the kernel can
have this attribute, allowing optimal ordering and locally unused variable
elimination.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D122091
Craig Topper [Wed, 4 May 2022 21:26:44 +0000 (14:26 -0700)]
[RISCV] Add a special case to treat riscv-v-vector-bits-min=-1 as meaning use Zvl*b value.
riscv-v-vector-bits-min is primarily used to opt-in to the
autovectorizer. The vector width can be determined from Zvl*b.
This patch adds support treating -1 as meaning use Zvl*b so we can
still opt-in to autovectorization without needing to repeat a
vector width already given by Zvl*b or -mcpu.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D124960
Congzhe Cao [Wed, 4 May 2022 21:09:13 +0000 (17:09 -0400)]
[LoopCacheAnalysis][NFC] Add a test case for improved loop cache analysis cost calculation
Added a motivating test case for D123400 where the loopnest has a
suboptimal loop order j-i-k. After D123400 we ensure that the order
of loop cache analysis output is loop i-j-k, despite the suboptimal
order in the original loopnest.
Reviewed By: bmahjour, #loopoptwg
Differential Revision: https://reviews.llvm.org/D122776
David Green [Wed, 4 May 2022 21:12:09 +0000 (22:12 +0100)]
[ARM] Delay creation of MVE Imm shifts to legalization
The reasoning for creating VSHLIMM/VSHRsIMM/VSHRuIMM nodes in a combine
- because matching i64 constants is difficult - does not apply for MVE,
as there are not v2i64 shifts. Delaying the creation of the nodes can
allow extra transforms on target independant shl/shr.
Amir Ayupov [Wed, 4 May 2022 21:07:42 +0000 (14:07 -0700)]
[BOLT][NFC] Move getInliningInfo out of Inliner class
`getInliningInfo` is useful in other passes that need to check inlining
eligibility for some function. Move the declaration and InliningInfo definition
out of Inliner class. Prepare for subsequent use in ICP.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D124899
Amir Ayupov [Wed, 4 May 2022 21:03:24 +0000 (14:03 -0700)]
[BOLT][NFC] Minor cleanup in ICP getCallTargets and canPromoteCallsite
Minor refactoring. NFC.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D124898
Yitzhak Mandelbaum [Wed, 4 May 2022 18:47:08 +0000 (18:47 +0000)]
[clang-tidy] Escape diagnostic messages before passing to `diag` in Transformer.
Messages generated by Transformer rules may have `%` in them, which
needs to be escaped before being passed to `diag`, which interprets them
specially (and crashes if they are misused).
Differential Revision: https://reviews.llvm.org/D124952
Ayke van Laethem [Wed, 4 May 2022 16:37:28 +0000 (18:37 +0200)]
[compiler-rt][AVR] Use correct return value for __ledf2 etc
Previously the default was long, which is 32-bit on AVR. But avr-gcc
expects a smaller value: it reads the return value from r24.
This is actually a regression from https://reviews.llvm.org/D98205.
Before D98205, the return value was an enum (which was 2 bytes in size)
which was compatible with the 1-byte return value that avr-gcc was
expecting. But long is 4 bytes and thus places the significant return
value in a different register.
Differential Revision: https://reviews.llvm.org/D124939