Arthur Eubanks [Sat, 10 Apr 2021 18:59:04 +0000 (11:59 -0700)]
[Inliner] Propagate SROA analysis through invariant group intrinsics
SROA can handle invariant group intrinsics, let the inliner know that
for better heuristics when the intrinsics are present.
This fixes size issues in a couple files when turning on
-fstrict-vtable-pointers in Chrome.
Reviewed By: rnk, mtrofin
Differential Revision: https://reviews.llvm.org/D100249
Hamza Sood [Mon, 12 Apr 2021 17:47:14 +0000 (10:47 -0700)]
Replace uses of std::iterator with explicit using
This patch removes all uses of `std::iterator`, which was deprecated in C++17.
While this isn't currently an issue while compiling LLVM, it's useful for those using LLVM as a library.
For some reason there're a few places that were seemingly able to use `std` functions unqualified, which no longer works after this patch. I've updated those places, but I'm not really sure why it worked in the first place.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D67586
Fraser Cormack [Thu, 8 Apr 2021 10:22:46 +0000 (11:22 +0100)]
[RISCV] Support vector SET[U]LT and SET[U]GE with splatted immediates
This patch adds more optimized codegen for the above SETCC forms,
by matching the '.vi' vector forms when the immediate is a 5-bit signed
immediate plus 1. The immediate can be decremented and the corresponding
SET[U]LE or SET[U]GT forms can be matched.
This work was left as a TODO from D94168.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D100096
Arjun P [Mon, 12 Apr 2021 17:31:15 +0000 (23:01 +0530)]
[MLIR] PresburgerSet emptiness check: remove assertions that there are no symbols
Symbols are now supported in the integer emptiness check. Remove some outdated assertions checking that there are no symbols.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D100327
Aart Bik [Mon, 12 Apr 2021 16:28:41 +0000 (09:28 -0700)]
[mlir] introduce "encoding" attribute to tensor type
This CL introduces a generic attribute (called "encoding") on tensors.
The attribute currently does not carry any concrete information, but the type
system already correctly determines that tensor<8xi1,123> != tensor<8xi1,321>.
The attribute will be given meaning through an interface in subsequent CLs.
See ongoing discussion on discourse:
[RFC] Introduce a sparse tensor type to core MLIR
https://llvm.discourse.group/t/rfc-introduce-a-sparse-tensor-type-to-core-mlir/2944
A sparse tensor will look something like this:
```
// named alias with all properties we hold dear:
#CSR = {
// individual named attributes
}
// actual sparse tensor type:
tensor<?x?xf64, #CSR>
```
I see the following rough 5 step plan going forward:
(1) introduce this format attribute in this CL, currently still empty
(2) introduce attribute interface that gives it "meaning", focused on sparse in first phase
(3) rewrite sparse compiler to use new type, remove linalg interface and "glue"
(4) teach passes to deal with new attribute, by rejecting/asserting on non-empty attribute as simplest solution, or doing meaningful rewrite in the longer run
(5) add FE support, document, test, publicize new features, extend "format" meaning to other domains if useful
Reviewed By: stellaraccident, bondhugula
Differential Revision: https://reviews.llvm.org/D99548
Emilio Cota [Mon, 12 Apr 2021 17:15:35 +0000 (19:15 +0200)]
[mlir] Rename AVX512 dialect to X86Vector
We will soon be adding non-AVX512 operations to MLIR, such as AVX's rsqrt. In https://reviews.llvm.org/D99818 several possibilities were discussed, namely to (1) add non-AVX512 ops to the AVX512 dialect, (2) add more dialects (e.g. AVX dialect for AVX rsqrt), and (3) expand the scope of the AVX512 to include these SIMD x86 ops, thereby renaming the dialect to something more accurate such as X86Vector.
Consensus was reached on option (3), which this patch implements.
Reviewed By: aartbik, ftynse, nicolasvasilache
Differential Revision: https://reviews.llvm.org/D100119
MaheshRavishankar [Mon, 12 Apr 2021 15:49:45 +0000 (08:49 -0700)]
[mlir][Linalg] Disable const -> linalg.generic when fused op is illegal.
Fusing a constant with a linalg.generic operation can result in the
fused operation being illegal since the loop bound computation
fails. Avoid such fusions.
Differential Revision: https://reviews.llvm.org/D100272
Mitch Phillips [Mon, 12 Apr 2021 16:49:28 +0000 (09:49 -0700)]
[asan] Replaceable new/delete is unsupported in Windows.
Mark the test as unsupported to bring the bot online. Could probably be
permanently fixed by using one of the workarounds already present in
compiler-rt.
Alexander Kornienko [Mon, 12 Apr 2021 16:28:01 +0000 (18:28 +0200)]
Fix nits.
Jens Massberg [Mon, 12 Apr 2021 16:25:29 +0000 (18:25 +0200)]
[clang-tidy] Add option to ignore macros in readability-function-cognitive-complexity check.
(this was originally part of https://reviews.llvm.org/D96281 and has been split off into its own patch)
If a macro is used within a function, the code inside the macro
doesn't make the code less readable. Instead, for a reader a macro is
more like a function that is called. Thus the code inside a macro
shouldn't increase the complexity of the function in which it is called.
Thus the flag 'IgnoreMacros' is added. If set to 'true' code inside
macros isn't considered during analysis.
This isn't perfect, as now the code of a macro isn't considered at all,
even if it has a high cognitive complexity itself. It might be better if
a macro is considered in the analysis like a function and gets its own
cognitive complexity. Implementing such an analysis seems to be very
complex (if possible at all with the given AST), so we give the user the
option to either ignore macros completely or to let the expanded code
count to the calling function's complexity.
See the code example from vgeof (originally added as note in https://reviews.llvm.org/D96281)
bool doStuff(myClass* objectPtr){
if(objectPtr == nullptr){
LOG_WARNING("empty object");
return false;
}
if(objectPtr->getAttribute() == nullptr){
LOG_WARNING("empty object");
return false;
}
use(objectPtr->getAttribute());
}
The LOG_WARNING macro itself might have a high complexity, but it do not make the
the function more complex to understand like e.g. a 'printf'.
By default 'IgnoreMacros' is set to 'false', which is the original behavior of the check.
Reviewed By: lebedev.ri, alexfh
Differential Revision: https://reviews.llvm.org/D98070
Tim Keith [Mon, 12 Apr 2021 16:40:51 +0000 (09:40 -0700)]
[flang] Fix narrowing warning on macos
With clang 11 on macos we were getting this warning:
```
flang/runtime/random.cpp:61:30: error: non-constant-expression cannot be narrowed from type 'unsigned long long' to 'runtime::GeneratedWord' (aka 'unsigned int') in initializer list [-Wc++11-narrowing]
GeneratedWord word{(generator() - generator.min()) & rangeMask};
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
flang/runtime/random.cpp:99:5: note: in instantiation of function template specialization 'runtime::Generate<double, 53>' requested here
Generate<CppTypeFor<TypeCategory::Real, 8>, 53>(harvest);
^
```
Changing the type of `rangeMask` fixes it.
Differential Revision: https://reviews.llvm.org/D100320
Artem Belevich [Thu, 8 Apr 2021 20:12:10 +0000 (13:12 -0700)]
Allow applying attributes to subset of allowed subjects.
Differential Revision: https://reviews.llvm.org/D100136
Yuanfang Chen [Thu, 1 Apr 2021 07:10:43 +0000 (00:10 -0700)]
[InstCombine] when calling conventions are compatible, don't convert the call to undef idiom
D24453 enabled libcalls simplication for ARM PCS. This may cause
caller/callee calling conventions mismatch in some situations such as
LTO. This patch makes instcombine aware that the compatible calling
conventions differences are benign (not emitting undef idom).
Differential Revision: https://reviews.llvm.org/D99773
Arthur O'Dwyer [Mon, 5 Apr 2021 18:56:03 +0000 (14:56 -0400)]
[libc++] Implement D2351R0 "Mark all library static cast wrappers as [[nodiscard]]"
These [[nodiscard]] annotations are added as a conforming extension;
it's unclear whether the paper will actually be adopted and make them
mandatory, but they do seem like good ideas regardless.
https://isocpp.org/files/papers/D2351R0.pdf
This patch implements the paper's effect on:
- std::to_integer, std::to_underlying
- std::forward, std::move, std::move_if_noexcept
- std::as_const
- std::identity
The paper also affects (but libc++ does not yet have an implementation of):
- std::bit_cast
Differential Revision: https://reviews.llvm.org/D99895
Arthur O'Dwyer [Sun, 11 Apr 2021 22:38:24 +0000 (18:38 -0400)]
[libc++] [test] Detect an improperly noexcept'ed __decay_copy.
`__decay_copy` is used by `std::thread`'s constructor to copy its arguments
into the new thread. If `__decay_copy` claims to be noexcept, but then
copying the argument does actually throw, we'd call std::terminate instead
of passing this test. (And I've verified that adding an unconditional `noexcept`
to `__decay_copy` does indeed fail this test.)
Differential Revision: https://reviews.llvm.org/D100277
Sanjay Patel [Mon, 12 Apr 2021 16:20:32 +0000 (12:20 -0400)]
[PassManager][PhaseOrdering] lower expects before running simplifyCFG
If we run passes before lowering llvm.expect intrinsics to metadata,
then those passes have no way to act on the hints provided by llvm.expect.
SimplifyCFG is the known offender, and we made it smarter about profile
metadata in D98898.
In the motivating example from https://llvm.org/PR49336 , this means we
were ignoring the recommended method for a programmer to tell the compiler
that a compare+branch is expensive. This change appears to solve that case -
the metadata survives to the backend, the compare order is as expected in IR,
and the backend does not do anything to reverse it.
We make the same change to the old pass manager to keep things synchronized.
Differential Revision: https://reviews.llvm.org/D100213
David Green [Mon, 12 Apr 2021 16:23:02 +0000 (17:23 +0100)]
[ARM] Add a number of intrinsics for MVE lane interleaving
Add a number of intrinsics which natively lower to MVE operations to the
lane interleaving pass, allowing it to efficiently interleave the lanes
of chucks of operations containing these intrinsics.
Differential Revision: https://reviews.llvm.org/D97293
Stephen Tozer [Thu, 11 Mar 2021 15:01:37 +0000 (15:01 +0000)]
Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands"
The causes of the previous build errors have been fixed in revisions
aa3e78a59fdf3b211be72f1b3221af831665e67d, and
140757bfaaa00110a92d2247a910c847e6e3bcc8
This reverts commit
f40976bd01032f4905dde361e709166704581077.
Louis Dionne [Fri, 9 Apr 2021 15:41:28 +0000 (11:41 -0400)]
[libc++] Divorce the std Lit feature from the -std=XXX compiler flag
After this patch, we can use `--param std=c++20` even if the compiler only
supports -std=c++2a. The test suite will handle that for us. The only Lit
feature that isn't fully baked will always be the "in development" one,
since we don't know exactly what year the standard will be ratified in.
This is another take on https://reviews.llvm.org/D99789.
Differential Revision: https://reviews.llvm.org/D100210
LLVM GN Syncbot [Mon, 12 Apr 2021 15:51:13 +0000 (15:51 +0000)]
[gn build] Port
6a1ac88fc19a
LLVM GN Syncbot [Mon, 12 Apr 2021 15:51:12 +0000 (15:51 +0000)]
[gn build] Port
26beecfe470b
LLVM GN Syncbot [Mon, 12 Apr 2021 15:51:11 +0000 (15:51 +0000)]
[gn build] Port
0b439e4cc9db
Louis Dionne [Mon, 12 Apr 2021 15:49:43 +0000 (11:49 -0400)]
[libc++] NFC: Remove duplicate synopsis from <__string>
Louis Dionne [Fri, 9 Apr 2021 16:58:00 +0000 (12:58 -0400)]
[libc++] Split std::get_temporary_buffer out of <memory>
Differential Revision: https://reviews.llvm.org/D100216
Louis Dionne [Fri, 9 Apr 2021 16:48:34 +0000 (12:48 -0400)]
[libc++] Split std::allocator out of <memory>
Differential Revision: https://reviews.llvm.org/D100216
Louis Dionne [Fri, 9 Apr 2021 16:44:26 +0000 (12:44 -0400)]
[libc++] Split auto_ptr out of <memory>
Differential Revision: https://reviews.llvm.org/D100216
Kristof Beyls [Mon, 12 Apr 2021 15:07:02 +0000 (17:07 +0200)]
[docs] Add Windows/COFF call info
Simon Pilgrim [Mon, 12 Apr 2021 14:31:48 +0000 (15:31 +0100)]
[InstCombine] Regenerate select-ctlz-to-cttz.ll tests
Correctly test !range metadata
Simon Pilgrim [Mon, 12 Apr 2021 13:56:10 +0000 (14:56 +0100)]
[X86] Fold cmpeq/ne(trunc(logic(x)),0) --> cmpeq/ne(logic(x),0)
Fixes the issues noted in PR48768, where the and/or/xor instruction had been promoted to avoid i8/i16 partial-dependencies, but the test against zero had not.
We can almost certainly relax this fold to work for any truncation, although it breaks a number of existing folds (notable movmsk folds which tend to rely on the truncate to determine the demanded bits/elts in the source vector).
There is a reverse combine in TargetLowering.SimplifySetCC so we must wait until after legalization before attempting this.
Daniel Kiss [Mon, 12 Apr 2021 15:02:16 +0000 (17:02 +0200)]
[compiler-rt][aarch64] Add PAC-RET/BTI support to HWASAN.
Support for -mbranch-protection.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D100143
Kadir Cetinkaya [Thu, 8 Apr 2021 13:50:50 +0000 (15:50 +0200)]
[clangd] Provide a way to disable external index
Users can reset any external index set by previous fragments by
putting a `None` for the external block, e.g:
```
Index:
External: None
```
Differential Revision: https://reviews.llvm.org/D100106
Wang, Pengfei [Mon, 12 Apr 2021 14:08:27 +0000 (22:08 +0800)]
[X86][AMX] Hoist ldtilecfg
The previous code calculated the first ldtilecfg by dominating all AMX registers' def. This may result in the ldtilecfg being inserted into a loop.
This patch try to calculate the nearest point where all shapes of AMX registers are reachable.
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D99010
David Green [Mon, 12 Apr 2021 14:28:13 +0000 (15:28 +0100)]
[ARM] Add FP handling for MVE lane interleaving
FP16 to FP32 converts can be handled in MVE lane interleaving, much like
the sext/zext lowering we do. This expands the pass with fpext and
fptrunc handling, and basic fp operations allowing more efficient
lowering of fp vectors.
Differential Revision: https://reviews.llvm.org/D97292
Nathan James [Mon, 12 Apr 2021 14:13:59 +0000 (15:13 +0100)]
[NFC] Remove redundant string copy
Malhar Jajoo [Mon, 12 Apr 2021 13:46:23 +0000 (14:46 +0100)]
[ARM] Updates to arm-block-placement pass
The patch makes two updates to the arm-block-placement pass:
- Handle arbitrarily nested loops
- Extends the search (for t2WhileLoopStartLR) to the predecessor of the
preHeader.
Differential Revision: https://reviews.llvm.org/D99649
Paul C. Anagnostopoulos [Fri, 9 Apr 2021 15:59:18 +0000 (11:59 -0400)]
[TableGen] Fix bug in recent change to ListInit::convertInitListSlice()
Differential Revision: https://reviews.llvm.org/D100247
Tobias Gysi [Mon, 12 Apr 2021 13:13:27 +0000 (13:13 +0000)]
[mlir][linalg] adding operation to access the iteration index of enclosing linalg ops.
The `linalg.index` operation provides access to the iteration indexes of immediately enclosing linalg operations. It takes a dimension `dim` attribute and returns the iteration index in the given dimension. Having `linalg.index` allows us to unify `linalg.generic` and `linalg.indexed_generic` and also enables index access in named operations.
Differential Revision: https://reviews.llvm.org/D100292
Andrew Savonichev [Mon, 12 Apr 2021 13:28:49 +0000 (16:28 +0300)]
Revert "[AArch64] Add Machine InstCombiner patterns for FMUL indexed variant"
This reverts commit
cca9b5985c0c7e3c34da7f2db7cc8e7e707b0e2e.
Buildbot reported an error for CodeGen/AArch64/machine-combiner-fmul-dup.mir:
*** Bad machine code: Virtual register killed in block, but needed live out. ***
- function: indexed_2s
- basic block: %bb.0 entry (0x640fee8)
Virtual register %7 is used after the block.
*** Bad machine code: Virtual register defs don't dominate all uses. ***
- function: indexed_2s
- v. register: %7
LLVM ERROR: Found 2 machine code errors.
Andrew Savonichev [Wed, 31 Mar 2021 12:25:27 +0000 (15:25 +0300)]
[AArch64] Add Machine InstCombiner patterns for FMUL indexed variant
This patch adds DUP+FMUL => FMUL_indexed pattern to InstCombiner.
FMUL_indexed is normally selected during instruction selection, but it
does not work in cases when VDUP and VMUL are in different basic
blocks.
Differential Revision: https://reviews.llvm.org/D99662
Raphael Isemann [Mon, 12 Apr 2021 12:40:58 +0000 (14:40 +0200)]
[lldb] Delete dead StackFrameList::Merge
That code is unused since it's check-in in 2010 (and I believe it would leak
memory when called as it releases the passed unique_ptr), so let's delete it.
Reviewed By: vsk
Differential Revision: https://reviews.llvm.org/D100212
Raphael Isemann [Mon, 12 Apr 2021 12:32:38 +0000 (14:32 +0200)]
[lldb] Don't recursively load types of static member variables in the DWARF AST parser
When LLDB's DWARF parser is parsing the member DIEs of a struct/class it
currently fully resolves the types of static member variables in a class before
adding the respective `VarDecl` to the record.
For record types fully resolving the type will also parse the member DIEs of the
respective class. The other way of resolving is just 'forward' resolving the type
which will try to load only the minimum amount of information about the type
(for records that would only be the name/kind of the type). Usually we always
resolve types on-demand so it's rarely useful to speculatively fully resolve
them on the first use.
This patch changes makes that we only 'forward' resolve the types of static
members. This solves the fact that LLDB unnecessarily loads debug information
to parse the type if it's maybe not needed later and it also avoids a crash where
the parsed type might in turn reference the surrounding class that is currently
being parsed.
The new test case demonstrates the crash that might happen. The crash happens
with the following steps:
1. We parse class `ToLayout` and it's members.
2. We parse the static class member and fully resolve its type
(`DependsOnParam2<ToLayout>`).
3. That type has a non-static class member `DependsOnParam1<ToLayout>` for which
LLDB will try to calculate the size.
4. The layout (and size)`DependsOnParam1<ToLayout>` turns depends on the
`ToLayout` size/layout.
5. Clang will calculate the record layout/size for `ToLayout` even though we are
currently parsing it and it's missing it's non-static member.
The created is missing the offset for the yet unparsed non-static member. If we
later try to get the offset we end up hitting different asserts. Most common is
the one in `TypeSystemClang::DumpValue` where it checks that the record layout
has offsets for the current FieldDecl.
```
assert(field_idx < record_layout.getFieldCount());
```
Fixed rdar://
67910011
Reviewed By: shafik
Differential Revision: https://reviews.llvm.org/D100180
Alexey Lapshin [Mon, 12 Apr 2021 11:27:14 +0000 (14:27 +0300)]
Fix chrome os failure after
021de7cf80268091cf13485a538b611b37d0b33e.
chrome os build failed after D98511:
https://bugs.chromium.org/p/chromium/issues/detail?id=1197970
This patch fixes permission issue appeared after D98511.
Sebastian Neubauer [Mon, 12 Apr 2021 12:20:03 +0000 (14:20 +0200)]
[AMDGPU] Kill temporary register after restoring
Not a correctness issue, but the temporary register is not used
afterwards and should be dead.
Differential Revision: https://reviews.llvm.org/D100295
Bradley Smith [Tue, 30 Mar 2021 12:14:36 +0000 (13:14 +0100)]
[AArch64][SVE] Remove redundant PTEST of MATCH/NMATCH results
Co-authored-by: Paul Walker <paul.walker@arm.com>
Differential Revision: https://reviews.llvm.org/D99584
Stephen Tozer [Thu, 8 Apr 2021 16:56:25 +0000 (17:56 +0100)]
Reapply "[DebugInfo] Correctly track SDNode dependencies for list debug values"
Fixed memory leak error by using BumpAllocator for SDDbgValue arrays.
This reverts commit
1b589172bd19b83e8137185ed11f50bba06e8766.
Esme-Yi [Mon, 12 Apr 2021 11:05:55 +0000 (11:05 +0000)]
Reland [DebugInfo] Fix the mismatching between C++ language tags and Dwarf versions.""
This reverts commit
c965e14a12955355ead9ea093989a8fcbf03a8c1.
Esme-Yi [Mon, 12 Apr 2021 10:36:46 +0000 (10:36 +0000)]
Revert "[DebugInfo] Fix the mismatching between C++ language tags and Dwarf versions."
This reverts commit
62fa9b9388aa114e3b1a58bbdbcd966ae3492ba5.
Tobias Gysi [Mon, 12 Apr 2021 09:35:26 +0000 (09:35 +0000)]
[mlir][linalg] fixing hard-coded variable names in a test (NFC)
The patch fixes hard-coded variable names in the vector-to-loops test.
Dmitry Preobrazhensky [Mon, 12 Apr 2021 10:30:29 +0000 (13:30 +0300)]
[AMDGPU][MC][NFC] Removed extra spaces
Fixed bugs 49646, 49647.
Differential Revision: https://reviews.llvm.org/D100173
Simon Pilgrim [Mon, 12 Apr 2021 10:20:32 +0000 (11:20 +0100)]
[IR] Fix Wdocumentation warning. NFCI.
Sander de Smalen [Mon, 12 Apr 2021 08:49:00 +0000 (09:49 +0100)]
[AArch64] ACLE: Fix issue for mismatching enum types with builtins.
This patch fixes an issue with the SVE prefetch and qinc/qdec intrinsics
that take an `enum` argument, but where the builtin prototype encodes
these as `int`. Some code in SemaDecl found the mismatch and chose
to forget about the builtin altogether, which meant that any future
code using that builtin would fail. The code that forgets about the
builtin was actually obsolete after D77491 and should have been removed.
This patch now removes that code.
This patch also fixes another issue with the SVE prefetch intrinsic
when built with C++, where the builtin didn't accept the correct
pointer type, which should be `const void *`.
Reviewed By: tambre
Differential Revision: https://reviews.llvm.org/D100046
Sebastian Neubauer [Mon, 12 Apr 2021 10:10:32 +0000 (12:10 +0200)]
[AMDGPU] Fix ubsan error
The RegScavenger can be null sometimes, so a pointer is needed.
Fixes UBSan error introduced in
f9a8c6a0e505.
Muhammad Omair Javaid [Mon, 12 Apr 2021 10:10:41 +0000 (15:10 +0500)]
[LLDB] Fix buildbots breakage due to TestGuessLanguage.py
Fix LLDB buidbot breakage due to D99250.
Differential Revision: https://reviews.llvm.org/D99250
Sebastian Neubauer [Mon, 12 Apr 2021 09:47:16 +0000 (11:47 +0200)]
[AMDGPU] Fix saving fp and bp
Spilling the fp or bp to scratch could overwrite VGPRs of inactive
lanes. Fix that by using only the active lanes of the scavenged VGPR.
This builds on the assumptions that
1. a function is never called with exec=0
2. lanes do not die in a function, i.e. exec!=0 in the function epilog
3. no new lanes are active when exiting the function, i.e. exec in the
epilog is a subset of exec in the prolog.
Differential Revision: https://reviews.llvm.org/D96869
Sebastian Neubauer [Mon, 12 Apr 2021 09:51:28 +0000 (11:51 +0200)]
[AMDGPU] Autogenerate test. NFC
Sebastian Neubauer [Mon, 12 Apr 2021 09:19:04 +0000 (11:19 +0200)]
[AMDGPU] Unify spill code
Instead of reimplementing spilling in prolog and epilog, reuse
buildSpillLoadStore.
Reviewed By: scott.linder
Differential Revision: https://reviews.llvm.org/D99269
Sebastian Neubauer [Mon, 12 Apr 2021 08:25:54 +0000 (10:25 +0200)]
[AMDGPU] Save VGPR of whole wave when spilling
Spilling SGPRs to scratch uses a temporary VGPR. LLVM currently cannot
determine if a VGPR is used in other lanes or not, so we need to save
all lanes of the VGPR. We even need to save the VGPR if it is marked as
dead.
The generated code depends on two things:
- Can we scavenge an SGPR to save EXEC?
- And can we scavenge a VGPR?
If we can scavenge an SGPR, we
- save EXEC into the SGPR
- set the needed lane mask
- save the temporary VGPR
- write the spilled SGPR into VGPR lanes
- save the VGPR again to the target stack slot
- restore the VGPR
- restore EXEC
If we were not able to scavenge an SGPR, we do the same operations, but
everytime the temporary VGPR is written to memory, we
- write VGPR to memory
- flip exec (s_not exec, exec)
- write VGPR again (previously inactive lanes)
Surprisingly often, we are able to scavenge an SGPR, even though we are
at the brink of running out of SGPRs.
Scavenging a VGPR does not have a great effect (saves three instructions
if no SGPR was scavenged), but we need to know if the VGPR we use is
live before or not, otherwise the machine verifier complains.
Differential Revision: https://reviews.llvm.org/D96336
Sven van Haastregt [Mon, 12 Apr 2021 08:30:06 +0000 (09:30 +0100)]
[OpenCL] Accept .rgba in OpenCL 3.0
The .rgba vector component accessors are supported in OpenCL C 3.0.
Previously, the diagnostic would check `OpenCLVersion` for version 2.2
(value 220) and report those accessors are an OpenCL 2.2 feature.
However, there is no "OpenCL C version 2.2", so change the check and
diagnostic text to 3.0 only.
A spurious `OpenCLVersion` argument was passed into the diagnostic;
remove that.
Differential Revision: https://reviews.llvm.org/D99969
Stelios Ioannou [Fri, 9 Apr 2021 16:36:20 +0000 (17:36 +0100)]
[AArch64] Adds memory operands for indexed loads.
This patch adds the memory operands for indexed loads so
that certain optimizations can take place.
Differential Revision: https://reviews.llvm.org/D100215/
Change-Id: I539fcf046ca4ad1e7df1d893f57d751419d8364d
Esme-Yi [Mon, 12 Apr 2021 07:42:54 +0000 (07:42 +0000)]
[DebugInfo] Fix the mismatching between C++ language tags and Dwarf versions.
Summary: The tags DW_LANG_C_plus_plus_14 and DW_LANG_C_plus_plus_11, introduced in Dwarf-5, are unexpected in previous versions. Fixing the mismathing doesn't have any drawbacks for any other debuggers, but helps dbx.
Reviewed By: aprantl, shchenz
Differential Revision: https://reviews.llvm.org/D99250
Balázs Kéri [Mon, 12 Apr 2021 06:52:40 +0000 (08:52 +0200)]
[clang][AST] Handle overload callee type in CallExpr::getCallReturnType.
The function did not handle every case. In some cases this
caused assertion failure.
After the fix the function returns DependentTy if the exact
return type can not be determined.
It seems that clang itself does not call the function in the
affected cases but some checker or other code may call it.
Reviewed By: hokein
Differential Revision: https://reviews.llvm.org/D95244
Zhang Qing Shan [Mon, 12 Apr 2021 06:55:03 +0000 (14:55 +0800)]
[NFC][Debug] Fix unnecessary deep-copy for vector to save compiling time
We saw some big compiling time impact after enabling the debug entry value
feature for X86 platform(D73534). Compiling time goes from 900s->1600s with
our testcase. It is caused by allocating/freeing the memory busily.
'using FwdRegWorklist = MapVector<unsigned, SmallVector<FwdRegParamInfo, 2>>;'
The value for this map is vector, and we miss the reference when access the
element. The same happens for `auto CalleesMap = MF->getCallSitesInfo();` which is a DenseMap.
Reviewed by: djtodoro, flychen50
Differential Revision: https://reviews.llvm.org/D100162
Mikael Holmen [Mon, 12 Apr 2021 06:23:47 +0000 (08:23 +0200)]
[libtooling][clang-tidy] Fix compiler warnings in testcase [NFC]
Without the fix we get:
06:31:09 In file included from ../../clang-tools-extra/unittests/clang-tidy/ClangTidyDiagnosticConsumerTest.cpp:3:
06:31:09 ../utils/unittest/googletest/include/gtest/gtest.h:1392:11: error: comparison of integers of different signs: 'const int' and 'const unsigned int' [-Werror,-Wsign-compare]
06:31:09 if (lhs == rhs) {
06:31:09 ~~~ ^ ~~~
06:31:09 ../utils/unittest/googletest/include/gtest/gtest.h:1421:12: note: in instantiation of function template specialization 'testing::internal::CmpHelperEQ<int, unsigned int>' requested here
06:31:09 return CmpHelperEQ(lhs_expression, rhs_expression, lhs, rhs);
06:31:09 ^
06:31:09 ../../clang-tools-extra/unittests/clang-tidy/ClangTidyDiagnosticConsumerTest.cpp:60:3: note: in instantiation of function template specialization 'testing::internal::EqHelper<false>::Compare<int, unsigned int>' requested here
06:31:09 EXPECT_EQ(4, Errors[0].Message.FileOffset);
06:31:09 ^
06:31:09 ../utils/unittest/googletest/include/gtest/gtest.h:1924:63: note: expanded from macro 'EXPECT_EQ'
06:31:09 EqHelper<GTEST_IS_NULL_LITERAL_(val1)>::Compare, \
06:31:09 ^
06:31:09 ../utils/unittest/googletest/include/gtest/gtest.h:1392:11: error: comparison of integers of different signs: 'const int' and 'const unsigned long' [-Werror,-Wsign-compare]
06:31:09 if (lhs == rhs) {
06:31:09 ~~~ ^ ~~~
06:31:09 ../utils/unittest/googletest/include/gtest/gtest.h:1421:12: note: in instantiation of function template specialization 'testing::internal::CmpHelperEQ<int, unsigned long>' requested here
06:31:09 return CmpHelperEQ(lhs_expression, rhs_expression, lhs, rhs);
06:31:09 ^
06:31:09 ../../clang-tools-extra/unittests/clang-tidy/ClangTidyDiagnosticConsumerTest.cpp:64:3: note: in instantiation of function template specialization 'testing::internal::EqHelper<false>::Compare<int, unsigned long>' requested here
06:31:09 EXPECT_EQ(1, Errors[0].Message.Ranges.size());
06:31:09 ^
06:31:09 ../utils/unittest/googletest/include/gtest/gtest.h:1924:63: note: expanded from macro 'EXPECT_EQ'
06:31:09 EqHelper<GTEST_IS_NULL_LITERAL_(val1)>::Compare, \
06:31:09 ^
06:31:09 2 errors generated.
Jim Lin [Mon, 12 Apr 2021 06:10:52 +0000 (14:10 +0800)]
[NFC] [Clang]: fix spelling mistake in assert message
Reviewed By: Jim
Differential Revision: https://reviews.llvm.org/D71541
Jim Lin [Mon, 12 Apr 2021 06:04:38 +0000 (14:04 +0800)]
fix typo in a CMake SANITIZER_CAN_USE_CXXABI variable initial definition
The current variable name isn't used anywhere else, which indicates it's
a typo. Let's fix it before someone copy+pastes it somewhere else.
Reviewed By: Jim
Differential Revision: https://reviews.llvm.org/D39157
Bing1 Yu [Wed, 24 Mar 2021 08:53:12 +0000 (16:53 +0800)]
[X86] Pass to transform tdpbsud&tdpbusd&tdpbuud intrinsics to scalar operation
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D99244
Evgeniy Brevnov [Fri, 9 Apr 2021 08:41:27 +0000 (15:41 +0700)]
[NARY] Don't optimize min/max if there are side uses
Say we have
%1=min(%a,%b)
%2=min(%b,%c)
%3=min(%2,%a)
The optimization will try to reassociate the later one so that we can rewrite it to %3=min(%1, %c) and remove %2.
But if %2 has another uses outside of %3 then we can't remove %2 and end up with:
%1=min(%a,%b)
%2=min(%b,%c)
%3=min(%1, %c)
This doesn't harm by itself except it is not profitable and changes IR for no good reason.
What is bad it triggers next iteration which finds out that optimization is applicable to %2 and %3 and generates:
%1=min(%a,%b)
%2=min(%b,%c)
%3=min(%1,%c)
%4=min(%2,%a)
and so on...
The solution is to prevent optimization in the first place if intermediate result (%2) has side uses and
known to be not removed.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D100170
Freddy Ye [Mon, 12 Apr 2021 02:36:08 +0000 (10:36 +0800)]
[X86] Remove FeatureCLWB from FeaturesICLClient
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D100279
Jez Ng [Mon, 12 Apr 2021 03:23:37 +0000 (23:23 -0400)]
[lld-macho][nfc] Convert tabs to spaces
Chen Zheng [Mon, 12 Apr 2021 03:13:17 +0000 (23:13 -0400)]
[Debug-Info] make fortran CHARACTER(1) type as valid unsigned type
This resolves https://bugs.llvm.org/show_bug.cgi?id=49872
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D100015
yifeng.dongyifeng [Mon, 12 Apr 2021 02:59:22 +0000 (10:59 +0800)]
[Clang][Coroutine][DebugInfo] In c++ coroutine, clang will emit different debug info variables for parameters and move-parameters.
The first one is the real parameters of the coroutine function, the
other one just for copying parameters to the coroutine frame.
Considering the following c++ code:
```
struct coro {
...
};
coro foo(struct test & t) {
...
co_await suspend_always();
...
co_await suspend_always();
...
co_await suspend_always();
}
int main(int argc, char *argv[]) {
auto c = foo(...);
c.handle.resume();
...
}
```
Function foo is the standard coroutine function, and it has only
one parameter named t (ignoring this at first),
when we use the llvm code to compile this function, we can get the
following ir:
```
!2921 = distinct !DISubprogram(name: "foo", linkageName:
"_ZN6Object3fooE4test", scope: !2211, file: !45, li\
ne: 48, type: !2329, scopeLine: 48, flags: DIFlagPrototyped |
DIFlagAllCallsDescribed, spFlags: DISPFlagDefi\
nition | DISPFlagOptimized, unit: !44, declaration: !2328,
retainedNodes: !2922)
!2924 = !DILocalVariable(name: "t", arg: 2, scope: !2921, file: !45,
line: 48, type: !838)
...
!2926 = !DILocalVariable(name: "t", scope: !2921, type: !838, flags:
DIFlagArtificial)
```
We can find there are two `the same` DIVariable named t in the same
dwarf scope for foo.resume.
And when we try to use llvm-dwarfdump to dump the dwarf info of this
elf, we get the following output:
```
0x00006684: DW_TAG_subprogram
DW_AT_low_pc (0x00000000004013a0)
DW_AT_high_pc (0x00000000004013a8)
DW_AT_frame_base (DW_OP_reg7 RSP)
DW_AT_object_pointer (0x0000669c)
DW_AT_GNU_all_call_sites (true)
DW_AT_specification (0x00005b5c "_ZN6Object3fooE4test")
0x000066a5: DW_TAG_formal_parameter
DW_AT_name ("t")
DW_AT_decl_file ("/disk1/yifeng.dongyifeng/my_code/llvm/build/bin/coro-debug-1.cpp")
DW_AT_decl_line (48)
DW_AT_type (0x00004146 "test")
0x000066ba: DW_TAG_variable
DW_AT_name ("t")
DW_AT_type (0x00004146 "test")
DW_AT_artificial (true)
```
The elf also has two 't' in the same scope.
But unluckily, it might let the debugger
confused. And failed to print parameters for O0 or above.
This patch will make coroutine parameters and move
parameters use the same DIVar and try to fix the problems
that I mentioned before.
Test Plan: check-clang
Reviewed By: aprantl, jmorse
Differential Revision: https://reviews.llvm.org/D97533
Qiu Chaofan [Mon, 12 Apr 2021 02:31:07 +0000 (10:31 +0800)]
[PowerPC] Lower f128 SETCC/SELECT_CC as libcall if p9vector disabled
XSCMPUQP is not available for pre-P9 subtargets. This patch will lower
them into libcall for correct behavior on power7/power8.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D92083
Zakk Chen [Thu, 8 Apr 2021 17:15:09 +0000 (10:15 -0700)]
[RISCV][Clang] Add some RVV Permutation intrinsic functions.
Support the following instructions.
1. Vector Slide Instructions
2. Vector Register Gather Instructions
3. Vector Compress Instruction
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D100127
Zakk Chen [Thu, 8 Apr 2021 15:28:15 +0000 (08:28 -0700)]
[RISCV][Clang] Add all RVV Mask intrinsic functions.
1. Redefine vpopc and vfirst IR intrinsic so it could adapt on
clang tablegen generator which always appends a type for vl
in IntrinsicType of clang codegen.
2. Remove `c` type transformer and add `u` and `l` for unsigned long
and long type.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D100120
Zakk Chen [Sun, 11 Apr 2021 14:25:06 +0000 (07:25 -0700)]
[RISCV][Clang] Add more RVV load/store intrinsic functions.
Support the following instructions.
1. Mask load and store
2. Vector Strided Instructions
3. Vector Indexed Store Instructions
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D99965
Zakk Chen [Tue, 6 Apr 2021 14:57:41 +0000 (07:57 -0700)]
[RISCV][Clang] Add all RVV Reduction intrinsic functions.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D99964
Zakk Chen [Tue, 6 Apr 2021 14:23:30 +0000 (07:23 -0700)]
[RISCV][Clang] Add RVV merge intrinsic functions.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D99963
Zakk Chen [Thu, 1 Apr 2021 16:21:11 +0000 (09:21 -0700)]
[RISCV][Clang] Add RVV Type-Convert intrinsic functions.
Fix extension macro condition.
Support below instructions:
1. Single-Width Floating-Point/Integer Type-Convert Instructions
2. Widening Floating-Point/Integer Type-Convert Instructions
3. Narrowing Floating-Point/Integer Type-Convert Instructions
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D99742
Zakk Chen [Thu, 8 Apr 2021 15:21:06 +0000 (08:21 -0700)]
[RISCV][Clang] Add some RVV Floating-Point intrinsic functions.
Support vfclass, vfmerge, vfrec7, vfrsqrt7, vfsqrt instructions.
Reviewed By: craig.topper
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Differential Revision: https://reviews.llvm.org/D99741
Zakk Chen [Thu, 8 Apr 2021 15:09:42 +0000 (08:09 -0700)]
[RISCV][Clang] Add more RVV Floating-Point intrinsic functions.
Support below instructions.
1. Vector Widening Floating-Point Add/Subtract Instructions
2. Vector Widening Floating-Point Multiply
3. Vector Single-Width Floating-Point Fused Multiply-Add Instructions
4. Vector Widening Floating-Point Fused Multiply-Add Instructions
5. Vector Floating-Point Compare Instructions
Reviewed By: craig.topper, HsiangKai
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Differential Revision: https://reviews.llvm.org/D99669
Zakk Chen [Thu, 8 Apr 2021 14:29:59 +0000 (07:29 -0700)]
[RISCV][Clang] Add some RVV Floating-Point intrinsic functions.
Support the following instructions which have the same class.
1. Vector Single-Width Floating-Point Subtract Instructions
2. Vector Single-Width Floating-Point Multiply/Divide Instructions
3. Vector Floating-Point MIN/MAX Instructions
4. Vector Floating-Point Sign-Injection Instructions
Reviewed By: craig.topper
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Differential Revision: https://reviews.llvm.org/D99668
Zakk Chen [Tue, 6 Apr 2021 10:26:44 +0000 (03:26 -0700)]
[RISCV][Clang] Add RVV Widening Integer Add/Subtract intrinsic functions.
Reviewed By: craig.topper
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Differential Revision: https://reviews.llvm.org/D99526
Jim Lin [Mon, 12 Apr 2021 02:15:35 +0000 (10:15 +0800)]
[RISCV][NFC] Remove unneeded explict XLenVT type on codegen patterns
Customized SDNode has been specified the explict XLenVT type.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D100190
Craig Topper [Mon, 12 Apr 2021 00:54:20 +0000 (17:54 -0700)]
[RISCV] Update computeKnownBitsForTargetNode to treat READ_VLENB as being 16 byte aligned.
According to the 0.10 spec, VLEN is at least 128 bits and is a
power of 2.
Craig Topper [Sun, 11 Apr 2021 18:57:52 +0000 (11:57 -0700)]
[RISCV] Use SLLI/SRLI instead of SLLIW/SRLIW for (srl (and X, 0xffff), C) custom isel on RV64.
We don't need the sign extending behavior here and SLLI/SRLI
are able to compress to C.SLLI/C.SRLI.
Roman Lebedev [Sun, 11 Apr 2021 20:44:24 +0000 (23:44 +0300)]
[NFCI][SimplifyCFG] PerformValueComparisonIntoPredecessorFolding(): improve Dominator Tree updating
Same as with previous patches.
Roman Lebedev [Sun, 11 Apr 2021 20:25:40 +0000 (23:25 +0300)]
[NFCI][SimplifyCFG] mergeEmptyReturnBlocks(): improve Dominator Tree updating
Same as with previous patches.
Roman Lebedev [Sun, 11 Apr 2021 20:17:48 +0000 (23:17 +0300)]
[NFCI][Local] MergeBasicBlockIntoOnlyPred(): improve Dominator Tree updating
Same as with TryToSimplifyUncondBranchFromEmptyBlock()/MergeBlockIntoPredecessor() patch.
Roman Lebedev [Sun, 11 Apr 2021 20:08:19 +0000 (23:08 +0300)]
[NFCI][BasicBlockUtils] MergeBlockIntoPredecessor(): improve Dominator Tree updating
Same as with TryToSimplifyUncondBranchFromEmptyBlock() patch.
Roman Lebedev [Sun, 11 Apr 2021 19:39:22 +0000 (22:39 +0300)]
[NFCI][Local] TryToSimplifyUncondBranchFromEmptyBlock(): improve Dominator Tree updating
First, we don't need vector-ness for the predecessor lists.
Secondly, like elsewhere, do insertions before deletions.
Lastly, the check that we actually need to insert an edge,
that it doesn't exist already, is backwards. Instead of
looking at successors of every single 'PredOfBB',
just always look at predecessors of the 'Succ'.
The result is always the same, but we avoid *really* inefficient code.
Roman Lebedev [Sun, 11 Apr 2021 19:02:57 +0000 (22:02 +0300)]
[NFCI][DomTreeUpdater] applyUpdates(): reserve space for updates first
While, indeed, we may end up pushing less updates that we'd reserve space
for, self-dominating updates aren't often enough for that to matter.
But this should matter for normal updates.
Florian Hahn [Sat, 10 Apr 2021 14:23:47 +0000 (15:23 +0100)]
[LoopUnroll] Add AArch64 test case with large vector ops.
Add test case to illustrate over-eager unrolling on AArch64, due to the
cost-model not estimating the size of vector loads/stores accurately.
Florian Hahn [Sun, 11 Apr 2021 15:51:37 +0000 (16:51 +0100)]
[VectorCombine] Add tests for load/extract scalarization.
Add tests where scalarizing a vector load + extract is profitable.
Simon Pilgrim [Sun, 11 Apr 2021 19:06:53 +0000 (20:06 +0100)]
[X86][AVX512] Fold not(kmov(x)) -> kmov(not(x)) and not(widen_subvector(x)) -> widen_subvector(not(x))
Improve AVX512 mask inversion, rG38c799bce801 exposed some missing opportunities to move scalar not() back onto the boolvector types for folding with setcc etc.
Thomas Lively [Sun, 11 Apr 2021 18:13:16 +0000 (11:13 -0700)]
[WebAssembly] Update v128.any_true
In the final SIMD spec, there is only a single v128.any_true instruction, rather
than one for each lane interpretation because the semantics do not depend on the
lane interpretation.
Differential Revision: https://reviews.llvm.org/D100241
Simon Pilgrim [Sun, 11 Apr 2021 18:01:59 +0000 (19:01 +0100)]
[X86] combineXor - Pull out repeated getOperand() calls. NFCI.
Simon Pilgrim [Sun, 11 Apr 2021 17:41:51 +0000 (18:41 +0100)]
[X86] Fold cmpeq/ne(and(X,Y),Y) --> cmpeq/ne(and(~X,Y),0)
Followup to D100177, handle an similar (demorgan inverse style) case from PR47797 as well
The AVX512 test cases could be further improved if we folded not(iX bitcast(vXi1)) -> (iX bitcast(not(vXi1)))
Alive2: https://alive2.llvm.org/ce/z/AnA_-W
Craig Topper [Sun, 11 Apr 2021 17:19:43 +0000 (10:19 -0700)]
[RISCV] Drop earlyclobber constraint from vwadd(u).wx, vwsub(u).wx, vfwadd.wf and vfwsub.wf.
The first source has the same EEW as the destination and the other
source is a scalar so the overlap constraints don't apply to
the unmasked version.
For the masked version we have a constraint that the destination
can't be V0 so that covers the only overlap issue there.
Reviewed By: khchen
Differential Revision: https://reviews.llvm.org/D100217
Craig Topper [Sun, 11 Apr 2021 16:51:45 +0000 (09:51 -0700)]
[RISCV] Teach targetShrinkDemandedConstant to preserve (and X, 0xffff) when zext.h is supported.
Similar to what we do for zext.w.
Disable the (srl (and X, 0xffff), C) custom isel when zext.h is
available.
Craig Topper [Sun, 11 Apr 2021 16:40:13 +0000 (09:40 -0700)]
[RISCV] Add i8 and i16 srli and srai tests to Zbb/Zbp test files. NFC
These require the input to be zero or sign extended. If we have
sext.b, sext.h or zext.h instructions we can use them. Otherwise
we need to use a pair of shifts to accomplish the zero/sign extend
and the final shift.
We don't currently use zext.h when it is available.