Victor Huang [Thu, 22 Jul 2021 15:05:30 +0000 (10:05 -0500)]
[PowerPC] Add PowerPC "__stbcx" builtin and intrinsic for XL compatibility
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtin and intrinsic for "__stbcx".
Reviewed By: nemanjai, #powerpc
Differential revision: https://reviews.llvm.org/D106484
Anastasia Stulova [Thu, 22 Jul 2021 15:43:18 +0000 (16:43 +0100)]
[OpenCL][NFC] Refactors lang version check in test.
Fixed test to use predefined version marco instead
of passing extra macro in the command line.
Patch by Topotuna (Justas Janickas)!
Differential Revision: https://reviews.llvm.org/D106254
Alexey Bataev [Thu, 22 Jul 2021 15:11:49 +0000 (08:11 -0700)]
[OPENMP]Fix PR49787: Codegen for calling __tgt_target_teams_nowait_mapper has too few arguments.
Added missed arguments in
__tgt_target_teams_nowait_mapper/__tgt_target_nowait_mapper runtime
functions calls.
Differential Revision: https://reviews.llvm.org/D106542
Nico Weber [Thu, 22 Jul 2021 14:31:39 +0000 (10:31 -0400)]
[lld/mac] Move handling of special undefineds later
treatUndefinedSymbol() was previously called before gatherInputSections()
and markLive() for these special symbols, but after them for normal
undefineds.
For PR50760, treatUndefinedSymbol() will have to potentially create
sections, so it's good to move treatUndefinedSymbol() for special
undefineds later, so that it can assume that gatherInputSections()
and markLive() has already been called always.
No intended behavior change, but part of PR50760 (and covered in
tests in the patch for the full feature).
Differential Revision: https://reviews.llvm.org/D106552
Alexey Bataev [Thu, 22 Jul 2021 15:06:03 +0000 (08:06 -0700)]
Revert "[OPENMP]Fix PR49787: Codegen for calling __tgt_target_teams_nowait_mapper has too few arguments."
This reverts commit
b455f7f22564a096c043b02fa159ab16669c121c to fix
buildbots.
Raphael Isemann [Thu, 22 Jul 2021 14:56:32 +0000 (16:56 +0200)]
[lldb] Remove a wrong assert in TestStructTypes that checks that empty structs in C always have size 0
D105471 fixes the way we assign sizes to empty structs in C mode. Instead of
just giving them a size 0, we instead use the size we get from DWARF if possible.
After landing D105471 the TestStructTypes test started failing on Windows. The
tests checked that the size of an empty C struct is 0 while the size LLDB now
reports is 4 bytes. It turns out that 4 bytes are the actual size Clang is using
for C structs with the MicrosoftRecordLayoutBuilder. The commit that introduced
that behaviour is
00a061dccc6671c96412d7b28ab2012963208579.
This patch removes that specific check from TestStructTypes. Note that D105471
added a series of tests that already cover this case (and the added checks
automatically adjust to whatever size the target compiler chooses for empty
structs).
Alexey Bataev [Wed, 21 Jul 2021 20:35:09 +0000 (13:35 -0700)]
[OPENMP]Fix PR49787: Codegen for calling __tgt_target_teams_nowait_mapper has too few arguments.
Added missed arguments in
__tgt_target_teams_nowait_mapper/__tgt_target_nowait_mapper runtime
functions calls.
Differential Revision: https://reviews.llvm.org/D106542
Aaron En Ye Shi [Wed, 21 Jul 2021 17:21:04 +0000 (17:21 +0000)]
[HIP] Fix no matching constructor for init of shared_ptr and malloc
Allow standard header versions of malloc and free to be defined
before introducing the device versions.
Fixes: SWDEV-295901
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D106463
Jon Chesterfield [Thu, 22 Jul 2021 14:24:19 +0000 (15:24 +0100)]
[libomptarget][nfc] Improve static assert message in dlwrap
Revision of D102858. Raise dlwrap arity argument to template argument
so the correct value is given in the error message. E.g. '2 == 1' instead of
'2 == trait<>::nargs'.
Arity higher than it should be:
Before diff
```
$/plugins/cuda/dynamic_cuda/cuda.cpp:23:1: error:
static_assert failed due to requirement '2 == trait<cudaError_enum (*)(unsigned int)>::nargs'
"Arity Error"
DLWRAP_INTERNAL(cuInit, 2);
^~~~~~~~~~~~~~~~~~~~~~~~~~
...
$/include/dlwrap.h:166:3: note: expanded from macro
'DLWRAP_COMMON'
static_assert(ARITY == trait<decltype(&SYMBOL)>::nargs, "Arity Error"); \
```
After diff
In file included from $/plugins/cuda/dynamic_cuda/cuda.cpp:16:
```
$/include/dlwrap.h:131:3: error: static_assert failed due to
requirement '2UL == 1UL' "Arity Error"
static_assert(Requested == Required, "Arity Error");
^ ~~~~~~~~~~~~~~~~~~~~~
$/plugins/cuda/dynamic_cuda/cuda.cpp:23:1: note: in
instantiation of function template specialization 'dlwrap::verboseAssert<2UL, 1UL>' requested
here
DLWRAP_INTERNAL(cuInit, 2);
```
Arity lower than it should be:
Before diff
```
$/plugins/cuda/dynamic_cuda/cuda.cpp:131:10: error: no
matching function for call to 'dlwrap_cuInit'
return dlwrap_cuInit(X);
^~~~~~~~~~~~~
$/plugins/cuda/dynamic_cuda/cuda.cpp:23:1: note: candidate
function not viable: requires 0 arguments, but 1 was provided
DLWRAP_INTERNAL(cuInit, 0);
```
After diff
In file included from $/plugins/cuda/dynamic_cuda/cuda.cpp:16:
```
$/include/dlwrap.h:131:3: error: static_assert failed due to
requirement '0UL == 1UL' "Arity Error"
static_assert(Requested == Required, "Arity Error");
^ ~~~~~~~~~~~~~~~~~~~~~
$/plugins/cuda/dynamic_cuda/cuda.cpp:23:1: note: in
instantiation of function template specialization 'dlwrap::verboseAssert<0UL, 1UL>' requested
here
DLWRAP_INTERNAL(cuInit, 0);
```
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D106543
Cullen Rhodes [Thu, 22 Jul 2021 13:46:19 +0000 (13:46 +0000)]
[AArch64][SME] Improve diagnostic for vector select register
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D106540
Melanie Blower [Thu, 22 Jul 2021 13:40:54 +0000 (09:40 -0400)]
Revert "[clang][fpenv][patch] Change clang option -ffp-model=precise to select ffp-contract=on"
This reverts commit
b9b696bba6702a31ac0995a494cd31c730ade5ec.
Buildbot failures see https://lab.llvm.org/buildbot#builders/118/builds/4138
and https://lab.llvm.org/buildbot#builders/110/builds/5112
Raphael Isemann [Thu, 22 Jul 2021 13:31:12 +0000 (15:31 +0200)]
[lldb] Fix TestCompletion by using SIGPIPE instead of SIGINT as test signal
The test I added in commit
078003482e90ff5c7ba047a3d3152f0b0c392b31 was using
SIGINT for testing the tab completion. The idea is to have a signal that only
has one possible completion and I ended up picking SIGIN -> SIGINT for the test.
However on non-Linux systems there is SIGINFO which is a valid completion for
`SIGIN' and so the test fails there.
This replaces SIGIN -> SIGINT with SIGPIP -> SIGPIPE completion which according
to LLDB's signal list in Host.cpp is the only valid completion.
Kazu Hirata [Thu, 22 Jul 2021 13:30:39 +0000 (06:30 -0700)]
[Transforms] Remove getOrCreateInitFunction (NFC)
The last use was removed on Jan 16, 2019 in commit
81101de5853b4ed64640220a086a67b16f36f153.
Joseph Huber [Thu, 22 Jul 2021 12:27:36 +0000 (08:27 -0400)]
[OpenMP] Fix warnings for uninitialized block counts
Summary:
Fixes some warning given for uninitialized block counts if the exection mode is
not recognized. This shouldn't happen in practice because the execution mode is
checked when it's read from the device.
Nico Weber [Thu, 22 Jul 2021 13:11:07 +0000 (09:11 -0400)]
[gn build] (manually) port
78bda894129 from 2012 because
924d62ca4a85 added it to check-llvm
Aaron Ballman [Thu, 22 Jul 2021 13:10:36 +0000 (09:10 -0400)]
Implement _ExtInt conversion rules
Clang implemented the _ExtInt datatype as a bit-precise integer type,
which was then proposed to WG14. WG14 has accepted the proposal
(http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2709.pdf), but Clang
requires some additional work as a result.
In the original Clang implementation, we elected to disallow implicit
conversions involving these types until after WG14 finalized the rules.
This patch implements the rules decided by WG14: no integer promotion
for bit-precise types, conversions prefer the larger of the two types
and in the event of a tie (say _ExtInt(32) and a 32-bit int), the
standard type wins.
There are more changes still needed to conform to N2709, but those will
be handled in follow-up patches.
Raphael Isemann [Thu, 22 Jul 2021 12:18:57 +0000 (14:18 +0200)]
[lldb][NFC] Allow range-based for loops over DWARFDIE's children
This patch adds the ability to get a DWARFDIE's children as an LLVM range.
This way we can use for range loops to iterate over them and we can use LLVM's
algorithms like `llvm::all_of` to query all children.
The implementation has to do some small shenanigans as the iterator needs to
store a DWARFDIE, but a DWARFDIE container is also a DWARFDIE so it can't return
the iterator by value. I just made the `children` getter a templated function to
avoid the cyclic dependency.
Reviewed By: #lldb, werat, JDevlieghere
Differential Revision: https://reviews.llvm.org/D103172
Med Ismail Bennani [Tue, 20 Jul 2021 13:00:54 +0000 (13:00 +0000)]
[lldb/Plugins] Add ScriptedProcess Process Plugin
This patch introduces Scripted Processes to lldb.
The goal, here, is to be able to attach in the debugger to fake processes
that are backed by script files (in Python, Lua, Swift, etc ...) and
inspect them statically.
Scripted Processes can be used in cooperative multithreading environments
like the XNU Kernel or other real-time operating systems, but it can
also help us improve the debugger testing infrastructure by writting
synthetic tests that simulates hard-to-reproduce process/thread states.
Although ScriptedProcess is not feature-complete at the moment, it has
basic execution capabilities and will improve in the following patches.
rdar://
65508855
Differential Revision: https://reviews.llvm.org/D100384
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Melanie Blower [Thu, 22 Jul 2021 11:58:50 +0000 (07:58 -0400)]
[clang][fpenv][patch] Change clang option -ffp-model=precise to select ffp-contract=on
Change the ffp-model=precise to enables -ffp-contract=on (previously
-ffp-model=precise enabled -ffp-contract=fast). This is a follow-up
to Andy Kaylor's comments in the llvm-dev discussion "Floating Point
semantic modes". From the same email thread, I put Andy's distillation
of floating point options and floating point modes into UsersManual.rst
Also fixes bugs.llvm.org/show_bug.cgi?id=50222
Reviewed By: rjmccall, andrew.kaylor
Differential Revision: https://reviews.llvm.org/D74436
Raphael Isemann [Thu, 22 Jul 2021 11:37:34 +0000 (13:37 +0200)]
[lldb] Fix that `process signal` completion always returns all signals
`CompletionRequest::AddCompletion` adds the given string as completion of the
current command token. `CompletionRequest::TryCompleteCurrentArg` only adds it
if the current token is a prefix of the given string. We're using
`AddCompletion` for the `process signal` handler which means that `process
signal SIGIN` doesn't get uniquely completed to `process signal SIGINT` as we
unconditionally add all other signals (such as `SIGABRT`) as possible
completions.
By using `TryCompleteCurrentArg` we actually do the proper filtering which will
only add `SIGINT` (as that's the only signal with the prefix 'SIGIN' in the
example above).
Reviewed By: mib
Differential Revision: https://reviews.llvm.org/D105028
Caroline Concatto [Wed, 5 May 2021 14:20:16 +0000 (15:20 +0100)]
[LoopVectorize] Fix crash for predicated instruction with scalable VF
This patch avoids computing discounts for predicated instructions when the
VF is scalable.
There is no support for vectorization of loops with division because the
vectorizer cannot guarantee that zero divisions will not happen.
This loop now does not use VF scalable
```
for (long long i = 0; i < n; i++)
if (cond[i])
a[i] /= b[i];
```
Differential Revision: https://reviews.llvm.org/D101916
Paulo Matos [Fri, 16 Jul 2021 08:34:44 +0000 (10:34 +0200)]
Add support for zero-sized Scalars as a LowLevelType
Opaque values (of zero size) can be stored in memory with the
implemention of reference types in the WebAssembly backend. Since
MachineMemOperand uses LLTs we need to be able to support
zero-sized scalars types in LLTs.
Differential Revision: https://reviews.llvm.org/D105423
Raphael Isemann [Thu, 22 Jul 2021 11:31:23 +0000 (13:31 +0200)]
[lldb][NFCI] Remove redundant accessibility heuristic in the DWARF parser
LLDB's DWARF parser has some heuristics for guessing and fixing up the
accessibility of C++ class/struct members after they were already created in the
internal Clang AST. The heuristic is that if a struct/class has a base class,
then it's actually a class and it's members are private unless otherwise
specified.
From what I can see this heuristic isn't sound and also unnecessary. The idea
that inheritance implies that the `class` keyword was used and the default
visibility is `private` is incorrect. Also both GCC and Clang use
`DW_TAG_structure_type` and `DW_TAG_class_type` for `struct` and `class` types
respectively, so the default visibility we infer from that information is always
correct and there is no need to fix it up.
And finally, the access specifiers we set in the Clang AST are anyway unused
within LLDB. The expression parser explicitly ignores them to give users access
to private members and there is not SBAPI functionality that exposes this
information.
This patch removes all the heuristic code for the reasons above and instead
just relies on the access values we infer from the tag kind and explicit
annotations in DWARF.
This patch is NFCI.
Reviewed By: werat
Differential Revision: https://reviews.llvm.org/D105463
Raphael Isemann [Thu, 22 Jul 2021 11:25:16 +0000 (13:25 +0200)]
[lldb] Generalize empty record size computation to avoid giving empty C++ structs a size of 0
C doesn't allow empty structs but Clang/GCC support them and give them a size of 0.
LLDB implements this by checking the tag kind and if it's `DW_TAG_structure_type` then
we give it a size of 0 via an empty external RecordLayout. This is done because our
internal TypeSystem is always in C++ mode (which means we would give them a size
of 1).
The current check for when we have this special case is currently too lax as types with
`DW_TAG_structure_type` can also occur in C++ with types defined using the `struct`
keyword. This means that in a C++ program with `struct Empty{};`, LLDB would return
`0` for `sizeof(Empty)` even though the correct size is 1.
This patch removes this special case and replaces it with a generic approach that just
assigns empty structs the byte_size as specified in DWARF. The GCC/Clang special
case is handles as they both emit an explicit `DW_AT_byte_size` of 0. And if another
compiler decides to use a different byte size for this case then this should also be
handled by the same code as long as that information is provided via `DW_AT_byte_size`.
Reviewed By: werat, shafik
Differential Revision: https://reviews.llvm.org/D105471
Florian Mayer [Thu, 22 Jul 2021 11:15:54 +0000 (12:15 +0100)]
Revert "[hwasan] Use stack safety analysis."
This reverts commit
bde9415fef25e9ff6e10595a2f4f5004dd62f10a.
Dawid Jurczak [Thu, 22 Jul 2021 09:42:59 +0000 (11:42 +0200)]
[LoopIdiom] Transform memmove-like loop into memmove (PR46179)
The purpose of patch is to learn Loop idiom recognition pass how to recognize simple memmove patterns
in similar way like GCC: https://godbolt.org/z/fh95e83od
LoopIdiomRecognize already has machinery for memset and memcpy recognition, patch tries to extend exisiting capabilities with minimal effort.
Differential Revision: https://reviews.llvm.org/D104464
Florian Mayer [Tue, 20 Jul 2021 09:38:52 +0000 (10:38 +0100)]
[hwasan] Use stack safety analysis.
This avoids unnecessary instrumentation.
Reviewed By: eugenis, vitalybuka
Differential Revision: https://reviews.llvm.org/D105703
Balázs Kéri [Thu, 22 Jul 2021 09:43:13 +0000 (11:43 +0200)]
[clang][AST] Add support for DecompositionDecl to ASTImporter.
BindingDecl was added recently but the related DecompositionDecl is needed
to make C++17 structured bindings importable.
Import of BindingDecl was changed to avoid infinite import loop.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D105354
Jan Svoboda [Thu, 22 Jul 2021 10:21:12 +0000 (12:21 +0200)]
[clang][lex] NFC: Add explicit cast to silence -Wsign-compare
Simon Pilgrim [Thu, 22 Jul 2021 09:58:31 +0000 (10:58 +0100)]
[InstCombine] Fold (gep (oneuse(gep Ptr, Idx0)), Idx1) -> (gep Ptr, (add Idx0, Idx1)) (PR51069)
As noticed on D106352, after we've folded "(select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0))" if the inner Ptr was also a (now one use) gep we could then merge the geps, using the sum of the indices instead.
I've limited this to basic 2-op geps - a more general case further down InstCombinerImpl.visitGetElementPtrInst doesn't have the one-use limitation but only creates the add if it can be created via SimplifyAddInst.
https://alive2.llvm.org/ce/z/f8pLfD (Thanks Roman!)
Differential Revision: https://reviews.llvm.org/D106450
Jon Chesterfield [Thu, 22 Jul 2021 09:29:30 +0000 (10:29 +0100)]
[libomptarget][amdgpu][nfc] Drop dead signal pool setup
This class is instantiated once in rtl.cpp before hsa_init is
called. The hsa_signal_create call therefore fails leaving the pool empty.
This signal pool is a legacy from ATMI where it was constructed after hsa_init.
Moving the state into the rtl.cpp global class disabled the initial populating
of the pool without noticeably changing performance. Just rechecked with a fix
that allocates the signals after hsa_init and that also doesn't noticeably
change performance.
This patch therefore drops the initialisation. Only change from main is to
drop a DEBUG_PRINT statement that would say the pool initial size is zero.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D106515
Simon Tatham [Thu, 22 Jul 2021 09:08:06 +0000 (10:08 +0100)]
[clang] Use i64 for the !srcloc metadata on asm IR nodes.
This is part of a patch series working towards the ability to make
SourceLocation into a 64-bit type to handle larger translation units.
!srcloc is generated in clang codegen, and pulled back out by llvm
functions like AsmPrinter::emitInlineAsm that need to report errors in
the inline asm. From there it goes to LLVMContext::emitError, is
stored in DiagnosticInfoInlineAsm, and ends up back in clang, at
BackendConsumer::InlineAsmDiagHandler(), which reconstitutes a true
clang::SourceLocation from the integer cookie.
Throughout this code path, it's now 64-bit rather than 32, which means
that if SourceLocation is expanded to a 64-bit type, this error report
won't lose half of the data.
The compiler will tolerate both of i32 and i64 !srcloc metadata in
input IR without faulting. Test added in llvm/MC. (The semantic
accuracy of the metadata is another matter, but I don't know of any
situation where that matters: if you're reading an IR file written by
a previous run of clang, you don't have the SourceManager that can
relate those source locations back to the original source files.)
Original version of the patch by Mikhail Maltsev.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D105491
David Green [Thu, 22 Jul 2021 09:22:42 +0000 (10:22 +0100)]
[AArch64] Add and update reduction and shuffle costs. NFC
Dmitry Vyukov [Wed, 21 Jul 2021 13:39:01 +0000 (15:39 +0200)]
sanitizers: increase .clang-format columns to 100
The current (default) line length is 80 columns.
That's based on old hardware and historical conventions.
There are no existent reasons to keep line length that small,
especially provided that our coding style uses quite lengthy
identifiers. The Linux kernel recently switched to 100,
let's start with 100 as well.
This change intentionally does not re-format code.
Re-formatting is intended to happen incrementally,
or on dir-by-dir basis separately.
Reviewed By: vitalybuka, melver, MaskRay
Differential Revision: https://reviews.llvm.org/D106436
Fraser Cormack [Thu, 20 May 2021 16:28:45 +0000 (17:28 +0100)]
[RISCV] Fix a crash when lowering split float arguments
Lowering certain float vectors without legal vector types could cause a
crash due to a bad interaction between passing floats via GPRs and
argument splitting. Split vector floats appear just like scalar floats.
Under certain situations we choose to pass these float arguments via
GPRs and use an XLenVT location and set the 'BCvt' info to track how
they must be converted back to floating-point values. However, later
logic for handling split arguments may take over, in which case we lose
the previous information and set the 'Indirect' info, thus incorrectly
lowering to integer types.
I don't believe that we would have come across the notion of split
floating-point arguments before. This patch addresses the issue by
updating the lowering so that split arguments are only passed indirectly
when they are scalar integer types.
This has some change to how we lower some larger illegal float vectors,
as can be seen in 'fastcc-float.ll' where the vector is now passed
partly in registers and partly on the stack.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D102852
Fraser Cormack [Fri, 16 Jul 2021 14:12:46 +0000 (15:12 +0100)]
[RISCV] Lower more BUILD_VECTOR sequences to RVV's VID
This relands
a6ca88e908b5befcd9b0f8c8cb40f53095cc17bc which was originally
reverted due to overflow bugs in
e3fa2b1eab60342dc882b7b888658b03c472fa2b.
This patch teaches the compiler to identify a wider variety of
`BUILD_VECTOR`s which form integer arithmetic sequences, and to lower
them to `vid.v` with modifications for non-unit steps and non-zero
addends.
The sequences handled by this optimization must either be monotonically
increasing or decreasing. Consecutive elements holding the same value
indicate a fractional step which, while simple mathematically,
becomes more complex to handle both in the realm of lossy integer
division and in the presence of `undef`s.
For example, a common "interleaving" shuffle index will be lowered by
LLVM to both `<0,u,1,u,2,...>` and `<u,0,u,1,u,...>` `BUILD_VECTOR`
nodes. Either of these would ideally be lowered to `vid.v` shifted right
by 1. Detection of this sequence in presence of general `undef` values
is more complicated, however: `<0,u,u,1,>` could match either
`<0,0,0,1,>` or `<0,0,1,1,>` depending on later values in the sequence.
Both are possible, so backtracking or multiple passes is inevitable.
Sticking to monotonic sequences keeps the logic simpler as it can be
done in one pass. Fractional steps will likely be a separate
optimization in a future patch.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D104921
Whisperity [Tue, 20 Jul 2021 14:05:54 +0000 (16:05 +0200)]
[clang-tidy] Fix crash and handle AttributedType in 'bugprone-easily-swappable-parameters'
@vabridgers identified a way to crash the check by running on code that
involve `AttributedType`s. This patch fixes the check to first and
foremost not crash, but also improves the logic handling qualifiers.
If the types contain any additional (not just CVR) qualifiers that are
not the same, they will not be deemed mixable. The logic for CVR-Mixing
and the `QualifiersMix` check option remain unchanged.
Reviewed By: aaron.ballman, vabridgers
Differential Revision: http://reviews.llvm.org/D106361
Jason Molenda [Thu, 22 Jul 2021 08:02:54 +0000 (01:02 -0700)]
Read and write a LC_NOTE "addrable bits" for addressing mask
This patch adds code to process save-core for Mach-O files which
embeds an "addrable bits" LC_NOTE when the process is using a
code address mask (e.g. AArch64 v8.3 with ptrauth aka arm64e).
Add code to ObjectFileMachO to read that LC_NOTE from corefiles,
and ProcessMachCore to set the process masks based on it when reading
a corefile back in.
Also have "process status --verbose" print the current address masks
that lldb is using internally to strip ptrauth bits off of addresses.
Differential Revision: https://reviews.llvm.org/D106348
rdar://
68630113
Timm Bäder [Wed, 21 Jul 2021 10:03:05 +0000 (12:03 +0200)]
[llvm][tools] Hide remaining unrelated llvm- tool options
Differential Revision: https://reviews.llvm.org/D106430
Nathan Ridge [Tue, 29 Jun 2021 05:54:12 +0000 (01:54 -0400)]
[clangd] Ensure Ref::Container refers to an indexed symbol
Fixes https://github.com/clangd/clangd/issues/806
Differential Revision: https://reviews.llvm.org/D105083
Hsiangkai Wang [Wed, 21 Jul 2021 02:27:35 +0000 (10:27 +0800)]
[llvm-mc-assemble-fuzzer] Initialize MCTargetOptions.
When run the command in the llvm-mc-assemble-fuzzer document,
```
llvm-mc-fuzzer --triple=aarch64-linux-gnu --fuzzer-args -max_len=4
```
it triggers the following assertion:
```
llvm-mc-assemble-fuzzer:
llvm-project/llvm/lib/MC/MCTargetOptionsCommandFlags.cpp:38:
bool llvm::mc::getRelaxAll(): Assertion `RelaxAllView &&
"RegisterMCTargetOptionsFlags not created."' failed.
```
It is caused by no global RegisterMCTargetOptionsFlags object to initialize
the MC target options.
Differential Revision: https://reviews.llvm.org/D106417
Jun Ma [Mon, 19 Jul 2021 13:10:17 +0000 (21:10 +0800)]
[AArch64][SVE] Handle svbool_t VLST <-> VLAT/GNUT conversion
According to https://godbolt.org/z/q5rME1naY and acle, we found that
there are different SVE conversion behaviours between clang and gcc. It turns
out that llvm does not handle SVE predicates width properly.
This patch 1) checks SVE predicates width rightly with svbool_t type.
2) removes warning on svbool_t VLST <-> VLAT/GNUT conversion.
3) disables VLST <-> VLAT/GNUT conversion between SVE vectors and predicates
due to different width.
Differential Revision: https://reviews.llvm.org/D106333
Johannes Doerfert [Fri, 16 Jul 2021 19:14:37 +0000 (14:14 -0500)]
[Attributor][FIX] Improve call graph updating
If we remove a non-intrinsic instruction we need to tell the (old) call
graph about it. This caused problems with some features down the line as
they allowed to removed calls more aggressively.
Johannes Doerfert [Wed, 21 Jul 2021 20:11:44 +0000 (15:11 -0500)]
[Attributor][FIX] Do not introduce multiple instances of SSA values
If we have a recursive function we could create multiple instantiations
of an SSA value, one per recursive invocation of the function. This is a
problem as we use SSA value equality in various places. The basic idea
follows from this test:
```
static int r(int c, int *a) {
int X;
return c ? r(false, &X) : a == &X;
}
int test(int c) {
return r(c, undef);
}
```
If we look through the argument `a` we will end up with `X`. Using SSA
value equality we will fold `a == &X` to true and return true even
though it should have been false because `a` and `&X` are from different
instantiations of the function.
Various tests for this have been placed in value-simplify-instances.ll
and this commit fixes them all by avoiding to produce simplified values
that could be non-unique at runtime. Thus, the result of a simplify
value call will always be unique at runtime or the original value, both
do not allow to accidentally compare two instances of a value with each
other and conclude they are equal statically (pointer equivalence) while
they are unequal at runtime.
Johannes Doerfert [Thu, 22 Jul 2021 02:58:00 +0000 (21:58 -0500)]
[Attributor] Improve the Attributor::getAssumedConstant interface
Similar to Attributor::getAssumedSimplified we need to allow IRPs
directly to get the right simplification callback (and context).
ShihPo Hung [Thu, 22 Jul 2021 03:51:18 +0000 (11:51 +0800)]
[RegisterCoalescer] Make resolveConflicts aware of earlyclobber
Prior to this patch, it skipped the instruction defining VNI when checking if the tainted lanes are used.
In the given example, VRGATHER is an illegal instruction because its DstReg overlaps with SrcReg.
Therefore we need to check the defining instruction as well when there is an earlyclobber constraint.
Reviewed By: qcolombet
Differential Revision: https://reviews.llvm.org/D105684
Johannes Doerfert [Wed, 21 Jul 2021 04:00:32 +0000 (23:00 -0500)]
[Attributor][NFC] Precommit tests exposing a conceptual simplification problem
Value simplification works under the implicit assumption that two SSA
values (`llvm::Value`) that are pointer equal are also equal at runtime.
This is mostly true except for values that are instantiated multiple
times. These test cases expose the problems we currently have when it
comes to recursion and multiple instances of values.
Johannes Doerfert [Tue, 20 Jul 2021 03:31:51 +0000 (22:31 -0500)]
[OpenMP][FIX] Use name + type checks not only name checks for calls
A call that is analyzed in an optimization needs to be verified against
the name and type of the runtime function to avoid that we look at
arguments that do not exist (anymore). This can happen if the signature
was rewritten. Since we will not set RFI.Declaration if the type doesn't
match we can use it (if it's not null) to determine if the signature is
as expected.
Differential Revision: https://reviews.llvm.org/D106341
Johannes Doerfert [Thu, 22 Jul 2021 03:01:02 +0000 (22:01 -0500)]
[Attributor][NFC] Clang format
rdzhabarov [Thu, 22 Jul 2021 00:06:43 +0000 (00:06 +0000)]
[mlir] Fix various issues in TimerImpl.
More specifically:
1) Use variable after move.
2) steady_clock needs to be used for measuring time intervals, but not the system_clock.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D106513
Uday Bondhugula [Thu, 22 Jul 2021 01:39:53 +0000 (07:09 +0530)]
[MLIR] Fix affine.for empty loop body folder
Fix affine.for empty loop body folder in the presence of yield values.
The existing pattern ignored iter_args/yield values and thus crashed
when yield values had uses.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D106121
Ben Shi [Thu, 22 Jul 2021 02:26:52 +0000 (10:26 +0800)]
[RISCV] Optimize multiplication in the zba extension with SH*ADD
This patch make the following optimization.
(mul x, 3 * power_of_2) -> (SLLI (SH1ADD x, x), bits)
(mul x, 5 * power_of_2) -> (SLLI (SH2ADD x, x), bits)
(mul x, 9 * power_of_2) -> (SLLI (SH3ADD x, x), bits)
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D105796
Carl Ritson [Thu, 22 Jul 2021 01:22:02 +0000 (10:22 +0900)]
[AMDGPU] Add VReg_192/VReg_224 support for MIMG instructions
Allow MIMG instructions to be selected with 6/7 VGPRs for vaddr.
Previously these were rounded up to VReg_256 this saves VGPRs.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D103800
thomasraoux [Mon, 19 Jul 2021 23:05:23 +0000 (16:05 -0700)]
[mlir] Extend scf pipeling to support loop carried dependencies
Differential Revision: https://reviews.llvm.org/D106325
Hsiangkai Wang [Tue, 8 Jun 2021 06:29:40 +0000 (14:29 +0800)]
[Clang][RISCV] Implement vsoxseg and vsuxseg.
Differential Revision: https://reviews.llvm.org/D103873
Hsiangkai Wang [Tue, 8 Jun 2021 05:29:51 +0000 (13:29 +0800)]
[Clang][RISCV] Implement vssseg.
Differential Revision: https://reviews.llvm.org/D103872
Hsiangkai Wang [Tue, 8 Jun 2021 05:09:07 +0000 (13:09 +0800)]
[Clang][RISCV] Implement vsseg.
Differential Revision: https://reviews.llvm.org/D103871
Hsiangkai Wang [Mon, 7 Jun 2021 13:53:49 +0000 (21:53 +0800)]
[Clang][RISCV] Add vloxseg and vluxseg test cases.
Hsiangkai Wang [Mon, 7 Jun 2021 13:53:37 +0000 (21:53 +0800)]
[Clang][RISCV] Implement vloxseg and vluxseg.
Differential Revision: https://reviews.llvm.org/D103809
Hsiangkai Wang [Mon, 7 Jun 2021 09:54:00 +0000 (17:54 +0800)]
[Clang][RISCV] Implement vlsseg.
Differential Revision: https://reviews.llvm.org/D103796
Carl Ritson [Thu, 22 Jul 2021 00:59:35 +0000 (09:59 +0900)]
[AMDGPU] Allow frontends to disable null export for pixel shaders
Disable null export (for kills) when a frontend defines a pixel
shader as not exporting using amdgpu-color-export and
amdgpu-depth-export function attrbutes.
This allows the generation of export free pixel shaders.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D105683
Joseph Huber [Wed, 21 Jul 2021 19:52:04 +0000 (15:52 -0400)]
[OpenMP] Strip NoInline from known OpenMP runtime functions
This patch strips the NoInline attribute from known OpenMP runtime functions.
This is done so that we can denote certain runtime functions as NoInline to
ensure their call sites are intact so they can be checked by OpenMPOpt. We
don't wan't this noinline attribute to remain for any functions after OpenMPOpt
has been run however.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D106482
Joseph Huber [Tue, 20 Jul 2021 21:55:08 +0000 (17:55 -0400)]
[OpenMP] Fold `__kmpc_is_generic_main_thread_id` if possible
This patch adds the ability to fold `__kmpc_is_generic_main_thread_id` if we
know for a fact that it is executed by the initial thread using
AAExecutionDomain. This combined with folding `__kmpc_is_spmd_exec_mode` will
allow us to fully fold `__kmpc_is_generic_main_thread`.
Depends on D106438 D106437
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D106439
Joseph Huber [Wed, 21 Jul 2021 12:57:34 +0000 (08:57 -0400)]
[OpenMP] Add an option to disable function internalization
Function internalization can sometimes occur in situations where we want to
keep the call sites intact. This patch adds an option to disable function
internalization and prevents the device runtime from being internalized while
creating the bitcode library.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D106438
Joseph Huber [Tue, 20 Jul 2021 20:07:52 +0000 (16:07 -0400)]
[Libomptarget] Introduce new main thread ID runtime function
This patch introduces `__kmpc_is_generic_main_thread_id` which splits the old
comparison into its own runtime function. The purpose of this is so we can fold
this part independently, so when both this and `is_spmd_mode` are folded the
final function will be folded as well.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D106437
Joseph Huber [Wed, 21 Jul 2021 16:48:39 +0000 (12:48 -0400)]
[OpenMP] Add new execution mode for SPMD execution with Generic semantics
Qualified kernels can be transformed from generic-mode to SPMD mode using an
optimization in OpenMPOpt. This patch introduces a new execution mode to
indicate kernels that have been transformed from generic-mode to SPMD-mode.
These kernels have SPMD-mode execution, but need generic-mode semantics for
scheduling the blocks and threads. Without this far too few blocks will be
scheduled for a generic region as SPMD mode expects the trip count to be
divided by the number of threads.
Reviewed By: ggeorgakoudis
Differential Revision: https://reviews.llvm.org/D106460
Joseph Huber [Wed, 21 Jul 2021 21:13:46 +0000 (17:13 -0400)]
[OpenMP] Change `__kmpc_free_shared` to include the paired allocation size
This patch changes `__kmpc_free_shared` to take an additional argument
corresponding to the associated allocation's size. This makes it easier to
implement the allocator in the runtime.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D106496
Lang Hames [Thu, 22 Jul 2021 00:45:24 +0000 (10:45 +1000)]
Re-re-revert "[ORC][ORC-RT] Add initial native-TLV support to MachOPlatform."
This reverts commit
6b2a96285b9bbe92d2c5e21830f21458f8be976d.
The ccache builders are still failing. Looks like they need to be updated to
get the llvm-zorg config change in
490633945677656ba75d42ff1ca9d4a400b7b243.
I'll re-apply this as soon as the builders are updated.
David Blaikie [Mon, 19 Jul 2021 04:44:08 +0000 (21:44 -0700)]
Fix assigned-but-unused (except in an assert) warning with a void cast
Hedin Garca [Wed, 21 Jul 2021 18:12:29 +0000 (18:12 +0000)]
[libc] Rename FEnv.h and refactor subsequent files
Because Windows's pathnames are not case sensitive,
to avoid include conflicts between our header file FEnv.h and the
one from the C Standard library, <fenv.h>, the prior file was renamed.
The motive for the relabel came to fix this include error in
TestHelpers.cpp since a conflict arose with a file in the same
directory when #include <fenv.h> was being used.
Reviewed By: sivachandra, aeubanks
Differential Revision: https://reviews.llvm.org/D106470
Jacob Hegna [Fri, 16 Jul 2021 21:25:33 +0000 (21:25 +0000)]
[NFC] Code cleanups in InlineCost.cpp.
- annotate const functions with "const"
- replace C-style casts with static_cast
Differential Revision: https://reviews.llvm.org/D105362
Lang Hames [Wed, 21 Jul 2021 07:38:35 +0000 (17:38 +1000)]
Re-re-apply "[ORC][ORC-RT] Add initial native-TLV support to MachOPlatform."
This reapplies commit
a7733e9556b5a6334c910f88bcd037e84e17e3fc ("Re-apply
[ORC][ORC-RT] Add initial native-TLV support to MachOPlatform."), and
d4abdefc998a1ee19d5edc79ec233774cbf64f6a ("[ORC-RT] Rename macho_tlv.x86-64.s
to macho_tlv.x86-64.S (uppercase suffix)").
These patches were reverted in
48aa82cacbff10e1c5395a03f86488bf449ba4da while I
investigated bot failures (e.g.
https://lab.llvm.org/buildbot/#/builders/109/builds/18981). The fix was to
disable building of the ORC runtime on buliders using ccache (which is the same
fix used for other compiler-rt projects containing assembly code). This fix was
commited to llvm-zorg in
490633945677656ba75d42ff1ca9d4a400b7b243.
Thomas Lively [Wed, 21 Jul 2021 23:45:54 +0000 (16:45 -0700)]
[WebAssembly] Replace @llvm.wasm.popcnt with @llvm.ctpop.v16i8
Use the standard target-independent intrinsic to take advantage of standard
optimizations.
Differential Revision: https://reviews.llvm.org/D106506
Fangrui Song [Wed, 21 Jul 2021 23:16:20 +0000 (16:16 -0700)]
[mlir] Add workaround for false positive in -Wfree-nonheap-object
Restore
499571ea835daf786626a0db1e12f890b6cd8f8d
reverted by
0082764605cc0e7e0363a41ffa77d214c3157aa6.
A compiler slightly older than
"[clang][Sema] removes -Wfree-nonheap-object reference param false positive"
may report the false positive.
We need to retain the workaround a bit longer so that such compilers
can be used to compile MLIR in a warning-free way.
Thomas Lively [Wed, 21 Jul 2021 23:11:00 +0000 (16:11 -0700)]
[WebAssembly] Remove clang builtins for extract_lane and replace_lane
These builtins were added to capture the fact that the underlying Wasm
instructions return i32s and implicitly sign or zero extend the extracted lanes
in the case of the i8x16 and i16x8 variants. But we do sufficient optimizations
during code gen that these low-level details do not need to be exposed to users.
This commit replaces the use of the builtins in wasm_simd128.h with normal
target-independent vector code. As a result, we can switch the relevant
intrinsics to use functions rather than macros and can use more user-friendly
return types rather than trying to precisely expose the underlying Wasm types.
Note, however, that the generated LLVM IR is no different after this change.
Differential Revision: https://reviews.llvm.org/D106500
John Ericson [Sat, 3 Jul 2021 05:18:37 +0000 (05:18 +0000)]
Remove `LIBC_INSTALL_PREFIX`
This matches the decision made in D99697.
It also shouldn't reintroduce the issue fixed in D99636.
The variable was originally introduced in
b22f448c21e718a3b6219df89169f38d436189c6 but is not essential to that
change.
Once we finish adding `GnuInstallDirs` support in D100810 and D99484,
setting `CMAKE_INSTALL_LIBDIR` would also work to change the
installation directory (though for more than libc).
`GnuInstallDirs` support also brings up an issue which is avoided if
variables like `LIBC_INSTALL_PREFIX` don't exist. Because the
`GnuInstallDirs` variables can be absolute paths, it is a bit unclear
how the per-project prefixes would work: does the project-agnostic
role-specific variable (e.g. `CMAKE_INSTALL_LIBDIR`), or project-specfic
role-agnostic (e.g. `LIBC_INSTALL_PREFIX`) take priority? Each is more
specific than the other on one axis, but not the other.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D105740
Mehdi Amini [Wed, 21 Jul 2021 22:28:45 +0000 (22:28 +0000)]
Add verifier for insert/extract element/value on type match between container and inserted/extracted value, and fix vector.shuffle lowering
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D106398
Stanislav Mekhanoshin [Tue, 20 Jul 2021 20:53:14 +0000 (13:53 -0700)]
Prevent dead uses in register coalescer after rematerialization
The coalescer does not check if register uses are available
at the point of rematerialization. If it attempts to rematerialize
an instruction with such uses it can end up with use without a def.
LiveRangeEdit does such check during rematerialization, so just
call LiveRangeEdit::allUsesAvailableAt() to avoid the problem.
Differential Revision: https://reviews.llvm.org/D106396
Nicolas Vasilache [Wed, 21 Jul 2021 22:01:51 +0000 (22:01 +0000)]
[mlir][LLVM] Revert bareptr calling convention handling as an argument materialization.
Type conversion and argument materialization are context-free: there is no available information on which op / branch is currently being converted.
As a consequence, bare ptr convention cannot be handled as an argument materialization: it would apply irrespectively of the parent op.
This doesn't typecheck in the case of non-funcOp and we would see cases where a memref descriptor would be inserted in place of the pointer in another memref descriptor.
For now the proper behavior is to revert to a specific BarePtrFunc implementation and drop the blanket argument materialization logic.
This reverts the relevant piece of the conversion to LLVM to what it was before https://reviews.llvm.org/D105880 and adds a relevant test and documentation to avoid the mistake by whomever attempts this again in the future.
Reviewed By: arpith-jacob
Differential Revision: https://reviews.llvm.org/D106495
Jessica Paquette [Wed, 21 Jul 2021 21:57:31 +0000 (14:57 -0700)]
[AArch64][GlobalISel] Change | -> || in an if
I wrote the wrong type of OR by mistake.
LLVM GN Syncbot [Wed, 21 Jul 2021 21:45:33 +0000 (21:45 +0000)]
[gn build] Port
74fd3cb8cd3e
Stanislav Mekhanoshin [Thu, 15 Jul 2021 23:17:02 +0000 (16:17 -0700)]
[AMDGPU] Mark relevant rematerializable VOP3 instructions
Differential Revision: https://reviews.llvm.org/D106110
Omar Emara [Wed, 21 Jul 2021 21:39:59 +0000 (14:39 -0700)]
[LLDB][GUI] Add required property to text fields
This patch adds a required property to text fields and their
derivatives. Additionally, the Process Name and PID fields in the attach
form were marked as required.
Differential Revision: https://reviews.llvm.org/D106458
Omar Emara [Wed, 21 Jul 2021 21:34:40 +0000 (14:34 -0700)]
[LLDB][GUI] Add Process Plugin Field
This patch adds a new Process Plugin Field. It is a choices field that
lists all the available process plugins and can retrieve the name of the
selected plugin or an empty string if the default is selected.
The Attach form now uses that field instead of manually creating a
choices field.
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D106467
Christopher Di Bella [Thu, 8 Jul 2021 21:01:19 +0000 (21:01 +0000)]
[libcxx][ranges] implements dangling, borrowed_iterator_t, borrowed_subrange_t
* Implements part of P0896 'The One Ranges Proposal'
* Implements http://wg21.link/range.dangling
Reviewed By: zoecarver
Differential Revision: https://reviews.llvm.org/D105205
Christopher Di Bella [Wed, 21 Jul 2021 21:29:24 +0000 (21:29 +0000)]
Revert "Add workaround for false positive in -Wfree-nonheap-object"
This reverts commit
499571ea835daf786626a0db1e12f890b6cd8f8d.
Christopher Di Bella [Tue, 13 Jul 2021 02:02:17 +0000 (02:02 +0000)]
[clang][Sema] removes -Wfree-nonheap-object reference param false positive
Taking the address of a reference parameter might be valid, and without
CFA, false positives are going to be more trouble than they're worth.
Differential Revision: https://reviews.llvm.org/D102728
Stanislav Mekhanoshin [Thu, 15 Jul 2021 23:10:36 +0000 (16:10 -0700)]
[AMDGPU] Mark relevant rematerializable VOP2 instructions
Differential Revision: https://reviews.llvm.org/D106023
Bill Wendling [Thu, 8 Jul 2021 09:08:45 +0000 (02:08 -0700)]
[llvm-diff] Check for recursive initialiers
We need to check for recursive initializers in the "ConstantStruct"
case.
Differential Revision: https://reviews.llvm.org/D105616
David Green [Wed, 21 Jul 2021 21:11:09 +0000 (22:11 +0100)]
[ARM] Pass SelectionDAG to methods that dont require DCI. NFC
In these methods DCI is never used, only the DAG from it. Pass the DAG
directly, cleaning up the code a little.
Walter Erquinigo [Wed, 21 Jul 2021 21:09:25 +0000 (14:09 -0700)]
[intel pt] fix builds
https://reviews.llvm.org/D105649 broke intel pt builds. Fortunately the
fix is super easy.
Stanislav Mekhanoshin [Tue, 13 Jul 2021 17:47:30 +0000 (10:47 -0700)]
[AMDGPU] Mark all relevant VOP1 instructions rematerializable
Differential Revision: https://reviews.llvm.org/D105919
Fangrui Song [Wed, 21 Jul 2021 21:03:26 +0000 (14:03 -0700)]
[sanitizer] Place module_ctor/module_dtor in llvm.used
This removes an abuse of ELF linker behaviors while keeping Mach-O/COFF linker
behaviors unchanged.
ELF: when module_ctor is in a comdat, this patch removes reliance on a linker
abuse (an SHT_INIT_ARRAY in a section group retains the whole group) by using
SHF_GNU_RETAIN. No linker behavior difference when module_ctor is not in a comdat.
Mach-O: module_ctor gets `N_NO_DEAD_STRIP`. No linker behavior difference
because module_ctor is already referenced by a `S_MOD_INIT_FUNC_POINTERS`
section (GC root).
PE/COFF: no-op. SanitizerCoverage already appends module_ctor to `llvm.used`.
Other sanitizers: llvm.used for local linkage is not implemented in
`TargetLoweringObjectFileCOFF::emitLinkerDirectives` (once implemented or
switched to a non-local linkage, COFF can use module_ctor in comdat (i.e.
generalize ELF-specific rL301586)).
There is no object file size difference.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D106246
Peter Steinfeld [Mon, 19 Jul 2021 18:22:45 +0000 (11:22 -0700)]
[flang] Implement the runtime portion of the CSHIFT intrinsic
This change fixes a bug in the runtime portion of the CSHIFT intrinsic
that happens when the value of the SHIFT argument is negative.
Differential Revision: https://reviews.llvm.org/D106292
Alex Langford [Wed, 7 Jul 2021 18:51:16 +0000 (11:51 -0700)]
[LLDB] Move Trace-specific classes into separate library
These two classes, TraceSessionFileParser and ThreadPostMortemTrace,
seem to be useful primarily for tracing. Currently it looks like
intel-pt is the sole user of these, but that other tracing plugins could
be written in the future that take advantage of these. Unfortunately
with them in Target, there is a dependency on PluginProcessUtility. I'd
like to sever that dependency, so I moved them into a `TraceCommon`
plugin.
Differential Revision: https://reviews.llvm.org/D105649
Nikita Popov [Wed, 21 Jul 2021 20:22:26 +0000 (22:22 +0200)]
[SimplifyCFG] Fix if conversion with opaque pointers
We need to make sure that the value types are the same. Otherwise
we both may not have the necessary dereferenceability implication,
nor can we directly form the desired select pattern.
Without opaque pointers this is enforced implicitly through the
pointer comparison.
Nikita Popov [Wed, 21 Jul 2021 20:17:46 +0000 (22:17 +0200)]
[SimplifyCFG] Regenerate test checks (NFC)
Stanislav Mekhanoshin [Thu, 8 Jul 2021 18:00:57 +0000 (11:00 -0700)]
[AMDGPU] Move perfhint analysis
This is SCC pass, moving it to the end of SCC PM saves one
Function PM. This needs the analysis to take into account
memory access width since it is now places after the
load/store optimizer (D105651).
Differential Revision: https://reviews.llvm.org/D105652
Jessica Paquette [Wed, 21 Jul 2021 00:28:45 +0000 (17:28 -0700)]
[AArch64][GlobalISel] Widen s2 and s4 G_IMPLICIT_DEF + G_FREEZE
These had
```
.clampScalar(0, s1, 64)
.widenScalarToNextPow2(0, 8)
```
If you have s2 or s4, then `widenScalarToNextPow2` does nothing.
This changes the `widenScalarToNextPow2` rule to use s8 as the minimum type
instead, allowing us to correctly widen s2 and s4.
This does not impact s1, since it's marked as legal already.
Differential Revision: https://reviews.llvm.org/D106413
Douglas Yung [Wed, 21 Jul 2021 19:51:05 +0000 (12:51 -0700)]
Change requires line from arm to aarch64 since the test uses arm64_32 which is AArch64.