Nikolas Klauser [Tue, 24 May 2022 08:32:50 +0000 (10:32 +0200)]
[libc++] Implement ranges::reverse
Reviewed By: var-const, #libc
Spies: libcxx-commits, mgorny
Differential Revision: https://reviews.llvm.org/D125752
Laramie Leavitt [Tue, 24 May 2022 08:27:37 +0000 (10:27 +0200)]
[libc++] Replace modulus operations in std::seed_seq::generate with conditional checks.
Abseil benchmarks suggest that the conditional checks result in faster code (4-5x)
as they are compiled into conditional move instructions (cmov on x86).
Reviewed By: #libc, philnik, Mordante
Spies: pengfei, Mordante, philnik, libcxx-commits
Differential Revision: https://reviews.llvm.org/D125329
Fraser Cormack [Tue, 24 May 2022 08:16:18 +0000 (09:16 +0100)]
[LegalizeTypes][NFC] Fix node name in assertion message
This was probably copy/pasted from the MSCATTER widening.
Aaron Jacobs [Tue, 24 May 2022 08:21:05 +0000 (10:21 +0200)]
[libc++] type_traits: use __is_core_convertible in __invokable_r.
This fixes incorrect handling of non-moveable types, adding tests for this case.
See [issue 55346](https://github.com/llvm/llvm-project/issues/55346).
The current implementation is based on is_convertible, which is
[defined](https://timsong-cpp.github.io/cppwp/n4659/meta.rel#5) in terms of
validity of the following function:
```
To test() {
return declval<From>();
}
```
But this doesn't work if To and From are both some non-moveable type, which the
[definition](https://timsong-cpp.github.io/cppwp/n4659/conv#3) of implicit
conversions says should work due to guaranteed copy elision:
```
To to = E; // E has type From
```
It is this latter definition that is used in the
[definition](https://timsong-cpp.github.io/cppwp/n4659/function.objects#func.require-2)
of INVOKE<R>. Make __invokable_r use __is_core_convertible, which
captures the ability to use guaranteed copy elision, making the
definition correct for non-moveable types.
Fixes llvm/llvm-project#55346.
Reviewed By: #libc, philnik, EricWF
Spies: EricWF, jloser, ldionne, philnik, libcxx-commits
Differential Revision: https://reviews.llvm.org/D125300
Nikita Popov [Tue, 24 May 2022 08:07:00 +0000 (10:07 +0200)]
[InstCombine] Handle logical and/or in recursive and/or of icmps fold
The and/or of icmps fold is also applied in reassociated form.
However, this currently only happens for bitwise and of bitwise
and, but not for bitwise and of logical and (or other combinations,
but this is the one being addressed here).
We can do this for bitwise+logical combinations as well, but need
to be a bit careful about which of the resulting ands are logical:
https://alive2.llvm.org/ce/z/WYSjGh
https://alive2.llvm.org/ce/z/guxYnz
https://alive2.llvm.org/ce/z/S5SYxY
https://alive2.llvm.org/ce/z/2rAWeW
Nikita Popov [Tue, 24 May 2022 08:04:24 +0000 (10:04 +0200)]
[InstCombine] Use different icmp pattern in test (NFC)
Use an and/or of icmp pattern that produces different code
depending on whether it is part of a logical or bitwise and/or.
Markus Lavin [Tue, 24 May 2022 07:42:07 +0000 (09:42 +0200)]
llvm-reduce: improve basic-blocks removal pass
When the single branch target of a block has been removed try updating
it to target a block that is kept (by scanning forward in the sequence)
instead of replacing the branch with a return instruction. Doing so
reduces the risk of breaking loop structures meaning that when the loop
is 'interesting' these reductions should have more blocks eliminated.
Differential Revision: https://reviews.llvm.org/D125766
Nikita Popov [Wed, 18 May 2022 15:09:36 +0000 (17:09 +0200)]
[LoopUnroll] Freeze tripcount rather than condition
This is a followup to D125754. We introduce two branches, one
before the unrolled loop and one before the epilogue (and similar
for the prologue case). The previous patch only froze the
condition on the first branch.
Rather than independently freezing the second condition, this patch
instead freezes TripCount and bases BECount on it. These are the
two quantities involved in the conditions, and this ensures that
both work on a consistent, non-poisonous trip count.
Differential Revision: https://reviews.llvm.org/D125896
Lian Wang [Tue, 24 May 2022 07:12:31 +0000 (07:12 +0000)]
[LegalizeTypes][VP] Fix OpNo in WidenVecOp_VP_SCATTER
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D126276
Fraser Cormack [Thu, 19 May 2022 13:47:40 +0000 (14:47 +0100)]
[RISCV] Ensure the entire stack is aligned to the RVV stack alignment
This patch fixes another bug in the RVV frame lowering. While some frame
objects with non-default stack IDs (such scalable-vector alloca
instructions) are considered in the target-independent max alignment
calculations, others (for example, during calling-convention lowering)
are not. This means we'd occasionally align the base of the stack to
only 16 bytes, with no way to ensure that the RVV section contained
within that is aligned to anything higher.
Reviewed By: StephenFan
Differential Revision: https://reviews.llvm.org/D125973
Fraser Cormack [Mon, 16 May 2022 09:57:32 +0000 (10:57 +0100)]
[RISCV] Fix RVV stack frame alignment bugs
This patch addresses several alignment issues in the stack frame when
RVV objects are taken into account.
One bug is that the RVV stack was never guaranteed to keep the alignment
of the stack *as a whole*. We must maintain a 16-byte aligned stack at
all times, especially when calling other functions. With the standard V
extension, this is conveniently happening since VLEN is at least 128 and
always 16-byte aligned. However, we support Zvl64b which does not
guarantee this. To fix this, the RVV stack size is rounded up to be
aligned to 16 bytes. This in practice generally makes us allocate a
stack sized at least 2*VLEN in size, and a multiple of 2.
|------------------------------| -- <-- FP
| 8-byte callee-save | | |
|------------------------------| | |
| one VLENB-sized RVV object | | |
|------------------------------| | |
| 8-byte local variable | | |
|------------------------------| -- <-- SP (must be aligned to 16)
In the example above, with Zvl64b we are decrementing SP by 12 bytes
which does not leave SP correctly aligned. We therefore introduce an
extra VLENB-sized amount used for alignment. This would therefore ensure
the total stack size was 16 bytes (48 for Zvl128b, 80 for Zvl256b, etc):
|------------------------------| -- <-- FP
| 8-byte callee-save | | |
|------------------------------| | |
| one VLENB-sized padding obj | | |
| one VLENB-sized RVV object | | |
|------------------------------| | |
| 8-byte local variable | | |
|------------------------------| -- <-- SP
A new RVV invariant has been introduced in this patch, which is that the
base of the RVV stack itself is now always aligned to 16 bytes, not 8 as
before. This keeps us more in line with the scalar stack and should be
easier to reason about. The calculation of the RVV padding has thus
changed to be the amount required to align the scalar local variable
section to the RVV section's alignment. This amount is further rounded
up when setting up the initial stack to keep everything aligned:
|------------------------------| -- <-- FP
| 8-byte callee-save |
|------------------------------|
| |
| RVV objects |
| (aligned to at least 16) |
| |
|------------------------------|
| RVV padding of 8 bytes |
|------------------------------|
| 8-byte local variable |
|------------------------------| -- <-- SP
In the example above, it's clear that we need 8 bytes of padding to keep
the RVV section aligned to 16 when using SP. But to keep SP *itself*
aligned to 16 we can't decrement the initial stack pointer by 24 - we
have to round up to 32.
With the RVV section correctly aligned, the second bug fixed by
this patch is that RVV objects themselves are now correctly aligned. We
were previously only guaranteeing an alignment of 8 bytes, even if they
required a higher alignment. This is relatively simple and in practice
we see more rounding up of VLEN amounts to account for alignment in
between objects:
|------------------------------|
| RVV object (aligned to 16) |
|------------------------------|
| no padding necessary |
|------------------------------|
| 2*VLENB RVV object (align 16)|
|------------------------------|
| VLENB alignment padding |
|------------------------------|
| RVV object (align 32) |
|------------------------------|
| 3*VLENB alignment padding |
|------------------------------|
| VLENB RVV object (align 32) |
|------------------------------| -- <-- base of RVV section
Note that a lot of the regressions in codegen owing to the new alignment
rules are correct but actually only strictly necessary for Zvl64b (and
Zvl32b but that's not really supported). I plan a follow-up patch to
take the known VLEN into account when padding for alignment.
Reviewed By: StephenFan
Differential Revision: https://reviews.llvm.org/D125787
LLVM GN Syncbot [Tue, 24 May 2022 05:44:48 +0000 (05:44 +0000)]
[gn build] Port
496156ac57da
Luo, Yuanke [Tue, 3 May 2022 10:57:25 +0000 (18:57 +0800)]
[X86][AMX] Multiple configure for AMX register.
The previous solution depends on variable name to record the shape
information. However it is not reliable, because in release build
compiler would not set the variable name. It can be accomplished with an
additional option `fno-discard-value-names`, but it is not acceptable
for users.
This patch is to preconfigure the tile register with machine
instruction. It follow the same way what sigle configure does. In the
future we can fall back to multiple configure when single configure
fails due to the shape dependency issue.
The algorithm to configure the tile register is simple in the patch. We
may improve it in the future. It configure tile register based on basic
block. Compiler would spill the tile register if it live out the basic
block. After the configure there should be no spill across tile
confgiure in the register alloction. Just like fast register allocation
the algorithm walk the instruction in reverse order. When the shape
dependency doesn't meet, it insert ldtilecfg after the last instruction
that define the shape.
In post configuration compiler also walk the basic block to collect the
physical tile register number and generate instruction to fill the stack
slot for the correponding shape information.
TODO: There is some following work in D125602. The risk is modifying the
fast RA may cause regression as fast RA is usded for different targets.
We may create an independent RA for tile register.
Differential Revision: https://reviews.llvm.org/D125075
Chen Zheng [Tue, 19 Apr 2022 07:40:17 +0000 (03:40 -0400)]
[MachineSink] replace MachineLoop with MachineCycle
MachineCycle can handle irreducible loop. Natural loop
analysis (MachineLoop) can not return correct loop depth if
the loop is irreducible loop. And MachineSink is sensitive
to the loop depth, see MachineSinking::isProfitableToSinkTo().
This patch tries to use MachineCycle so that we can handle
irreducible loop better.
Reviewed By: sameerds, MatzeB
Differential Revision: https://reviews.llvm.org/D123995
Shraiysh Vaishay [Tue, 24 May 2022 04:23:33 +0000 (09:53 +0530)]
[OpenMP][IRBuilder] `omp task` support
This patch adds basic support for `omp task` to the OpenMPIRBuilder.
The outlined function after code extraction is called from a wrapper function with appropriate arguments. This wrapper function is passed to the runtime calls for task allocation.
This approach is different from the Clang approach - clang directly emits the runtime call to the outlined function. The outlining utility (OutlineInfo) simply outlines the code and generates a function call to the outlined function. After the function has been generated by the outlining utility, there is no easy way to alter the function arguments without meddling with the outlining itself. Hence the wrapper function approach is taken.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D71989
Peter Klausler [Tue, 17 May 2022 01:10:27 +0000 (18:10 -0700)]
[flang] Allow more forward references to ENTRY names
Forward references to ENTRY names to pass them as actual procedure arguments
don't work in all cases, exposing some basic ordering problems in
name resolution for these symbols. Refactor; create all the
necessary procedure symbols, and either function result or host association
symbols (for subroutines), at the time that the subprogrma scope is
created, so that the names exist in the scope as text "before"
the ENTRY is processed in name resolution. Some processing
remains in PostEntryStmt() so that we can check that an ENTRY with
an explicit distinct RESULT doesn't also have declarations for the
ENTRY name.
Differential Revision: https://reviews.llvm.org/D126142
Sotiris Apostolakis [Tue, 24 May 2022 04:02:00 +0000 (00:02 -0400)]
Revert "[SelectOpti][5/5] Optimize select-to-branch transformation"
This reverts commit
a111fb960108df910a864500f3b98d75d37f083c.
Sotiris Apostolakis [Tue, 24 May 2022 03:04:20 +0000 (23:04 -0400)]
[SelectOpti][5/5] Optimize select-to-branch transformation
This patch optimizes the transformation of selects to a branch when the heuristics deemed it profitable.
It aggressively sinks eligible instructions to the newly created true/false blocks to prevent their
execution on the common path and interleaves dependence slices to maximize ILP.
Depends on D120232
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D120233
Brad Smith [Tue, 24 May 2022 03:26:14 +0000 (23:26 -0400)]
[Hexagon] Fix test on OpenBSD
The test specifies a CPU arch but not a particular OS. So if run on
OpenBSD it acts as if it's an OpenBSD/hexagon system. OpenBSD uses
__guard_local instead of __stack_chk_guard so the test will fail. So
specify an OS other than OpenBSD fixes the test.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D126265
Hyoun Kyu Cho [Tue, 24 May 2022 03:04:40 +0000 (03:04 +0000)]
Exposes interface to free up caching data structure in DWARFDebugLine and DWARFUnit for memory management
This is minimum changes extracted from https://reviews.llvm.org/D78950. The old patch tried to add LRU eviction of caching data structure. Due to multiple layers of interfaces that users could be using, it was not clear where to put the functionality. While we work out on where to put that functionality, it'll be great to add this minimum interface change so that the user could implement their own memory management. More specifically:
* Add a clearLineTable method for DWARFDebugLine which erases the given offset from the LineTableMap.
* DWARFDebugContext adds the clearLineTableForUnit method that leverages clearLineTable to remove the object corresponding to a given compile unit, for memory management purposes. When it is referred to again, the line table object will be repopulated.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D90006
usama hameed [Mon, 23 May 2022 23:05:15 +0000 (16:05 -0700)]
updated canResolveToExpr to accept both statements and expressions. Removed unnecessary code
usama hameed [Mon, 23 May 2022 21:52:14 +0000 (14:52 -0700)]
bugfix in InfiniteLoopCheck to not print warnings for unevaluated loops
Added a separate check for unevaluated statements. Updated InfiniteLoopCheck to use new check
Differential Revision: https://reviews.llvm.org/D126246
usama hameed [Thu, 19 May 2022 23:51:34 +0000 (16:51 -0700)]
bugfix in InfiniteLoopCheck to not print warnings for unevaluated loops
Differential Revision: https://reviews.llvm.org/D126034
Wolfgang Pieb [Tue, 24 May 2022 00:08:01 +0000 (17:08 -0700)]
[NFC][Metadata] Define move constructor and move assignment operator for MDOperand.
This is a preparatory patch for the MDNode resize functionality.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D125994
Sotiris Apostolakis [Tue, 24 May 2022 02:05:41 +0000 (22:05 -0400)]
[SelectOpti][4/5] Loop Heuristics
This patch adds the loop-level heuristics for determining whether branches are more profitable than conditional moves.
These heuristics apply to only inner-most loops.
Depends on D120231
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D120232
Sotiris Apostolakis [Mon, 23 May 2022 20:26:09 +0000 (16:26 -0400)]
[SelectOpti][3/5] Base Heuristics
This patch adds the base heuristics for determining whether branches are more profitable than conditional moves.
Base heuristics apply to all code apart from inner-most loops.
Depends on D122259
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D120231
Vy Nguyen [Tue, 24 May 2022 00:59:18 +0000 (07:59 +0700)]
[lld-macho][nfc] Run clang-format on lld/MachO/*.{h,cpp}
- fixed inconsistent indents and spaces
- prevent extraneous formatting changes in other patches
Differential Revision: https://reviews.llvm.org/D126262
Peter Klausler [Wed, 11 May 2022 21:32:59 +0000 (14:32 -0700)]
[flang] Ignore BIND(C) binding name conflicts of inner procedures
The binding names of inner procedures with BIND(C) are not exposed
to the loader and should be ignored for potential conflict errors.
Differential Revision: https://reviews.llvm.org/D126141
Peter Klausler [Wed, 11 May 2022 21:13:50 +0000 (14:13 -0700)]
[flang] Allow global scope names that clash with intrinsic modules
Intrinsic module names are not in the user's namespace, so they
are free to declare global names that conflict with intrinsic
modules.
Differential Revision: https://reviews.llvm.org/D126140
Xeonacid [Tue, 24 May 2022 00:58:23 +0000 (02:58 +0200)]
[RISCV] Make old JIT ExecutionEngine tests unsupported
Make old JIT ExecutionEngine tests unsupported for RISCV, like many other architectures included.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D126188
Peter Klausler [Wed, 11 May 2022 20:15:59 +0000 (13:15 -0700)]
[flang] Fix character length calculation for Unicode component
The character length value in the derived type component information table
entry is already in units of characters, not bytes, so don't divide by the
per-character byte size.
Differential Revision: https://reviews.llvm.org/D126139
Sam Clegg [Fri, 20 May 2022 21:39:33 +0000 (14:39 -0700)]
[lld][WebAssembly] Allow use of statically allocated TLS region.
It turns out we were already allocating static address space for TLS
data along with the non-TLS static data, but this space was going
unused/ignored.
With this change, we include the TLS segment in `__wasm_init_memory`
(which does the work of loading the passive segments into memory when a
module is first loaded). We also set the `__tls_base` global to point
to the start of this segment.
This means that the runtime can use this static copy of the TLS data for
the first/primary thread if it chooses, rather than doing a runtime
allocation prior to calling `__wasm_init_tls`.
Practically speaking, this will allow emscripten to avoid dynamic
allocation of TLS region on the main thread.
Differential Revision: https://reviews.llvm.org/D126107
Hendrik Greving [Fri, 13 May 2022 17:53:13 +0000 (10:53 -0700)]
[BasicBlockUtils] Do not move loop metadata if outer loop header.
Fixes a bug preventing moving the loop's metadata to an outer loop's header,
which happens if the loop's exit is also the header of an outer loop.
Adjusts test for above.
Fixes #55416.
Differential Revision: https://reviews.llvm.org/D125574
Hendrik Greving [Mon, 16 May 2022 14:34:04 +0000 (07:34 -0700)]
[BasicBlockUtils] Add corner case test for loop metadata.
Adds a test to expose #55416.
Differential Revision: https://reviews.llvm.org/D125696
Mehdi Amini [Mon, 16 May 2022 10:33:00 +0000 (10:33 +0000)]
Apply clang-tidy fixes for modernize-use-bool-literals in Parser.cpp (NFC)
Mehdi Amini [Mon, 16 May 2022 10:24:43 +0000 (10:24 +0000)]
Apply clang-tidy fixes for modernize-use-override in SparseTensorUtils.cpp (NFC)
Mehdi Amini [Mon, 16 May 2022 10:09:28 +0000 (10:09 +0000)]
Apply clang-tidy fixes for performance-unnecessary-value-param in Utils.cpp (NFC)
Vitaly Buka [Mon, 23 May 2022 22:56:35 +0000 (15:56 -0700)]
[test][clang] Move -O3 in command line
Jamie Schmeiser [Thu, 7 Oct 2021 19:02:19 +0000 (15:02 -0400)]
Add new hidden option -print-on-crash that prints out IR that caused opt pipeline to crash
A new hidden option -print-on-crash that prints the IR as it was upon entering
the last pass when there is a crash.
The IR is saved in its print form before each pass is started and a
signal handler is registered. If the compilation crashes, the signal
handler will print the saved IR to dbgs(). This option
can be modified using -print-module-scope to get the IR for the complete
module. Note that this option only works with the new pass manager.
Reviewed By: yrouban
Differential Revision: https://reviews.llvm.org/D86657
Tom Stellard [Mon, 23 May 2022 22:09:26 +0000 (15:09 -0700)]
github: Switch release PR repository to llvm/llvm-project-release-prs
As discussed in https://discourse.llvm.org/t/creating-a-new-repository-for-release-branch-pull-requests/61339
Reviewed By: asl
Differential Revision: https://reviews.llvm.org/D125851
Alex Brachet [Mon, 23 May 2022 21:47:22 +0000 (21:47 +0000)]
[libc][docs] Use same formatting for headers in source_layout
utils looks different from the other directory names
in the docs, see
https://libc.llvm.org/source_layout.html#the-utils-directory
Differential revision: https://reviews.llvm.org/D126211
NAKAMURA Takumi [Sat, 21 May 2022 23:52:03 +0000 (08:52 +0900)]
[TableGen] emitStringLiteralDef: Pad trailing '\0' at the end of char array.
Fixup for https://reviews.llvm.org/D73044
String literal has an implicit terminator '\0'. This commit adjusts char array
to long literal.
This causes difference of artifacts between -long-string-literals=true
and false.
Differential Revision: https://reviews.llvm.org/D126136
Jeffrey Tan [Mon, 23 May 2022 17:17:44 +0000 (10:17 -0700)]
Fix lldb-vscode frame test failure
Previous patch (https://reviews.llvm.org/D126013) added a new "optimized"
attribute to DAP stack frame this caused some tests, like
lldb-vscode/coreFile/TestVSCode_coreFile.py
to fail because the tests explicitly check for all attributes.
To fix the test failure I decided to remove this attribute.
Differential Revision: https://reviews.llvm.org/D126225
NAKAMURA Takumi [Sat, 21 May 2022 23:56:30 +0000 (08:56 +0900)]
emitStringLiteralDef: Return earlier here. NFC.
Differential Revision: https://reviews.llvm.org/D126135
Mitch Phillips [Mon, 23 May 2022 20:11:01 +0000 (13:11 -0700)]
[symbolizer] Parse DW_TAG_variable DIs to show line info for globals
Currently, llvm-symbolizer doesn't like to parse .debug_info in order to
show the line info for global variables. addr2line does this. In the
future, I'm looking to migrate AddressSanitizer off of internal metadata
over to using debuginfo, and this is predicated on being able to get the
line info for global variables.
This patch adds the requisite support for getting the line info from the
.debug_info section for symbolizing global variables. This only happens
when you ask for a global variable to be symbolized as data.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D123538
Sotiris Apostolakis [Mon, 23 May 2022 14:47:32 +0000 (10:47 -0400)]
[SelectOpti][2/5] Select-to-branch base transformation
This patch implements the actual transformation of selects to branches.
It includes only the base transformation without any sinking.
Depends on D120230
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D122259
Qunyan Mangus [Mon, 23 May 2022 19:48:06 +0000 (12:48 -0700)]
Remove duplicate fields in RAGreedy
RAGreedy has two fields of RegisterClassInfo, one called RCI and another RegClassInfo from its base class.
RCI is initialized without freezeReservedRegs first, while RegClassInfo does. Therefore, if reserved registers
information is changed between last time freezeReservedRegs is called and RAGreedy, it's not picked up by RCI.
Instead of having both fields in RAGreedy, remove RCI and use RegClassInfo instead. Also removed is the TRI field
which is present in its base class.
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D125926
Paul Robinson [Mon, 23 May 2022 19:49:20 +0000 (12:49 -0700)]
[PS5] Make driver's PIC behavior match PS4
The new test is a copy of the corresponding PS4 test, with the triple
etc updated, because there's currently no good way to make one lit test
"iterate" with multiple targets.
Louis Dionne [Mon, 23 May 2022 19:36:35 +0000 (15:36 -0400)]
[libc++] Remove duplicate tests for callable concepts
This is essentially a revert of
c7ad02009. Indeed, it seems that both
96dbdd75 and
c7ad02009 were committed, but
c7ad02009 seems to be only
an older version of
96dbdd75's tests.
Stella Stamenova [Mon, 23 May 2022 19:38:02 +0000 (12:38 -0700)]
[mlir] Use 'native' instead of 'llvm_has_native_target' in the mlir tests
The tests actually require the target triple to match the host, rather than just having the host in the list of available targets. This change removes `llvm_has_native_target` and instead uses the `native` feature from the lit configuration.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D126011
Florian Hahn [Mon, 23 May 2022 19:27:42 +0000 (20:27 +0100)]
[AArch64] Add tests with free shuffles for indexed fma variants.
The new tests contain examples where shuffles are free, because indexed
fma instructions can be used.
Paul Walker [Mon, 23 May 2022 18:07:10 +0000 (19:07 +0100)]
[SVEInstrFormats] Ensure scatter instructions are named consistently.
Alexey Bataev [Mon, 23 May 2022 15:09:55 +0000 (08:09 -0700)]
[SLP][NFC]Improve compile time, NFC.
Builds UserIgnore list only once as a SmallDenseSet without rebuilding
it between the runs, iterate over gathers instead list of reduction ops,
do some checks in the buildTree_rec only if the corresponding containers
are not empty.
Julian Lettner [Mon, 23 May 2022 18:32:38 +0000 (11:32 -0700)]
[Sanitizer][Darwin] Add explanation for Apple platform macros
Differential Revision: https://reviews.llvm.org/D126229
LLVM GN Syncbot [Mon, 23 May 2022 18:52:16 +0000 (18:52 +0000)]
[gn build] Port
eebc1fb772c5
Nikolas Klauser [Sun, 22 May 2022 11:43:37 +0000 (13:43 +0200)]
[libc++] Add ranges::max_element to the synopsis and ADL-proof the __min_element_impl calls
Reviewed By: ldionne, #libc
Spies: sstefan1, libcxx-commits
Differential Revision: https://reviews.llvm.org/D126167
Nikolas Klauser [Sun, 22 May 2022 11:34:22 +0000 (13:34 +0200)]
[libc++] Add auto to the list of required extensions in C++03
We use `auto` in C++03, so we shouldn't say that we aren't.
Reviewed By: ldionne, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D126165
Nikolas Klauser [Fri, 20 May 2022 15:11:58 +0000 (17:11 +0200)]
[libc++] Assume that push_macro and pop_macro are available
All compilers that libc++ supports support `push_macro` and `pop_macro`. So let's remove it.
Reviewed By: ldionne, #libc
Spies: libcxx-commits, mgorny
Differential Revision: https://reviews.llvm.org/D126073
Nikolas Klauser [Thu, 12 May 2022 13:46:18 +0000 (15:46 +0200)]
[libc++] Always enable the ranges concepts
The ranges concepts were already available in libc++13, so we shouldn't guard them with `_LIBCPP_HAS_NO_INCOMPLETE_RANGES`.
Fixes https://github.com/llvm/llvm-project/issues/54765
Reviewed By: #libc, ldionne
Spies: ldionne, libcxx-commits
Differential Revision: https://reviews.llvm.org/D124011
Nikolas Klauser [Fri, 20 May 2022 21:31:13 +0000 (23:31 +0200)]
[libc++] Granularize parts of <type_traits>
`<type_traits>` is quite a large header, so I'll granularize it in a few steps.
Reviewed By: ldionne, #libc
Spies: libcxx-commits, mgorny
Differential Revision: https://reviews.llvm.org/D124755
Jorge Gorbe Moya [Mon, 23 May 2022 18:29:10 +0000 (11:29 -0700)]
Remove `friend` classes from TypeCategoryMap
As far as I can tell, the only thing those friend classes access is the
`ValueSP` typedef.
Given that this is a map-ish class, with "Map" in its name, it doesn't
seem like a stretch to make `KeyType`, `ValueType` and `ValueSP` public.
More so when the public methods of the class have `KeyType` and
`ValueSP` arguments and clearly `ValueSP` needs to be accessed from the
outside.
`friend` complicates local reasoning about who can access private
members, which is valuable in a class like this that has every method
locking a mutex to prevent concurrent access.
Differential Revision: https://reviews.llvm.org/D126103
natashaknk [Mon, 23 May 2022 17:58:33 +0000 (10:58 -0700)]
[mlir][tosa] Change tosa.depthwise_conv2d's ending reshape to a collapse.
TOSAs depthwise_conv2d operation includes a reshape to include the implicit x1 dimension.
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D126212
Julian Lettner [Mon, 23 May 2022 18:18:15 +0000 (11:18 -0700)]
[Sanitizer][Darwin] Add SANITIZER_DRIVERKIT platform macro
Sanjay Patel [Mon, 23 May 2022 17:31:00 +0000 (13:31 -0400)]
[IR] add and use pattern match specialization for sqrt intrinsic; NFC
This was included in D126190 originally, but it's
independent and a useful change for readability.
Craig Topper [Mon, 23 May 2022 05:38:04 +0000 (22:38 -0700)]
[DAGCombiner][AArch64] Don't fold (smulo x, 2) -> (saddo x, x) if VT is i2.
If the VT is i2, then 2 is really -2.
Test has not been commited yet, but diff shows the change.
Fixes PR55644.
Differential Revision: https://reviews.llvm.org/D126213
Craig Topper [Mon, 23 May 2022 05:32:16 +0000 (22:32 -0700)]
[AArch64] Add test case for pr55644. NFC
Dave Lee [Mon, 23 May 2022 18:00:22 +0000 (11:00 -0700)]
[lldb] Specify aguments of `image list`
Register positional argument details in `CommandObjectTargetModulesList`.
I recently learned that `image list` takes a module name, but the help info
does not indicate this. With this change, `help image list` will show that it
accepts zero or more module names.
This makes it easier to get info about specific modules, without having to
find/grep through the full image list.
Reviewed By: DavidSpickett
Differential Revision: https://reviews.llvm.org/D125154
Stephen Long [Mon, 23 May 2022 14:01:55 +0000 (07:01 -0700)]
[MSVC, ARM64] Add __readx18 intrinsics
https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170
unsigned char __readx18byte(unsigned long)
unsigned short __readx18word(unsigned long)
unsigned long __readx18dword(unsigned long)
unsigned __int64 __readx18qword(unsigned long)
Given the lack of documentation of the intrinsics, we chose to align the offset with just
`CharUnits::One()` when calling `IRBuilderBase::CreateAlignedLoad()`
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D126024
Dave Lee [Wed, 18 May 2022 23:31:49 +0000 (16:31 -0700)]
[lldb] Improve formatting of dlopen error messages (NFC)
Ensure there's a space between "utility" and "function", and also makes
it easier to grep/search for "utility function".
While making this change, I also re-formatted the other dlopen error messages
(with clang-format). This fix other instances of spaces missing between words,
and makes each of these strings fit a single line, making them greppable.
Differential Revision: https://reviews.llvm.org/D126078
Benjamin Kramer [Mon, 23 May 2022 17:53:40 +0000 (19:53 +0200)]
Fix an unused variable warning in no-asserts build mode
Paul Robinson [Mon, 23 May 2022 17:43:12 +0000 (10:43 -0700)]
[PS5] Disable a test, same as PS4
Philip Reames [Mon, 23 May 2022 17:10:08 +0000 (10:10 -0700)]
[RISCV] Add basic fault-first load coverage for VSETVLI insertion
Simplified version of a test taken from D123581.
Jeffrey Tan [Tue, 17 May 2022 16:21:10 +0000 (09:21 -0700)]
Show error message for optimized variables
This fixes an issue that optimized variable error message is not shown to end
users in lldb-vscode.
Differential Revision: https://reviews.llvm.org/D126014
Jeffrey Tan [Tue, 17 May 2022 16:17:26 +0000 (09:17 -0700)]
Add [opt] suffix to optimized stack frame in lldb-vscode
To help user identify optimized code This diff adds a "[opt]" suffix to
optimized stack frames in lldb-vscode. This provides consistent experience
as command line lldb.
It also adds a new "optimized" attribute to DAP stack frame object so that
it is easy to identify from telemetry than parsing trailing "[opt]".
Differential Revision: https://reviews.llvm.org/D126013
Fangrui Song [Mon, 23 May 2022 16:58:54 +0000 (09:58 -0700)]
[llvm-nm][docs] Document -W and -U
Latest GNU nm (milestone: 2.39) has added -W/--no-weak and changed -U to mean
--defined-only (instead of --unicode=). The changes match our semantics.
Close #55297
Reviewed by: jhenderson, keith
Differential Revision: https://reviews.llvm.org/D126133
Christopher Bate [Tue, 17 May 2022 21:42:47 +0000 (15:42 -0600)]
[mlir][NvGpuToNVVM] Fix byte size calculation in async copy lowering
AsyncCopyOp lowering converted "size in elements" to "size in bytes"
assuming the element type size is at least one byte. This removes
that restriction, allowing for types such as i4 and b1 to be handled
correctly.
Differential Revision: https://reviews.llvm.org/D125838
Matthias Springer [Mon, 23 May 2022 16:49:45 +0000 (18:49 +0200)]
[mlir][bufferize][NFC] Update One-Shot Bufferize pass documentation
Differential Revision: https://reviews.llvm.org/D125637
Christopher Bate [Fri, 20 May 2022 20:41:55 +0000 (14:41 -0600)]
[mlir][NvGpuToNVVM] Fix missing i4 support for nvgpu.mma.sync
This changes adds missing support for the i4 data type. Tests are added
to ensure proper lowering of an nvgpu.mma.sync operation targeting the
16x8x64xi4 and 16x8x32xi4 MMA variants in the NVVM dialect.
Differential Revision: https://reviews.llvm.org/D126092
Matthias Springer [Mon, 23 May 2022 16:37:26 +0000 (18:37 +0200)]
[mlir][bufferize] Support fully dynamic layout maps in BufferResultsToOutParams
Also fixes integration of the pass into One-Shot Bufferize and adds additional test cases.
BufferResultsToOutParams can be used with "identity-layout-map" and "fully-dynamic-layout-map". "infer-layout-map" is not supported.
Differential Revision: https://reviews.llvm.org/D125636
Jonas Devlieghere [Mon, 23 May 2022 16:07:54 +0000 (09:07 -0700)]
[lldb] Fix should_skip_simulator_test decorator
Currently simulator tests get skipped when the reported platform is
macosx rather than darwin. Update the decorator to match both.
Matthias Springer [Mon, 23 May 2022 16:10:12 +0000 (18:10 +0200)]
[mlir][bufferization] Fix Python bindings
Differential Revision: https://reviews.llvm.org/D126179
Nathan Sidwell [Tue, 22 Mar 2022 17:49:08 +0000 (10:49 -0700)]
[clang] Module global init mangling
C++20 modules require emission of an initializer function, which is
called by importers of the module. This implements the mangling for
that function. It is the one place the ABI exposes partition names in
symbols -- but fortunately only needed by other TUs of that same module.
Reviewed By: bruno
Differential Revision: https://reviews.llvm.org/D122741
Stella Laurenzo [Sun, 22 May 2022 04:30:01 +0000 (21:30 -0700)]
NFC: Silence two warnings for unused bufferization symbols in release mode.
Differential Revision: https://reviews.llvm.org/D126182
Richard [Fri, 20 May 2022 23:19:11 +0000 (17:19 -0600)]
[clang-tidy] Improve add_new_check.py to recognize more checks
When looking for whether or not a check provides fixits, the script
examines the implementation of the check. Some checks are not
implemented in source files that correspond one-to-one with the check
name, e.g. cert-dcl21-cpp. So if we can't find the check implementation
directly from the check name, open up the corresponding module file and
look for the class name that is registered with the check. Then consult
the file corresponding to the class name.
Some checks are derived from a base class that implements fixits. So if
we can't find fixits in the implementation file for a check, scrape out
the name of it's base class. If it's not ClangTidyCheck, then consult
the base class implementation to look for fixit support.
Differential Revision: https://reviews.llvm.org/D126134
Fixes #55630
Nikita Popov [Mon, 23 May 2022 15:29:33 +0000 (17:29 +0200)]
[InstCombine] Change operand order in recursive and/or of icmps fold
The order obviously doesn't matter for bitwise and/or, but would
matter for logical and/or, so change it to preserve the original
order.
Nikita Popov [Mon, 23 May 2022 15:24:19 +0000 (17:24 +0200)]
[InstCombine] Add tests for recursive and/or of icmp folds (NFC)
Add variations with bitwise and logical and/or, as well as
commuted operands.
Jingu Kang [Mon, 23 May 2022 11:33:48 +0000 (12:33 +0100)]
Revert "Revert "[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth""
This reverts commit
42ebfa8269470e6b1fe2de996d3f1db6d142e16a.
The commmit from https://reviews.llvm.org/D125918 has fixed the stage 2 build
failure.
Differential Revision: https://reviews.llvm.org/D118979
Matthias Springer [Mon, 23 May 2022 14:53:17 +0000 (16:53 +0200)]
[mlir][bufferization][NFC] Improve assembly format of AllocTensorOp
No longer pass static dim sizes as an attribute. This was redundant and required extra checks in the verifier. This change also makes the op symmetrical to memref::AllocOp.
Differential Revision: https://reviews.llvm.org/D126178
PeixinQiao [Mon, 23 May 2022 14:50:06 +0000 (22:50 +0800)]
[NFC][flang] Change the OpenMP atomic read/write test cases
Remove the integration tests and rename the file.
Reviewed By: shraiysh, NimishMishra
Differential Revision: https://reviews.llvm.org/D126169
Alexander Belyaev [Mon, 23 May 2022 14:29:02 +0000 (16:29 +0200)]
[mlir] Add Expm1 tp ComplexOps.td.
Differential Revision: https://reviews.llvm.org/D126206
Jay Foad [Mon, 23 May 2022 14:18:34 +0000 (15:18 +0100)]
[TableGen] Remove an untrue statement from the docs
You can't use foreach in a record body. This was a mistake in the
documentation dating from when it was first written in D85838.
Alexander Belyaev [Mon, 23 May 2022 14:10:20 +0000 (16:10 +0200)]
[mlir] Add RSqrt tp ComplexOps.td.
Differential Revision: https://reviews.llvm.org/D126202
Alexey Bataev [Wed, 4 Aug 2021 17:58:37 +0000 (10:58 -0700)]
[SLP]Do not emit extract elements for insertelements users, replace with shuffles directly.
SLP vectorizer emits extracts for externally used vectorized scalars and
estimates the cost for each such extract. But in many cases these
scalars are input for insertelement instructions, forming buildvector,
and instead of extractelement/insertelement pair we can emit/cost
estimate shuffle(s) cost and generate series of shuffles, which can be
further optimized.
Tested using test-suite (+SPEC2017), the tests passed, SLP was able to
generate/vectorize more instructions in many cases and it allowed to reduce
number of re-vectorization attempts (where we could try to vectorize
buildector insertelements again and again).
Differential Revision: https://reviews.llvm.org/D107966
Stephen Long [Mon, 23 May 2022 14:00:54 +0000 (07:00 -0700)]
[MSVC, ARM64] Add __writex18 intrinsics
https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170
void __writex18byte(unsigned long, unsigned char)
void __writex18word(unsigned long, unsigned short)
void __writex18dword(unsigned long, unsigned long)
void __writex18qword(unsigned long, unsigned __int64)
Given the lack of documentation of the intrinsics, we chose to align the offset with just
`CharUnits::One()` when calling `IRBuilderBase::CreateAlignedStore()`.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D126023
Sanjay Patel [Mon, 23 May 2022 13:26:43 +0000 (09:26 -0400)]
[InstCombine] fold icmp of zext bool based on limited range
X <u (zext i1 Y) --> (X == 0) && Y
https://alive2.llvm.org/ce/z/avQDRY
This is a generalization of
4069cccf3b4ff4a based on the post-commit suggestion.
This also adds the i1 type check and tests that were missing from the earlier
attempt; that commit caused several bot fails and was reverted.
Differential Revision: https://reviews.llvm.org/D126171
Sanjay Patel [Sun, 22 May 2022 17:16:19 +0000 (13:16 -0400)]
[InstCombine] add tests for icmp of zext i1; NFC
Alexey Bataev [Mon, 23 May 2022 13:43:02 +0000 (06:43 -0700)]
[SLP][NFC]Add a test for extracting scalar from undef result vector,
NFC.
Nikita Popov [Mon, 23 May 2022 13:12:15 +0000 (15:12 +0200)]
[InstCombine] Reuse icmp of and/or folds for logical and/or
Similarly to a change recently done for fcmps, add a flag that
indicates whether the and/or is logical to foldAndOrOfICmps, and
reuse the function when folding logical and/or.
We were already calling some parts of it, but this gives us a
clearer indication of which parts may need poison-safe variants,
and would also allow to fold combinations of bitwise and logical
and/or.
This change should be close to NFC, because all folds this enables
were either already called previously, or can make use of implied
poison reasoning.
Anastasia Stulova [Mon, 23 May 2022 13:03:54 +0000 (14:03 +0100)]
[SPIR-V] Allow setting SPIR-V version via target triple.
Currently added versions are from v1.0 to v1.5, other versions
can be added as needed.
This change also adds documentation about SPIR-V target support
in LLVM.
Differential Revision: https://reviews.llvm.org/D124776
Timm Bäder [Mon, 23 May 2022 13:22:27 +0000 (15:22 +0200)]
Revert "[clang][driver] Dynamically select gcc-toolset/devtoolset version"
This reverts commit
8717b492dfcd12d6387543a2f8322e0cf9059982.
The new unittest fails on Windows buildbots, e.g.
https://lab.llvm.org/buildbot/#/builders/119/builds/8647