Samuel Parker [Tue, 17 Jan 2023 10:34:43 +0000 (10:34 +0000)]
[NFC][WebAssembly] Update test
Run update_llc_test_checks.py on address-offsets.ll
Jean Perier [Tue, 17 Jan 2023 10:24:40 +0000 (11:24 +0100)]
[flang][hlfir] Add hlfir.destroy operation.
Add the operation to mark the end of life of hlfir.expr.
As described in its description this is the easiest solution
to deploy given lowering "knows" where expression value are last
used.
However, inserting these points in lowering will probably make
it harder to do some IR transformation that would move the code
using or creating hlfir.expr (no use should be moved after an
hlfir.destroy).
Once the dust settle with the HLFIR change, it will be worth assessing
the situation and see if an analysis could do a better and safer job at
finding those destruction points.
Depends on D141832
Reviewed By: clementval
Differential Revision: https://reviews.llvm.org/D141839
Jean Perier [Tue, 17 Jan 2023 10:22:33 +0000 (11:22 +0100)]
[flang][hlfir] Add move semantics to hlfir.as_expr.
hlfir.as_expr allows turning an array, character, or derived type
variable into a value when it the usage require an hlfir.expr (e.g,
when returning the element value inside and hlfir.elemental).
The default implementation of this operation in bufferization is to
make a copy of the variable into a temporary buffer.
This adds a time and memory overhead in cases where such copy is not
needed because the variable is already a temporary that was created
in lowering to compute the expression value, and the "as_expr" is
the sole usage of the variable.
This is for instance the case for many transformational intrinsics
that do not have hlfir.expr operation (at least for now, but some may
never benefit from having one) and must be implemented "on memory"
in lowering.
This patch adds a way to "move" the variable storage along its value.
It allows the bufferization to re-use the variable storage for the
hlfir.expr created by hlfir.as_expr, and in exchange, the
responsibility of deallocating the buffer (if the variable was heap
allocated) if passed along to the hlfir.expr, and will need to be
done after the last hlfir.expr usage.
Differential Revision: https://reviews.llvm.org/D141832
Nikita Popov [Tue, 17 Jan 2023 10:20:03 +0000 (11:20 +0100)]
[MLIR] Convert some tests to opaque pointers (NFC)
Nikita Popov [Tue, 17 Jan 2023 09:59:32 +0000 (10:59 +0100)]
[MLIR] Convert test to opaque pointers (NFC)
Nikita Popov [Tue, 17 Jan 2023 10:12:54 +0000 (11:12 +0100)]
[MLIR] Don't verify opaque pointer type in cmpxchg
We should not check the element type for opaque pointers. We should
still check that the value operands have the same type though.
This causes a verifier error when converting instructions.ll to
opaque pointers.
Nikita Popov [Tue, 17 Jan 2023 10:06:23 +0000 (11:06 +0100)]
[MLIR] Don't verify opaque pointer type in atomicrmw
If the pointer type is opaque, we should not check the element type.
This causes a verifier failure when converting instructions.ll to
opaque pointers.
Nikita Popov [Tue, 17 Jan 2023 09:53:22 +0000 (10:53 +0100)]
[MLIR] Don't verify call signature for indirect opaque ptr call
Fixes a crash when converting the instructions.ll test to opaque
pointers.
Chuanqi Xu [Tue, 17 Jan 2023 09:26:48 +0000 (17:26 +0800)]
[C++20] [Modules] Only diagnose the non-inline external variable
definitions in header units
Address part of https://github.com/llvm/llvm-project/issues/60079.
Since the the declaration of a non-inline static data member in its
class definition is not a definition. The following form:
```
class A {
public:
static const int value = 43;
};
```
should be fine to appear in a header unit. From the perspective of
implementation, it looks like we simply forgot to check if the variable
is a definition...
Reviewed By: iains
Differential Revision: https://reviews.llvm.org/D141905
Nikita Popov [Tue, 17 Jan 2023 09:35:31 +0000 (10:35 +0100)]
[MLIR] Convert test to opaque pointers (NFC)
Nikita Popov [Tue, 17 Jan 2023 09:12:02 +0000 (10:12 +0100)]
[Polly] Convert some tests to opaque pointers (NFC)
Nikita Popov [Tue, 17 Jan 2023 09:08:33 +0000 (10:08 +0100)]
[Clang] Convert test to opaque pointers (NFC)
Florian Hahn [Tue, 17 Jan 2023 09:08:33 +0000 (09:08 +0000)]
[VPlan] Remove unneeded VPUser::classof(const VPDef *) (NFC).
This specialization is not needed any longer as VPRecipeBase inherits
from VPUser and getDefiningRecipe returns a VPRecipeBase.
Mariusz Sikora [Mon, 16 Jan 2023 11:54:56 +0000 (12:54 +0100)]
[AMDGPU] v_fmac_f64 encoding tests for gfx940
Differential Revision: https://reviews.llvm.org/D141857
Nikita Popov [Mon, 16 Jan 2023 11:57:26 +0000 (12:57 +0100)]
[Clang] Convert test to opaque pointers (NFC)
Nikita Popov [Mon, 16 Jan 2023 14:03:35 +0000 (15:03 +0100)]
[Support] Fix alternation support in backreferences (PR60073)
backref() always performs a full match on the remaining string,
and as such also needs to be matched against the whole remaining
strip. For alternations, the match was performed against just the
sub-strip for one alternative, which would of course fail to match
the whole string.
This can be done by skipping the part of the strip between OOR1
and O_CH, so that only the first alternative in the strip is
matched, and the remaining ones are skipped. Indeed, the necessary
OOR1 skipping code was already implemented in the easy-path of
backref(), so this is clearly how it was supposed to work.
However, there were two bugs: First, under this scheme we should
be passing the stop point of the original strip, not just the
alternative sub-strip. Second, while skipping for OOR1 was
implemented, handling for O_CH was missing. This would occur when
the last alternative matches, as O_CH is preceded by an implicit
OOR1 only.
Fixes https://github.com/llvm/llvm-project/issues/60073.
Rainer Orth [Tue, 17 Jan 2023 08:41:00 +0000 (09:41 +0100)]
[sanitizer_common] Don't intercept __tls_get_addr on Solaris
When building/testing ASan inside the GCC tree on Solaris while using GNU
`ld` instead of Solaris `ld`, a large number of tests SEGVs on both sparc
and x86 like this:
Thread 2 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1 (LWP 1)]
0xfe014cfc in __sanitizer::atomic_load<__sanitizer::atomic_uintptr_t>
(a=0xfc602a58, mo=__sanitizer::memory_order_acquire) at
sanitizer_common/sanitizer_atomic_clang_x86.h:46
46 v = a->val_dont_use;
1: x/i $pc
=> 0xfe014cfc
<_ZN11__sanitizer11atomic_loadINS_16atomic_uintptr_tEEENT_4TypeEPVKS2_NS_12memory_orderE+62>:
mov (%eax),%eax
(gdb) bt
#0 0xfe014cfc in __sanitizer::atomic_load<__sanitizer::atomic_uintptr_t>
(a=0xfc602a58, mo=__sanitizer::memory_order_acquire) at
sanitizer_common/sanitizer_atomic_clang_x86.h:46
#1 0xfe0bd1d7 in __sanitizer::DTLS_NextBlock (cur=0xfc602a58) at
sanitizer_common/sanitizer_tls_get_addr.cpp:53
#2 0xfe0bd319 in __sanitizer::DTLS_Find (id=1) at
sanitizer_common/sanitizer_tls_get_addr.cpp:77
#3 0xfe0bd466 in __sanitizer::DTLS_on_tls_get_addr (arg_void=0xfeffd068,
res=0xfe602a18, static_tls_begin=0, static_tls_end=0) at
sanitizer_common/sanitizer_tls_get_addr.cpp:116
#4 0xfe063f81 in __interceptor___tls_get_addr (arg=0xfeffd068) at
sanitizer_common/sanitizer_common_interceptors.inc:5501
#5 0xfe0a3054 in __sanitizer::CollectStaticTlsBlocks (info=0xfeffd108,
size=40, data=0xfeffd16c) at
sanitizer_common/sanitizer_linux_libcdep.cpp:366
#6 0xfe6ba9fa in dl_iterate_phdr () from /usr/lib/ld.so.1
#7 0xfe0a3132 in __sanitizer::GetStaticTlsBoundary (addr=0xfe608020,
size=0xfeffd244, align=0xfeffd1b0) at
sanitizer_common/sanitizer_linux_libcdep.cpp:382
#8 0xfe0a33f7 in __sanitizer::GetTls (addr=0xfe608020, size=0xfeffd244)
at sanitizer_common/sanitizer_linux_libcdep.cpp:482
#9 0xfe0a34b1 in __sanitizer::GetThreadStackAndTls (main=true,
stk_addr=0xfe608010, stk_size=0xfeffd240, tls_addr=0xfe608020,
tls_size=0xfeffd244) at sanitizer_common/sanitizer_linux_libcdep.cpp:565
The address being accessed is unmapped. However, even when the tests
`PASS` with Solaris `ld`, `ASAN_OPTIONS=verbosity=2` shows
==6582==__tls_get_addr: Can't guess glibc version
Given that that the code is stricly `glibc`-specific according to
`sanitizer_tls_get_addr.h`, there seems little point in using the
interceptor on non-`glibc` targets.
That's what this patch does. Tested on `i386-pc-solaris2.11` and
`sparc-sun-solaris2.11` inside the GCC tree.
Differential Revision: https://reviews.llvm.org/D141385
Sergey Kachkov [Thu, 22 Dec 2022 13:59:06 +0000 (16:59 +0300)]
[GVN] Refactor handling of pointer-select in GVN pass
This patch extends Def memory dependency with support of select
instructions to consistently handle pointer-select conversion.
Differential Revision: https://reviews.llvm.org/D141619
Kadir Cetinkaya [Tue, 17 Jan 2023 08:08:46 +0000 (09:08 +0100)]
[clangd] Disable ScopedMemoryLimit on tsan builds
This is causing flakiness, see https://lab.llvm.org/buildbot/#/builders/131/builds/39272
Fangrui Song [Tue, 17 Jan 2023 07:57:44 +0000 (23:57 -0800)]
[ARM] Properly fix -Wsign-compare after D141791
chendewen [Tue, 17 Jan 2023 07:24:06 +0000 (15:24 +0800)]
Revert "[AArch64][SVE] Add more intrinsics in 'isZeroingInactiveLanes'."
This reverts commit
6ef6b2b5162ef48a63fb2697d77cffa6d7b1f7e7.
Noah Goldstein [Tue, 17 Jan 2023 02:51:08 +0000 (18:51 -0800)]
Transform AtomicRMW logic operations to BT{R|C|S} if only changing/testing a single bit.
This is essentially expanding on the optimizations added on: D120199
but applies the optimization to cases where the bit being changed /
tested is not am IMM but is a provable power of 2.
The only case currently added for cases like:
`__atomic_fetch_xor(p, 1 << c, __ATOMIC_RELAXED) & (1 << c)`
Which instead of using a `cmpxchg` loop can be done with `btcl; setcc; shl`.
There are still a variety of missed cases that could/should be
addressed in the future. This commit documents many of those
cases with Todos.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D140939
Noah Goldstein [Tue, 17 Jan 2023 02:50:15 +0000 (18:50 -0800)]
Add tests for BMI patterns across non-adjacent and assosiative instructions.
I.e for blsi match (and (sub 0, x), x) but we currently miss valid
patterns like (and (and (sub 0, x), y), x).
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D141178
Jonathan Peyton [Mon, 5 Dec 2022 15:06:01 +0000 (09:06 -0600)]
[OpenMP][libomp] Add topology information to thread structure
Each time a thread gets a new affinity assigned, it will not
only assign its mask, but also topology information including
which socket, core, thread and core-attributes (if available)
it is now assigned. This occurs for all non-disabled KMP_AFFINITY
values as well as OMP_PLACES/OMP_PROC_BIND.
The information regarding which socket, core, etc. can take on three
values:
1) The actual ID of the unit (0 - (N-1)), given N units
2) UNKNOWN_ID (-1) which indicates it does not know which ID
3) MULTIPLE_ID (-2) which indicates the thread is spread across
multiple of this unit (e.g., affinity mask is spread across
multiple hardware threads)
This new information is stored in th_topology_ids[] array. An example
how to get the socket Id, one would read th_topology_ids[KMP_HW_SOCKET].
This could be expanded in the future to something more descriptive for
the "multiple" case, like a range of values. For now, the single
value suffices.
The information regarding the core attributes can take on two values:
1) The actual core-type or core-eff
2) KMP_HW_CORE_TYPE_UNKNOWN if the core type is unknown, and
UNKNOWN_CORE_EFF (-1) if the core eff is unknown.
This new information is stored in th_topology_attrs. An example
how to get the core type, one would read
th_topology_attrs.core_type.
Differential Revision: https://reviews.llvm.org/D139854
Shilei Tian [Tue, 17 Jan 2023 04:55:17 +0000 (23:55 -0500)]
[OpenMP] Fix the wrong format string used in `__kmpc_error`
This patch fixes the wrong format string used in `__kmpc_error`, which could
cause segment fault at runtime.
Reviewed By: jlpeyton
Differential Revision: https://reviews.llvm.org/D141889
Jonathan Peyton [Mon, 12 Dec 2022 17:33:52 +0000 (11:33 -0600)]
[OpenMP][libomp] Fix macOS 12 library destruction
When building the library with icc and using it on macOS 12,
the library destruction process is skipped which has many OMPT tests
failing for macOS 12. This change registers the
__kmp_internal_end_library() call for atexit() which will be a
harmless, redundant call for macOS 11 and below and the only destructor
called for macOS 12+.
Differential Revision: https://reviews.llvm.org/D139857
chenglin.bi [Tue, 17 Jan 2023 04:01:41 +0000 (12:01 +0800)]
[AArch64] fold subs ugt/ult to ands when the second operand is a mask
https://alive2.llvm.org/ce/z/pLhHI9
Fix: https://github.com/llvm/llvm-project/issues/59598
Reviewed By: samtebbs
Differential Revision: https://reviews.llvm.org/D141829
Chuanqi Xu [Tue, 17 Jan 2023 03:31:24 +0000 (11:31 +0800)]
[C++20] [Coroutines] Disable to take the address of labels in coroutines
Closing https://github.com/llvm/llvm-project/issues/56436
We can't support the GNU address of label extension in coroutines well
in current architecture. Since the coroutines are going to split into
pieces in the middle end so the address of labels are ambiguous that
time.
To avoid any further misunderstanding, we try to emit an error here.
Differential Revision: https://reviews.llvm.org/D131938
Shilei Tian [Tue, 17 Jan 2023 03:34:14 +0000 (22:34 -0500)]
[Clang][OpenMP] Fix the issue that a functor is not captured properly in a task region
This patch fixes the issue that a functor is not captured properly if
that is used in a task region. It was introduced by https://reviews.llvm.org/D114546
where `CallExpr` is treated specially, but the callee itself is not properly visited.
https://reviews.llvm.org/D115902 already did some fix for one case. This patch
fixes another case.
Fix #57757.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D141873
Freddy Ye [Tue, 17 Jan 2023 02:29:47 +0000 (10:29 +0800)]
[NFC][X86] clang-format change for avx512vlbwintrin.h
chendewen [Tue, 17 Jan 2023 01:47:35 +0000 (09:47 +0800)]
[AArch64][SVE] Add more intrinsics in 'isZeroingInactiveLanes'.
The REINTERPRET_CAST operation generates redundant and and ptrue instructions.
For some instructions, this is redundant, because its inactive lanes are zeroed by construction.
For example. Codegen before:
```
facgt p2.d, p0/z, z4.d, z1.d
ptrue p1.d
and p1.b, p2/z, p2.b, p1.b
```
After:
```
facgt p1.d, p0/z, z4.d, z1.d
```
ref: https://reviews.llvm.org/D129851
Reviewed By:sdesmalen
Differential Revision:https://reviews.llvm.org/D141469
Arthur Eubanks [Tue, 17 Jan 2023 01:50:46 +0000 (17:50 -0800)]
[bolt][test] Add REQUIRES: asserts to jt-symbol-disambiguation-3.s
Or else it unexpectedly passes in non-assert builds of bolt.
Arthur Eubanks [Wed, 11 Jan 2023 00:16:04 +0000 (16:16 -0800)]
[docs][NewPM] Clarify more status of legacy PM + optimization pipeline
Reviewed By: asbirlea, nikic
Differential Revision: https://reviews.llvm.org/D141443
James Y Knight [Mon, 16 Jan 2023 23:19:29 +0000 (18:19 -0500)]
ExceptionHandling documentation tweaks.
Delete mention of the llvm.eh.begincatch/llvm.eh.endcatch intrinsics,
and remove them from a few remaining test-cases. These intrinsics were
from a previous attempt at implementing Windows exception-handling,
but were removed from LLVM in 2015.
Also mention that dynamic exception specifications ("throw filters")
were removed from the spec in C++17.
Paul Walker [Mon, 16 Jan 2023 17:58:09 +0000 (17:58 +0000)]
[NFC][Clang] Regenerate test output for SVE ACLE tests.
Just a rerun of update_cc_test_checks.py to capture some changes to
variable names after their reliance on instcombine was removed.
Luo, Yuanke [Mon, 16 Jan 2023 14:39:06 +0000 (22:39 +0800)]
[X86] Don't fold select for vXi1 on X86 target.
Since there is no mask instruction for vXi1 with avx512f in X86 target.
Folding select for vXi1 doesn't help to reduce instructions.
Differential Revision: https://reviews.llvm.org/D141782
Mehdi Amini [Mon, 16 Jan 2023 23:26:28 +0000 (23:26 +0000)]
Revert "Revert "Refactor OperationName to use virtual tables for dispatch (NFC)""
This streamlines the implementation and makes it so that the virtual
tables are in the binary instead of dynamically assembled during initialization.
The dynamic allocation size of op registration is also smaller with this
change.
This reverts commit
7bf1e441da6b59a25495fde8e34939f93548cc6d
and re-introduce
e055aad5ffb348472c65dfcbede85f39efe8f906
after fixing the windows crash by making ParseAssemblyFn a
unique_function again
Differential Revision: https://reviews.llvm.org/D141492
James Y Knight [Mon, 16 Jan 2023 23:15:01 +0000 (18:15 -0500)]
Move Personalities array from MachineModuleInfo to DwarfCFIException.
It was only ever used there, already. The previous location seems
left-over from when the personality function was specified on a
per-landingpad basis, instead of per-function.
James Y Knight [Mon, 16 Jan 2023 23:15:00 +0000 (18:15 -0500)]
FastISel: remove EH_LABEL skipping code.
This was intended to skip past the EH_LABEL which is added at the top
of a landingpad block. But, it is unnecessary because `LastLocalValue`
is already set to point past the EH_LABEL in that case.
Thus, currently, this is dead-code. I am removing it because it _also_
attempts to skip over EH_LABELs emitted around a call. Currently, this
situation never arises, but it becomes harmful after a future
in-progress commit.
Mehdi Amini [Mon, 16 Jan 2023 23:11:12 +0000 (23:11 +0000)]
Revert "Refactor OperationName to use virtual tables for dispatch (NFC)"
This reverts commit
e055aad5ffb348472c65dfcbede85f39efe8f906.
This crashes on Windows at the moment for some reasons.
Martin Storsjö [Tue, 22 Nov 2022 14:12:39 +0000 (16:12 +0200)]
[clang] [MinGW] Avoid adding <base>/include and <base>/lib when cross compiling
The MinGW compiler driver first tries to deduce the root of
the toolchain installation (either clang itself or a separate
cross mingw gcc installation). On top of this root, a number
of include and lib paths are added (some added unconditionally,
some only if they exist):
- <base>/x86_64-w64-mingw32/include
- <base>/include
- <base>/include/x86_64-w64-windows-gnu
(Some more are also added for libstdc++ and/or libc++.)
The first one is the one commonly used for MinGW targets so
far. For LLVM runtimes installed with the
LLVM_ENABLE_PER_TARGET_RUNTIME_DIR option, the latter two are
used though (this is currently not the default, not yet at least).
For cross compiling, if base is a separate dedicated directory,
this is fine, but when using the sysroots of a distro-installed
cross mingw toolchain, base is /usr - and having /usr/include
in the include path for cross compilation is a potential
source for problems; see
https://github.com/llvm/llvm-project/issues/59871.
If not cross compiling though, <base>/include needs to be included
too. E.g. in the case of msys2, most headers are in e.g.
/mingw64/include while the compiler is /mingw64/bin/clang.
When cross compiling, if the sysroot has been explicitly set
by the user, keep <base>/include too. (In the case of a distro
provided cross gcc toolchain in /usr, the sysroot needs to be set
to /usr and not /usr/x86_64-w64-mingw32 though, to be able to find
libgcc files under /usr/lib/gcc/x86_64-w64-mingw32. So with such a
toolchain, setting the sysroot explicitly does retain the problem.)
All in all - this avoids adding /usr/include and /usr/lib to the
include/lib paths when doing mingw cross compilation with a
distro-provided sysroot in /usr/x86_64-w64-mingw32.
Test that the include directory is omitted in the mingw-sysroot.cpp
tests, when cross compiling. That test is only ever executed on
non-Windows hosts, since it uses symlinks to set up fake environments
with colocated compilers and header/lib directories. There aren't
really any current corresponding tests for the same implicit behaviours
when actually running on Windows.
Differential Revision: https://reviews.llvm.org/D141206
Joe Loser [Mon, 16 Jan 2023 04:39:16 +0000 (21:39 -0700)]
[llvm][ADT] Replace uses of `makeMutableArrayRef` with deduction guides
Similar to how `makeArrayRef` is deprecated in favor of deduction guides, do the
same for `makeMutableArrayRef`.
Once all of the places in-tree are using the deduction guides for
`MutableArrayRef`, we can mark `makeMutableArrayRef` as deprecated.
Differential Revision: https://reviews.llvm.org/D141814
Mehdi Amini [Mon, 16 Jan 2023 20:59:09 +0000 (20:59 +0000)]
Fix crash in Spirv -lower-host-to-llvm pass
When providing with a spirv module as input where no conversion happens
the code didn't defend against broken invariant.
We'll fail the pass here, but it's not clear if it is the right thing
or if the module should just be ignored.
Fixes #59971
Reviewed By: kuhar
Differential Revision: https://reviews.llvm.org/D141856
Mahesh Ravishankar [Mon, 16 Jan 2023 20:50:59 +0000 (20:50 +0000)]
[mlir][TilingInterface] Fix use after free error from D141028.
The `candidateSliceOp` was replaces and used in a subsequent
call. Instead just replace its uses. The op is dead and will be
removed with CSE.
Differential Revision: https://reviews.llvm.org/D141869
Slava Zakharin [Mon, 16 Jan 2023 20:38:31 +0000 (12:38 -0800)]
[mlir][llvmir] Fixed MDNode uniquing during TBAA translation.
In the process of creating the MDNodes for the TBAA tag operations
we used to produce incomplete MDNodes like:
```
@__tbaa::@tbaa_tag_4 => !{!null, !null, i64 0}
@__tbaa::@tbaa_tag_7 => !{!null, !null, i64 0}
```
This caused the two tags to map to the same incomplete MDNode due to uniquing.
To prevent this, we have to use temporary MDNodes instead of !null's.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D141726
Hendrik Greving [Wed, 11 Jan 2023 17:12:48 +0000 (09:12 -0800)]
[mlir:LLVM] Fix minor bug, missing cconv translation
Fixes translating the calling convention to LLVM-IR, possibly missed
by https://reviews.llvm.org/D126161, and adds a test.
Lei Huang [Fri, 13 Jan 2023 15:42:38 +0000 (09:42 -0600)]
[P10] Fix the implementation for BRH
Fixes the patterns for the brh instruction to include a clrldi when emitted.
Reviewed By: amyk
Differential Revision: https://reviews.llvm.org/D141697
Ram-NK [Mon, 16 Jan 2023 19:23:46 +0000 (14:23 -0500)]
[LoopInterchange] Correcting the profitability check
Before D135808, There would be endless loop interchange posibility (no
proper priority was there in profitability check. Any profitable check
may leads to loop-interchange). With this patch, there is no endless
interchange (priority in profitable check is defined. Order of decision
is 'Cache cost' check, 'InstrOrderCost', 'Vectorization'). Corrected the
dependency checking inside isProfitableForVectorization(), corrected the
checking of bad order loops in isProfitablePerInstrOrderCost().
Reviewed By: Meinersbur, bmahjour, #loopoptwg
Differential Revision: https://reviews.llvm.org/D135808
Alex Zinenko [Mon, 16 Jan 2023 15:01:02 +0000 (15:01 +0000)]
[mlir] accept values with result numbers in gpu.launch_func
The parser of gpu.launch_func was incorrectly rejecting SSA values with
result numbers (`%0#0`) in the list of function arguments by using the
`parseArgument` function intended for region argument declarations, not
operands. Fix this by directly parsing comma-separated operands and
types.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D141851
Joseph Huber [Mon, 16 Jan 2023 17:18:54 +0000 (11:18 -0600)]
[nvptx-arch] Dynamically load the CUDA runtime if not found during the build
Much like the changes in D141859, this patch allows the `nvptx-arch`
tool to be built and provided with every distrubition of LLVM / Clang.
This will make it more reliable for our toolchains to depend on. The
changes here configure a version that dynamically loads CUDA if it was
not found at build time.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D141861
Joseph Huber [Mon, 16 Jan 2023 16:56:24 +0000 (10:56 -0600)]
[amdgpu-arch] Dynamically load the HSA runtime if not found during the build
We use the `amdgpu-arch` tool to query the installed GPUs at runtime.
One problem is that this tool is currently not build if the person
building the LLVM binary does not have the HSA runtime on their system.
This means that if someone built and distrubted an installation of LLVM
without HSA, then the user will not be able to use it even if they have
it on their system.
This patch makes us build this tool unconditionally and adds extra logic
to dynamically load HSA if it's present.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D141859
Kelvin Li [Mon, 16 Jan 2023 19:03:55 +0000 (14:03 -0500)]
[flang][NFC] Fix typo in Cray pointee error message
David Green [Mon, 16 Jan 2023 18:55:10 +0000 (18:55 +0000)]
[AArch64] Add tests for forming abd from wrap flags and min/max. NFC
Simon Pilgrim [Mon, 16 Jan 2023 18:52:04 +0000 (18:52 +0000)]
Silence signed/unsigned comparison warnings. NFC.
Nikolas Klauser [Sun, 15 Jan 2023 18:53:49 +0000 (19:53 +0100)]
[libc++] Remove <type_traits> includes from <atomic> and <ratio>
Reviewed By: Mordante, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D141799
Mahesh Ravishankar [Thu, 5 Jan 2023 00:57:50 +0000 (00:57 +0000)]
[mlir][TilingInterface] Add an option to tile and fuse to yield replacement for the fused producer.
This patch adds an option to the method that fuses a producer with a
tiled consumer, to also yield from the tiled loops a value that can be
used to replace the original producer. This is only valid if it can be
assertained that the slice of the producer computed within each
iteration of the tiled loop nest does not compute slices of the
producer redundantly. The analysis to derive this is very involved. So
this is left to the caller to assertain. A test is added that mimics
the `scf::tileConsumerAndFuseProducersGreedilyUsingSCFForOp`, but also
yields the values of all fused producers. This can be used as a
reference for how a caller could use this functionality.
Differential Revision: https://reviews.llvm.org/D141028
Simon Pilgrim [Mon, 16 Jan 2023 17:59:38 +0000 (17:59 +0000)]
[Thumb2][MVE] Recognise shuffle truncation patterns suitable for ARMISD::MVETRUNC
I'm helping with the remaining regressions on D127115, and one of my candidate fixes caused some regressions with MVE interleaved shuffles due to poor handling of 'truncation' style shuffle masks (0,2,4,6,...).
This patch attempts to use the ARMISD::MVETRUNC node to handle these cases, based off existing code in LowerTruncate.
It handles both (0,2,4,6,...) and (1,3,5,7,....) 'top' style patterns (assuming no endian problems). I shift down the 'top' patterns - a basic search of ARM docs suggests MVE has some top/bottom truncation/narrowing instructions but I don't seem to be able to get them to be used.
Differential Revision: https://reviews.llvm.org/D141791
Sanjay Patel [Mon, 16 Jan 2023 17:34:26 +0000 (12:34 -0500)]
[InstCombine] canonicalize a signum (spaceship) that ends in add
(A s>> (BW - 1)) + (zext (A s> 0)) --> (A s>> (BW - 1)) | (zext (A != 0))
https://alive2.llvm.org/ce/z/V-nM8N
This is not the form that we currently match as m_Signum(),
but I'm not sure if one is better than the other, so there's
a follow-up patch needed either way.
For this patch, it should be better for analysis to use a
not-null test and bitwise logic rather than >0 with add.
Codegen doesn't seem significantly different on any targets
that I looked at.
Also note that none of these variants is shown in issue #60012 -
those generally include at least one 'select', so that's likely
where these patterns will end up.
Sanjay Patel [Mon, 16 Jan 2023 16:46:12 +0000 (11:46 -0500)]
[InstCombine] add tests for signum (spaceship) variant; NFC
Matt Kulukundis [Mon, 16 Jan 2023 17:24:10 +0000 (17:24 +0000)]
Fix format for `case` in .proto files
Fix format for `case` in .proto files
Reviewed By: krasimir, echristo
Differential Revision: https://reviews.llvm.org/D141547
serge-sans-paille [Fri, 13 Jan 2023 13:25:54 +0000 (14:25 +0100)]
[lld][COFF] Provide unwinding information for Chunk injected by /delayloaded
For each symbol in a /delayloaded library, lld injects a small piece of
code to handle the symbol lazy loading. This code doesn't have unwind
information, which may be troublesome.
Provide these information for AMD64.
Thanks to Yannis Juglaret <yjuglaret@mozilla.com> for contributing the
unwinding info and for his support while crafting this patch.
Fix #59639
Differential Revision: https://reviews.llvm.org/D141691
Michael Buch [Mon, 16 Jan 2023 10:24:22 +0000 (10:24 +0000)]
[libcxx] Add missing includes
This fixes the remaining errors when building the llvm-project
with `LLVM_ENABLE_MODULES=ON` (and `LLVM_ENABLE_LOCAL_SUBMODULE_VISIBILITY=ON`,
which currently is the LLVM default).
Previously this would fail in the `CXX_SUPPORTS_MODULES` check.
Differential Revision: https://reviews.llvm.org/D141833
David Green [Mon, 16 Jan 2023 16:58:18 +0000 (16:58 +0000)]
[AArch64] Move default extensions from clang Driver to TargetParser
The default extensions would be better added in the TargetParser, not by
the driver. This removes the addition of +i8mm and +bf16 features in the
driver as they are already added in 8.6/9.1 architectures. AEK_MOPS and
AEK_HBC have been added to 8.8/9.3 architectures to replace the need for
+hbc and +mops features.
Differential Revision: https://reviews.llvm.org/D141518
Frank (Fang) Gao [Fri, 13 Jan 2023 15:47:44 +0000 (10:47 -0500)]
[mlir][vector] Add scalable vectors support to OuterProductOp
This will probably be the first in a series of patches that tries to
enable code generation for ARM SME (extension of SVE).
Since SME's core operation is the outer product instruction, I figured
that it would probably be a good idea to enable the outer product
operation to properly accept and generate scalable vectors.
Reviewed By: dcaballe
Differential Revision: https://reviews.llvm.org/D138718
Prabhdeep Singh Soni [Mon, 16 Jan 2023 16:46:33 +0000 (11:46 -0500)]
Revert "[mlir][vector] Add scalable vectors support to OuterProductOp"
This reverts commit
be4c5ad54c929f2d817ab4a55707f0beda73a05f.
This patch did not include the test case.
Mehdi Amini [Mon, 16 Jan 2023 16:38:43 +0000 (16:38 +0000)]
Check for FunctionOpInterface when looking up a parent function in GPU lowering
This makes it more robust when expanding code in other function than
func.func, like spv.func for example.
Fixes #60072
Ivan Kosarev [Mon, 16 Jan 2023 16:11:52 +0000 (16:11 +0000)]
[AMDGPU][AsmParser][NFC] Refine defining single-bit custom operands.
Reviewed By: dp
Differential Revision: https://reviews.llvm.org/D141301
Felipe de Azevedo Piovezan [Mon, 16 Jan 2023 16:05:22 +0000 (13:05 -0300)]
Revert "[codegen] Store address of indirect arguments on the stack"
This reverts commit
7e4447a17db4a070f01c8f8a87505a4b2a1b0e3a.
Nikita Popov [Mon, 16 Jan 2023 14:19:22 +0000 (15:19 +0100)]
[Support] Fix incorrect assertion in backref compilation
These should be == rather than !=.
Nikita Popov [Mon, 16 Jan 2023 14:11:12 +0000 (15:11 +0100)]
[Support] Fix REDEBUG compilation
Benjamin Maxwell [Mon, 9 Jan 2023 16:24:07 +0000 (16:24 +0000)]
[DebugInfo] Add CIE::getAugmentationData() and FDE::getCIEPointer()
Public getters are provided for other similar members of both the CIE
and FDE, these fields are also displayed by the llvm-drawfdump tool,
so it seems like not exposing them was likely an oversight.
These are needed for tools based on LLVM that need access to all the
parsed DWARF data.
Differential Revision: https://reviews.llvm.org/D141475
Matthias Springer [Mon, 16 Jan 2023 15:23:58 +0000 (16:23 +0100)]
[mlir][NFC] GreedyPatternRewriteDriver: Consistent return values
All `apply...` functions now return a LogicalResult indicating whether the iterative process converged or not.
Differential Revision: https://reviews.llvm.org/D141845
Matthias Springer [Mon, 16 Jan 2023 15:18:23 +0000 (16:18 +0100)]
[mlir][NFC] GreedyPatternRewriteDriver: Remove overridden eraseOp
It is not necessary to override `eraseOp`, we can use the existing `notifyOperationRemoved`.
Differential Revision: https://reviews.llvm.org/D141844
Mehdi Amini [Mon, 16 Jan 2023 15:07:46 +0000 (15:07 +0000)]
Explicitly more Error when returning it (NFC)
This is an attempt to fix a build failure:
llvm/lib/Object/ELFObjectFile.cpp:300:12: error: call to deleted constructor of 'llvm::Error'
return E;
Francesco Petrogalli [Mon, 16 Jan 2023 09:29:11 +0000 (10:29 +0100)]
[docs] Expand example on stand-alone builds.
1. Make explicit that the folder where to build a subproject in stand-alone mode can not be the same folder where LLVM was build.
2. Add a cut 'n paste example for building stand-alone `clang`.
Differential Revision: https://reviews.llvm.org/D141825
Freddy Ye [Mon, 16 Jan 2023 14:16:02 +0000 (22:16 +0800)]
[X86] Prefer fpext(splat(X)) to splat(fpext(x)).
This patch is to fix regression of D122875. X86 has fpext instructions
supporting rmb form, which takes advantage of fpext(fplat(X)) than
splat(fpext(X)).
Reviewed By: RKSimon, skan
Differential Revision: https://reviews.llvm.org/D141657
Luo, Yuanke [Mon, 16 Jan 2023 14:03:39 +0000 (22:03 +0800)]
[X86] Add more test case for folding select on vXi1
Guillaume Chatelet [Mon, 16 Jan 2023 12:34:40 +0000 (12:34 +0000)]
Deprecate MemIntrinsicBase::getDestAlignment() and MemTransferBase::getSourceAlignment()
Differential Revision: https://reviews.llvm.org/D141840
Felipe de Azevedo Piovezan [Fri, 6 Jan 2023 18:52:22 +0000 (15:52 -0300)]
[codegen] Store address of indirect arguments on the stack
With codegen prior to this patch, truly indirect arguments -- i.e.
those that are not `byval` -- can have their debug information lost even
at O0. Because indirect arguments are passed by pointer, and this
pointer is likely placed in a register as per the function call ABI,
debug information is lost as soon as the register gets clobbered.
This patch solves the issue by storing the address of the parameter on
the stack, using a similar strategy employed when C++ references are
passed. In other words, this patch changes codegen from:
```
define @foo(ptr %arg) {
call void @llvm.dbg.declare(%arg, [...], metadata !DIExpression())
```
To:
```
define @foo(ptr %arg) {
%ptr_storage = alloca ptr
store ptr %arg, ptr %ptr_storage
call void @llvm.dbg.declare(%ptr_storage, [...], metadata !DIExpression(DW_OP_deref))
```
Some common cases where this may happen with C or C++ function calls:
1. "Big enough" trivial structures passed by value under the ARM ABI.
2. Structures that are non-trivial for the purposes of call (as per
the Itanium ABI) when passed by value.
A few tests were matching the wrong alloca (matching against the new
alloca, instead of the old one), so they were updated to either match
both allocas or include a `,` right after the alloca type, to prevent
matching against a pointer type.
Differential Revision: https://reviews.llvm.org/D141381
Elena Lepilkina [Wed, 7 Dec 2022 06:47:51 +0000 (09:47 +0300)]
[llvm-objdump][RISCV] Use new common method to parse ARCH RISCV attribute
Differential Revision: https://reviews.llvm.org/D139553
Ed Maste [Fri, 13 Jan 2023 01:05:42 +0000 (20:05 -0500)]
[libc++] allow redefined macro in non_trivial_copy_move_ABI test
__config defines _LIBCPP_DEPRECATED_ABI_DISABLE_PAIR_TRIVIAL_COPY_CTOR
on FreeBSD, which conflicts with a command-line definition used by the
non_trivial_copy_move_ABI test.
Add -Wno-macro-redefined to ADDITIONAL_COMPILE_FLAGS in this test.
Reviewed By: philnik
Differential Revision: https://reviews.llvm.org/D141774
Alexey Lapshin [Sun, 15 Jan 2023 21:31:35 +0000 (22:31 +0100)]
This patch allows llvm-dwarfutil to utilize accelerator tables
generation code from DWARFLinker. It adds command line option:
--build-accelerator [none,DWARF]
Build accelerator tables(default: none)
=none - Do not build accelerators
=DWARF - Build accelerator tables according to the resulting DWARF version
DWARFv4: .debug_pubnames and .debug_pubtypes
DWARFv5: .debug_names
Differential Revision: https://reviews.llvm.org/D139638
Sergei Barannikov [Sun, 15 Jan 2023 11:38:34 +0000 (14:38 +0300)]
[NFC] Use `llvm::enumerate` in llvm/unittests/Object
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D141788
Guillaume Chatelet [Mon, 16 Jan 2023 12:57:13 +0000 (12:57 +0000)]
[NFC] Remove dead code
Guillaume Chatelet [Mon, 16 Jan 2023 12:43:52 +0000 (12:43 +0000)]
Deprecate Argument::getParamAlignment()
Florian Hahn [Mon, 16 Jan 2023 12:25:34 +0000 (12:25 +0000)]
[LoopUnroll] Don't update DT for changeToUnreachable.
There is no need to update the DT here, because there must be a unique
latch. Hence if the latch is not exiting it must directly branch back
to the original loop header and does not dominate any nodes.
Skipping a DT update here simplifies D141487.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D141810
Haojian Wu [Mon, 16 Jan 2023 12:13:20 +0000 (13:13 +0100)]
[bazel] Another blank-line format fix for the utils/bazel/configure.bzl, NFC
Sergey Kachkov [Mon, 16 Jan 2023 12:13:17 +0000 (15:13 +0300)]
Revert "[GVN] Refactor handling of pointer-select in GVN pass"
This reverts commit
fc7cdaa373308ce3d72218b4d80101ae19850a6c.
Zain Jaffal [Mon, 16 Jan 2023 12:04:39 +0000 (12:04 +0000)]
[AArch64] Add tests for dotreduce to check for wider vectors.
Currently we only reduce vector.reduce.add to sdot if the vectors are either <8 x i8> or <16 x i8>.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D141692
Max Kazantsev [Mon, 16 Jan 2023 11:53:34 +0000 (18:53 +0700)]
[JumpThreading] Preserve profile metadata during select unfolding, take 2
Jump threading can replace select and unconditional branch with
conditional branch, but when doing so loses profile information.
This destructive transform can eventually lead to a performance
degradation due to folding of branches in
shouldFoldCondBranchesToCommonDestination as branch probabilities
are no longer known.
The first version was reverted due to assert caused by i32 overflow,
fixed in this version.
Patch by Roman Paukner!
Differential Revision: https://reviews.llvm.org/D138132
Reviewed By: mkazantsev
Haojian Wu [Mon, 16 Jan 2023 11:57:57 +0000 (12:57 +0100)]
[bazel] Fix the format of utils/bazel/configure.bzl, NFC
Kerry McLaughlin [Mon, 16 Jan 2023 11:36:37 +0000 (11:36 +0000)]
[AArch64][SME] Add an instruction mapping for SME pseudos
Adds an instruction mapping to SMEInstrFormats which matches SME
pseudos with the real instructions they are transformed to.
A new flag is also added to AArch64Inst (SMEMatrixType), which is
used to indicate the base register required when emitting many
of the SME instructions.
This reduces the number of pseudos handled by the switch statement
in EmitInstrWithCustomInserter.
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D136856
Jolanta Jensen [Tue, 10 Jan 2023 13:53:28 +0000 (13:53 +0000)]
[NFC] Fixed a typo in clang help docs
Fixed minor typo in clang help docs.
Differential Revision: https://reviews.llvm.org/D141507
Sven van Haastregt [Mon, 16 Jan 2023 11:32:12 +0000 (11:32 +0000)]
[OpenCL] Allow undefining header-only features
`opencl-c-base.h` always defines 5 particular feature macros for
SPIR-V, making it impossible to disable those features.
To allow disabling any of those features, let the header recognize
`__undef_<feature>` macros. The user can then pass the
`-D__undef_<feature>` flag on the command line to disable a specific
feature. The __undef macro could potentially also be set from
`-cl-ext=-feature`, but for now only change the header and only
provide __undef macros for the 5 features that are always enabled in
`opencl-c-base.h`.
Differential Revision: https://reviews.llvm.org/D141297
Utkarsh Saxena [Mon, 16 Jan 2023 06:29:38 +0000 (07:29 +0100)]
Add test for an invalid requirement in requires expr.
The one introduced in D140547 was brittle. Fixing max template depth to
a small value would still test the same issue without causing actual
stack exhaustion.
Differential Revision: https://reviews.llvm.org/D141818
Nikita Popov [Mon, 16 Jan 2023 11:03:29 +0000 (12:03 +0100)]
[Clang] Convert test to opaque pointers (NFC)
A very annoying update, because some but now all of the zero-index
GEPs are omitted with opaque pointers.
Sergey Kachkov [Thu, 22 Dec 2022 13:59:06 +0000 (16:59 +0300)]
[GVN] Refactor handling of pointer-select in GVN pass
This patch introduces new type of memory dependency - Select to
consistently handle it like Def/Clobber dependency.
Differential Revision: https://reviews.llvm.org/D141619
Markus Böck [Sun, 15 Jan 2023 13:52:05 +0000 (14:52 +0100)]
[mlir][NFC] Set `useFoldAPI` to `kEmitRawAttributesFolder` value for some dialects missed previously
Found these while working on https://reviews.llvm.org/D141604. These were previously not found due to the old implementation only emitting warnings if an Op has a `fold`.
Changing these values both avoid the deprecation warning and if new `fold`s were added to ops of these dialects, that they are already using the new API.
Differential Revision: https://reviews.llvm.org/D141795
David Green [Mon, 16 Jan 2023 10:44:38 +0000 (10:44 +0000)]
[AArch64] Sink to umull if we know tops bits are zero.
This is an extension to the code for sinking splats to multiplies, where
if we can detect that the top bits are known-zero we can treat the
instruction like a zext. The existing code was also adjusted in the
process to make it more precise about only sinking if both operands are
zext or sext.
Differential Revision: https://reviews.llvm.org/D141275
Florian Hahn [Mon, 16 Jan 2023 10:23:51 +0000 (10:23 +0000)]
[VPlan] Use VPDef prefix for VPDef IDs instead of VPRecipeBase (NFC).
Various places in the code where still using the VPRecipeBase:: prefix
for VPDef IDs or not prefix at all. Now that the VPDef IDs have been
moved to VPDef, use this prefix instead and consistently use it.