Tom Eccles [Fri, 20 Jan 2023 12:51:11 +0000 (04:51 -0800)]
[mlir][Linalg] Fix ignoring nodiscard return value
ff94419a287c changed the return value of appendMangledType() to
LogicalResult, which is marked as nodiscard. Ignoring the result
generates a warning when building with clang.
Reviewed By: nicolasvasilache, chelini
Differential Revision: https://reviews.llvm.org/D142202
Kevin Sala [Thu, 19 Jan 2023 18:47:57 +0000 (19:47 +0100)]
[OpenMP][libomptarget] Fix deinit of NextGen AMDGPU plugin
This patch fixes a segfault that was appearing when the plugin fails to
initialize and then is deinitialized. Also, do not call hsa_shut_down if
the hsa_init failed.
Differential Revision: https://reviews.llvm.org/D142145
Tobias Gysi [Fri, 20 Jan 2023 12:03:36 +0000 (13:03 +0100)]
[mlir][llvm] Drop cyclic dependencies during debug metadata import.
This revision fixes the import of LLVM IR to handle debug metadata with
cyclic dependencies. It deletes the elements list of the composite type
if a cyclic dependency is detected. The revision is meant as a band aid
to avoid infinite recursion during the import of cyclic debug metadata.
Long term solutions are currently discussed here:
https://discourse.llvm.org/t/handling-cyclic-dependencies-in-debug-info/67526/4
Reviewed By: Dinistro
Differential Revision: https://reviews.llvm.org/D142086
Sven van Haastregt [Thu, 19 Jan 2023 13:42:43 +0000 (13:42 +0000)]
[OpenCL] Always add nounwind attribute for OpenCL
Neither OpenCL nor C++ for OpenCL support exceptions, so add the
`nounwind` attribute unconditionally for those languages.
Differential Revision: https://reviews.llvm.org/D142033
Nikita Popov [Fri, 20 Jan 2023 11:46:31 +0000 (12:46 +0100)]
[InstCombine] Add multi-use tests for gep of gep fold (NFC)
Kerry McLaughlin [Fri, 20 Jan 2023 11:17:17 +0000 (11:17 +0000)]
[AArch64][SME2] Add multi-vector multiply-add long intrinsics.
Adds (single, multi & indexed) intrinsics for the following:
- bfmlal/bfmlsl
- fmlal/fmlsl
- smlal/smlsl
- umlal/umlsl
This patch also extends SelectSMETileSlice to handle scaled vector select offsets.
NOTE: These intrinsics are still in development and are subject to future changes.
Reviewed By: CarolineConcatto
Differential Revision: https://reviews.llvm.org/D142004
Nikita Popov [Thu, 19 Jan 2023 14:34:35 +0000 (15:34 +0100)]
[ValueTracking] Take poison-generating metadata into account (PR59888)
In canCreateUndefOrPoison(), take not only poison-generating flags,
but also poison-generating metadata into account. The helpers are
written generically, but I believe the only case that can actually
matter is !range on calls -- !nonnull and !align are only valid on
loads, and those can create undef/poison anyway.
Unfortunately, this negatively impacts logical to bitwise and/or
conversion: For ctpop/ctlz/cttz we always attach !range metadata,
which will now block the transform, because it might introduce
poison. It would be possible to recover this regression by supporting
a ConsiderFlagsAndMetadata=false mode in impliesPoison() and clearing
flags/metadata on visited instructions.
Fixes https://github.com/llvm/llvm-project/issues/59888.
Differential Revision: https://reviews.llvm.org/D142115
Kerry McLaughlin [Fri, 20 Jan 2023 10:44:39 +0000 (10:44 +0000)]
[AArch64][SME2] Add multi-vector fused multiply-add/subtract intrinsics
Adds intrinsics for the following:
- fmla (single, multi & indexed)
- fmls (single, multi & indexed)
NOTE: These intrinsics are still in development and are subject
to future changes.
Reviewed By: CarolineConcatto
Differential Revision: https://reviews.llvm.org/D141946
Guray Ozen [Thu, 19 Jan 2023 09:09:02 +0000 (10:09 +0100)]
[mlir][nvvm] Introduce redux op
Ptx model has `redux.sync` that performs reduction operation on the data from each predicated active thread in the thread group. It only is available sm80+.
This revision adds redux as on op to nvvm dialect.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D142088
Nicholas Guy [Wed, 18 Jan 2023 14:03:56 +0000 (14:03 +0000)]
[ReleaseNotes] Add mention of complex number support for ARM and AArch64 backends.
Differential Revision: https://reviews.llvm.org/D142012
Anshil Gandhi [Fri, 20 Jan 2023 10:52:17 +0000 (11:52 +0100)]
[InstCombine] Add tests for constant memcpy with select (NFC)
Tests for D136524.
Marco Elver [Fri, 20 Jan 2023 09:20:41 +0000 (10:20 +0100)]
tsan: Consider SI_TIMER signals always asynchronous
POSIX timer can be configured to send any kind of signal, however, it
fundamentally does not make sense to consider a timer a synchronous
signal. Teach TSan that timers are never synchronous.
The tricky bit here is correctly defining compiler-rt's siginfo
replacement, which is a rather complex struct. Extend it in a limited
way that is mostly cross-platform compatible and add offset tests in
sanitizer_platform_limits_posix.cpp.
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D142117
Jean Perier [Fri, 20 Jan 2023 10:30:39 +0000 (11:30 +0100)]
[flang][hlfir] Lower pointer and allocatable sub-part references
The previous patches dealt with allocatable and pointer symbol
and component whole references.
This one deals with the remaining sub-part case where a dereference
must be created before applying the sub-part reference on the target.
With this patch the support to designate allocatable and pointer in
HLFIR is complete, but some use points will need to be updated to
use HLFIR designator lowering (at least allocate/deallocate statement
and whole allocatable assignment).
The partInfo.base had to be turned into an std::optional<hlfir::Entity>
because loads of allocatable/pointers do create a
fir::FortranVariableOpInterface (there is no need to). The optional part
comes from the fact that the partInfo.base is not set when creating the
partInfo, but later when visiting the designator parts.
They are three cases when dereferences must be inserted:
- The pointer/allocatable is a symbol followed by a sub-part that is not
a component ref. This is done in visit(Symbol).
- The pointer/allocatable is a component followed by a sub-part that is
not another component ref. This is done in visit(Component).
- The pointer/allocatable is followed by a component ref. This case is
special since it does not call the above "visit" but instead calls "gen"
to break the visit and generate an hlfir.designate for the component
base (since one hlfir.designate can only represent one Fortran part-ref,
and must be chained to implement a Fortran designator with several part
refs). This is done in visitComponentImpl().
Differential Revision: https://reviews.llvm.org/D142124
Benjamin Kramer [Fri, 20 Jan 2023 10:05:29 +0000 (11:05 +0100)]
[bazel] Add missing dependencies for
790f237012
LLVM GN Syncbot [Fri, 20 Jan 2023 09:52:08 +0000 (09:52 +0000)]
[gn build] Port
0e13ccc69cf2
Florian Hahn [Fri, 20 Jan 2023 09:51:06 +0000 (09:51 +0000)]
[VPlan] Add initial VPDT test. (NFC)
Nikita Popov [Fri, 20 Jan 2023 09:11:01 +0000 (10:11 +0100)]
[libomp] Explicitly include <string> header (NFC)
This is required to build against libstdc++ 13. Debug.h uses
std::stoi() from <string> without explicitly including it.
v1nh1shungry [Mon, 2 Jan 2023 05:13:38 +0000 (13:13 +0800)]
[clang] fix crash on generic lambda with lambda in decltype
Relevant issue: https://github.com/llvm/llvm-project/issues/59771
During the instantiation of a generic lambda, a non-generic lambda in
the trailing `decltype` is a `DeclContext` but not a dependent context,
so we shouldn't call `PerformDependentDiagnostics` on it.
Differential Revision: https://reviews.llvm.org/D140838
Matthias Springer [Fri, 20 Jan 2023 09:01:27 +0000 (10:01 +0100)]
[mlir] GreedyPatternRewriteDriver: Add new strict mode option
There are now three options:
* `AnyOp` (previously `false`)
* `ExistingAndNewOps` (previously `true`)
* `ExistingOps`: this one is new.
The last option corresponds to what the `applyOpPatternsAndFold(Operation*, ...)` overload is doing. It is now also supported on the `applyOpPatternsAndFold(ArrayRef<Operation *>, ...)` overload.
Differential Revision: https://reviews.llvm.org/D141904
Timm Bäder [Fri, 20 Jan 2023 08:09:55 +0000 (09:09 +0100)]
[clang][Interp] Initialize remaining InlineDescriptor fields
for local variables. Hoping this will please msan.
Nikita Popov [Thu, 19 Jan 2023 13:28:58 +0000 (14:28 +0100)]
[Flang] Explicitly include cstdint (NFC)
This header uses std::int8_t, but does not include cstdint.
This fixes the build against libstc++ 13, where some indirect
header includes have been removed.
Sergey Kachkov [Fri, 13 Jan 2023 13:02:21 +0000 (16:02 +0300)]
[GVN] Refactor findDominatingLoad function
Improve findDominatingLoad implementation:
1. Result is saved into gvn::AvailableValue struct
2. Search is done in extended BB (while there is a single predecessor or
limit is reached)
Differential Revision: https://reviews.llvm.org/D141680
Viktoriia Bakalova [Mon, 16 Jan 2023 16:17:47 +0000 (16:17 +0000)]
[include-mapping] Parse zombie_names.html into a removed symbols map.
Differential Revision: https://reviews.llvm.org/D141855
Kristof Beyls [Fri, 20 Jan 2023 08:49:16 +0000 (09:49 +0100)]
Add security group 2022 transparency report.
Craig Topper [Fri, 20 Jan 2023 08:41:14 +0000 (00:41 -0800)]
Revert "[X86][WIP] Change precision control to FP80 during u64->fp32 conversion on Windows."
This reverts commit
928a1764d6bdf84073c9d85875f45c1716d6ff12.
Committed accidentally
Craig Topper [Fri, 20 Jan 2023 08:20:13 +0000 (00:20 -0800)]
[RISCV][TableGen] Use getAllDerivedDefinitions in RISCVTargetDefEmitter to simplify the code. NFC
Craig Topper [Fri, 20 Jan 2023 05:34:23 +0000 (21:34 -0800)]
[X86][WIP] Change precision control to FP80 during u64->fp32 conversion on Windows.
This is an alternative to D141074 to fix the problem by adjusting
the precision control dynamically.
This isn't quite complete yet. I want to support fadd with an load
folded into it too. That's the code we will usually generate.
Posting for early review so we can do some testing of this solution.
Differential Revision: https://reviews.llvm.org/D142178
Nicolas Vasilache [Wed, 18 Jan 2023 20:26:13 +0000 (12:26 -0800)]
[mlir][Linalg] Add a structured.pack_transpose transform op
This transform is complementary to the `structured.pack` op which
allows packing a whole op but does not allow transposes on the individual
operands.
`structured.pack_transpose` allows transposing single operands connected to
pack or unpack ops after the fact.
This makes the system overall more composable than e.g. a giant transform
op with all permutation specified at once.
Differential Revision: https://reviews.llvm.org/D142053
Nicolas Vasilache [Fri, 20 Jan 2023 08:06:34 +0000 (00:06 -0800)]
[mlir][Linalg] Fix crash in LinalgToStandard
Properly handle `appendMangledType` failure instead of asserting.
Fixes #59986.
Nicolas Vasilache [Fri, 20 Jan 2023 08:06:34 +0000 (00:06 -0800)]
[mlir][Linalg] Add missing test
c3f0efe753e27105b519ae9283796d41fe574741 lacked a test, added here.
Kadir Cetinkaya [Fri, 20 Jan 2023 07:57:08 +0000 (08:57 +0100)]
[clangd] Fix shared lib builds
Uday Bondhugula [Fri, 20 Jan 2023 06:53:32 +0000 (12:23 +0530)]
NFC. Refactor affine fusion code for readability
Replace a couple of check instances with llvm::any_of (clang-tidy
warnings). Factor out "canCreatePrivateMemRef" and
"performFusionsIntoDest" into separate methods to reduce the
length/indent of the containing methods. Add doc comments and debug messages.
Mark some of the methods that should have been const const.
NFC.
Reviewed By: vinayaka-polymage
Differential Revision: https://reviews.llvm.org/D142076
Nicolas Vasilache [Fri, 20 Jan 2023 07:29:16 +0000 (23:29 -0800)]
[mlir][Linalg] Fix crash in LinalgToStandard
Use rewriter.notifyMatchFailure instead of assert.
Fixes #59986.
LLVM GN Syncbot [Fri, 20 Jan 2023 06:58:57 +0000 (06:58 +0000)]
[gn build] Port
21f4232dd963
Nikolas Klauser [Mon, 21 Nov 2022 11:41:15 +0000 (12:41 +0100)]
[libc++] Enable segmented iterator optimizations for join_view::iterator
Reviewed By: ldionne, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D138413
Kazu Hirata [Fri, 20 Jan 2023 06:49:31 +0000 (22:49 -0800)]
[ADT,Support] Include compiler.h
This restores builds with gcc-9, which does not have __has_builtin.
Lang Hames [Fri, 20 Jan 2023 05:33:41 +0000 (21:33 -0800)]
[ORC][ORC-RT] Add support for callback-based lookup of JIT'd MachO unwind info.
In LLVM the MachOPlatform class is modified to identify unwind info sections
and the address ranges of the functions these sections cover. These address
ranges are then communicated to the ORC runtime by attaching them to the
register-object-platform-sections allocation action.
In the ORC runtime the unwind-info section addresses are recorded and used to
support lookup of unwind info via the new `findDynamicUnwindSections` function.
At bootstrap time the ORC runtime checks for the presence of new
unwind-info-lookup-registration functions in libunwind (see
https://reviews.llvm.org/D142176), and if available uses them to register the
`findDynamicUnwindSections` function with libunwind to enable callback-based
lookup. If the new unwind-info-lookup-registration functions are not available
then the ORC runtime falls back to using the existing libunwind registration
APIs.
The callback-based scheme is intended to address three shortcomings in the
current registration scheme for JIT'd unwind info on Darwin: (1) Lack of
compact-unwind support, (2) inability to describe the subarchitecture of JIT'd
frames, and (3) lack of efficient address-based lookup data structures in
libunwind.
For more details see the proposed libunwind changes in
https://reviews.llvm.org/D142176.
Nikolas Klauser [Fri, 20 Jan 2023 06:06:40 +0000 (07:06 +0100)]
[libc++] Mark LWG3349 as complete
Craig Topper [Fri, 20 Jan 2023 05:36:07 +0000 (21:36 -0800)]
Revert "[X86] Avoid converting u64 to f32 using x87 on Windows"
This reverts commit
a6e3027db7ebe6863e44bafcfeaacc16bdc88a3f.
Chrome and Halide are both reporting issues with importing builtins.
Maybe the better direction is to manually adjust FPCW for the inline
sequence on Windows.
Kazu Hirata [Fri, 20 Jan 2023 05:15:39 +0000 (21:15 -0800)]
[llvm] Move bit counting functions to bit.h (NFC)
This patch provides C++20-style countl_zero, countr_zero, countl_one,
and countr_one in bit.h. Existing functions like countLeadingZeros
become wrappers around the new functions.
Note that I cannot quite declare countLeadingZeros as:
template <class T> using countLeadingZeros = countl_zero<T>;
because countl_zero returns int, whereas countLeadingZeros returns
unsigned.
Differential Revision: https://reviews.llvm.org/D142078
LLVM GN Syncbot [Fri, 20 Jan 2023 05:06:41 +0000 (05:06 +0000)]
[gn build] Port
b40a3d73dc9c
Nikolas Klauser [Tue, 6 Sep 2022 17:01:17 +0000 (19:01 +0200)]
[libc++] Remove old CI configurations and update the supported compiler versions
`_LIBCPP_REMOVE_TRANSITIVE_INCLUDES` doesn't do anything anymore in C++23 mode, so it's now just a duplicate of the C++23 configuration.
Also add new steps to the post-release checklist for updating the supported compilers.
Reviewed By: ldionne, #libc
Spies: arichardson, libcxx-commits, arphaman
Differential Revision: https://reviews.llvm.org/D133364
Nikolas Klauser [Sun, 11 Dec 2022 10:32:54 +0000 (11:32 +0100)]
[libc++] Implement P2446R2 (views::as_rvalue)
Reviewed By: ldionne, var-const, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D137637
Advenam Tacet [Fri, 20 Jan 2023 04:52:18 +0000 (20:52 -0800)]
Adding missing colon
Simple typo fix.
The absence of this colon may be confusing and result in misinterpretation of the result.
In normal libfuzzer mode, that colon is present.
You can compare with:
https://github.com/llvm/llvm-project/blob/
aa0e9046c16bf27a8affbd903e2e3cad924a5217/compiler-rt/lib/fuzzer/FuzzerLoop.cpp#L356
Reviewed By: #sanitizers, vitalybuka
Differential Revision: https://reviews.llvm.org/D142171
Shilei Tian [Fri, 20 Jan 2023 03:24:23 +0000 (22:24 -0500)]
[Clang][OpenMP] Allow `f16` literal suffix when compiling OpenMP target offloading for NVPTX
Fix #58087.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D142075
Yaxun (Sam) Liu [Thu, 19 Jan 2023 15:37:55 +0000 (10:37 -0500)]
[HIP] Unbundler allows missing host entry
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D142118
Charles Magahern [Mon, 9 Jan 2023 22:28:29 +0000 (14:28 -0800)]
[clang] Don't short-circuit constant evaluation for array or record types
FastEvaluateAsRValue returns `true` without setting a result value for when a
given constant expression is an array or record type.
Clang attributes must be able to support constant expressions that are array or
record types, so proceed with the slower path for evaluation in the case where
`FastEvaluateAsRValue` does not yield an evaluation result.
Differential Revision: https://reviews.llvm.org/D141745
Matt Arsenault [Sat, 14 Jan 2023 13:50:55 +0000 (08:50 -0500)]
llvm-reduce: Trim includes and avoid using namespace in a header
Ben Shi [Thu, 19 Jan 2023 10:06:27 +0000 (18:06 +0800)]
[AVR] Fix incorrectly printed global symbol operands in inline-asm
Fixes https://github.com/llvm/llvm-project/issues/58879
Reviewed By: aykevl
Differential Revision: https://reviews.llvm.org/D142096
River Riddle [Fri, 20 Jan 2023 01:42:30 +0000 (17:42 -0800)]
[mlir][LLVM] Tidy up DebugTranslation casting
Add a specific class for local scope attributes and remove
some unnecessary casts.
Matt Arsenault [Thu, 19 Jan 2023 21:21:05 +0000 (17:21 -0400)]
llvm-reduce: Fix typo in help text
Matt Arsenault [Thu, 19 Jan 2023 20:13:37 +0000 (16:13 -0400)]
llvm-reduce: Use WithColor in another error message
Matt Arsenault [Tue, 17 Jan 2023 19:31:17 +0000 (14:31 -0500)]
llvm-reduce: Account for initializer complexity
Matt Arsenault [Tue, 17 Jan 2023 00:41:49 +0000 (19:41 -0500)]
llvm-reduce: Account for aliases and ifuncs in IR complexity score
Matt Arsenault [Mon, 16 Jan 2023 16:00:01 +0000 (11:00 -0500)]
llvm-reduce: Use consistent ReductionFunc types
Some of these were relying on ReducerWorkItem's operator Module&.
Jonas Devlieghere [Fri, 20 Jan 2023 01:07:12 +0000 (17:07 -0800)]
[lldb] Re-enable xmm/ymm/zmm tests with the system debugserver
Re-enable the xmm/ymm/zmm tests now that the system debugserver used by
our CI is capable or writing xmm/ymm/zmm registers.
Nico Weber [Fri, 20 Jan 2023 01:06:44 +0000 (20:06 -0500)]
Revert "[gn] port
a033dbbe5c43 (clang-stat-cache)"
This reverts commit
8d498e08deaf6e06a578cfedb4eb259b722ac7f6.
a033dbbe5c43 was reverted in
cf12709222a4.
Nico Weber [Fri, 20 Jan 2023 01:04:38 +0000 (20:04 -0500)]
Revert "[clang][Darwin] Try to guess the SDK root with xcrun when unspecified"
This reverts commit
ecade80d93960ad01d8665db02c23841e055a80f.
Breaks tests on macOS and tries to run xcrun on non-mac platforms,
see comments on https://reviews.llvm.org/D136315
Arthur Eubanks [Fri, 20 Jan 2023 00:59:36 +0000 (16:59 -0800)]
Revert "[LoopUnroll] Directly update DT instead of DTU."
This reverts commit
d0907ce7ed9f159562ca3f4cfd8d87e89e93febe.
Causes `opt -passes=loop-unroll-full` to crash on
```
define void @foo() {
bb:
br label %bb1
bb1: ; preds = %bb1, %bb1, %bb
switch i1 true, label %bb1 [
i1 true, label %bb2
i1 false, label %bb1
]
bb2: ; preds = %bb1
ret void
}
```
Benjamin Kramer [Thu, 19 Jan 2023 16:58:54 +0000 (17:58 +0100)]
[Linalg] Don't create complex vectors when vectorizing copies
vector<complex<...>> is currently not valid. This is a reduced version
of https://reviews.llvm.org/D141578
Differential Revision: https://reviews.llvm.org/D142131
Argyrios Kyrtzidis [Thu, 19 Jan 2023 18:40:11 +0000 (10:40 -0800)]
[Lex] For dependency directive lexing, angled includes in `__has_include` should be lexed as string literals
rdar://
104386604
Differential Revision: https://reviews.llvm.org/D142143
Diego Caballero [Thu, 19 Jan 2023 22:59:02 +0000 (22:59 +0000)]
[mlir][vector] Disable folding for masked reductions
Reductions can't be folded into plain arith ops until we can mask
those arith ops.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D141645
Daniele Castagna [Sun, 25 Dec 2022 20:36:41 +0000 (12:36 -0800)]
CUDA/HIP: Use kernel name to map to symbol
Currently CGCUDANV uses an llvm::Function as a key to map kernels to a
symbol in host code. HIP adds one level of indirection and uses the
llvm::Function to map to a global variable that will be initialized to
the kernel stub ptr.
Unfortunately there is no garantee that the llvm::Function created
by GetOrCreateLLVMFunction will be the same. In fact, the first
time we encounter GetOrCrateLLVMFunction for a kernel, the type
might not be completed yet, and the type of llvm::Function will be
a generic {}, since the complete type is not required to get a symbol
to a function. In this case we end up creating two global variables,
one for the llvm::Function with the incomplete type and one for the
function with the complete type. The first global variable will be
declared by not defined, resulting in a linking error.
This change uses the mangled name of the llvm::Function as key in the
KernelHandles map, in this way the same llvm::Function will be
associated to the same kernel handle even if they types are different.
Reviewed By: yaxunl
Differential Revision: https://reviews.llvm.org/D140663
Jeffrey Byrnes [Thu, 12 Jan 2023 01:06:48 +0000 (17:06 -0800)]
[AMDGPU] Further reduce attaching of implicit operands to spills
Extension of https://reviews.llvm.org/D141101 to even further reduce the amount of implicit operands we attach. The main benefit is to improve cability of post-ra scheduler, and reduce unneeded dependency resolution (e.g. inserting snops).
Unfortunately, we run into regressions if we completely minimize the amount implicit operands (naively), we run into some regressions (e.g. dual_movs are replaced with multiple calls to v_mov). This is even more reason to switch to LiveRegUnits.
Nonetheless, this patch removes the operands which we can for free (more or less).
Change-Id: Ib4f409202b36bdbc59eed615bc2d19fa8bd8c057
Differential Revision: https://reviews.llvm.org/D141557
Change-Id: I8b039e3c0d39436b384083f8beb947ee1b1730b2
Geoffrey Martin-Noble [Thu, 19 Jan 2023 21:57:33 +0000 (13:57 -0800)]
[Bazel] Fix layering issues
These are caught by clang-16, which we're using in our project.
Reviewed By: rupprecht
Differential Revision: https://reviews.llvm.org/D142158
serge-sans-paille [Thu, 19 Jan 2023 22:00:53 +0000 (23:00 +0100)]
[llvm] Cleanup edit_distance short circuiting
Also prevent integer overflow if MaximumDistance == UINT_MAX.
This is a follow-up to
6ad1b4095172373590134afff19a7fbad9d7889d
Evgenii Stepanov [Wed, 18 Jan 2023 00:17:03 +0000 (16:17 -0800)]
Revert "[AArch64][v8.3A] Avoid inserting implicit landing pads (PACI*SP)"
Linux kernel sets SCTRL_EL1.BT0 and BT1 to 1 unconditionally, which
makes PACIASP equivalent to BTI C + PACIA LR,SP.
Use the shorter instruction sequence by default.
I'm not aware of anyone who needs the opposite. They are welcome to
revert to the current behavior under a subtarget feature or an
environment check.
This reverts commit
571c8c5263a79293aaadae07b11feb36726eaf53.
Differential Revision: https://reviews.llvm.org/D141978
Fred Riss [Thu, 19 Jan 2023 21:51:25 +0000 (13:51 -0800)]
Revert "[Clang] Give Clang the ability to use a shared stat cache"
This reverts commit
c5abe893120b115907376359a5809229a9f9608a.
This reverts commit
a033dbbe5c43247b60869b008e67ed86ed230eaa.
This broke the build with -DLLVM_LINK_LLVM_DYLIB=ON. Reverting while I
investigate.
Volodymyr Sapsai [Tue, 6 Dec 2022 02:03:08 +0000 (18:03 -0800)]
[ODRHash] Detect mismatches in anonymous `RecordDecl`.
Allow completing a redeclaration check for anonymous structs/unions
inside `RecordDecl`, so we deserialize and compare anonymous entities
from different modules.
Completing the redeclaration chain for `RecordDecl` in
`ASTContext::getASTRecordLayout` mimics the behavior in
`CXXRecordDecl::dataPtr`. Instead of completing the redeclaration chain
every time we request a definition, do that right before we need a
complete definition in `ASTContext::getASTRecordLayout`.
Such code is required only for anonymous `RecordDecl` because we
deserialize named decls when we look them up by name. But it doesn't
work for anonymous decls as they don't have a name. That's why need to
force deserialization of anonymous decls in a different way.
rdar://
81864186
Differential Revision: https://reviews.llvm.org/D140055
Volodymyr Sapsai [Fri, 2 Dec 2022 02:39:23 +0000 (18:39 -0800)]
[ODRHash] Hash `RecordDecl` and diagnose discovered mismatches.
When two modules contain struct/union with the same name, check the
definitions are equivalent and diagnose if they are not. This is similar
to `CXXRecordDecl` where we already discover and diagnose mismatches.
rdar://problem/
56764293
Differential Revision: https://reviews.llvm.org/D71734
Alexey Bataev [Tue, 10 Jan 2023 21:49:05 +0000 (13:49 -0800)]
[SLP]Improve isGatherShuffledEntry by looking deeper through the reused scalars.
The compiler may produce better results if it does not look for
constants, uses an extra analysis of phi nodes, looks through all tree
nodes without skipping the cases, where the very first set of nodes is
empty. Also, it tries to reshufle the nodes if it is profitable for
sure, i.e. at least 2 scalars are used for single node permutation and at
least 3 scalars are used for the permutation of 2 nodes.
Part of D110978
Differential Revision: https://reviews.llvm.org/D141512
Ye Luo [Thu, 19 Jan 2023 21:39:05 +0000 (15:39 -0600)]
[OpenMP] Build device runtimes for sm_89 and sm_90
Paul Robinson [Thu, 19 Jan 2023 21:08:03 +0000 (13:08 -0800)]
Reland "[lit] Stop supporting triple substrings in UNSUPPORTED and
XFAIL"
This reverts commit
2f8b920f95aa1e308193cf5803df7912025e8400.
Forgot to update lit's own tests.
Alexander Shaposhnikov [Thu, 19 Jan 2023 20:57:24 +0000 (20:57 +0000)]
[Clang] Add lifetimebound attribute to std::move/std::forward
Clang now automatically adds [[clang::lifetimebound]] to the parameters of
std::move, std::forward et al, this enables Clang to diagnose more cases
where the returned reference outlives the object.
Associated GitHub issue: https://github.com/llvm/llvm-project/issues/60020
Test plan: ninja check-clang check-all
Differential revision: https://reviews.llvm.org/D141744
Florian Hahn [Thu, 19 Jan 2023 20:34:22 +0000 (20:34 +0000)]
[VPlan] Add vp_depth_first_deep (NFC)
Similar to vp_depth_first_shallow (D140512) add vp_depth_first_deep to
make existing code clearer and more compact.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D142055
Gilles Gouaillardet [Thu, 19 Jan 2023 20:24:00 +0000 (14:24 -0600)]
[OpenMP][libomp] Insert correct HWLOC version guards
Put needed HWLOC version guards around relevant HWLOC API.
Tested OpenMP host runtime build with HWLOC 1.11.13, 2.0-2.9.
Differential Revision: https://reviews.llvm.org/D142152
Fix #54951
Thomas Raoux [Thu, 19 Jan 2023 13:38:38 +0000 (13:38 +0000)]
[mlir] Update VectorToGPU to new memory space
GPU memory space have changed to new attributes. Update VectorToGPU pass
to use those.
Differential Revision: https://reviews.llvm.org/D142105
Frederik Gossen [Thu, 19 Jan 2023 20:05:05 +0000 (15:05 -0500)]
Paul Kirth [Thu, 19 Jan 2023 16:14:43 +0000 (16:14 +0000)]
[llvm][codegen] Fix non-determinism in StackFrameLayoutAnalysisPass output
We were iterating over a SmallPtrSet when outputting slot variables.
This is still correct but made the test fail under reverse iteration.
This patch replaces the SmallPtrSet with a SmallVector.
Also remove the "Stack Frame Layout" lines from arm64-opt-remarks-lazy-bfi test,
since those also break under reverse iteration.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D142127
Jim Ingham [Thu, 19 Jan 2023 19:59:54 +0000 (11:59 -0800)]
Remove the undocumented `help` subcommand.
This is processed by hand in CommandObjectMultiword, and is undiscoverable,
it doesn't work in all cases. Because it is a bare word, it can't really be
extended w/o introducing the possibility of collisions as well. If we did
want to do something like this we should add a --help flag to CommandObject. That
way the feature would be consistent and documented.
Differential Revision: https://reviews.llvm.org/D142067
Mark de Wever [Thu, 5 May 2022 16:57:32 +0000 (18:57 +0200)]
[libc++][format] range-default-formatter for map
Implements the range-default-formatter specialization range_format::map.
Implements parts of
- P2286R8 Formatting Ranges
- P2585R0 Improving default container formatting
Depends on D140653
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D140801
Mehdi Amini [Tue, 17 Jan 2023 11:36:54 +0000 (11:36 +0000)]
Remove useless / untested verifier in scf.foreach_thread (NFC)
Arvind Sudarsanam [Thu, 19 Jan 2023 18:24:46 +0000 (10:24 -0800)]
[opt] Fix static code analysis concerns
This is an issue reported inside the NewPMDriver module. Static analyzer reported that Null pointer 'P' may be dereferenced at line 371 and two more sites. Proposed change guards this use.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D142047
Noah Goldstein [Thu, 19 Jan 2023 19:02:40 +0000 (11:02 -0800)]
Removing 'TuningSlow3OpsLEA' from ICL config
According to https://uops.info/ ICL and newer have fast 3-term LEA.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D141974
Noah Goldstein [Thu, 19 Jan 2023 19:02:26 +0000 (11:02 -0800)]
Add transform ctpop(X) -> 1 iff X is non-zero power of 2
Definitionally a non-zero power of 2 will only have 1 bit set so this
is a freebee.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D141990
Noah Goldstein [Thu, 19 Jan 2023 19:01:25 +0000 (11:01 -0800)]
Add tests for ctpop(X) where X is a power of 2; NFC
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D141989
Frederik Gossen [Thu, 19 Jan 2023 19:25:55 +0000 (14:25 -0500)]
LLVM GN Syncbot [Thu, 19 Jan 2023 19:13:26 +0000 (19:13 +0000)]
[gn build] Port
c90801457f7c
Nikolas Klauser [Sun, 20 Nov 2022 22:16:20 +0000 (23:16 +0100)]
[libc++] Refactor deque::iterator algorithm optimizations
This has multiple benefits:
- The optimizations are also performed for the `ranges::` versions of the algorithms
- Code duplication is reduced
- it is simpler to add this optimization for other segmented iterators,
like `ranges::join_view::iterator`
- Algorithm code is removed from `<deque>`
Reviewed By: ldionne, huixie90, #libc
Spies: mstorsjo, sstefan1, EricWF, libcxx-commits, mgorny
Differential Revision: https://reviews.llvm.org/D132505
Stanislav Mekhanoshin [Wed, 18 Jan 2023 23:20:36 +0000 (15:20 -0800)]
[AMDGPU] Treat WMMA the same as MFMA for sched_barrier
MFMA and WMMA essentially the same thing, but apear on different ASICs.
Differential Revision: https://reviews.llvm.org/D142062
Stanislav Mekhanoshin [Wed, 18 Jan 2023 19:58:50 +0000 (11:58 -0800)]
[AMDGPU] Introduce separate register limit bias in scheduler
Current implementation abuses ErrorMargin to apply an additional
bias to VGPR and SGPR limits under a high register pressure. The
ErrorMargin exists to account for inaccuracies of the RP tracker
and not to tackle an excess pressure. Introduce separate bias for
this purpose and also make it different for SGPRs and VGPRs as we
may want to use different values in the future.
This is supposed to be NFC, however there is a subtle difference
when subtracting a margin overflows the limit. Doing two subtractions
makes it less probable, although manifests only in mir tests with
an artificially small register budget.
Differential Revision: https://reviews.llvm.org/D142051
Joseph Huber [Thu, 19 Jan 2023 18:48:28 +0000 (12:48 -0600)]
[Clang][NFC] Tweak error message for GPU architecture tools
Summary:
There shouldn't be an extra newline in these messages.
Joseph Huber [Thu, 19 Jan 2023 17:20:07 +0000 (11:20 -0600)]
[LinkerWrapper] Use `clang` to perform the device linking
Right now in the linker wrapper we manually invoke a lot of the
toolchain programs. This reproduces a lot of logic that is already
handled in clang. Since D140158 we can now target all supported
toolchains directly via cross-compilation.
This patch changes the linker wrapper to consolidate all the alternate
linking and assembler steps into a generic call to `clang` and let clang
handle the argument handling. This heavily simplifies the interface.
Reviewed By: tra, JonChesterfield
Differential Revision: https://reviews.llvm.org/D142133
Krzysztof Drewniak [Thu, 19 Jan 2023 18:41:00 +0000 (18:41 +0000)]
Revert "[mlir][Index] Implement InferIntRangeInterface"
This reverts commit
455305624884cf9237143e2ba0635fcc5ba5206a.
Linker error, unbreak build while I work out how to fix it.
Differential Revision: https://reviews.llvm.org/D142142
Gulfem Savrun Yeniceri [Sat, 14 Jan 2023 00:48:32 +0000 (00:48 +0000)]
[IRLinker] Replace CallInstr with CallBase
This patch replaces CallInstr with CallBase to cover InvokeInstr
besides CallInstr while removing nocallback attribute on a call site.
It also extends drop-attribute.ll test to include a case for an invoke
instruction.
Differential Revision: https://reviews.llvm.org/D141740
Arthur Eubanks [Thu, 19 Jan 2023 18:19:27 +0000 (10:19 -0800)]
Revert "Reland [pgo] Avoid introducing relocations by using private alias"
This reverts commit
da5a8d14b8cc6cea16ee0929413c0672b47c93d9.
Causes more duplicate symbol errors, see https://bugs.chromium.org/p/chromium/issues/detail?id=1408161.
Frederik Gossen [Thu, 19 Jan 2023 18:18:22 +0000 (13:18 -0500)]
[MLIR] Add InferTypeOpInterface to scf.if op
Differential Revision: https://reviews.llvm.org/D142049
Erich Keane [Tue, 17 Jan 2023 19:29:04 +0000 (11:29 -0800)]
Forbid implicit conversion of constraint expression to bool
As reported in https://github.com/llvm/llvm-project/issues/54524, and
later in https://github.com/llvm/llvm-project/issues/60038, we were not
properly implmenting temp.constr.atomic P3. This patch stops implicitly
converting constraints to bool, and ensures the Rvalue conversion takes
place as needed.
Differential Revision: https://reviews.llvm.org/D141954
Florian Hahn [Thu, 19 Jan 2023 18:10:51 +0000 (18:10 +0000)]
[LoopUnroll] Directly update DT instead of DTU.
The scope of DT updates are very limited when unrolling loops: the DT
should only need updating for
* new blocks added
* exiting blocks we simplified branches
This can be done manually without too much extra work.
MergeBlockIntoPredecessor also needs to be updated to support direct
DT updates.
This fixes excessive time spent in DTU for same cases. In an internal
example, time spent in LoopUnroll with this patch goes from ~200s to 2s.
It also is slightly positive for CTMark:
* NewPM-O3: -0.13%
* NewPM-ReleaseThinLTO: -0.11%
* NewPM-ReleaseLTO-g: -0.13%
Notable improvements are mafft (~ -0.50%) and lencod (~ -0.30%), with no
workload regressed.
https://llvm-compile-time-tracker.com/compare.php?from=
78a9ee7834331fb4360457cc565fa36f5452f7e0&to=
687e08d011b0dc6d3edd223612761e44225c7537&stat=instructions:u
Reviewed By: kuhar
Differential Revision: https://reviews.llvm.org/D141487
Matthias Springer [Thu, 19 Jan 2023 18:01:22 +0000 (19:01 +0100)]
[mlir][SCF] Fix crash in loop peeling
Upper bound and step size should be symbols instead of dims.
Differential Revision: https://reviews.llvm.org/D142136