Joseph Huber [Mon, 23 Jan 2023 13:06:53 +0000 (07:06 -0600)]
[Clang] Remove flaky test line from linker wrapper test
Summary:
This test is a little flaky and isn't as necessary anymore now that we
only generate one temporary file.
Nikita Popov [Mon, 23 Jan 2023 13:14:19 +0000 (14:14 +0100)]
[InstCombine] Make worklist check in memcpy from constant fold more precise
The phi operands need to be either in the worklist or be the
alloca itself, because that one does not require replacement.
Sander de Smalen [Thu, 12 Jan 2023 12:30:12 +0000 (12:30 +0000)]
[AArch64][SME2] MOVA tile-to-vector and vector-to-tile should not accept VG suffix
Reviewed By: MattDevereau
Differential Revision: https://reviews.llvm.org/D141601
Sander de Smalen [Fri, 20 Jan 2023 11:46:49 +0000 (11:46 +0000)]
[AArch64][SME2] NFC: Simplify multiclasses for mova/movaz.
Reviewed By: CarolineConcatto
Differential Revision: https://reviews.llvm.org/D142198
Sander de Smalen [Thu, 12 Jan 2023 13:03:44 +0000 (13:03 +0000)]
[AArch64][SME] Allow predicate-as-counter operands for psel
The specification says:
For programmer convenience, an assembler must also accept
predicate-as-counter register names for the destination predicate
register and the first source predicate register
Reviewed By: CarolineConcatto, MattDevereau
Differential Revision: https://reviews.llvm.org/D141603
Jay Foad [Mon, 23 Jan 2023 12:54:38 +0000 (12:54 +0000)]
[BOLT] Fix build error after D142214
Max Kazantsev [Mon, 23 Jan 2023 12:24:41 +0000 (19:24 +0700)]
[Test] Add test exercising scenarios of widening into loop-invariant condition
Max Kazantsev [Mon, 23 Jan 2023 12:19:40 +0000 (19:19 +0700)]
[Test] Add test for PR60234
https://github.com/llvm/llvm-project/issues/60234 explains how widening
of a branch by loop-invariant condition is causing a miscompile.
David Sherwood [Tue, 17 Jan 2023 15:44:09 +0000 (15:44 +0000)]
[AArch64][SVE2p1] Add SVE2.1 fclamp intrinsic
Adds an intrinsic for the following instruction:
* fclamp
Differential Revision: https://reviews.llvm.org/D141942
Anton Bikineev [Wed, 4 Jan 2023 23:51:21 +0000 (00:51 +0100)]
[X86][ABI] Don't preserve return regs for preserve_all/preserve_most CCs
Currently both calling conventions preserve registers that are used to
store a return value. This causes the returned value to be lost:
define i32 @bar() {
%1 = call preserve_mostcc i32 @foo()
ret i32 %1
}
define preserve_mostcc i32 @foo() {
ret i32 2
; preserve_mostcc will restore %rax,
; whatever it was before the call.
}
This contradicts the current documentation (preserve_allcc "behaves
identical to the `C` calling conventions on how arguments and return
values are passed") and also breaks [[clang::preserve_most]].
This change makes CSRs be preserved iff they are not used to store a
return value (e.g. %rax for scalars, {%rax:%rdx} for __int128, %xmm0
for double). For void functions no additional registers are
preserved, i.e. the behaviour is backward compatible with existing
code.
Differential Revision: https://reviews.llvm.org/D141020
Jay Foad [Mon, 23 Jan 2023 12:27:50 +0000 (12:27 +0000)]
[LLDB] Fix build error after D142214
Jannik Silvanus [Thu, 19 Jan 2023 15:04:45 +0000 (16:04 +0100)]
[IR] Avoid creation of GEPs into vectors (in one place)
The method DataLayout::getGEPIndexForOffset(Type *&ElemTy, APInt &Offset)
allows to generate GEP indices for a given byte-based offset.
This allows to generate "natural" GEPs using the given type structure
if the byte offset happens to match a nested element object.
With opaque pointers and a general move towards byte-based GEPs [1],
this function may be questionable in the future.
This patch avoids creation of GEPs into vectors in routines that use
DataLayout::getGEPIndexForOffset by not returning indices in that case.
The reason is that A) GEPs into vectors have been discouraged for a long
time [2], and B) that GEPs into vectors are currently broken if the element
type is overaligned [1]. This is also demonstrated by a lit test where
previously InstCombine replaced valid loads by poison. Note that
the result of InstCombine on that test is *still* invalid, because
padding bytes are assumed.
Moreover, GEPs into vectors may be outright forbidden in the future [1].
[1]: https://discourse.llvm.org/t/67497
[2]: https://llvm.org/docs/GetElementPtr.html
The test case is new. It will be precommitted if this patch is accepted.
Differential Revision: https://reviews.llvm.org/D142146
Jannik Silvanus [Thu, 19 Jan 2023 17:56:11 +0000 (18:56 +0100)]
[Transforms] Add lit test for instcombine on load into vector of overaligned elements.
The result is currently broken in two ways:
- Valid loads are replaced by poison
- An array-like layout with padding bytes is assumed
This commit serves as precommit for a patch that addresses the first issue.
The second issue will remain a TODO.
Contributors:
Sebastian Neubauer <sebastian.neubauer@amd.com>
Jeremy Morse [Mon, 23 Jan 2023 12:08:34 +0000 (12:08 +0000)]
[DebugInfo][CSInfo] Don't use clobbered registers as locations
When finding call-site argument locations, don't consider registers to be
location candidates if they will be clobbered between the copy to/from them
and call site. Doing so would present overwritten register values as entry
values in called functions.
This patch adds a collection of register units defined as we walk back from
the call site, and prevents the acceptance of a call-site parameter
location if it will be clobbered on that path.
Fixes https://github.com/llvm/llvm-project/issues/57444
Differential Revision: https://reviews.llvm.org/D141279
Nikita Popov [Mon, 23 Jan 2023 12:10:42 +0000 (13:10 +0100)]
[InstCombine] Add additional memcpy from constant test with phi (NFC)
This is the case that is safe to handle, but currently isn't.
Akash Banerjee [Fri, 13 Jan 2023 15:45:06 +0000 (15:45 +0000)]
[MLIR][OpenMP] Added target data, exit data, and enter data operation definition for MLIR
This includes a basic implementation for the OpenMP 5.1 Target Data, Target Exit Data and Target Enter Data constructs
operation.
TODO:
- Depend clause support for Target Enter and Exit Data.
- Mapper and Iterator value support for Map Type Modifiers.
- Verifier for the operations.
Co-authored-by: abidmalikwaterloo <amalik@bnl.gov>
Co-authored-by: raghavendra <Raghu.Maddhipatla@amd.com>
Differential Revision: https://reviews.llvm.org/D131915
Simon Pilgrim [Mon, 23 Jan 2023 12:05:49 +0000 (12:05 +0000)]
[X86] Add test coverage for and(ext(and(x, c1)),c2) patterns
This shows the failure to merge to and(ext(x),and(c1,ext(c2))) if the outer and has already been folded to a clear shuffle mask
Similar to the v8i1-masks.ll from regression D127115
Jay Foad [Fri, 13 Jan 2023 17:06:41 +0000 (17:06 +0000)]
[MC] Do not copy MCInstrDescs. NFC.
Avoid copying MCInstrDesc instances because a future patch will change
them to find their implicit operands and operand info array based on
their own "this" pointer, so it will only work for MCInstrDescs in the
TargetInsts table, not for a copy of an MCInstrDesc at a different
address.
Differential Revision: https://reviews.llvm.org/D142214
Luca Di Sera [Mon, 23 Jan 2023 11:52:36 +0000 (12:52 +0100)]
Revert "Add clang_CXXMethod_isExplicit to libclang"
This is currently failing the build due to some test errors.
This reverts commit
ddbe14084da7f31d4b4b53e13d9f868d759f3673.
Haojian Wu [Thu, 19 Jan 2023 13:44:14 +0000 (14:44 +0100)]
[clang] Fix the location of UsingTypeLoc.
It is revealed by the https://reviews.llvm.org/D141280.
```
namespace ns { class Foo {}; }
using ns::Foo;
// Before the fix, the Location of UsingTypeLoc Foo points to the
token "class", slection on ^Foo will result in the VarDecl abc.
class Foo abc;
```
Differential Revision: https://reviews.llvm.org/D142125
Noah Goldstein [Mon, 23 Jan 2023 11:35:26 +0000 (03:35 -0800)]
Fix `FindSingleBitChange` to handle NOT(V) where V is not an Instruction
Was previously buggy to assume that NOT'd Value was always an
instruction. If the NOT'd value is not an Instruction, we should just
return as its either a constant, in which can we will re-run the logic
after constant-folding, or its a type we can't evaluate anyways.
This is a follow up to: `D140939`
Reviewed By: pengfei, RKSimon
Differential Revision: https://reviews.llvm.org/D142339
Jay Foad [Fri, 13 Jan 2023 13:56:47 +0000 (13:56 +0000)]
[MC] Make more use of MCInstrDesc::operands. NFC.
Change MCInstrDesc::operands to return an ArrayRef so we can easily use
it everywhere instead of the (IMHO ugly) opInfo_begin and opInfo_end.
A future patch will remove opInfo_begin and opInfo_end.
Also use it instead of raw access to the OpInfo pointer. A future patch
will remove this pointer.
Differential Revision: https://reviews.llvm.org/D142213
Lucas Prates [Tue, 20 Dec 2022 17:19:30 +0000 (17:19 +0000)]
[AArch64] Make CNTPCTSS_EL0 and CNTVCTSS_EL0 system registers read-only
The `CNTPCTSS_EL0` and `CNTVCTSS_EL0` system registers, part of
Armv8.6-A's Enhanced Counter Virtualization extension (FEAT_ECV), are
described as read-only in the Arm ARM. This updates their implementation
to match the spec.
Original patch by Simon Tatham.
Reviewed By: lenary
Differential Revision: https://reviews.llvm.org/D141398
David Green [Mon, 23 Jan 2023 11:22:11 +0000 (11:22 +0000)]
[ARM] Don't emit Arm speculation hardening thunks under Thumb and vice-versa
Given a patch like D129506, using instructions not valid for the current
target feature set becomes an error. This means that emitting Arm
instructions in a Thumb target (or vice versa) becomes an error. When
running in Thumb mode only thumb thunks will be needed, and in Arm mode
only arm thunks are needed. This patch limits the emitted thunks to just
the ones valid for the current architecture.
Differential Revision: https://reviews.llvm.org/D129693
Timm Bäder [Tue, 20 Dec 2022 14:57:32 +0000 (15:57 +0100)]
[clang][Interp][NFC] Remove InitFn code
This is unused.
Nikita Popov [Mon, 23 Jan 2023 11:11:33 +0000 (12:11 +0100)]
[PassBuilder] Detect loop-mssa for licm with parameters (PR60149)
When auto-detecting loop-mssa for licm/lnicm, also handle the case
where there are pass parameters.
Fixes https://github.com/llvm/llvm-project/issues/60149.
Nikita Popov [Mon, 23 Jan 2023 10:57:39 +0000 (11:57 +0100)]
[LICM] Don't generate crash dialog for missing MSSA
This is a user error, so we should not be asking them to report
an issue.
David Spickett [Fri, 13 Jan 2023 14:23:53 +0000 (14:23 +0000)]
[LLDB] Remove return value from DumpRegisterValue
No one ever checks it. Also convert to early return.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D141687
Guillaume Chatelet [Mon, 23 Jan 2023 10:43:34 +0000 (10:43 +0000)]
Revert D142108 "[libc][NFC] Detect host CPU features using try_compile instead of try_run."
Build bots are failing.
https://lab.llvm.org/buildbot/#/builders/90/builds/44634
This reverts commit
9acc2f37bdfce08ca0c2faec03392db10d1bb7a9.
Luca Di Sera [Mon, 23 Jan 2023 09:05:51 +0000 (10:05 +0100)]
Add clang_CXXMethod_isExplicit to libclang
The new method is a wrapper of `CXXConstructorDecl::isExplicit` and
`CXXConversionDecl::isExplicit`, allowing the user to recognize whether
the declaration pointed to by a cursor was marked with the explicit
specifier.
An export for the function, together with its documentation, was added
to "clang/include/clang-c/Index.h" with an implementation provided in
"clang/tools/libclang/CIndex.cpp".
The implementation is based on similar `clang_CXXMethod`
implementations, returning a falsy unsigned value when the cursor is not
a declaration, is not a declaration for a constructor or conversion
function or is not a relevant declaration that was marked with the
`explicit` specifier.
The new symbol was added to "clang/tools/libclang/libclang.map" to be
exported, under the LLVM16 tag.
"clang/tools/c-index-test/c-index-test.c" was modified to print a
specific tag, "(explicit)", for cursors that are recognized by
`clang_CXXMethod_isExplicit`.
Two new regression files, "explicit-constructor.cpp" and
"explicit-conversion-function.cpp", were added to "clang/test/Index", to
ensure that the behavior of the new function is correct for constructors
and conversion functions, respectively.
The "get-cursor.cpp", "index-file.cpp" and
"recursive-cxx-member-calls.cpp" regression files in "clang/test/Index"
were updated as they were affected by the new "(explicit)" tag.
A binding for the new function was added to libclang's python's
bindings, in "clang/bindings/python/clang/cindex.py", as the
"is_explicit_method" method under `Cursor`.
An accompanying test was added to
"clang/bindings/python/tests/cindex/test_cursor.py", mimicking the
regression tests for the C side.
The current release note for Clang, "clang/docs/ReleaseNotes.rst" was
modified to report the new addition under the "libclang" section.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D140756
Dimitry Andric [Mon, 23 Jan 2023 10:21:28 +0000 (11:21 +0100)]
Revert "[compiler-rt][builtins] Skip building (b)float16 support on i386-freebsd"
This reverts commit
45368c75582f0bded1f06d5c82c1f2ee023fb186.
There were some unexpected failures in aarch64 and arm buildbots, I will
have to investigate why these suddenly fell over.
Nikita Popov [Fri, 20 Jan 2023 16:06:46 +0000 (17:06 +0100)]
[Verifier] Check that !nonnull metadata is empty
!nonnull expectes an empty metadata argument, so check that this
is the case in the verifier. This came up as a problem in
https://reviews.llvm.org/D141386.
This requires dropping the verifier call in the compatibility-6.0.ll
test (which is not present in any of the other bitcode compatibility
tests). The original input unfortunately used typo'd nonnull
metadata.
Matt Arsenault [Mon, 19 Dec 2022 16:30:12 +0000 (11:30 -0500)]
DAG: Use getNegatedExpression in combineMinNumMaxNum
Computing the negated RHS expression just to see if it compares equal
and throw it away feels dirty.
Matt Arsenault [Thu, 15 Dec 2022 17:57:10 +0000 (12:57 -0500)]
DAG: Look through fneg when trying to create unsafe minnum/maxnum
This makes most sense for isFNegFree targets, but shouldn't make
things worse without it. This avoids AMDGPU test regressions in a
future patch.
For some reason APFloat::compareAbsoluteValue is private, so compute
the neg of the constants.
Shivam Gupta [Mon, 23 Jan 2023 09:45:22 +0000 (15:15 +0530)]
[MLIR][NFC] Fix a memset in MemRefUtils.h
found by PVS-Studio - https://pvs-studio.com/en/blog/posts/cpp/1003/, N10.
memset function expects to take int as the second actual argument,
but receives a pointer. Here, the first and the second argument of
the function are mixed up.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D142310
Michael Buch [Sat, 21 Jan 2023 02:07:24 +0000 (02:07 +0000)]
[clang][DebugInfo] Don't canonicalize names in template argument list for alias templates
**Summary**
This patch customizes the `CGDebugInfo` printing policy to stop canonicalizing
the template arugment list in `DW_AT_name` for alias templates. The motivation for
this is that we want to be able to use the `TypePrinter`s support for
omitting defaulted template arguments when emitting `DW_AT_name`.
For reference, GCC currently completely omits the template arguments
when emitting alias template DIEs.
**Testing**
* Added unit-test
Differential Revision: https://reviews.llvm.org/D142268
Timm Bäder [Sun, 1 Jan 2023 12:22:59 +0000 (13:22 +0100)]
[clang][Interp][NFC] Rename InlineDescptor::IsMutable to IsFieldMutable
Shivam Gupta [Mon, 23 Jan 2023 09:26:59 +0000 (14:56 +0530)]
[Flang] fix a copy-paste error in scope.cpp
found by PVS-Studio.
Reviewed By: jeanPerier, klausler
Differential Revision: https://reviews.llvm.org/D142306
Timm Bäder [Thu, 5 Jan 2023 12:40:26 +0000 (13:40 +0100)]
[clang][Interp][NFC] Add Record::getDestructor()
Unused for now but will be used in later commits.
Timm Bäder [Wed, 21 Dec 2022 09:35:20 +0000 (10:35 +0100)]
[clang][Interp][NFC] Remove unused using alias
Guillaume Chatelet [Thu, 19 Jan 2023 14:02:51 +0000 (14:02 +0000)]
[libc][NFC] Detect host CPU features using try_compile instead of try_run.
This implements the same behavior as D141997 but makes sure that the same detection mechanism is used between CMake and source code.
Differential Revision: https://reviews.llvm.org/D142108
Valentin Clement [Mon, 23 Jan 2023 08:44:12 +0000 (09:44 +0100)]
[flang] Deal with NULL() passed as actual arg to unlimited polymorphic dummy
NULL() passed as actual argument to a procedure with an optional
dummy argument is represented with `fir.box<none>` type. When the dummy
argument is polymoprhic or unlimited polymorphic, the SelectOp will complain
if the types of the two arguments are not identical. Add a conversion from
`fir.box<none>` to `fir.class<none>` in that case.
Other situations with optional will require a fir.rebox and will be done in
a follow up patch.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D142203
Jannik Silvanus [Tue, 20 Dec 2022 13:03:12 +0000 (14:03 +0100)]
[LangRef] Require i8s to be naturally aligned
It is widely assumed that i8 is naturally aligned (i8:8),
and that hence i8s can be used to access arbitrary bytes.
As discussed in https://discourse.llvm.org/t/status-of-overaligned-i8,
this patch makes this assumption explicit, by documenting it in
the LangRef, and enforcing it when parsing a data layout string.
Historically, there have been data layouts that violate this requirement,
notably the old DXIL data layout that aligns i8 to 32 bits.
A previous patch (df1a74a) enabled importing modules with invalid data layouts
using override callbacks.
Users who wish to continue importing modules with overaligned i8s (e.g. DXIL)
thus need to provide a data layout override callback that fixes the
data layout, at minimum by setting natural alignment for i8.
Any further adjustments to the module (e.g. adding padding bytes if necessary)
need to be done after module import. In the case of DXIL, this should not be
necessary, because i8 usage in DXIL is very limited and its alignment actually
does not matter, see
https://github.com/microsoft/DirectXShaderCompiler/blob/main/docs/DXIL.rst#primitive-types
Differential Revision: https://reviews.llvm.org/D142211
Johannes Doerfert [Mon, 23 Jan 2023 07:11:55 +0000 (23:11 -0800)]
[Attributor] Add initial support for vectors in AAPointerInfo
While full support requires more work (see TODOs), this allows us to
handle vector writes with a single constant value properly. For now,
we can handle the same constant values stored to all elements if
everything is of a fixed size.
Johannes Doerfert [Mon, 23 Jan 2023 07:18:55 +0000 (23:18 -0800)]
[Attributor] Multi-range accesses can be exact
Even if we have multiple access ranges, the access can be exact. It is
not a MUST access but that is taken care of elsewhere. The tests were
wrong as they contained uninitialized memory. When the memory is
initialized it works as expected.
Johannes Doerfert [Mon, 23 Jan 2023 03:55:32 +0000 (19:55 -0800)]
[OpenMP] Identify non-aligned barriers executed in an aligned context
Even if a barrier does not enforce aligned execution, it will
effectively be like an aligned barrier if it is executed by all threads
in an aligned way. We lack control flow divergence analysis here so we
can only do (basic block) local reasoning for now.
LLVM GN Syncbot [Mon, 23 Jan 2023 04:09:18 +0000 (04:09 +0000)]
[gn build] Port
7458908f12da
Johannes Doerfert [Mon, 23 Jan 2023 04:05:06 +0000 (20:05 -0800)]
[OpenMP][FIX] Ensure not to dereference a nullptr
Nikolas Klauser [Mon, 9 Jan 2023 02:01:26 +0000 (03:01 +0100)]
[libc++] Refactor clang-query checks to clang-tidy checks to get less obscure error messages
Also remove clang-query related code, since it's unused now.
Reviewed By: ldionne, Mordante, #libc
Spies: libcxx-commits, arichardson
Differential Revision: https://reviews.llvm.org/D141805
Nikolas Klauser [Thu, 8 Dec 2022 08:40:54 +0000 (09:40 +0100)]
[libc++] Improve binary size when using __transaction
__exception_guard is a no-op in -fno-exceptions mode to produce better code-gen. This means that we don't provide the strong exception guarantees. However, Clang doesn't generate cleanup code with exceptions disabled, so even if we wanted to provide the strong exception guarantees we couldn't. This is also only relevant for constructs with a stack of -fexceptions > -fno-exceptions > -fexceptions code, since the exception can't be caught where exceptions are disabled. While -fexceptions > -fno-exceptions is quite common (e.g. libc++.dylib > -fno-exceptions), having another layer with exceptions enabled seems a lot less common, especially one that tries to catch an exception through -fno-exceptions code.
Fixes https://github.com/llvm/llvm-project/issues/56783
Reviewed By: ldionne, Mordante, huixie90, #libc
Spies: EricWF, alexfh, hans, joanahalili, libcxx-commits
Differential Revision: https://reviews.llvm.org/D133661
Wang, Xin10 [Mon, 23 Jan 2023 02:37:26 +0000 (10:37 +0800)]
[DAGCombine]Expand usage of CreateBuildVecShuffle to make full use of vector ops
Now, when llc encounters the case that contains a lot of
extract_vector_elt and a BUILD_VECTOR, it will replace these to
vector_shuffle to decrease the size of code, the actions are done in
createBuildVecShuffle in DAGCombiner.cpp, but now the code cannot handle
the case that the size of source vector reg is more than twice the dest
size.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D139685
Kazu Hirata [Mon, 23 Jan 2023 03:14:33 +0000 (19:14 -0800)]
[Support] Use llvm::byteswap in SwapByteOrder.h (NFC)
This patch defines ByteSwap_{32,64} and getSwappedBytes with
llvm::byteswap.
It's tempting to define something like:
template <typename T,
typename = std::enable_if_t<std::is_integral_v<T>>>
inline T getSwappedBytes(T C) { return llvm::byteswap(C); }
But this doesn't work. The host compiler would issue:
error: call to 'getSwappedBytes' is ambiguous
while compiling lldb/source/Utility/UUID.cpp.
Johannes Doerfert [Mon, 23 Jan 2023 02:30:46 +0000 (18:30 -0800)]
[OpenMP][FIX] Adjust enum size to avoid assertion after D142320
Yaxun (Sam) Liu [Fri, 20 Jan 2023 19:42:58 +0000 (14:42 -0500)]
[HIP] Change default offload arch to gfx906
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D142246
Matt Arsenault [Thu, 15 Dec 2022 18:33:16 +0000 (13:33 -0500)]
ARM: Add baseline test for fneg + fcmp + select combine
Sergei Barannikov [Mon, 23 Jan 2023 00:53:22 +0000 (03:53 +0300)]
[MC] Replace single-case switch with an if (NFC)
Same as
e5f746e9 but for MasmParser.
Ben Shi [Sun, 22 Jan 2023 05:47:57 +0000 (13:47 +0800)]
[AVR] Emit 'eicall' for devices with large program memory
Fixes https://github.com/llvm/llvm-project/issues/58856
Reviewed By: aykevl
Differential Revision: https://reviews.llvm.org/D142298
Johannes Doerfert [Wed, 14 Dec 2022 23:08:35 +0000 (15:08 -0800)]
[OpenMP] Merge barrier elimination into AAExecutionDomain
With this patch we track aligned barriers in AAExecutionDomain and also
delete unnecessary barriers there. This allows us to eliminate barriers
across blocks, across functions, and in the presence of complex accesses
that do not force a barrier. Further, we can use the collected
information to enable store-load forwarding in a threaded environment
(follow up patch).
Differential Revision: https://reviews.llvm.org/D140463
Sergei Barannikov [Mon, 23 Jan 2023 00:10:02 +0000 (03:10 +0300)]
[MC] Replace a switch with two 'if's (NFC)
This simplifies logic a bit and helps to reduce the future diff.
Shilei Tian [Mon, 23 Jan 2023 00:10:46 +0000 (19:10 -0500)]
[OpenMP][DeviceRTL][NFC] Use `OMPTgtExecModeFlags` from `llvm/include/llvm/Frontend/OpenMP/OMPDeviceConstants.h`
This patch makes preparation for a series that will enable per-kernel information
used in both host and device runtime. Some variables/enums, such as `OMPTgtExecModeFlags`,
have to be shared by both of them. A new header `OMPDeviceConstants.h` is added,
containing code that will be shared by them. We will introduce more variables soon.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D142320
Johannes Doerfert [Fri, 23 Dec 2022 02:21:00 +0000 (18:21 -0800)]
[OpenMP] Guarding restrictions are required only for guarding
If we do not guard code during SPMDzation, we do not need to check
conditions for successfull guarding. That is, even if some code is
executed in different modes, it does not prevent SPMDzation if there is
no guarded code in there.
Johannes Doerfert [Sun, 22 Jan 2023 23:44:30 +0000 (15:44 -0800)]
[OpenMP][FIX] Properly update ParallelLevels tracker
Johannes Doerfert [Sat, 14 Jan 2023 03:10:46 +0000 (19:10 -0800)]
[OpenMP][FIX] Use thread id not team id for masked section
Kazu Hirata [Sun, 22 Jan 2023 22:34:43 +0000 (14:34 -0800)]
[Support] Use llvm::bit_floor in PowerOf2Floor (NFC)
Kazu Hirata [Sun, 22 Jan 2023 22:05:14 +0000 (14:05 -0800)]
[llvm] Use llvm::bit_ceil (NFC)
In both of these cases, the arguments to Log2_32_Ceil are known to be
nonzero.
Kazu Hirata [Sun, 22 Jan 2023 21:41:23 +0000 (13:41 -0800)]
[llvm] Use llvm::bit_floor (NFC)
In all these cases, the arguments to Log2_32 are known to be nonzero,
so we don't have to worry about "1 << -1".
Dimitry Andric [Sun, 16 Oct 2022 18:24:42 +0000 (20:24 +0200)]
[compiler-rt][builtins] Skip building (b)float16 support on i386-freebsd
Since bfloat16 and float16 support is not available for i386-freebsd,
the `truncdfbf2.c` and `truncsfbf2.c` builtin sources should be skipped
when targeting that platform, and `COMPILER_RT_HAS_FLOAT16` should not
be defined.
However, the CMake configuration stage runs its tests with the default
target, which normally is amd64-freebsd, so it will detect both bfloat16
and float16 support.
Move adding of the `COMPILER_RT_HAS_FLOAT16` define to the `foreach()`
loop where all the supported architectures are handled, and do not
enable it when targeting i386-freebsd.
Also remove the bfloat16 sources from the `i386_SOURCES` list, when
targeting i386-freebsd.
Differential Revision: https://reviews.llvm.org/D136044
Kazu Hirata [Sun, 22 Jan 2023 20:48:51 +0000 (12:48 -0800)]
Use llvm::popcount instead of llvm::countPopulation(NFC)
Aaron Puchert [Sun, 22 Jan 2023 20:35:09 +0000 (21:35 +0100)]
[CMake] Look up target subcomponents in LLVM_AVAILABLE_LIBS
In an installation using the all-contained libLLVM.so, individual
components are not available as targets, so we have to look them up in
LLVM_AVAILABLE_LIBS just like llvm_map_components_to_libnames does it.
Here I don't think we need the capitalized names though because we know
the right capitalization. But I might be wrong.
This is required by dragonffi, who call llvm_map_components_to_libnames
on a list containing ${LLVM_NATIVE_ARCH}. Downstream bug report:
https://bugzilla.opensuse.org/show_bug.cgi?id=1180748.
Differential Revision: https://reviews.llvm.org/D96670
Roman Lebedev [Sun, 22 Jan 2023 19:25:38 +0000 (22:25 +0300)]
[SCEV] `getRangeRefIter()`: don't forget to recurse into casts
I'm not really sure the problem can be nicely exposed via a lit test,
since we don't give up on range calculation for deeply nested ranges,
but if i add an assertion that those opcodes are never encountered,
the assertion fails in a number of existing tests.
In reality, the iterative approach is still pretty partial:
1. `Seen` should not be there. We want the last instance of expression, not the first one
2. There should be a check that `getRangeRefIter()` does not self-recurse
Roman Lebedev [Sun, 22 Jan 2023 18:49:04 +0000 (21:49 +0300)]
[NFC][SCEV] Reflow `getRangeRefIter()` into an exhaustive switch
And, this shows a bug in the original code:
why do we not recurse into casts?
If i add an assertion that those opcodes are never encountered,
the assertion fails in a number of existing tests.
Roman Lebedev [Sun, 22 Jan 2023 18:55:28 +0000 (21:55 +0300)]
[NFC][SCEV] `GetMinTrailingZerosImpl()`: deduplicate handling
`scPtrToInt` recieves same treatment as normal n-ary ops.
Roman Lebedev [Sun, 22 Jan 2023 18:29:07 +0000 (21:29 +0300)]
[NFC][SCEV] Reflow `GetMinTrailingZerosImpl()` into an exhaustive switch
Florian Hahn [Sun, 22 Jan 2023 20:22:41 +0000 (20:22 +0000)]
[Dominators] Introduce DomTreeNodeTraits to allow customization. (NFC)
This patch introduces DomTreeNodeTraits for customization. Clients can implement
DomTreeNodeTraitsCustom to provide custom ParentPtr, getEntryNode and getParent.
There's also a default specialization if DomTreeNodeTraitsCustom is not implemented,
that assume a Function-like NodeT. This is what is used for the existing DominatorTree
and MachineDominatorTree.
The main motivation for this patch is using DominatorTreeBase across all
regions of a VPlan, see D140513.
Reviewed By: kuhar
Differential Revision: https://reviews.llvm.org/D142162
Piotr Fusik [Sun, 22 Jan 2023 18:59:52 +0000 (19:59 +0100)]
[NFC] Fix "form/from" typos
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D142007
Kazu Hirata [Sun, 22 Jan 2023 18:41:13 +0000 (10:41 -0800)]
[Support] Use functions from bit.h (NFC)
This patch makes the following replacements:
countLeadingZeros -> llvm::countl_zero
countTrailingZeros -> llvm::countr_zero
countPopulation -> llvm::popcount
Simon Pilgrim [Sun, 22 Jan 2023 18:21:08 +0000 (18:21 +0000)]
[ADT] llvm::bit_cast - use __builtin_bit_cast if available
If the compiler supports __builtin_bit_cast we should try to use it instead of std::memcpy (and avoid including the cstring header).
Differential Revision: https://reviews.llvm.org/D142305
Kazu Hirata [Sun, 22 Jan 2023 17:29:35 +0000 (09:29 -0800)]
[ADT] Add llvm::byteswap to bit.h
This patch adds C++23-style byteswap to bit.h.
The implementation and tests are largely taken from
llvm/include/llvm/Support/SwapByteOrder.h and
llvm/unittests/Support/SwapByteOrderTest.cpp, respectively.
Differential Revision: https://reviews.llvm.org/D142274
Sergei Barannikov [Sun, 22 Jan 2023 17:07:16 +0000 (20:07 +0300)]
[MC][test] Fix a typo
Simon Pilgrim [Sun, 22 Jan 2023 17:09:46 +0000 (17:09 +0000)]
[PowerPC] Regenerate vec_absd.ll test checks
Simon Pilgrim [Sun, 22 Jan 2023 15:41:44 +0000 (15:41 +0000)]
[DAG] visitINSERT_VECTOR_ELT - use mergeEltWithShuffle to merge inserted vector element chain into base shuffle node
This allows us to merge insert_elt(insert_elt(shuffle(x,y),extract_elt(x,c1),c2),extract_elt(y,c3),c4) style insertion chains into a new shuffle node.
I had hoped to remove mergeInsertEltWithShuffle entirely, but that case doesn't have the one use limits so we would regress in a few other cases.
Fixes the vector-shuffle-combining.ll regressions in D127115
Shivam Gupta [Sun, 22 Jan 2023 16:11:19 +0000 (21:41 +0530)]
[Flang][NFC] fix a cpoy-paste in fold-logical.cpp
found by PVS-Studio.
Roman Lebedev [Sun, 22 Jan 2023 15:50:52 +0000 (18:50 +0300)]
[NFC][SCEV] Reflow `impliesPoison()` into an exhaustive switch
Shivam Gupta [Sun, 22 Jan 2023 15:54:52 +0000 (21:24 +0530)]
[PVS-Studio][NFC] fix a typo in ShapeUtils.h
Mark de Wever [Sun, 22 Jan 2023 15:49:39 +0000 (16:49 +0100)]
[libc++][test] Disable parts requiring locales.
This part should be guarded, but there are no proper guards yet.
Therefore disable the offending part. This was reported post commit in
D140653.
Sanjay Patel [Sun, 22 Jan 2023 14:43:35 +0000 (09:43 -0500)]
[InstSimplify] (X || Y) && Y --> Y (for poison-safe logical ops)
https://alive2.llvm.org/ce/z/oT_tEh
This is the conjugate/sibling pattern suggested in post-commit
feedback for:
9444252a674df5952bb5af2b76348ae4b45
issue #60167
Sanjay Patel [Sun, 22 Jan 2023 14:19:55 +0000 (09:19 -0500)]
[InstSimplify] add tests for poison-safe variants of (X || Y) && Y; NFC
Mark de Wever [Sun, 22 Jan 2023 15:21:11 +0000 (16:21 +0100)]
[clang][doc] Fixes formatting of a text block.
Simon Pilgrim [Sun, 22 Jan 2023 15:19:17 +0000 (15:19 +0000)]
[X86] avx2-vbroadcast.ll - use X86 check prefix instead of X32
We try to use X32 for tests on gnux32 triples
Markus Böck [Sun, 22 Jan 2023 15:11:27 +0000 (16:11 +0100)]
[mlir][ods] Simplify signature of `custom` printers and parsers of Attributes and Types in presence of default constructible parameters
The vast majority of parameters of C++ types used as parameters for Attributes and Types are likely to be default constructible. Nevertheless, TableGen conservatively generates code for the custom directive, expecting signatures using FailureOr<T> for all parameter types T to accomodate them possibly not being default constructible. This however reduces the ergonomics of the likely case of default constructible parameters.
This patch fixes that issue, while barely changing the generated TableGen code, by using a helper function that is used to pass any parameters into custom parser methods. If the type is default constructible, as deemed by the C++ compiler, a default constructible instance is created and passed into the parser method by reference. In all other cases it is a Noop and a FailureOr is passed as before.
Documentation was also updated to document the new behaviour.
Fixes https://github.com/llvm/llvm-project/issues/60178
Differential Revision: https://reviews.llvm.org/D142301
Simon Pilgrim [Sun, 22 Jan 2023 14:07:20 +0000 (14:07 +0000)]
[X86] commute-3dnow.ll - use X86 check prefix instead of X32
We try to use X32 for tests on gnux32 triples
Simon Pilgrim [Sun, 22 Jan 2023 14:01:04 +0000 (14:01 +0000)]
[X86] avx-vbroadcastf128.ll - use X86 check prefix instead of X32
We try to use X32 for tests on gnux32 triples
Roman Lebedev [Sun, 22 Jan 2023 14:27:17 +0000 (17:27 +0300)]
[NFC][SCEVExpander] `CmpSelCost`: use the cost of the expression, not operand
Currently, for all invocations, it's equivalent, since that is literally
how `SCEVMinMaxExpr::getType()` is defined. But for e.g. `select`,
we'll want to ask about the hand type, and not the type of the operand
that happens to be first.
Roman Lebedev [Sun, 22 Jan 2023 14:35:25 +0000 (17:35 +0300)]
[NFC][SCEV] Reflow `computeSCEVAtScope()` into an exhaustive switch
Roman Lebedev [Sun, 22 Jan 2023 14:15:16 +0000 (17:15 +0300)]
[NFC][SCEV] `getRelevantLoop()`: deduplicate handling
Roman Lebedev [Sun, 22 Jan 2023 14:08:45 +0000 (17:08 +0300)]
[NFC][SCEV] `getBlockDisposition()`: deduplicate handling
Roman Lebedev [Sun, 22 Jan 2023 13:56:28 +0000 (16:56 +0300)]
[NFC][SCEV] `getLoopDisposition()`: deduplicate handling
Roman Lebedev [Sun, 22 Jan 2023 13:32:02 +0000 (16:32 +0300)]
[NFC][SCEV] `computeSCEVAtScope()`: deduplicate handling
Casts and udiv get the exactly the same handling as n-ary,
there is no point in special-handling anything.
Matt Arsenault [Fri, 16 Dec 2022 03:04:36 +0000 (22:04 -0500)]
AMDGPU: Copy a source modifier test for f16/v2f16
This is essentially a modernized copy of
select-fabs-fneg-extract.ll. Stop using kernels with loads and stores,
don't use fsub for fneg, and port the examples to half.
Matt Arsenault [Sun, 18 Dec 2022 12:25:56 +0000 (07:25 -0500)]
AMDGPU: Add modern copy of fneg combines test