Krzysztof Parzyszek [Tue, 8 Feb 2022 23:26:12 +0000 (15:26 -0800)]
[Hexagon] Fix operation actions for v128f16
There were more cases of operations that should have been "Custom" for
v128f16, but ended up "Legal" (e.g. load and store).
Mogball [Tue, 8 Feb 2022 23:03:23 +0000 (23:03 +0000)]
[mlir][ods] Attribute and type formats: support whitespaces
Supports whitespace elements: ` ` and `\\n` as well as the "empty" whitespace `` that removes an otherwise printed space.
Depends on D118208
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D118210
Matt Arsenault [Sat, 20 Mar 2021 15:25:49 +0000 (11:25 -0400)]
GlobalISel: Add FoldBinOpIntoSelect combine
This will do the combine in cases that should fold, but don't
now. e.g. we're relying on the CSEMIRBuilder's incomplete constant
folding. For instance it doesn't handle FP operations or vectors (and
we don't have separate constant folding combines either to catch
them).
Matt Arsenault [Sat, 20 Mar 2021 22:19:09 +0000 (18:19 -0400)]
AMDGPU/GlobalISel: Add baseline test for binop fold into select combine
Daniel Thornburgh [Fri, 21 Jan 2022 00:13:52 +0000 (00:13 +0000)]
[Symbolizer] Add Build ID flag to llvm-symbolizer.
This adds a --build-id=<hex build ID> flag to llvm-symbolizer. If --obj
is unspecified, this will attempt to look up the provided build ID using
whatever mechanisms are available to the Symbolizer (typically,
debuginfod). The semantics are then as if the found binary were given
using the --obj flag.
Reviewed By: jhenderson, phosek
Differential Revision: https://reviews.llvm.org/D118633
Jacques Pienaar [Tue, 8 Feb 2022 23:00:39 +0000 (15:00 -0800)]
[mlir][math] Expand coverage of atan2 expansion
Reuse the higher precision F32 approximation for the F16 one (by expanding and
truncating). This is partly RFC as I'm not sure what the expectations are here
(e.g., these are only for F32 and should not be expanded, that reusing
higher-precision ones for lower precision is undesirable due to increased
compute cost and only approximations per exact type is preferred, or this is
appropriate [at least as fallback] but we need to see how to make it more
generic across all the patterns here).
Differential Revision: https://reviews.llvm.org/D118968
Casey Carter [Thu, 30 Dec 2021 00:02:45 +0000 (16:02 -0800)]
[libcxx][test] Disable bad unique_ptr<T[]> to shared_ptr<U[]> conversion test cases
for non-libc++. I've reported allowance of these conversions as a bug at https://llvm.org/PR53368.
Differential Revision: https://reviews.llvm.org/D117996
Casey Carter [Sat, 22 Jan 2022 20:41:30 +0000 (12:41 -0800)]
[libcxx][test] tests for strengthened `noexcept` are non-portable
Differential Revision: https://reviews.llvm.org/D117966
Fangrui Song [Tue, 8 Feb 2022 22:48:34 +0000 (14:48 -0800)]
[sanitizer] Guard the whole ThreadDescriptorSize block with #if !SANITIZER_GO after D119007
The SANITIZER_GO code path reports an undefined symbol error for dlsym.
```
FAILED: projects/compiler-rt/lib/tsan/rtl/CMakeFiles/GotsanRuntimeCheck /tmp/RelA/projects/compiler-rt/lib/tsan/rtl/CMakeFiles/GotsanRuntimeCheck
```
Nikolas Klauser [Mon, 7 Feb 2022 20:54:49 +0000 (21:54 +0100)]
[libc++] Prepare string.{access, capacity, cons} tests for constexpr
Reviewed By: ldionne, #libc
Spies: libcxx-commits, arphaman
Differential Revision: https://reviews.llvm.org/D119123
Ahmed Bougacha [Tue, 8 Feb 2022 21:33:39 +0000 (13:33 -0800)]
[clang] Document objc_unsafeClaimAutoreleasedReturnValue.
This has been added a few years ago but wasn't listed here.
Ahmed Bougacha [Tue, 8 Feb 2022 21:29:49 +0000 (13:29 -0800)]
[clang][Driver] Use a VersionTuple for darwin linker version checks.
This unifies a couple spots that did it manually by checking the
flag directly.
It does mean that we're now dropping the 5th component, but that's
not used in any of these checks, and to my knowledge it's never been
used in ld64.
Benjamin Kramer [Tue, 8 Feb 2022 22:01:04 +0000 (23:01 +0100)]
[bazel] Port
216575e58102
Siva Chandra Reddy [Tue, 8 Feb 2022 17:09:12 +0000 (17:09 +0000)]
[libc][Obvious] Fix typo in mkdir and mkdirat implementations.
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D119265
Sylvestre Ledru [Tue, 8 Feb 2022 20:54:32 +0000 (21:54 +0100)]
README: Point to the discourse & discord forums
Differential Revision: https://reviews.llvm.org/D119279
Snehasish Kumar [Fri, 4 Feb 2022 19:11:47 +0000 (11:11 -0800)]
Revert "Revert "[ProfileData] Read and symbolize raw memprof profiles.""
This reverts commit
dbf47d227d080e4eb7239b589660f51d7b08afa9.
Reapply https://reviews.llvm.org/D116784 now that
https://reviews.llvm.org/D118413 has landed with a couple of fixes:
* fix raw profile reader unaligned access identified by ubsan
* fix windows build by using MOCK_CONST_METHOD3 instead of MOCK_METHOD.
Arthur Eubanks [Thu, 3 Feb 2022 19:56:20 +0000 (11:56 -0800)]
[test] Remove -fno-experimental-new-pass-manager -O1 from sanitize-address-field-padding.cpp
-O1 doesn't seem necessary here.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D118936
Arthur Eubanks [Tue, 8 Feb 2022 05:56:28 +0000 (21:56 -0800)]
[clang] Properly cache member pointer LLVM types
When not going through the main Clang->LLVM type cache, we'd
accidentally create multiple different opaque types for a member pointer
type.
This allows us to remove the -verify-type-cache flag now that
check-clang passes with it on. We can do the verification in expensive
builds. Previously microsoft-abi-member-pointers.cpp was failing with
-verify-type-cache.
I suspect that there may be more issues when we have multiple member
pointer types and we clear the cache, but we can leave that for later.
Followup to D118744.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D119215
Florian Hahn [Tue, 8 Feb 2022 21:18:40 +0000 (21:18 +0000)]
[LV] Move buildScalarSteps out of ILV (NFC).
This makes the function independent of shared state in ILV (ensures no
new dependencies on things like the cost model are introduced) and allows
for use directly in recipe's ::execute functions.
Mogball [Tue, 8 Feb 2022 21:13:58 +0000 (21:13 +0000)]
[mlir][ods] NFC fix tblgen crash with empty assembly format
tyb0807 [Tue, 25 Jan 2022 22:51:49 +0000 (22:51 +0000)]
[AArch64] ACLE feature macro for Armv8.8-A MOPS
This introduces the new __ARM_FEATURE_MOPS ACLE feature test macro,
which signals the availability of the new Armv8.8-A/Armv9.3-A
instructions for standardising memcpy, memset and memmove operations.
This patch supersedes the one from https://reviews.llvm.org/D116160.
Differential Revision: https://reviews.llvm.org/D118199
Krzysztof Parzyszek [Sat, 5 Feb 2022 00:14:13 +0000 (16:14 -0800)]
[Hexagon] Fix crash with shuffle_vector of v128f16
Benjamin Kramer [Tue, 8 Feb 2022 20:53:30 +0000 (21:53 +0100)]
[Debuginfod][Symbolizer] Cut dependency cycle after
4a6553f4c2be
Florian Weimer [Tue, 8 Feb 2022 20:46:41 +0000 (12:46 -0800)]
[sanitizer] Use _thread_db_sizeof_pthread to obtain struct pthread size
This symbol has been exported (as an internal GLIBC_PRIVATE symbol) from libc.so.6 starting with glibc 2.34. glibc uses it internally for its libthread_db implementation to enable thread debugging on GDB, so it is unlikely to go away for now.
Fixes #52989.
Reviewed By: #sanitizers, MaskRay, vitalybuka
Differential Revision: https://reviews.llvm.org/D119007
Guillaume Chatelet [Mon, 7 Feb 2022 15:40:28 +0000 (15:40 +0000)]
[libc] Replace type punning with bit_cast
Although type punning is defined for union in C, it is UB in C++.
This patch introduces a bit_cast function to convert between types in a safe way.
This is necessary to get llvm-libc compile with GCC.
This patch is extracted from D119002.
Differential Revision: https://reviews.llvm.org/D119145
Amir Ayupov [Tue, 8 Feb 2022 05:14:32 +0000 (21:14 -0800)]
[BOLT][TEST] Add .so instrumentation test
Summary: Shared object instrumentation test
Test Plan: bin/llvm-lit -a bolt/test/X86/internal-call-instrument-so.s
Reviewers: rafauler
FBD34064557
Sylvestre Ledru [Tue, 8 Feb 2022 20:33:01 +0000 (21:33 +0100)]
Fix a typo (occured => occurred)
Reported:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1005195
Valentin Clement [Tue, 8 Feb 2022 20:26:16 +0000 (21:26 +0100)]
[flang][codegen] Keep primitive type for extractvalue and insertvalue
llvm.insertvalue and llvm.extractvalue need LLVM primitive type
for the indexing operands. While upstreaming the TargetRewrite pass the change
was made from i32 to index without knowing this restriction. This patch reverts
back the types used for indexing in the two ops created in this pass.
the error you will receive when lowering to LLVM IR with the current code
is the following:
```
'llvm.insertvalue' op operand #1 must be primitive LLVM type, but got 'index'
```
Reviewed By: jeanPerier, schweitz
Differential Revision: https://reviews.llvm.org/D119253
Louis Dionne [Mon, 7 Feb 2022 19:52:17 +0000 (14:52 -0500)]
[libc++] Remove _LIBCPP_ABI_UNSTABLE
Previously, _LIBCPP_ABI_UNSTABLE would be used interchangeably with
_LIBCPP_ABI_VERSION >= 2. This was confusing and creating unnecessary
complexity.
This patch removes _LIBCPP_ABI_UNSTABLE -- instead, the LIBCXX_ABI_UNSTABLE
CMake option will result in the LIBCXX_ABI_VERSION being set to '2', the
current unstable ABI. As a result, in the code, we only have _LIBCPP_ABI_VERSION
to check in order to query the current ABI version.
As a fly-by, this also defines the ABI namespace during CMake configuration
to reduce complexity in __config. I believe it was previously done this
way because we used to try to use __config_site as seldom as possible.
Now that we always ship a __config_site, it doesn't really matter and
I think being explicit about how the library is configured in the __config_site
is actually a feature.
Differential Revision: https://reviews.llvm.org/D119173
Louis Dionne [Fri, 4 Feb 2022 20:37:01 +0000 (15:37 -0500)]
[libc++] Fix modules and benchmarks CI builds when incomplete features are disabled
Differential Revision: https://reviews.llvm.org/D119036
Mogball [Tue, 8 Feb 2022 16:59:57 +0000 (16:59 +0000)]
[mlir][ods] Optional Attribute or Type Parameters
Implements optional attribute or type parameters, including support for such parameters in the assembly format `struct` directive. Also implements optional groups.
Depends on D117971
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D118208
harsh [Tue, 8 Feb 2022 19:50:00 +0000 (19:50 +0000)]
Add case to handle 0-D vectors in FlattenContiguousRowMajorTransferWritePattern and FlattenContiguousRowMajorTransferReadPattern.
For 0-D as well as 1-D vectors, both these patterns should
return a failure as there is no need to collapse the shape
of the source. Currently, only 1-D vectors were handled. This
patch handles the 0-D case as well.
Reviewed By: Benoit, ThomasRaoux
Differential Revision: https://reviews.llvm.org/D119202
Martin Storsjö [Tue, 8 Feb 2022 08:48:32 +0000 (10:48 +0200)]
[clang] [MinGW] Recognize -lcrtdll as a library replacing -lmsvcrt
Differential Revision: https://reviews.llvm.org/D119234
James Y Knight [Mon, 7 Feb 2022 16:31:22 +0000 (11:31 -0500)]
Revert "[Clang] Propagate guaranteed alignment for malloc and others"
The above change assumed that malloc (and friends) would always
allocate memory to getNewAlign(), even for allocations which have a
smaller size. This is not actually required by spec (a 1-byte
allocation may validly have 1-byte alignment).
Some real-world malloc implementations do not provide this guarantee,
and thus this optimization is breaking programs.
Fixes #53540
This reverts commit
c2297544c04764237cedc523083c7be2fb3833d4.
Differential Revision: https://reviews.llvm.org/D118804
LLVM GN Syncbot [Tue, 8 Feb 2022 19:21:13 +0000 (19:21 +0000)]
[gn build] Port
4a6553f4c2be
Daniel Thornburgh [Tue, 25 Jan 2022 22:23:38 +0000 (22:23 +0000)]
[Debuginfod] [Symbolizer] Break debuginfod out of libLLVM.
Debuginfod can pull in libcurl as a dependency, which isn't appropriate
for libLLVM. (See
https://gitlab.freedesktop.org/mesa/mesa/-/issues/5732).
This change breaks out debuginfod into a separate non-component library
that can be used directly in llvm-symbolizer. The tool can inject
debuginfod into the Symbolizer library via an abstract DebugInfoFetcher
interface, breaking the dependency of Symbolizer on debuinfod.
See https://github.com/llvm/llvm-project/issues/52731
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D118413
Amilendra Kodithuwakku [Tue, 8 Feb 2022 19:11:49 +0000 (19:11 +0000)]
[clang][ARM] Re-word PACBTI warning.
The original warning added in D115501 when pacbti is used with an
incompatible architecture was not exactly correct because it was
not really ignored and can affect codegen.
Therefore reword to say that the pacbti option is incompatible with
the given architecture.
Reviewed By: chill
Differential Revision: https://reviews.llvm.org/D119166
Leonard Chan [Tue, 8 Feb 2022 18:51:53 +0000 (10:51 -0800)]
[clang][Fuchsia] Ensure static sanitizer libs are only linked in after the -nostdlib check
Differential Revision: https://reviews.llvm.org/D119201
Steffen Larsen [Tue, 8 Feb 2022 18:27:52 +0000 (13:27 -0500)]
Allow parameter pack expansions and initializer lists in annotate attribute
These changes make the Clang parser recognize expression parameter pack
expansion and initializer lists in attribute arguments. Because
expression parameter pack expansion requires additional handling while
creating and instantiating templates, the support for them must be
explicitly supported through the AcceptsExprPack flag.
Handling expression pack expansions may require a delay to when the
arguments of an attribute are correctly populated. To this end,
attributes that are set to accept these - through setting the
AcceptsExprPack flag - will automatically have an additional variadic
expression argument member named DelayedArgs. This member is not
exposed the same way other arguments are but is set through the new
CreateWithDelayedArgs creator function generated for applicable
attributes.
To illustrate how to implement support for expression pack expansion
support, clang::annotate is made to support pack expansions. This is
done by making handleAnnotationAttr delay setting the actual attribute
arguments until after template instantiation if it was unable to
populate the arguments due to dependencies in the parsed expressions.
Alex Brachet [Tue, 8 Feb 2022 18:32:18 +0000 (18:32 +0000)]
[libc][NFC] Remove all Linux specific code to respective linux/ directories
These were all the non OS agnostic implementations I could find in general directories.
Currently none of these functions are actually enabled, but for when they do it makes sense that they be in linux/ specific directories.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D119164
Roman Lebedev [Tue, 8 Feb 2022 18:17:23 +0000 (21:17 +0300)]
[SimplifyCFG] 'merge compatible invokes': fully support indirect invokes
As long as *all* the invokes in the set are indirect,
we can merge them, but don't merge direct invokes into the set,
even though it would be legal to do.
Roman Lebedev [Tue, 8 Feb 2022 18:16:39 +0000 (21:16 +0300)]
[SimplifyCFG] 'merge compatible invokes': don't create trivial PHI's with all-identical incoming values
Roman Lebedev [Tue, 8 Feb 2022 17:43:26 +0000 (20:43 +0300)]
[NFC][SimplifyCFG] 'merge compatible invokes': tests for indirect invokes.
Stanislav Mekhanoshin [Thu, 13 Jan 2022 00:03:16 +0000 (16:03 -0800)]
[AMDGPU] Select VGPR versions of MFMA if possible
We can select _vgprcd versions of MAI instructions and have no
AGPRs with the whole budget left for VGPRs if:
1. This is a kernel;
2. It has no calls;
3. It runs at least on 2 waves thus having not more that 256 VGPRs.
4. There is no inline asm requesting AGPRs.
Differential Revision: https://reviews.llvm.org/D117253
Mahesh Ravishankar [Mon, 7 Feb 2022 17:45:28 +0000 (17:45 +0000)]
[mlir][Linalg] NFC: Combine elementwise fusion test passes.
There are a few different test passes that check elementwise fusion in
Linalg. Consolidate them to a single pass controlled by different pass
options (in keeping with how `TestLinalgTransforms` exists).
Josh Mottley [Thu, 3 Feb 2022 21:54:56 +0000 (21:54 +0000)]
[flang] Upstream partial lowering of GET_ENVIRONMENT_VARIABLE intrinsic
This patch adds partial lowering of the "GET_ENVIRONMENT_VARIABLE" intrinsic
to the backend runtime hook implemented in patches D111394 and D112698.
It also renames the `isPresent` lambda to `isAbsent` and moves it out to
its own function in `Command.cpp`. Corresponding comment fixes for this
are also modified. Lastly it adds the i1 type to
`RuntimeCallTestBash.h`.
Differential Revision: https://reviews.llvm.org/D118984
Craig Topper [Tue, 8 Feb 2022 17:20:19 +0000 (09:20 -0800)]
[X86] Update register RCL/RCR by 1 and immediate scheduling for Intel CPUs
Most Intel CPU scheduler files lumped the immediate and 1 instructions
together, but uops.info shows they are quite different.
For the most part the by 1 instructions were pretty accurate to the uops.info
data except the latency was 3 instead of 2 as uops.info indicates.
The by immediate instructions need 7 or 8 uops and have higher latency.
It looks like the 8-bit by immediate instructions may need even more
uops, but I just lumped them with the 16/32/64.
Noticed while checking out PR53648. So mostly I cared about the by 1
instructions.
Reviewed By: RKSimon, pengfei
Differential Revision: https://reviews.llvm.org/D119217
Corentin Jabot [Tue, 8 Feb 2022 17:09:03 +0000 (12:09 -0500)]
[C++2b] Implement multidimentional subscript operator
Implement P2128R6 in C++23 mode.
Unlike GCC's implementation, this doesn't try to recover when a user
meant to use a comma expression.
Because the syntax changes meaning in C++23, the patch is *NOT*
implemented as an extension. Instead, declaring an array with not
exactly 1 parameter is an error in older languages modes. There is an
off-by-default extension warning in C++23 mode.
Unlike the standard, we supports default arguments;
Ie, we assume, based on conversations in WG21, that the proposed
resolution to CWG2507 will be accepted.
We allow arrays OpenMP sections and C++23 multidimensional array to
coexist:
[a , b] multi dimensional array
[a : b] open mp section
[a, b: c] // error
The rest of the patch is relatively straight forward: we take care to
support an arbitrary number of arguments everywhere.
Joseph Huber [Tue, 8 Feb 2022 15:43:22 +0000 (10:43 -0500)]
[Attributor] Emit fixed-point remark on function list
This patch replaces the function we emit the remark on when we run into
the fix-point limit. Previously we got a function to emit a remark on
from the worklist's associated function. However, the worklist may not
always have an associated function in the case of global variables.
Replace this with the function set, and if there are no functions don't
emit the remark.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D119248
Joseph Huber [Tue, 8 Feb 2022 16:35:57 +0000 (11:35 -0500)]
[Libomptarget] Add header files as a dependency to CMake target
This patch manually adds the runtime include files to the list of
dependencies when we build the bitcode runtime library. Previously if
only the header was changed we would not recompile the source files.
The solution used here isn't optimal because every source file not has a
dependency on each header file regardless of if it was actually used by
that file.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D119254
Krzysztof Parzyszek [Tue, 8 Feb 2022 16:45:40 +0000 (08:45 -0800)]
[Hexagon] Alter meaning of versionless -mhvx
The documentation for the official (downstream) Qualcomm Hexagon Clang
states that -mhvx sets the HVX version to be the same as the CPU version.
The current implementation upstream would use the most recent versioned
-mhvx= flag first (if present), then the CPU version. Change the upstream
behavior to match the documented behavior of the downstream compiler.
Dawid Jurczak [Tue, 8 Feb 2022 16:23:53 +0000 (17:23 +0100)]
[NFC] Increase initial size of FoldingSets used in ASTContext and CodeGenTypes
Among many FoldingSet users most notable seem to be ASTContext and CodeGenTypes.
The reasons that we spend not-so-tiny amount of time in FoldingSet calls from there, are following:
1. Default FoldingSet capacity for 2^6 items very often is not enough.
For PointerTypes/ElaboratedTypes/ParenTypes it's not unlikely to observe growing it to 256 or 512 items.
FunctionProtoTypes can easily exceed 1k items capacity growing up to 4k or even 8k size.
2. FoldingSetBase::GrowBucketCount cost itself is not very bad (pure reallocations are rather cheap thanks to BumpPtrAllocator).
What matters is high collision rate when lot of items end up in same bucket slowing down FoldingSetBase::FindNodeOrInsertPos and trashing CPU cache
(as items with same hash are organized in intrusive linked list which need to be traversed).
This change address both issues by increasing initial size of FoldingSets used in ASTContext and CodeGenTypes.
Extracted from: https://reviews.llvm.org/D118385
Differential Revision: https://reviews.llvm.org/D118608
Benjamin Kramer [Tue, 8 Feb 2022 16:50:45 +0000 (17:50 +0100)]
[MLIR][Presburger] Fix linkage of functions in header
Static functions in a header cause spurious unused function warnings.
Jacques Pienaar [Tue, 8 Feb 2022 16:48:10 +0000 (08:48 -0800)]
[mlir][bazel] Update post 24a1
Bixia Zheng [Fri, 4 Feb 2022 22:21:43 +0000 (14:21 -0800)]
[mlir][taco] Use sparse_tensor.out to write sparse tensors to files.
Add a Python method, output_sparse_tensor, to use sparse_tensor.out to write
a sparse tensor value to a file.
Modify the method that evaluates a tensor expression to return a pointer of the
MLIR sparse tensor for the result to delay the extraction of the coordinates and
non-zero values.
Implement the Tensor to_file method to evaluate the tensor assignment and write
the result to a file.
Add unit tests. Modify test golden files to reflect the change that TNS outputs
now have a comment line and two meta data lines.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D118956
Balazs Benics [Tue, 8 Feb 2022 16:42:46 +0000 (17:42 +0100)]
Revert "[analyzer] Prevent misuses of -analyze-function"
This reverts commit
841817b1ed26c1fbb709957d54c0e2751624fbf8.
Ah, it still fails on build bots for some reason.
Pinning the target triple was not enough.
Mark de Wever [Wed, 2 Feb 2022 18:28:03 +0000 (19:28 +0100)]
[libc++][nfc] Use TEST_SAFE_STATIC.
This avoids using an libc++ internal macro in our tests.
Reviewed By: #libc, philnik, ldionne
Differential Revision: https://reviews.llvm.org/D118874
Mark de Wever [Fri, 4 Feb 2022 07:03:50 +0000 (08:03 +0100)]
[libc++] Removes cpp17_output_iterator's default constructor.
This has been suggested in D117950.
Reviewed By: ldionne, #libc, philnik
Differential Revision: https://reviews.llvm.org/D118971
Andy Yankovsky [Mon, 7 Feb 2022 20:37:38 +0000 (20:37 +0000)]
[Support] Don't print stacktrace if DbgHelp.dll hasn't been loaded yet
On Windows certain function from `Signals.h` require that `DbgHelp.dll` is loaded. This typically happens when the main program calls `llvm::InitLLVM`, however in some cases main program doesn't do that (e.g. when the application is using LLDB via `liblldb.dll`). This patch adds a safe guard to prevent crashes. More discussion in
https://reviews.llvm.org/D119009.
Reviewed By: aganea
Differential Revision: https://reviews.llvm.org/D119181
Mircea Trofin [Tue, 8 Feb 2022 15:27:11 +0000 (07:27 -0800)]
[nfc][mlgo][regalloc] Stop warnings about unused function
Added a `NoopSavedModelImpl` type which can be used as a mock AOT-ed
saved model, and further minimize conditional compilation cases. This
also removes unused function warnings on gcc.
Krzysztof Drewniak [Mon, 7 Feb 2022 21:45:40 +0000 (21:45 +0000)]
[MLIR][GPU] Update GPUToROCDL to account for ControlFlow dialect
The conversion to the new ControlFlow dialect didn't change the
GPUToROCDL pass - this commit fixes this issue.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D119188
Hongtao Yu [Fri, 28 Jan 2022 23:53:37 +0000 (15:53 -0800)]
[llvm-profgen] On-demand track optimized-away inlinees for preinliner.
Tracking optimized-away inlinees based on all probes in a binary is expansive in terms of memory usage I'm making the tracking on-demand based on profiled functions only. This saves about 10% memory overall for a medium-sized benchmark.
Before:
note: After parsePerfTraces
note: Thu Jan 27 18:42:09 2022
note: VM: 8.68 GB RSS: 8.39 GB
note: After computeSizeForProfiledFunctions
note: Thu Jan 27 18:42:41 2022
note: **VM: 10.63 GB RSS: 10.20 GB**
note: After generateProbeBasedProfile
note: Thu Jan 27 18:45:49 2022
note: VM: 25.00 GB RSS: 24.95 GB
note: After postProcessProfiles
note: Thu Jan 27 18:49:29 2022
note: VM: 26.34 GB RSS: 26.27 GB
After:
note: After parsePerfTraces
note: Fri Jan 28 12:04:49 2022
note: VM: 8.68 GB RSS: 7.65 GB
note: After computeSizeForProfiledFunctions
note: Fri Jan 28 12:05:26 2022
note: **VM: 8.68 GB RSS: 8.42 GB**
note: After generateProbeBasedProfile
note: Fri Jan 28 12:08:03 2022
note: VM: 22.93 GB RSS: 22.89 GB
note: After postProcessProfiles
note: Fri Jan 28 12:11:30 2022
note: VM: 24.27 GB RSS: 24.22 GB
This should be a no-diff change in terms of profile quality.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D118515
Mark de Wever [Sat, 29 Jan 2022 13:52:41 +0000 (14:52 +0100)]
[libc++][format[[nfc] Use string_view in tests.
This change is a preparation for adapting the tests for
P2216 std::format improvements
Reviewed By: #libc, Quuxplusone, ldionne
Differential Revision: https://reviews.llvm.org/D118717
Balazs Benics [Tue, 8 Feb 2022 16:27:57 +0000 (17:27 +0100)]
[analyzer] Prevent misuses of -analyze-function
Sometimes when I pass the mentioned option I forget about passing the
parameter list for c++ sources.
It would be also useful newcomers to learn about this.
This patch introduces some logic checking common misuses involving
`-analyze-function`.
Reviewed-By: martong
Differential Revision: https://reviews.llvm.org/D118690
Matt Arsenault [Fri, 4 Feb 2022 19:29:12 +0000 (14:29 -0500)]
AMDGPU: Use reserved VGPR for AGPR spills to memory
Previously would reuse the VGPR used for large frame offsets with the
one needed for copying from the AGPR. Fix this by reusing the register
we already reserved for handling AGPR to AGPR copies.
Philip Reames [Tue, 8 Feb 2022 14:56:05 +0000 (06:56 -0800)]
[SCEV] Generalize SCEVEqualsPredicate to any compare [NFC]
PredicatedScalarEvolution has a predicate type for representing A == B. This change generalizes it into something which can represent a A <pred> B.
This generality is currently unused, but is motivated by a couple of recent cases which have come up. In particular, I'm currently playing around with using this to simplify the runtime checking code in LoopVectorizer. Regardless of the outcome of that prototyping, generalizing the compare node seemed useful.
Nikita Popov [Tue, 8 Feb 2022 16:14:41 +0000 (17:14 +0100)]
[Mem2Reg] Check that load type matches alloca type
Alloca promotion can only deal with cases where the load/store
types match the alloca type (it explicitly does not support
bitcasted load/stores).
With opaque pointers this is no longer enforced through the pointer
type, so add an explicit check.
Matt Arsenault [Wed, 15 Dec 2021 02:56:48 +0000 (21:56 -0500)]
AMDGPU: Reserve v32 if we may need to copy between AGPRs on gfx908
We need to guarantee cheap copies between AGPRs, and unfortunately
gfx908 cannot directly do this. Theoretically we could set the
scavenger up with an emergency spill slot, but it also feels
unreasonable to pay that cost for what was assumed to be a simple and
cheap copy. Pick a register that doesn't conflict with any ABI
registers.
This does not address the same issue when copying from SGPR to AGPR
for gfx90a (this coincidentally fixes it for gfx908), but that's less
interesting since the register allocator shouldn't be proactively
introducing such copies.
One edge case I'm worried about is respecting the VGPR budget implied
by amdgpu-waves-per-eu. If the theoretical upper bound of a function
is 32 VGPRs, this will force the actual count to be 33.
This is also broken if inline assembly uses/defs something in v32. The
coalescer will eliminate the intermediate vreg between the def and
use, and the introduced copy will clobber the user value.
(cherry picked from commit
3335784ac2d587ff4eac04586e189532ae8b2607)
Matt Arsenault [Fri, 4 Feb 2022 19:56:03 +0000 (14:56 -0500)]
AMDGPU: Regenerate mir test checks to include -NEXT
Louis Dionne [Mon, 7 Feb 2022 22:25:41 +0000 (17:25 -0500)]
[libc++] Add a Lit configuration for running back-deployment tests
This testing configuration links tests against one libc++ shared library,
but runs them against another libc++ shared library. This makes sure that
we can build applications against the libc++ provided in a recent SDK and
back-deploy them to platforms containing older libc++ dylibs.
It also switches the Apple CI script to using that new configuration
instead of the legacy one.
Differential Revision: https://reviews.llvm.org/D119195
zhijian [Tue, 8 Feb 2022 15:57:04 +0000 (10:57 -0500)]
[NFC] Refactor llvm-nm symbol comparing and split sorting
Summary:
1.added a helper function isSymbolDefined().
2.Split out sorting code
3.refactor symbol comparing function
Reviewers: James Henderson,Fangrui Song
Differential Revision: https://reviews.llvm.org/D119028
Sanjay Patel [Tue, 8 Feb 2022 15:41:34 +0000 (10:41 -0500)]
[SDAG] enable binop identity constant folds for fmul/fdiv
The test diffs are identical to D119111.
This only affects x86 currently because no other target
has an override for the TLI hook that controls this transform.
Nikita Popov [Tue, 8 Feb 2022 15:50:11 +0000 (16:50 +0100)]
[AutoUpgrade] Handle remangling upgrade for ptr.annotation
The code assumed that the upgrade would happen due to the argument
count changing from 4 to 5. However, a remangling upgrade is also
possible here.
David Sherwood [Wed, 2 Feb 2022 09:02:16 +0000 (09:02 +0000)]
[AArch64][CodeGen] Always use SVE (when enabled) to lower 64-bit vector multiplies
This patch adds custom lowering support for ISD::MUL with v1i64 and v2i64
types when SVE is enabled, regardless of the minimum SVE vector length. We
do this because NEON simply does not have 64-bit vector multiplies, so we
want to take advantage of these instructions in SVE.
I've updated the 128-bit min SVE vector bits tests here:
CodeGen/AArch64/sve-fixed-length-int-arith.ll
CodeGen/AArch64/sve-fixed-length-int-mulh.ll
CodeGen/AArch64/sve-fixed-length-int-rem.ll
Differential Revision: https://reviews.llvm.org/D118802
Arjun P [Tue, 8 Feb 2022 15:36:24 +0000 (21:06 +0530)]
[MLIR][Presburger] Support computing volumes via hyperrectangular overapproximation
Add support for computing an overapproximation of the number of integer points
in a polyhedron. The returned result is actually the number of integer points
one gets by computing the "rational shadow" obtained by projecting out the
local IDs, finding the minimal axis-parallel hyperrectangular approximation
of the shadow, and returning the number of integer points in that. This does
not currently support symbols.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D119228
Roman Lebedev [Tue, 8 Feb 2022 15:35:22 +0000 (18:35 +0300)]
[ValueTracking] Only check for non-undef/poison if already known to be a self-multiply
https://godbolt.org/z/js9fTTG9h
^ we don't care what `isGuaranteedNotToBeUndefOrPoison()` says
unless we already knew that the operands were equal.
Roman Lebedev [Tue, 8 Feb 2022 15:27:29 +0000 (18:27 +0300)]
[NFC][clang] Autogenerate checklines in CodeGenCXX/nrvo.cpp
It checks IR after optimizations, which is inherently fragile,
and the results are now different after the recent patch.
Arjun P [Tue, 8 Feb 2022 15:23:43 +0000 (20:53 +0530)]
[MLIR][Presburger] Simplex::computeIntegerBounds: support unbounded directions by returning Optionals
Nathan Sidwell [Mon, 7 Feb 2022 18:08:18 +0000 (10:08 -0800)]
[demangler][NFC] Utility header cleanups
a) Using a do...while loop in the number formatter means we do not
have to special case zero.
b) Let's use 'if (auto size = ...) {}' for appending to the output
buffer.
c) We should also be using memcpy there, not memmove -- the string
being appended is never part of the current buffer.
d) Let's put all the operator<< functions together.
e) I find 'if (cond) frob(..., true) ; elseOD frob(..., false)'
somewhat confusing. Let's just use std::abs in the signed integer
printer and let CSE decide about the duplicate < 0 testing.
f) Let's have as many as possible return *this. That's both more
consistent, and allows tailcalls in some cases (the actual number
formatter has a local array though).
These changes removed around 100 bytes from the demangler's
instructions on x86_64.
Reviewed By: ChuanqiXu
Differential Revision: https://reviews.llvm.org/D119176
Nikita Popov [Thu, 3 Feb 2022 13:46:57 +0000 (14:46 +0100)]
[OpenCL] Mark kernel arguments as ABI aligned
Following the discussion on D118229, this marks all pointer-typed
kernel arguments as having ABI alignment, per section 6.3.5 of
the OpenCL spec:
> For arguments to a __kernel function declared to be a pointer to
> a data type, the OpenCL compiler can assume that the pointee is
> always appropriately aligned as required by the data type.
Differential Revision: https://reviews.llvm.org/D118894
Nikita Popov [Tue, 8 Feb 2022 12:52:02 +0000 (13:52 +0100)]
[AMDGPURewriteOutArguments] Don't use pointer element type
Instead of using the pointer element type, look at how the pointer
is actually being used in store instructions, while looking through
bitcasts. This makes the transform compatible with opaque pointers
and a bit more general.
It's worth noting that I have dropped the 3-vector to 4-vector
shufflevector special case, because this is now handled in a
different way: If the value is actually used as a 4-vector, then
we're directly going to use that type, instead of shuffling to a
3-vector in between.
Differential Revision: https://reviews.llvm.org/D119237
Simon Pilgrim [Tue, 8 Feb 2022 15:09:12 +0000 (15:09 +0000)]
[X86] selectLEAAddr - relax heuristic to only require one operand to be a MathWithFlags op (PR46809)
As suggested by @craig.topper, relaxing LEA matching to only require the ADD to be fed from a single op with EFLAGS helps avoid duplication when the EFLAGS are consumed in a later, dependent instruction.
There was some concern about whether the heuristic is too simple, not taking into account lost loads that can't fold by using a LEA, but some basic tests (included in select-lea.ll) don't suggest that's really a problem.
Differential Revision: https://reviews.llvm.org/D118128
serge-sans-paille [Fri, 4 Feb 2022 11:14:43 +0000 (12:14 +0100)]
Cleanup LLVMDebugInfoCodeView headers
Major user-facing changes:
Many headers in llvm/DebugInfo/CodeView no longer include
llvm/Support/BinaryStreamReader.h or llvm/Support/BinaryStreamWriter.h,
those headers may need to be included manually.
Several headers in llvm/DebugInfo/CodeView no longer include
llvm/DebugInfo/CodeView/EnumTables.h or llvm/DebugInfo/CodeView/CodeView.h,
those headers may need to be included manually.
Some statistics:
$ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/DebugInfo/CodeView/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
after: 2794466
before: 2832765
Discourse thread on the topic: https://discourse.llvm.org/t/include-what-you-use-include-cleanup/
Differential Revision: https://reviews.llvm.org/D119092
Simon Pilgrim [Tue, 8 Feb 2022 14:59:59 +0000 (14:59 +0000)]
[X86] Remove __builtin_ia32_padd/psub saturated intrinsics and use generic __builtin_elementwise_add/sub_sat
D117898 added the generic __builtin_elementwise_add_sat and __builtin_elementwise_sub_sat with the same integer behaviour as the SSE/AVX instructions
This patch removes the __builtin_ia32_padd/psub saturated intrinsics and just uses the generics - the existing tests see no changes:
__m256i test_mm256_adds_epi8(__m256i a, __m256i b) {
// CHECK-LABEL: test_mm256_adds_epi8
// CHECK: call <32 x i8> @llvm.sadd.sat.v32i8(<32 x i8> %{{.*}}, <32 x i8> %{{.*}})
return _mm256_adds_epi8(a, b);
}
Nikita Popov [Tue, 8 Feb 2022 14:57:48 +0000 (15:57 +0100)]
[AutoUpgrade] Also upgrade intrinsics in invokes
We currently don't have any specialized upgrades for intrinsics
that can be used in invokes, but they can still be subject to
a generic remangling upgrade. In particular, this happens when
upgrading statepoint intrinsics under -opaque-pointers.
This patch just changes the upgrade code to work on CallBase
instead of CallInst in particular.
Joseph Huber [Tue, 8 Feb 2022 14:38:33 +0000 (09:38 -0500)]
[OpenMP] Enable new driver tests for AMDGPU
This patch enables running the new driver tests for AMDGPU. Previously
this was disabled because some tests failed. This was only because the
new driver tests hadn't been listed as unsupported or expected to fail.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D119240
Sanjay Patel [Tue, 8 Feb 2022 13:32:14 +0000 (08:32 -0500)]
[SDAG] move x86 select-with-identity-constant fold behind a target hook; NFC
This is no-functional-change-intended because only the
x86 target enables the TLI hook currently.
We can add fmul/fdiv opcodes to the switch similar to the
proposal D119111, but we don't need to make other changes
like enabling target-specific combines.
We can also add integer opcodes (add, or, shl, etc.) to
the switch because this function is called from all of the
generic binary opcodes.
The goal is to incrementally enable the profitable diffs
from D90113 while avoiding regressions.
Differential Revision: https://reviews.llvm.org/D119150
Roman Lebedev [Tue, 8 Feb 2022 13:54:03 +0000 (16:54 +0300)]
[SimplifyCFG] 'merge compatible invokes': support normal destination w/ uses
If the original invokes had uses, the uses must have been in PHI's,
but that immediately results in the incoming values being incompatible.
But we'll replace uses of the original invokes with the use of the
merged invoke, so as long as the incoming values become compatible
after that, we can merge.
Roman Lebedev [Tue, 8 Feb 2022 13:34:34 +0000 (16:34 +0300)]
[SimplifyCFG] 'merge compatible invokes': support normal destination w/ PHIs but no uses
As long as the incoming values for all the invokes in the set
are identical, we can merge the invokes.
Roman Lebedev [Tue, 8 Feb 2022 12:42:03 +0000 (15:42 +0300)]
[SimplifyCFG] 'merge compatible invokes': support normal destination w/ no uses, no PHI's
Even if the invokes have normal destination, iff it's the same block,
we can merge them. For now, require that there are no PHI nodes,
and the returned values of invokes aren't used.
Roman Lebedev [Tue, 8 Feb 2022 12:13:16 +0000 (15:13 +0300)]
[NFC][SimplifyCFG] 'merge compatible invokes': more tests for various edge-cases
Simon Pilgrim [Tue, 8 Feb 2022 14:45:28 +0000 (14:45 +0000)]
Revert rG6c174ab2ad0676b295f11f6c3913eff9289fa6b9 "[X86] Remove __builtin_ia32_padd/psub saturated intrinsics and use generic __builtin_elementwise_add/sub_sat"
Missed some legacy builtin tests that need cleaning up first
Sheng [Tue, 8 Feb 2022 14:32:48 +0000 (14:32 +0000)]
[GlobalISel] Add big endian support in CallLowering
When splitting values, CallLowering assumes Lo part goes first. But in big endian ISA such as M68k, Hi part goes first.
This patch fixes this.
Differential Revision: https://reviews.llvm.org/D116877
Nathan Sidwell [Fri, 28 Jan 2022 17:27:28 +0000 (09:27 -0800)]
[demangler] Improve ->* & .* demangling
The demangler treats ->* as a BinaryExpr, but .* as a MemberExpr.
That's inconsistent. This makes the former a MemberExpr too.
However, in order to not regress the paren output, MemberExpr::print
is modified to parenthesize the MemberExpr if the operator ends with
'*'. Printing is affected thusly:
Before:
obj.member
obj->member
obj.*member
(obj) ->* (member)
After:
obj.member # Unchanged
obj->member # Unchanged
obj.*(member) # Added paren member operand
obj->*(member) # Removed paren on object operand, less whitespace
The right solution to the paren problem is to add some notion of
precedence (and associativity) to Nodes, but that's a larger change
that would become simpler once the refactoring I'm doing is completed.
FWIW, binutils' demangler's paren algorithm has a small idea of
precedence, and will generally not emit parens when the operand is
unary.
Reviewed By: bruno
Differential Revision: https://reviews.llvm.org/D118486
Simon Pilgrim [Tue, 8 Feb 2022 14:21:11 +0000 (14:21 +0000)]
[X86] Remove __builtin_ia32_padd/psub saturated intrinsics and use generic __builtin_elementwise_add/sub_sat
D117898 added the generic __builtin_elementwise_add_sat and __builtin_elementwise_sub_sat with the same integer behaviour as the SSE/AVX instructions
This patch removes the __builtin_ia32_padd/psub saturated intrinsics and just uses the generics - the existing tests see no changes:
__m256i test_mm256_adds_epi8(__m256i a, __m256i b) {
// CHECK-LABEL: test_mm256_adds_epi8
// CHECK: call <32 x i8> @llvm.sadd.sat.v32i8(<32 x i8> %{{.*}}, <32 x i8> %{{.*}})
return _mm256_adds_epi8(a, b);
}
Nikita Popov [Tue, 8 Feb 2022 14:16:16 +0000 (15:16 +0100)]
[AArch64TargetTransformInfo] Avoid pointer element type access
Use the element type of the gathered/scattered vector instead.
Simon Pilgrim [Tue, 8 Feb 2022 14:13:36 +0000 (14:13 +0000)]
Fix signed/unsigned comparison warnings on ppc buildbots
Corentin Jabot [Tue, 8 Feb 2022 14:13:04 +0000 (09:13 -0500)]
Add core papers adopted at the february plenary.
2 papers are added to the status page, one targeting
C++23, the other added to the batch of C++20 concept papers.
Nikita Popov [Tue, 8 Feb 2022 14:04:23 +0000 (15:04 +0100)]
[AsmPrinter] Avoid pointer element type access
Instead of checking for a bitcast from a function type, check
whether the aliasee is a function after stripping bitcasts. This
is not strictly equivalent, but serves the same purpose.
Simon Pilgrim [Tue, 8 Feb 2022 13:55:01 +0000 (13:55 +0000)]
Fix signed/unsigned comparison warnings on ppc buildbots