Mircea Trofin [Tue, 30 Nov 2021 01:18:29 +0000 (17:18 -0800)]
[NFC][regalloc] Factor accesses to ExtraRegInfo
We'll move ExtraRegInfo to the RegAllocEvictionAdvisor subsequently.
This change prepares for that by factoring all accesses.
RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-November/153639.html
Differential Revision: https://reviews.llvm.org/D114759
Tarique Islam [Tue, 30 Nov 2021 22:41:55 +0000 (22:41 +0000)]
Big-endian version of vpermxor
A big-endian version of vpermxor, named vpermxor_be, is added to LLVM
and Clang. vpermxor_be can be called directly on both the little-endian
and the big-endian platforms.
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D114540
Julian Lettner [Tue, 30 Nov 2021 20:12:14 +0000 (12:12 -0800)]
[TSan][Darwin] Avoid crashes due to interpreting non-zero shadow content as a pointer
We would like to use TLS to store the ThreadState object (or at least a
reference ot it), but on Darwin accessing TLS via __thread or manually
by using pthread_key_* is problematic, because there are several places
where interceptors are called when TLS is not accessible (early process
startup, thread cleanup, ...).
Previously, we used a "poor man's TLS" implementation, where we use the
shadow memory of the pointer returned by pthread_self() to store a
pointer to the ThreadState object.
The problem with that was that certain operations can populate shadow
bytes unbeknownst to TSan, and we later interpret these non-zero bytes
as the pointer to our ThreadState object and crash on when dereferencing
the pointer.
This patch changes the storage location of our reference to the
ThreadState object to "real" TLS. We make this work by artificially
keeping this reference alive in the pthread_key destructor by resetting
the key value with pthread_setspecific().
This change also fixes the issue were the ThreadState object is
re-allocated after DestroyThreadState() because intercepted functions
can still get called on the terminating thread after the
THREAD_TERMINATE event.
Radar-Id: rdar://problem/
72010355
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D110236
Jonathan Peyton [Fri, 19 Nov 2021 22:22:21 +0000 (16:22 -0600)]
[OpenMP][libomp][doc] Add environment variables documentation
Add documentation for the environment variables for libomp
Differential Revision: https://reviews.llvm.org/D114269
Peter Klausler [Fri, 26 Nov 2021 19:39:31 +0000 (11:39 -0800)]
[flang] Define & implement a lowering support API IsContiguous() in runtime
Create a new flang/runtime/support.cpp module to hold miscellaneous
runtime APIs to support lowering, and define an API IsContiguous() to
wrap the member function predicate Descriptor::IsContiguous().
And do a little clean-up of other API headers that don't need to expose
Runtime/descriptor.h.
Differential Revision: https://reviews.llvm.org/D114752
Schuyler Eldridge [Tue, 30 Nov 2021 05:47:08 +0000 (00:47 -0500)]
[ADT] Remove 0-width Asserts in APInt.getZExtValue
Remove assertion that disallows getting a zero-extended value from a
zero-width APInt. This check is too restrictive and makes it difficult
to use APInt to model zero-width things, e.g., zero-width wires in the
CIRCT project.
Signed-off-by: Schuyler Eldridge <schuyler.eldridge@sifive.com>
Reviewed By: lattner, darthscsi, nikic
Differential Revision: https://reviews.llvm.org/D114768
Vitaly Buka [Mon, 29 Nov 2021 21:07:11 +0000 (13:07 -0800)]
[NFC][sanitizer] Fail test quickly
Srividya Karumuri [Tue, 30 Nov 2021 00:25:21 +0000 (16:25 -0800)]
[InstCombine] Allow fake vector insert folding to bit-logic only if the insert element is integer type
The below commit is causing assertion when insert element type is not integer
type such as half. This is because the transformation is creating zext before
doing bitwise OR, and the zext is supported only for integer types
https://github.com/llvm/llvm-project/commit/
80ab06c599a0f5a90951c36a57b2a9b492b19d61
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D114734
Greg Clayton [Thu, 18 Nov 2021 05:18:24 +0000 (21:18 -0800)]
[NFC] Refactor symbol table parsing.
Symbol table parsing has evolved over the years and many plug-ins contained duplicate code in the ObjectFile::GetSymtab() that used to be pure virtual. With this change, the "Symbtab *ObjectFile::GetSymtab()" is no longer virtual and will end up calling a new "void ObjectFile::ParseSymtab(Symtab &symtab)" pure virtual function to actually do the parsing. This helps centralize the code for parsing the symbol table and allows the ObjectFile base class to do all of the common work, like taking the necessary locks and creating the symbol table object itself. Plug-ins now just need to parse when they are asked to parse as the ParseSymtab function will only get called once.
This is a retry of the original patch https://reviews.llvm.org/D113965 which was reverted. There was a deadlock in the Manual DWARF indexing code during symbol preloading where the module was asked on the main thread to preload its symbols, and this would in turn cause the DWARF manual indexing to use a thread pool to index all of the compile units, and if there were relocations on the debug information sections, these threads could ask the ObjectFile to load section contents, which could cause a call to ObjectFileELF::RelocateSection() which would ask for the symbol table from the module and it would deadlock. We can't lock the module in ObjectFile::GetSymtab(), so the solution I am using is to use a llvm::once_flag to create the symbol table object once and then lock the Symtab object. Since all APIs on the symbol table use this lock, this will prevent anyone from using the symbol table before it is parsed and finalized and will avoid the deadlock I mentioned. ObjectFileELF::GetSymtab() was never locking the module lock before and would put off creating the symbol table until somewhere inside ObjectFileELF::GetSymtab(). Now we create it one time inside of the ObjectFile::GetSymtab() and immediately lock it which should be safe enough. This avoids the deadlocks and still provides safety.
Differential Revision: https://reviews.llvm.org/D114288
Peter Klausler [Thu, 25 Nov 2021 00:05:37 +0000 (16:05 -0800)]
[flang] Correct INQUIRE(POSITION= & PAD=)
INQUIRE(POSITION=)'s results need to reflect the POSITION=
specifier used for the OPEN statement until the unit has been
repositioned. Preserve the POSITION= from OPEN and used it
for INQUIRE(POSITION=) until is becomes obsolete.
INQUIRE(PAD=) is implemented here in the case of an unconnected unit
with Fortran 2018 semantics; i.e., "UNDEFINED", rather than Fortran 90's
"YES"/"NO" (see 4.3.6 para 2). Apparent failures with F'90-only tests
will persist with INQUIRE(PAD=); these discrepancies don't seem to warrant
an option or environment variable.
To make the implementation of INQUIRE more closely match the language
in the standard, rename IsOpen() to IsConnected(), and use it explicitly
for the various INQUIRE specifiers.
Differential Revision: https://reviews.llvm.org/D114755
Aart Bik [Tue, 30 Nov 2021 18:58:13 +0000 (10:58 -0800)]
[mlir][sparse] refine simply dynamic sparse tensor outputs
Proper test for sparse tensor outputs is a single condition throughout
the whole tensor index expression (not a general conjunction, since this
may include other conditions that cause cancellation).
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D114810
Peter Klausler [Tue, 23 Nov 2021 02:37:25 +0000 (18:37 -0800)]
[flang] Re-fold bounds expressions in DATA implied DO loops
To accommodate triangular implied DO loops in DATA statements, in which
the bounds of nested implied DO loops might depend on the values of the
indices of outer implied DO loops in the same DATA statement set, it
is necessary to run them through constant folding each time they are
encountered.
Differential Revision: https://reviews.llvm.org/D114754
Elizabeth Andrews [Tue, 30 Nov 2021 21:15:51 +0000 (13:15 -0800)]
[clang-repl][NFC] Fix calling convention mismatch in test
Test failed on x86 platforms due to a calling convention mismatch
when member function was called like a free function. In this patch,
member function is marked static to address this.
Jonas Devlieghere [Tue, 30 Nov 2021 20:54:31 +0000 (12:54 -0800)]
[lldb] Fix broken skipUnlessUndefinedBehaviorSanitizer decorator
727bd89b605b broke the UBSan decorator. The decorator compiles a custom
source code snippet that exposes UB and verifies the presence of a UBSan
symbol in the generated binary. The aforementioned commit broke both by
compiling a snippet without UB and discarding the result.
Peter Klausler [Tue, 23 Nov 2021 20:45:39 +0000 (12:45 -0800)]
[flang] Fix usage & catch errors for MAX/MIN with keyword= arguments
Max(), MIN(), and their specific variants are defined with an unlimited
number of dummy arguments named A1=, A2=, &c. whose names are almost never
used in practice but should be allowed for and properly checked for the
usual errors when they do appear. The intrinsic table's entries otherwise
have fixed numbers of dummy argument definitions, so add some special
case handling in a few spots for MAX/MIN/&c. checking and procedure
characteristics construction.
Differential Revision: https://reviews.llvm.org/D114750
Jonas Devlieghere [Tue, 30 Nov 2021 20:41:45 +0000 (12:41 -0800)]
[lldb] Fix TypeError: argument of type 'NoneType' is not iterable
Check if we have an apple_sdk before checking if it contains "internal".
Jonas Devlieghere [Tue, 30 Nov 2021 19:33:09 +0000 (11:33 -0800)]
[lldb] Mark TestTsanBasic and TestUbsanBasic as "no debug info" tests
Speed up testing by not rerunning the test for all debug info variants.
Nicolas Vasilache [Mon, 29 Nov 2021 16:22:45 +0000 (16:22 +0000)]
[mlir][tensor] InsertSliceOp verification.
This revision reintroduces tensor.insert_slice verification which seems
to have vanished over time: a verifier was initially introduced in
cf9503c1b752062d9abfb2c7922a50574d9c5de4
but for some reason the invalid.mlir was not properly updated; as time passed the verifier was not called anymore and later the code was deleted.
As a consequence, a non-negligible portion of tests has run astray using invalid
tensor.insert_slice semantics and needed to be fixed.
Also, extract isRankReducedType from TensorOps for better reuse
Originally, this facility was used by both tensor and memref forms but
it got copied around as dialects were split.
Differential Revision: https://reviews.llvm.org/D114715
MaheshRavishankar [Tue, 30 Nov 2021 15:46:21 +0000 (15:46 +0000)]
[mlir][MemRef] Fix SubViewOp canonicalization when a subset of unit-dims are dropped.
The canonical type of the result of the `memref.subview` needs to make
sure that the previously dropped unit-dimensions are the ones dropped
for the canonicalized type as well. This means the generic
`inferRankReducedResultType` cannot be used. Instead the current
dropped dimensions need to be querried and the same need to be dropped.
Reviewed By: nicolasvasilache, ThomasRaoux
Differential Revision: https://reviews.llvm.org/D114751
Jameson Nash [Tue, 30 Nov 2021 19:59:49 +0000 (14:59 -0500)]
AArch64 GIsel: legalize lshr operands, even if it is poison
Previously, this caused GlobalISel to emit invalid IR (a gpr32 to gpr64
copy) and fail during verification.
While this shift is not defined (returns poison), it should not crash
codegen, as it may appear inside dead code (for example, a select
instruction), and it is legal IR input, as long as the value is unused.
Discovered while trying to build Julia with LLVM v13:
https://github.com/JuliaLang/julia/pull/42602.
Reviewed By: aemerson
Differential Revision: https://reviews.llvm.org/D114389
Snehasish Kumar [Tue, 30 Nov 2021 20:19:27 +0000 (12:19 -0800)]
[memprof] Disallow memprof profile reader tests on non-x86 archs.
The memprof profile reader tests rely on binary data which is generated
from and meant to be interpreted on little endian architectures. Add a
REQUIRES: x86_64-linux clause to both tests to ensure they don't fail on big
endian targets such as ppc.
Nikita Popov [Tue, 30 Nov 2021 20:07:31 +0000 (21:07 +0100)]
[SCEV] Verify integrity of ValuesAtScopes and users (NFC)
Make sure that ValuesAtScopes and ValuesAtScopesUsers are
consistent during SCEV verification.
Zarko Todorovski [Tue, 30 Nov 2021 20:06:46 +0000 (15:06 -0500)]
[clang][docs] Inclusive language: remove use of sanity check in option description
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D114562
Zarko Todorovski [Tue, 30 Nov 2021 19:48:53 +0000 (14:48 -0500)]
[NFC][Clang]Inclusive language: Replace uses of whitelist in clang/test
Snehasish Kumar [Tue, 30 Nov 2021 19:48:53 +0000 (11:48 -0800)]
[memprof] Disable pedantic warnings, suppress variadic macro warning.
The memprof unit tests use an older version of gmock (included in the
repo) which does not build cleanly with -pedantic:
https://github.com/google/googletest/issues/2650
For now just silence the warning by disabling pedantic and add the
appropriate flags for gcc and clang.
not-jenni [Tue, 30 Nov 2021 19:56:23 +0000 (11:56 -0800)]
[mlir][tosa] Add tosa.conv2d as fully_connected canonicalization
For a 1x1 weight and stride of 1, the input/weight can be reshaped and passed into a fully connected op then reshaped back
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D114757
Jameson Nash [Thu, 25 Nov 2021 05:15:05 +0000 (00:15 -0500)]
fix inverted logic for HideUnrelatedOptions
It seems clearer to me that this would check for *any of* instead of
*all of* these option categories, as it looks to me like that was the
intent. But apparently this logic has always has been inverted, and
possibly never fully used?
Differential Revision: https://reviews.llvm.org/D114572
Michael Jones [Mon, 15 Nov 2021 23:03:30 +0000 (15:03 -0800)]
[libc][clang-tidy] fix namespace check for externals
Up until now, all references to `errno` were marked with `NOLINT`, since
it was technically calling an external function. This fixes the lint
rules so that `errno`, as well as `malloc`, `calloc`, `realloc`, and
`free` are all allowed to be called as external functions. All of the
relevant `NOLINT` comments have been removed, and the documentation has
been updated.
Reviewed By: sivachandra, lntue, aaron.ballman
Differential Revision: https://reviews.llvm.org/D113946
Snehasish Kumar [Tue, 30 Nov 2021 19:33:37 +0000 (11:33 -0800)]
[memprof] Fix unit test build after refactoring shared header.
The memprof unittest also needs to include the MemProfData.inc header
directly to have access to MEMPROF_RAW_MAGIC and MEMPROF_RAW_VERSION
globals.
Fangrui Song [Tue, 30 Nov 2021 19:33:16 +0000 (11:33 -0800)]
[ELF][PPC64] Remove unneeded PPC64PCRelLongBranchThunk
This reverts the PPC64PCRelLongBranchThunk part from D86706.
PPC64PCRelLongBranchThunk is the same as PPC64R12SetupStub.
Use `__gep_setup_` instead of `__long_branch_pcrel_` for the stub symbol name
as it more closely indicates the operation.
(Note: GNU ld uses `*.long_branch.*` and `*.plt_branch.*`).
Reviewed By: NeHuang, nemanjai
Differential Revision: https://reviews.llvm.org/D114656
Jonas Devlieghere [Tue, 30 Nov 2021 19:28:52 +0000 (11:28 -0800)]
[lldb] Fix indentation in builders/darwin.py
Jonas Devlieghere [Tue, 30 Nov 2021 19:28:19 +0000 (11:28 -0800)]
[lldb] Search PrivateFrameworks when using an internal SDK
Make sure to add the PrivateFrameworks directory to the frameworks path
when using an internal SDK. This is necessary for the "on-device" test
suite.
rdar://
84519268
Differential revision: https://reviews.llvm.org/D114742
Sanjay Patel [Tue, 30 Nov 2021 18:59:39 +0000 (13:59 -0500)]
[InstSimplify] add logic fold for 'or'
https://alive2.llvm.org/ce/z/4PaPDy
There's a related fold where the inner 'or' is replaced by 'and',
but that needs to be more careful about matching a 'not'.
Sanjay Patel [Tue, 30 Nov 2021 18:17:30 +0000 (13:17 -0500)]
[InstSimplify] reduce code duplication for 'or' logic folds; NFC
Sanjay Patel [Tue, 30 Nov 2021 18:08:12 +0000 (13:08 -0500)]
[InstSimplify] make 'or' test names more descriptive; NFC
Also, vary the types in a couple of tests for better coverage.
Fangrui Song [Tue, 30 Nov 2021 19:06:28 +0000 (11:06 -0800)]
[ELF] Change -z unknown from error to warning
There is a trend of having more optional options (usually security
hardening related) like -z cet-report=, -z bti-report=, -z force-bti.
If ld.lld 14.0.0 uses a warning, in 15/16/17/... timeframe when people
add new options to software, they can worry less about linker errors on ld.lld 14.0.0.
In some cases `-z foo` does essential work where a silent ignore can be
problematic, but the user has received a warning. From my observation, the
doing-essential-work `-z foo` is much fewer than the converse. In addition,
the user who cares can use `--fatal-warnings` (Note: GNU ld doesn't upgrade warnings to errors).
It is unclear whether we need something like `clang -Wunknown-warning-option`.
If we ever run into unfortunate transition like `-z start-stop-gc`, the
affected software (e.g. ldc is a compiler which passes linker options to the underlying ld)
can blindly add the `-z` option, without worrying it may cause a linker error to LLD 14.0.0.
Reviewed By: jrtc27, peter.smith
Differential Revision: https://reviews.llvm.org/D114748
LLVM GN Syncbot [Tue, 30 Nov 2021 18:46:43 +0000 (18:46 +0000)]
[gn build] Port
7cca33b40f77
Snehasish Kumar [Fri, 19 Nov 2021 22:02:41 +0000 (14:02 -0800)]
[memprof] Extend llvm-profdata to display MemProf profile summaries.
This commit adds initial support to llvm-profdata to read and print
summaries of raw memprof profiles.
Summary of changes:
* Refactor shared defs to MemProfData.inc
* Extend show_main to display memprof profile summaries.
* Add a simple raw memprof profile reader.
* Add a couple of tests to tools/llvm-profdata.
Differential Revision: https://reviews.llvm.org/D114286
Peter Klausler [Fri, 26 Nov 2021 20:40:11 +0000 (12:40 -0800)]
[flang] Address TODO from previous changes to IsSaved()
An earlier fix to evaluate::IsSaved() needed to preserve its
treatment of named constants in modules and main programs -- i.e.
they would appear to be saved -- until a correction was added
to the lowering code. This TODO can now be resolved.
Differential Revision: https://reviews.llvm.org/D114756
Hans Wennborg [Tue, 30 Nov 2021 18:26:50 +0000 (19:26 +0100)]
Typo fix
Alexey Bataev [Wed, 17 Nov 2021 19:14:38 +0000 (11:14 -0800)]
[SLP]Improve isFixedVectorShuffle and its use.
Extended support for undefined source vector/extract indices/non-fixed
vector types, also no need to check for the parent of the extractelement
instructions with the constant indicies.
Differential Revision: https://reviews.llvm.org/D114121
Sanjay Patel [Tue, 30 Nov 2021 17:45:09 +0000 (12:45 -0500)]
[InstSimplify] reduce code duplication for 'or' logic fold; NFC
Sanjay Patel [Tue, 30 Nov 2021 17:29:45 +0000 (12:29 -0500)]
[InstSimplify] adjust tests for 'or' of logic ops; NFC
Half of the tests had an extra instruction so were not testing the minimal patterns.
Sanjay Patel [Tue, 30 Nov 2021 16:47:57 +0000 (11:47 -0500)]
[InstSimplify] refactor 'or' logic folds; NFC
Reduce duplication for handling the top-level commuted operands.
There are several other folds that should be moved in here, but
we need to make sure there's good test coverage.
Sanjay Patel [Tue, 30 Nov 2021 16:20:23 +0000 (11:20 -0500)]
[InstSimplify] add tests for 'or' with logic ops; NFC
The code for these transforms can be refactored,
but the existing tests are incomplete.
Sanjay Patel [Tue, 30 Nov 2021 15:15:43 +0000 (10:15 -0500)]
[InstSimplify] add tests for 'or' logic folds; NFC
The tests are adapted from the xor patterns used with:
892648b18a8c
b326c058146f
Alexey Bataev [Tue, 30 Nov 2021 16:36:45 +0000 (08:36 -0800)]
[SLP][NFC]Move static function to make it visible in member function,
NFC.
Nikita Popov [Tue, 30 Nov 2021 17:27:27 +0000 (18:27 +0100)]
Revert "Use VersionTuple for parsing versions in Triple. This makes it possible to distinguish between "16" and "16.0" after parsing, which previously was not possible."
This reverts commit
1e8286467036d8ef1a972de723f805a4981b2692.
llvm/test/Transforms/LoopStrengthReduce/X86/2009-11-10-LSRCrash.ll fails
with assertion failure:
llc: /home/nikic/llvm-project/llvm/include/llvm/ADT/Optional.h:196: T& llvm::optional_detail::OptionalStorage<T, true>::getValue() & [with T = unsigned int]: Assertion `hasVal' failed.
...
#8 0x00005633843af5cb llvm::MCStreamer::emitVersionForTarget(llvm::Triple const&, llvm::VersionTuple const&)
#9 0x0000563383b47f14 llvm::AsmPrinter::doInitialization(llvm::Module&)
kpyzhov [Tue, 30 Nov 2021 17:30:15 +0000 (12:30 -0500)]
[RegionPass] Added check for -filter-print-funcs option to the region IR dumps.
Differential Revision: https://reviews.llvm.org/D114310
Nikita Popov [Tue, 30 Nov 2021 11:02:23 +0000 (12:02 +0100)]
[SCEV] Track and invalidate ValuesAtScopes users
ValuesAtScopes maps a SCEV and a Loop to another SCEV. While we
invalidate entries if the left-hand SCEV is invalidated, we
currently don't do this for the right-hand SCEV. Fix this by
tracking users in a reverse map and using it for invalidation.
This is conceptually the same change as D114738, but using the
reverse map to avoid performance issues.
Differential Revision: https://reviews.llvm.org/D114788
Steven Wu [Tue, 30 Nov 2021 17:12:08 +0000 (09:12 -0800)]
[JITLink][ELF] Don't skip sections of size 0
Size 0 sections can have symbols that have size 0. Build those sections
and symbols into the LinkGraph so they can be used properly if needed.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D114749
Steven Wu [Tue, 30 Nov 2021 17:18:27 +0000 (09:18 -0800)]
[JITLink][ELF] Add support for reading extended table
Add support for reading extended table in ELF object file. This allows
JITLink to support ELF object files with many sections.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D114747
Hongtao Yu [Mon, 29 Nov 2021 22:28:27 +0000 (14:28 -0800)]
[CSSPGO] Sorting nodes in a cycle of profiled call graph.
For nodes that are in a cycle of a profiled call graph, the current order the underlying scc_iter computes purely depends on how those nodes are reached from outside the SCC and inside the SCC, based on the Tarjan algorithm. This does not honor profile edge hotness, thus does not gurantee hot callsites to be inlined prior to cold callsites. To mitigate that, I'm adding an extra sorter on top of scc_iter to sort scc functions in the order of callsite hotness, instead of changing the internal of scc_iter.
Sorting on callsite hotness can be optimally based on detecting cycles on a directed call graph, i.e, to remove the coldest edge until a cycle is broken. However, detecting cycles isn't cheap. I'm using an MST-based approach which is faster and appear to deliver some performance wins.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D114204
Paul Robinson [Tue, 30 Nov 2021 16:34:33 +0000 (08:34 -0800)]
[PS4][DWARF] Explicitly set default DWARF version to 4
Benjamin Kramer [Tue, 30 Nov 2021 16:53:19 +0000 (17:53 +0100)]
[clang][dataflow] Make header parse
Looks like this is actually dead code?
Philip Reames [Tue, 30 Nov 2021 16:45:03 +0000 (08:45 -0800)]
[LV] Remove unneeded cast to Operator [NFC]
Ryan Mansfield [Tue, 30 Nov 2021 16:17:30 +0000 (17:17 +0100)]
Fix file extension of alignment-assumption-ignorelist.cppp test
During the renaming of blacklist to ignorelist this test got renamed
incorrectly.
Differential revision: https://reviews.llvm.org/D114710
Mateja Marjanovic [Mon, 20 Sep 2021 14:05:45 +0000 (16:05 +0200)]
Code quality: Combine V_RSQ
Combine V_RCP and V_SQRT into V_RSQ on AMDGPU for GlobalISel.
Change-Id: I93c5dcb412483156a6e8b68c4085cbce83ac9703
Valentin Clement [Tue, 30 Nov 2021 16:15:42 +0000 (17:15 +0100)]
Revert "[fir] Add fir reduction builder"
This reverts commit
cf3422d3df5b00d771bba837b9f51f67ab07eb64.
This fails on some buildbots
Valentin Clement [Tue, 30 Nov 2021 16:01:39 +0000 (17:01 +0100)]
[fir] Remove unused fct recordTypeCanBeMemCopied
Remove unused fct added with
47f759309eeaf9bd77debe4f6c3e1fe52913b537
gysit [Tue, 30 Nov 2021 15:47:57 +0000 (15:47 +0000)]
[mlir][linalg] Add decompose to CodegenStrategy.
Add the decompose patterns that lower higher dimensional convolutions to lower dimensional ones to CodegenStrategy and use CodegenStrategy to test the decompose patterns. Additionally, remove the assertion that checks the anchor op name is set in the CodegenStrategyTest pass. Removing the assertion allows us to simplify the pipelines used in the interchange and decompose tests.
Depends On D114797
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114798
gysit [Tue, 30 Nov 2021 15:41:21 +0000 (15:41 +0000)]
[mlir][linalg] Adapt the decompose patterns to use a filter (NFC).
The revision updates the convolution decomposition patterns to take a linalg transformation filter. The transformation filter in a later revision allows use the patterns from CodegenStrategy.
Depends On D114690
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114797
James Farrell [Tue, 16 Nov 2021 22:52:24 +0000 (22:52 +0000)]
Use VersionTuple for parsing versions in Triple. This makes it possible to distinguish between "16" and "16.0" after parsing, which previously was not possible.
See also https://github.com/android/ndk/issues/1455.
Differential Revision: https://reviews.llvm.org/D114163
Florian Hahn [Tue, 30 Nov 2021 15:40:14 +0000 (15:40 +0000)]
[DSE] Use optimized access if available for redundant store elimination.
Using the optimized access enables additional optimizations in cases
where the defining access is a non-aliasing store.
Alternatively we could also walk upwards and skip non-aliasing defs
here, but my experiments so far showed that this will noticeably
increase compile-time for little extra gain compared to just using the
optimized access.
Improvements of dse.NumRedundantStores on MultiSource/CINT2006/CPF2006
on X86 with -O3:
test-suite...-typeset/consumer-typeset.test 1.00 76.00 7500.0%
test-suite.../Benchmarks/Bullet/bullet.test 3.00 12.00 300.0%
test-suite...006/453.povray/453.povray.test 3.00 6.00 100.0%
test-suite...telecomm-gsm/telecomm-gsm.test 1.00 2.00 100.0%
test-suite...ediabench/gsm/toast/toast.test 1.00 2.00 100.0%
test-suite...marks/7zip/7zip-benchmark.test 1.00 2.00 100.0%
test-suite...ications/JM/lencod/lencod.test 7.00 10.00 42.9%
test-suite...6/464.h264ref/464.h264ref.test 6.00 8.00 33.3%
test-suite...ications/JM/ldecod/ldecod.test 6.00 7.00 16.7%
test-suite...006/447.dealII/447.dealII.test 33.00 33.00 0.0%
test-suite...6/471.omnetpp/471.omnetpp.test NaN 1.00 nan%
test-suite...006/450.soplex/450.soplex.test NaN 2.00 nan%
test-suite.../CINT2006/403.gcc/403.gcc.test NaN 7.00 nan%
test-suite...lications/ClamAV/clamscan.test NaN 1.00 nan%
test-suite...CI_Purple/SMG2000/smg2000.test NaN 3.00 nan%
Follow-up to D111727.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D112315
gysit [Tue, 30 Nov 2021 15:31:56 +0000 (15:31 +0000)]
[mlir][linalg] Support the empty anchor op string when padding.
Add support for an empty anchor op string in vectorization. An empty anchor op string is useful after fusion when there are multiple different operations to vectorize.
Depends On D114689
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114690
Yitzhak Mandelbaum [Tue, 30 Nov 2021 15:29:20 +0000 (15:29 +0000)]
[clang][dataflow] Fix broken build in ClangStaticAnalyzer
Adds a missing virtual destructor.
gysit [Tue, 30 Nov 2021 15:23:25 +0000 (15:23 +0000)]
[mlir][linalg] Use top down traversal for padding.
Pad the operation using a top down traversal. The top down traversal unlocks folding opportunities and dim op canonicalizations due to the introduced extract slice operation after the padded operation.
Depends On D114585
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114689
David Green [Tue, 30 Nov 2021 15:29:14 +0000 (15:29 +0000)]
[DAG] Create fptosi.sat from clamped fptosi
This adds a fold in DAGCombine to create fptosi_sat from sequences for
smin(smax(fptosi(x))) nodes, where the min/max saturate the output of
the fp convert to a specific bitwidth (say INT_MIN and INT_MAX). Because
it is dealing with smin(/smax) in DAG they may currently be ISD::SMIN,
ISD::SETCC/ISD::SELECT, ISD::VSELECT or ISD::SELECT_CC nodes which need
to be handled similarly.
A shouldConvertFpToSat method was added to control when converting may
be profitable. The original fptosi will have a less strict semantics
than the fptosisat, with less values that need to produce defined
behaviour.
This especially helps on ARM/AArch64 where the vcvt instructions
naturally saturate the result.
Differential Revision: https://reviews.llvm.org/D111976
gysit [Tue, 30 Nov 2021 15:17:38 +0000 (15:17 +0000)]
[mlir][linalg] Fix windows build issue in hoist padding.
Iterating backwardSlice and removing elements at the same time can fail on windows for specific build configurations (the code was introduced in https://reviews.llvm.org/D114420). This revision introduces a second vector to collect all operations and removes them after finishing the reverse iteration.
Reviewed By: hpmorgan
Differential Revision: https://reviews.llvm.org/D114775
Joseph Huber [Tue, 30 Nov 2021 15:15:44 +0000 (10:15 -0500)]
[OpenMP] Add RTL function to externalization RAII
This patch adds the `__kmpc_get_warp_size` OpenMP RTL function to the
externalization RAII struct. This was getting optimized out and then
being replaced with an undefined value once added back in, causing bugs
for complex reductions.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D114802
gysit [Tue, 30 Nov 2021 14:48:25 +0000 (14:48 +0000)]
[mlir][linalg] Run CSE after every CodegenStrategy transformation.
Add CSE after every transformation. Transformations such as tiling introduce redundant computation, for example, one AffineMinOp for every operand dimension pair. Follow up transformations such as Padding and Hoisting benefit from CSE since comparing slice sizes simplifies to comparing SSA values instead of analyzing affine expressions.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114585
Vy Nguyen [Mon, 22 Nov 2021 22:14:20 +0000 (17:14 -0500)]
[lld-macho] Mark dylib symbols coming from -weak_framework as weak-ref.
PR:52564
Differential Revision: https://reviews.llvm.org/D114397
Valentin Clement [Tue, 30 Nov 2021 14:49:22 +0000 (15:49 +0100)]
[fir] Add fir reduction builder
This patch introduces a bunch of builder functions
to create function calls to runtime reduction functions.
This patch is part of the upstreaming effort from fir-dev branch.
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: mleair <leairmark@gmail.com>
Differential Revision: https://reviews.llvm.org/D114460
Reviewed By: awarzynski
Tobias Burnus [Tue, 30 Nov 2021 14:33:48 +0000 (14:33 +0000)]
[MC][ELF] Fix accepting abbreviated form with Type change
Follow up to D92052 and D94072, exposed due to D107707
Many assemblers to permit that only the first .section contains all
the attributes like '.lds_bss,"w",@nobits' and later section only
use the name ('.lds_bss') inheriting those attributes from the first
section. I turned out that the case that Type changed was missed
when implementing it - and D107707 make it much more likely to hit
that issue. That's fixed by this commit.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D114717
Stanislav Gatev [Mon, 29 Nov 2021 15:11:35 +0000 (15:11 +0000)]
[clang][dataflow] Add base types for building dataflow analyses
This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Reviewed By: ymandel, xazax.hun, gribozavr2
Differential Revision: https://reviews.llvm.org/D114234
Hans Wennborg [Tue, 30 Nov 2021 14:36:56 +0000 (15:36 +0100)]
Revert "[DAG] Create fptosi.sat from clamped fptosi"
It causes builds to fail with this assert:
llvm/include/llvm/ADT/APInt.h:990:
bool llvm::APInt::operator==(const llvm::APInt &) const:
Assertion `BitWidth == RHS.BitWidth && "Comparison requires equal bit widths"' failed.
See comment on the code review.
> This adds a fold in DAGCombine to create fptosi_sat from sequences for
> smin(smax(fptosi(x))) nodes, where the min/max saturate the output of
> the fp convert to a specific bitwidth (say INT_MIN and INT_MAX). Because
> it is dealing with smin(/smax) in DAG they may currently be ISD::SMIN,
> ISD::SETCC/ISD::SELECT, ISD::VSELECT or ISD::SELECT_CC nodes which need
> to be handled similarly.
>
> A shouldConvertFpToSat method was added to control when converting may
> be profitable. The original fptosi will have a less strict semantics
> than the fptosisat, with less values that need to produce defined
> behaviour.
>
> This especially helps on ARM/AArch64 where the vcvt instructions
> naturally saturate the result.
>
> Differential Revision: https://reviews.llvm.org/D111976
This reverts commit
52ff3b009388f1bef4854f1b6470b4ec19d10b0e.
Mateja Marjanovic [Tue, 30 Nov 2021 14:00:16 +0000 (15:00 +0100)]
Test commit
Change-Id: I1d310a860ed673acdc8177232c91025004b1f3d2
Florian Hahn [Tue, 30 Nov 2021 13:50:09 +0000 (13:50 +0000)]
[DSE] Add memset_chk tests.
Florian Hahn [Tue, 30 Nov 2021 13:50:01 +0000 (13:50 +0000)]
[BuildLibCalls] Add memset_chk test.
Jeremy Morse [Tue, 30 Nov 2021 12:41:59 +0000 (12:41 +0000)]
[DebugInfo] Turn instruction referencing on by default for x86
This patch is designed to be reverted -- it activates a reasonably large
block of new-ish code, so some turbulence is likely.
Instruction referencing is best summarised, and it being on-by-default,
is discussed here:
https://lists.llvm.org/pipermail/llvm-dev/2021-November/153653.html
Differential Revision: https://reviews.llvm.org/D114631
Simon Pilgrim [Tue, 30 Nov 2021 13:43:11 +0000 (13:43 +0000)]
[X86] Add mulh test coverage for extension to illegal type
Part of D113371 - add test coverage for case where we're truncating from an illegal type
Pavel Labath [Thu, 11 Nov 2021 18:54:39 +0000 (19:54 +0100)]
[lldb] Introduce PlatformQemuUser
This adds a new platform class, whose job is to enable running
(debugging) executables under qemu.
(For general information about qemu, I recommend reading the RFC thread
on lldb-dev
<https://lists.llvm.org/pipermail/lldb-dev/2021-October/017106.html>.)
This initial patch implements the necessary boilerplate as well as the
minimal amount of functionality needed to actually be able to do
something useful (which, in this case means debugging a fully statically
linked executable).
The knobs necessary to emulate dynamically linked programs, as well as
to control other aspects of qemu operation (the emulated cpu, for
instance) will be added in subsequent patches. Same goes for the ability
to automatically bind to the executables of the emulated architecture.
Currently only two settings are available:
- architecture: the architecture that we should emulate
- emulator-path: the path to the emulator
Even though this patch is relatively small, it doesn't lack subtleties
that are worth calling out explicitly:
- named sockets: qemu supports tcp and unix socket connections, both of
them in the "forward connect" mode (qemu listening, lldb connecting).
Forward TCP connections are impossible to realise in a race-free way.
This is the reason why I chose unix sockets as they have larger, more
structured names, which can guarantee that there are no collisions
between concurrent connection attempts.
- the above means that this code will not work on windows. I don't think
that's an issue since user mode qemu does not support windows anyway.
- Right now, I am leaving the code enabled for windows, but maybe it
would be better to disable it (otoh, disabling it means windows
developers can't check they don't break it)
- qemu-user also does not support macOS, so one could contemplate
disabling it there too. However, macOS does support named sockets, so
one can even run the (mock) qemu tests there, and I think it'd be a
shame to lose that.
Differential Revision: https://reviews.llvm.org/D114509
Pavel Labath [Fri, 26 Nov 2021 08:26:52 +0000 (09:26 +0100)]
[lldb] Inline Platform::LoadCachedExecutable into its (single) caller
Nico Weber [Tue, 30 Nov 2021 13:04:15 +0000 (08:04 -0500)]
[gn build] (semimanually) port
25a7e4b9f7c6
Valentin Clement [Tue, 30 Nov 2021 12:50:32 +0000 (13:50 +0100)]
[fir] Add array value copy pass
This patch upstream the array value copy pass.
Transform the set of array value primitives to a memory-based array
representation.
The Ops `array_load`, `array_store`, `array_fetch`, and `array_update` are
used to manage abstract aggregate array values. A simple analysis is done
to determine if there are potential dependences between these operations.
If not, these array operations can be lowered to work directly on the memory
representation. If there is a potential conflict, a temporary is created
along with appropriate copy-in/copy-out operations. Here, a more refined
analysis might be deployed, such as using the affine framework.
This pass is required before code gen to the LLVM IR dialect.
This patch is part of the upstreaming effort from fir-dev branch. The
pass is bringing quite a lot of file with it.
Reviewed By: kiranchandramohan, schweitz
Differential Revision: https://reviews.llvm.org/D111337
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
Jeremy Morse [Tue, 30 Nov 2021 12:38:11 +0000 (12:38 +0000)]
[DebugInfo][InstrRef] Pre-land on-by-default-for-x86 changes
Over in D114631 and [0] there's a plan for turning instruction referencing
on by default for x86. This patch adds / removes all the relevant bits of
code, with the aim that the final patch is extremely small, for an easy
revert. It should just be a condition in CommandFlags.cpp and removing the
XFail on instr-ref-flag.ll.
[0] https://lists.llvm.org/pipermail/llvm-dev/2021-November/153653.html
Alexander Belyaev [Tue, 30 Nov 2021 12:27:24 +0000 (13:27 +0100)]
[mlir] Add bazel build for BufferizationToMemRef.
Alexander Belyaev [Tue, 30 Nov 2021 12:10:37 +0000 (13:10 +0100)]
[mlir] Fix BufferizationToMemRef build.
Jeremy Morse [Tue, 30 Nov 2021 11:47:58 +0000 (11:47 +0000)]
[DebugInfo][InstrRef][X86] Instrument expanded DYN_ALLOCAs
If we have a DYN_ALLOCA_* instruction, it will eventually be expanded to a
stack probe and subtract-from-SP. Add debug-info instrumentation to
X86FrameLowering::emitStackProbe so that it can redirect debug-info for the
DYN_ALLOCA to the lowered stack probe. In practice, this means putting an
instruction number label either the call instruction to _chkstk for win32,
or more commonly on the subtract from SP instruction. The two tests added
cover both of these cases.
Differential Revision: https://reviews.llvm.org/D114452
Abinav Puthan Purayil [Tue, 30 Nov 2021 11:11:06 +0000 (16:41 +0530)]
[AMDGPU][NFC] Remove unused defvar in AMDGPUInstructions.td.
Jeremy Morse [Tue, 30 Nov 2021 11:19:56 +0000 (11:19 +0000)]
[DebugInfo][InstrRef] Avoid dropping fragment info during PHI elimination
InstrRefBasedLDV used to crash on the added test -- the exit block is not
in scope for the variable being propagated, but is still considered because
it contains an assignment. The failure-mode was vlocJoin ignoring
assign-only blocks and not updating DIExpressions, but pickVPHILoc would
still find a variable location for it. That led to DBG_VALUEs created with
the wrong fragment information.
Fix this by removing a filter inherited from VarLocBasedLDV: vlocJoin will
now consider assign-only blocks and will update their expressions.
Differential Revision: https://reviews.llvm.org/D114727
David Green [Tue, 30 Nov 2021 11:05:32 +0000 (11:05 +0000)]
[DAG] Create fptosi.sat from clamped fptosi
This adds a fold in DAGCombine to create fptosi_sat from sequences for
smin(smax(fptosi(x))) nodes, where the min/max saturate the output of
the fp convert to a specific bitwidth (say INT_MIN and INT_MAX). Because
it is dealing with smin(/smax) in DAG they may currently be ISD::SMIN,
ISD::SETCC/ISD::SELECT, ISD::VSELECT or ISD::SELECT_CC nodes which need
to be handled similarly.
A shouldConvertFpToSat method was added to control when converting may
be profitable. The original fptosi will have a less strict semantics
than the fptosisat, with less values that need to produce defined
behaviour.
This especially helps on ARM/AArch64 where the vcvt instructions
naturally saturate the result.
Differential Revision: https://reviews.llvm.org/D111976
Louis Dionne [Mon, 29 Nov 2021 21:20:48 +0000 (16:20 -0500)]
[libc++][ABI BREAK] Do not use the C++03 emulation for std::nullptr_t by default
We only support Clangs that implement nullptr as an extension in C++03 mode,
and we don't support GCC in C++03 mode. Hence, this patch disables the
use of the std::nullptr_t emulation in C++03 mode by default. Doing that
is technically an ABI break since it changes the mangling for std::nullptr_t.
However:
(1) The only affected users are those compiling in C++03 mode that have
std::nullptr_t as part of their ABI, which should be reasonably rare.
(2) Those users already have a lingering problem in that their code will
be incompatible in C++03 and C++11 modes because of that very ABI break.
Hence, the only users that could really be inconvenienced about this
change is those that planned on compiling in C++03 mode forever - for
other users, we're just breaking them now instead of letting them break
themselves later on when they try to upgrade to C++11.
(3) The ABI break will cause a linker error since the mangling changed,
and will not result in an obscure runtime error.
Furthermore, if anyone is broken by this, they can define the
_LIBCPP_ABI_USE_CXX03_NULLPTR_EMULATION macro to return to the
previous behavior. We will then remove that macro after shipping
this for one release if we haven't seen widespread issues.
Concretely, the motivation for making this change is to make our own ABI
consistent in C++03 and C++11 modes and to remove complexity around the
definition of nullptr.
Furthermore, we could investigate making nullptr a keyword in C++03 mode
as a Clang extension -- I don't think that would break anyone, since
libc++ already defines nullptr as a macro to something else. Only users
that do not use libc++ and compile in C++03 mode could potentially be
broken by that.
Differential Revision: https://reviews.llvm.org/D109459
Guillaume Chatelet [Tue, 30 Nov 2021 10:52:34 +0000 (10:52 +0000)]
[libc] Add a reasonably optimized version for bcmp
This is based on current memcmp implementation.
Differential Revision: https://reviews.llvm.org/D114432
Guillaume Chatelet [Tue, 30 Nov 2021 10:46:16 +0000 (10:46 +0000)]
[libc] Add memmove benchmarks
This patch enables the benchmarking of `memmove`.
Ideally, this should be submitted before D114637.
Differential Revision: https://reviews.llvm.org/D114694
Jeremy Morse [Tue, 30 Nov 2021 10:21:31 +0000 (10:21 +0000)]
[DebugInfo][InstrRef] "final final" test cleanups for x86 tests
Two "totally definitely the last ones" instruction referencing test
updates:
* fp-stack.ll: this test targets i686, and so it won't be getting
instruction referencing, or at least not right now,
* X86/live-debug-values.ll: instruction referencing will produce entry
values in this test, add check lines to account for this. It's not clear
what the test is supposed to be testing anyway, but the entry values
appear to be correct.
Differential Revision: https://reviews.llvm.org/D114626
Florian Hahn [Tue, 30 Nov 2021 10:32:44 +0000 (10:32 +0000)]
[LV] Move code from widenSelectInstruction to VPWidenSelectRecipe. (NFC)
The code in widenSelectInstruction has already been transitioned
to only rely on information provided by VPWidenSelectRecipe directly.
Moving the code directly to VPWidenSelectRecipe::execute completes
the transition for the recipe.
It provides the following advantages:
1. Less indirection, easier to see what's going on.
2. Removes accesses to fields of ILV.
2) in particular ensures that no dependencies on
fields in ILV for vector code generation are re-introduced.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D114323
Gabor Marton [Thu, 11 Nov 2021 16:12:24 +0000 (17:12 +0100)]
[Analyzer][Core] Make SValBuilder to better simplify svals with 3 symbols in the tree
Add the capability to simplify more complex constraints where there are 3
symbols in the tree. In this change I extend simplifySVal to query constraints
of children sub-symbols in a symbol tree. (The constraint for the parent is
asked in getKnownValue.)
Differential Revision: https://reviews.llvm.org/D103317
Gabor Marton [Fri, 26 Nov 2021 09:59:09 +0000 (10:59 +0100)]
[Analyzer][solver] Do not remove the simplified symbol from the eq class
Currently, during symbol simplification we remove the original member symbol
from the equivalence class (`ClassMembers` trait). However, we keep the
reverse link (`ClassMap` trait), in order to be able the query the
related constraints even for the old member. This asymmetry can lead to
a problem when we merge equivalence classes:
```
ClassA: [a, b] // ClassMembers trait,
a->a, b->a // ClassMap trait, a is the representative symbol
```
Now lets delete `a`:
```
ClassA: [b]
a->a, b->a
```
Let's merge the trivial class `c` into ClassA:
```
ClassA: [c, b]
c->c, b->c, a->a
```
Now after the merge operation, `c` and `a` are actually in different
equivalence classes, which is inconsistent.
One solution to this problem is to simply avoid removing the original
member and this is what this patch does.
Other options I have considered:
1) Always merge the trivial class into the non-trivial class. This might
work most of the time, however, will fail if we have to merge two
non-trivial classes (in that case we no longer can track equivalences
precisely).
2) In `removeMember`, update the reverse link as well. This would cease
the inconsistency, but we'd loose precision since we could not query
the constraints for the removed member.
Differential Revision: https://reviews.llvm.org/D114619
Pavel Labath [Mon, 22 Nov 2021 15:32:44 +0000 (16:32 +0100)]
[lldb] Remove 'extern "C"' from the lldb-swig-python interface
The LLDBSWIGPython functions had (at least) two problems:
- There wasn't a single source of truth (a header file) for the
prototypes of these functions. This meant that subtle differences
in copies of function declarations could go by undetected. And
not-so-subtle differences would result in strange runtime failures.
- All of the declarations had to have an extern "C" interface, because
the function definitions were being placed inside and extert "C" block
generated by swig.
This patch fixes both problems by moving the function definitions to the
%header block of the swig files. This block is not surrounded by extern
"C", and seems more appropriate anyway, as swig docs say it is meant for
"user-defined support code" (whereas the previous %wrapper code was for
automatically-generated wrappers).
It also puts the declarations into the SWIGPythonBridge header file
(which seems to have been created for this purpose), and ensures it is
included by all code wishing to define or use these functions. This
means that any differences in the declaration become a compiler error
instead of a runtime failure.
Differential Revision: https://reviews.llvm.org/D114369