Chuanqi Xu [Wed, 20 Jul 2022 02:37:23 +0000 (10:37 +0800)]
Don't treat readnone call in presplit coroutine as not access memory
To solve the readnone problems in coroutines. See
https://discourse.llvm.org/t/address-thread-identification-problems-with-coroutine/62015
for details.
According to the discussion, we decide to fix the problem by inserting
isPresplitCoroutine() checks in different passes instead of
wrapping/unwrapping readnone attributes in CoroEarly/CoroCleanup passes.
In this direction, we might not be able to cover every case at first.
Let's take a "find and fix" strategy.
Reviewed By: nikic, nhaehnle, jyknight
Differential Revision: https://reviews.llvm.org/D127383
Jez Ng [Wed, 20 Jul 2022 01:54:58 +0000 (21:54 -0400)]
[lld-macho] Simplify archive loading logic
This is a follow-on to {D129556}. I've refactored the code such that
`addFile()` no longer needs to take an extra parameter. Additionally,
the "do we force-load or not" policy logic is now fully contained within
addFile, instead of being split between `addFile` and
`parseLCLinkerOptions`. This also allows us to move the `ForceLoad` (now
`LoadType`) enum out of the header file.
Additionally, we can now correctly report loads induced by
`LC_LINKER_OPTION` in our `-why_load` output.
I've also added another test to check that CLI library non-force-loads
take precedence over `LC_LINKER_OPTION` + `-force_load_swift_libs`. (The
existing logic is correct, just untested.)
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D130137
Alex Brachet [Wed, 20 Jul 2022 01:42:56 +0000 (01:42 +0000)]
[llvm-driver] Generate symlinks instead of executables for tools
When LLVM_TOOL_LLVM_DRIVER_BUILD is On, create symlinks
to llvm instead of creating the executables. Currently
this only works for install and not
install-distribution, the work for the later will be
split up into a second patch.
Differential Revision: https://reviews.llvm.org/D127800
Sanjay Patel [Wed, 20 Jul 2022 01:25:41 +0000 (21:25 -0400)]
[x86] use zero-extending load of a byte outside of loops too (2nd try)
The first attempt missed changing test files for tools
(update_llc_test_checks.py).
Original commit message:
This implements the main suggested change from issue #56498.
Using the shorter (non-extending) instruction with only
-Oz ("minsize") rather than -Os ("optsize") is left as a
possible follow-up.
As noted in the bug report, the zero-extending load may have
shorter latency/better throughput across a wide range of x86
micro-arches, and it avoids a potential false dependency.
The cost is an extra instruction byte.
This could cause perf ups and downs from secondary effects,
but I don't think it is possible to account for those in
advance, and that will likely also depend on exact micro-arch.
This does bring LLVM x86 codegen more in line with existing
gcc codegen, so if problems are exposed they are more likely
to occur for both compilers.
Differential Revision: https://reviews.llvm.org/D129775
Jez Ng [Wed, 20 Jul 2022 01:22:27 +0000 (21:22 -0400)]
[lld-macho] Read in new addrsig format
The new format uses symbol relocations, as described in {D127637}.
Reviewed By: #lld-macho, alx32
Differential Revision: https://reviews.llvm.org/D128938
Jez Ng [Wed, 20 Jul 2022 01:22:23 +0000 (21:22 -0400)]
[MC][MachO] Change addrsig format + ensure its size is properly set
There were two problems with the previous setup:
1. We weren't setting its size, which caused problems when `__llvm_addrsig`
wasn't the last section. In particular, `__debug_line` (if created) is
generated and placed after `__llvm_addrsig`, and would result in an
invalid object file w/ overlapping sections being emitted.
2. The symbol indices could be invalidated if e.g. `llvm-strip` ran on
the object file. See discussion [here][1].
To fix both these issues, we use symbol relocations instead of encoding
symbol indices directly in the section contents. The section itself
doesn't contain any data. That sidesteps the layout problem in addition
to solving the second issue.
The corresponding LLD change to read in this new format: {D128938}.
It will fix the icf-safe.ll test failure on this diff.
[1]: https://discourse.llvm.org/t/problems-with-mach-o-address-significance-table-generation/63392/
Reviewed By: #lld-macho, alx32
Differential Revision: https://reviews.llvm.org/D127637
Konstantin Varlamov [Wed, 20 Jul 2022 01:14:44 +0000 (18:14 -0700)]
[libc++][ranges] Fix broken CI.
Hui Xie [Wed, 20 Jul 2022 00:24:23 +0000 (17:24 -0700)]
[libc++][ranges] fix `std::search_n` incorrect `static_assert`
[libc++][ranges] fix `std::search_n` incorrect `static_assert`
see more detail in https://reviews.llvm.org/D124079?#3661721
Differential Revision: https://reviews.llvm.org/D130124
Lang Hames [Wed, 20 Jul 2022 00:19:58 +0000 (17:19 -0700)]
[ORC] Fix serialization / deserialization of default-constructed StringRef.
Avoids accessing the data field on zero-length strings. This is the StringRef
counterpart to the ArrayRef<char> fix in
67220c2ad72e3.
rdar://
97285294
Konstantin Varlamov [Wed, 20 Jul 2022 00:20:56 +0000 (17:20 -0700)]
[libc++][ranges][NFC] Test that range algorithms support iterators requiring `iter_move`.
Differential Revision: https://reviews.llvm.org/D130057
Joe Loser [Fri, 3 Jun 2022 23:05:23 +0000 (17:05 -0600)]
[libc++] Define ostream nullptr inserter for >= C++17 only
The `ostream` `nullptr` inserter implemented in 3c125fe is missing a C++ version
guard. Normally, `libc++` takes the stance of backporting LWG issues to older
standards modes as was done in 3c125fe. However, backporting to older standards
modes breaks existing code in popular libraries such as `Boost.Test` and
`Google Test` who define their own overload for `nullptr_t`.
Instead, only apply this `operator<<` overload in C++17 or later.
Fixes https://github.com/llvm/llvm-project/issues/55861.
Differential Revision: https://reviews.llvm.org/D127033
Qwinci [Wed, 20 Jul 2022 00:02:24 +0000 (20:02 -0400)]
Argument name support for function pointer signature hints
Fixes https://github.com/clangd/clangd/issues/1068
Reviewed By: nridge
Differential Revision: https://reviews.llvm.org/D125120
River Riddle [Tue, 19 Jul 2022 20:21:36 +0000 (13:21 -0700)]
[mlir][NFC] Split out various tests from IR/invalid.mlir
This file contains a huge number of tests that should really be in
different dialect/files. It is monolothic because of the legacy
surrounding the old standard dialect, affine operations, etc. Splitting
this up makes the tests much more maintainable given that they are now
group with other similar tests.
Volodymyr Sapsai [Mon, 27 Jun 2022 23:53:36 +0000 (16:53 -0700)]
[ODRHash diagnostics] Preparation to minimize subsequent diffs. NFC.
Specifically, making the following changes:
* Turn lambdas calculating ODR hashes into static functions.
* Move `ODRCXXRecordDifference` where it is used.
* Rename some variables and move some lines of code.
* Replace `auto` with explicit type when the deduced type is not mentioned.
* Add `const` for unmodified objects, so we can pass them to more functions.
Differential Revision: https://reviews.llvm.org/D128690
Johannes Doerfert [Sat, 9 Jul 2022 18:52:49 +0000 (13:52 -0500)]
[Attributor] Teach checkForAllUses to follow returns into callers
If we can determine all call sites we can follow a use in a return
instruction into the caller. AAPointerInfo utilizes this feature.
Johannes Doerfert [Sat, 9 Jul 2022 00:54:04 +0000 (19:54 -0500)]
[Attributor][NFC] Improve debug messages
Sriraman Tallam [Tue, 19 Jul 2022 23:09:11 +0000 (16:09 -0700)]
[bolt] std::atomic_uint64_t to std::atomic<uint64_t>
Differential Revision: https://reviews.llvm.org/D129903
Sriraman Tallam [Fri, 15 Jul 2022 22:08:08 +0000 (15:08 -0700)]
Bazel BUILD file for BOLT.
Differential Revision: https://reviews.llvm.org/D129899
Slava Zakharin [Tue, 19 Jul 2022 20:26:35 +0000 (13:26 -0700)]
[mlir] Fixed ordering of pass statistics.
The change makes sure the plain C string statistics names
are properly ordered.
Differential Revision: https://reviews.llvm.org/D130122
LLVM GN Syncbot [Tue, 19 Jul 2022 22:44:22 +0000 (22:44 +0000)]
[gn build] Port
1b1f1c778695
Siva Chandra Reddy [Tue, 19 Jul 2022 19:19:52 +0000 (19:19 +0000)]
[libc] Add a method `find_last_of` to StringView.
Reviewed By: jeffbailey
Differential Revision: https://reviews.llvm.org/D130112
Zequan Wu [Tue, 19 Jul 2022 19:58:56 +0000 (12:58 -0700)]
[LLDB][NativePDB] Add MSInheritanceAttr when creating pointer type that is a pointer to member.
Differential Revision: https://reviews.llvm.org/D129807
Anubhab Ghosh [Tue, 19 Jul 2022 22:28:55 +0000 (15:28 -0700)]
Re-re-apply
5acd47169884, Add a shared-memory based orc::MemoryMapper...
...with more fixes.
The original patch was reverted in
3e9cc543f22 due to bot failures caused by
a missing dependence on librt. That issue was fixed in
32d8d23cd0, but that
commit also broke sanitizer bots due to a bug in SimplePackedSerialization:
empty ArrayRef<char>s triggered a zero-byte memcpy from a null source. The
ArrayRef<char> serialization issue was fixed in
67220c2ad7, and this patch has
also been updated with a new custom SharedMemorySegFinalizeRequest message that
should avoid serializing empty ArrayRefs in the first place.
https://reviews.llvm.org/D128544
Nick Desaulniers [Tue, 19 Jul 2022 21:59:07 +0000 (14:59 -0700)]
Revert "[Local] Allow creating callbr with duplicate successors"
This reverts commit
08860f525a2363ccd697ebb3ff59769e37b1be21.
Crashes during PPC64LE linux kernel builds as reported by @nathanchance.
https://reviews.llvm.org/D129997#3663632
Lang Hames [Tue, 19 Jul 2022 22:00:32 +0000 (15:00 -0700)]
[JITLink] Hook up prebuilt cache in DWARFRecordSectionSplitter::processBlock.
DWARFRecordSectionSplitter pre-builds a splitBlock cache, but wasn't passing it
to the call to splitBlock. This was an oversight in the original patch.
Kaining Zhong [Tue, 19 Jul 2022 21:43:30 +0000 (17:43 -0400)]
[lld-macho] Fix loading same libraries from both LC_LINKER_OPTION and command line
This fixes https://github.com/llvm/llvm-project/issues/56059 and
https://github.com/llvm/llvm-project/issues/56440. This is inspired by
tapthaker's patch (https://reviews.llvm.org/D127941), and has reused his
test cases. This patch adds an bool "isCommandLineLoad" to indicate
where archives are from. If lld tries to load the same library loaded
previously by LC_LINKER_OPTION from CLI, it will use this
isCommandLineLoad to determine if it should be affected by -all_load &
-ObjC flags. This also prevents -force_load from affecting archives
loaded previously from CLI without such flag, whereas tapthaker's patch
will fail such test case (introduced by
https://reviews.llvm.org/D128025).
Reviewed By: int3, #lld-macho
Differential Revision: https://reviews.llvm.org/D129556
Jacques Pienaar [Tue, 19 Jul 2022 21:42:57 +0000 (14:42 -0700)]
[mlir] Flip LinAlg dialect to _Both
This one required more changes than ideal due to overlapping generated name
with different return types. Changed getIndexingMaps to getIndexingMapsArray to
move it out of the way/highlight that it returns (more expensively) a
SmallVector and uses the prefixed name for the Attribute.
Differential Revision: https://reviews.llvm.org/D129919
Sanjay Patel [Tue, 19 Jul 2022 21:36:27 +0000 (17:36 -0400)]
Revert "[x86] use zero-extending load of a byte outside of loops too"
This reverts commit
9d1ea1774c51c44ddf0b5065bf600919988d7015.
There are tests of update_llc_tests_checks.py that missed being updated.
Johannes Doerfert [Tue, 21 Jun 2022 15:30:10 +0000 (10:30 -0500)]
[Attributor] Replace AAValueSimplify with AAPotentialValues
For the longest time we used `AAValueSimplify` and
`genericValueTraversal` to determine "potential values". This was
problematic for many reasons:
- We recomputed the result a lot as there was no caching for the 9
locations calling `genericValueTraversal`.
- We added the idea of "intra" vs. "inter" procedural simplification
only as an afterthought. `genericValueTraversal` did offer an option
but `AAValueSimplify` did not. Thus, we might end up with "too much"
simplification in certain situations and then gave up on it.
- Because `genericValueTraversal` was not a real `AA` we ended up with
problems like the infinite recursion bug (#54981) as well as code
duplication.
This patch introduces `AAPotentialValues` and replaces the
`AAValueSimplify` uses with it. `genericValueTraversal` is folded into
`AAPotentialValues` as are the instruction simplifications performed in
`AAValueSimplify` before. We further distinguish "intra" and "inter"
procedural simplification now.
`AAValueSimplify` was not deleted as we haven't ported the
re-materialization of instructions yet. There are other differences over
the former handling, e.g., we may not fold trivially foldable
instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2`
but if an operand would be simplified to `i32 1` we would fold it still.
We are also even more aware of function/SCC boundaries in CGSCC passes,
which is good even if some tests look like they regress.
Fixes: https://github.com/llvm/llvm-project/issues/54981
Note: A previous version was flawed and consequently reverted in
6555558a80589d1c5a1154b92cc3af9495f8f86c.
Philip Reames [Tue, 19 Jul 2022 21:15:03 +0000 (14:15 -0700)]
[LV] Autogen a partially autogened test for ease of update
Louis Dionne [Tue, 19 Jul 2022 21:16:53 +0000 (17:16 -0400)]
[libc++][NFC] Add commit SHA for ABI change
Louis Dionne [Tue, 19 Jul 2022 15:04:31 +0000 (11:04 -0400)]
[libc++] Drop the legacy debug mode symbols by default
Leave the escape hatch in place with a note, but don't include the
debug mode symbols by default since we don't support the debug mode
in the normal library anymore.
This is technically an ABI break for users who were depending on
those debug mode symbols in the dylib, however those users will
already be broken at compile-time because they must have been using
_LIBCPP_DEBUG=2, which is now an error.
Differential Revision: https://reviews.llvm.org/D127360
Slava Zakharin [Mon, 18 Jul 2022 22:49:24 +0000 (15:49 -0700)]
[flang] Support late math lowering for intrinsics from the llvm table.
mathOperations should now support all intrinsics that are handled
by the llvmIntrinsics table + `tan` lowered as Math dialect operation +
f128 flavor of abs.
I am going to flip the default to late math lowering after this change,
but still keep the fallback via pgmath. This will allow getting rid
of the llvmIntrinsics table and continue populating
only the mathOperations table, otherwise, updating both tables
seems to be inconvenient.
Differential Revision: https://reviews.llvm.org/D130048
Keith Smiley [Sat, 16 Jul 2022 18:26:44 +0000 (11:26 -0700)]
[lld-macho] Add support for -alias
This creates a symbol alias similar to --defsym in the elf linker. This
is used by swiftpm for all executables, so it's useful to support. This
doesn't implement -alias_list but that could be done pretty easily as
needed.
Differential Revision: https://reviews.llvm.org/D129938
Nico Weber [Tue, 19 Jul 2022 20:50:53 +0000 (16:50 -0400)]
[gn build] (manually) port
c91ce941448 (HTMLForestResources.inc)
Sanjay Patel [Tue, 19 Jul 2022 20:11:14 +0000 (16:11 -0400)]
[x86] use zero-extending load of a byte outside of loops too
This implements the main suggested change from issue #56498.
Using the shorter (non-extending) instruction with only
-Oz ("minsize") rather than -Os ("optsize") is left as a
possible follow-up.
As noted in the bug report, the zero-extending load may have
shorter latency/better throughput across a wide range of x86
micro-arches, and it avoids a potential false dependency.
The cost is an extra instruction byte.
This could cause perf ups and downs from secondary effects,
but I don't think it is possible to account for those in
advance, and that will likely also depend on exact micro-arch.
This does bring LLVM x86 codegen more in line with existing
gcc codegen, so if problems are exposed they are more likely
to occur for both compilers.
Differential Revision: https://reviews.llvm.org/D129775
Sanjay Patel [Tue, 19 Jul 2022 18:54:57 +0000 (14:54 -0400)]
[x86] add tests for fixup-bw with size optimization attrs; NFC
Sam McCall [Mon, 18 Jul 2022 12:52:11 +0000 (14:52 +0200)]
[pseudo] Add `clang-pseudo -html-forest=<output.html>`, an HTML forest browser
It generates a standalone HTML file with all needed JS/CSS embedded.
This allows navigating the tree both with a tree widget and in the code,
inspecting nodes, and selecting ambiguous alternatives.
Demo: https://htmlpreview.github.io/?https://gist.githubusercontent.com/sam-mccall/
03882f7499d293196594e8a50599a503/raw/ASTSignals.cpp.html
Differential Revision: https://reviews.llvm.org/D130004
Arthur Eubanks [Tue, 19 Jul 2022 20:10:37 +0000 (13:10 -0700)]
[test] Convert some tests to use opaque pointers
Denys Petrov [Sat, 16 Jul 2022 09:05:22 +0000 (12:05 +0300)]
[analyzer][NFC] Use `SValVisitor` instead of explicit helper functions
Summary: Get rid of explicit function splitting in favor of specifically designed Visitor. Move logic from a family of `evalCastKind` and `evalCastSubKind` helper functions to `SValVisitor`.
Differential Revision: https://reviews.llvm.org/D130029
Benoit Jacob [Tue, 19 Jul 2022 14:11:18 +0000 (14:11 +0000)]
Don't combine if there would remain no true reduction dim.
Differential Revision: https://reviews.llvm.org/D130109
Rajas Vanjape [Tue, 19 Jul 2022 19:18:55 +0000 (19:18 +0000)]
[mlir][docs] Fix pass manager document
The code example for pass manager incorrectly uses nestedFunctionPM
instead of nestedAnyPm for adding CSE and Canonicalize Passes. This diff fixes
it by changing it to nestedAnyPm.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D130110
Philip Reames [Tue, 19 Jul 2022 19:29:07 +0000 (12:29 -0700)]
[LV] Add test for generic predicated sdiv
Louis Dionne [Wed, 6 Jul 2022 22:10:10 +0000 (18:10 -0400)]
[clang] Add a new flag -fexperimental-library to enable experimental library features
Based on the discussion at [1], this patch adds a Clang flag called
-fexperimental-library that controls whether experimental library
features are provided in libc++. In essence, it links against the
experimental static archive provided by libc++ and defines a feature
that can be picked up by libc++ to enable experimental features.
This ensures that users don't start depending on experimental
(and hence unstable) features unknowingly.
[1]: https://discourse.llvm.org/t/rfc-a-compiler-flag-to-enable-experimental-unstable-language-and-library-features
Differential Revision: https://reviews.llvm.org/D121141
David Green [Tue, 19 Jul 2022 18:36:08 +0000 (19:36 +0100)]
[ARM] Update atomic tests for D129695. NFC
Philip Reames [Tue, 19 Jul 2022 18:16:34 +0000 (11:16 -0700)]
[LV] Add test coverage for a bug in srem handling
Kamau Bridgeman [Mon, 18 Jul 2022 15:18:35 +0000 (10:18 -0500)]
[TSAN] Disable clone_setns test case on PPC64 RHEL 7.9 Targets
The compler-rt test case tsan/Linux/clone_setns.cpp fails on
PowerPC64 RHEL 7.9 targets.
Unshare fails with errno code EINVAL.
It is unclear why this happens specifically on RHEL 7.9 and no other
operating system like Ubuntu 18 or RHEL 8.4 for example.
This patch uses marcos to disable the test case for ppc64 rhel7.9
because there are no XFAIL directives to target rhel 7.9 specifically.
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D130086
Nico Weber [Tue, 19 Jul 2022 18:23:46 +0000 (14:23 -0400)]
[gn build] fix typo
Nico Weber [Tue, 19 Jul 2022 18:23:26 +0000 (14:23 -0400)]
[gn build] (manually) port
e939bf67e340
Jonathan Peyton [Wed, 18 May 2022 20:38:14 +0000 (15:38 -0500)]
[OpenMP][libomp] Fix affinity warnings and unify under one macro
Warnings that occur during affinity initialization are supposed
to be guarded by KMP_AFFINITY=nowarnings,noverbose, but some had been
missed by this logic. Create one macro for affinity warnings that takes
these settings into account.
Differential Revision: https://reviews.llvm.org/D125991
AndreyChurbanov [Wed, 18 May 2022 21:06:20 +0000 (16:06 -0500)]
[OpenMP][libomp] Allow reset affinity mask after parallel
Added control to reset affinity of primary thread after outermost parallel
region to initial affinity encountered before OpenMP runtime was initialized.
KMP_AFFINITY environment variable reset/noreset modifier introduced.
Default behavior is unchanged.
Differential Revision: https://reviews.llvm.org/D125993
Jonathan Peyton [Thu, 19 May 2022 17:17:43 +0000 (12:17 -0500)]
[OpenMP][libomp] Fix fallthrough attribute detection for Intel compilers
icc does not properly detect lack of fallthrough attribute since it
defines __GNU__ > 7 and also icc's __has_cpp_attribute/__has_attribute
feature detectors do not properly detect the lack of fallthrough attribute.
Differential Revision: https://reviews.llvm.org/D126001
Philip Reames [Tue, 19 Jul 2022 17:56:59 +0000 (10:56 -0700)]
[LV] Add test coverage for scalable div/rem patterns
AndreyChurbanov [Thu, 19 May 2022 17:06:01 +0000 (12:06 -0500)]
[OpenMP][libomp] Fix /dev/shm pollution after forked child process terminates
Made library registration conditional and skip it in the __kmp_atfork_child
handler, postponed it till middle initialization in the child.
This fixes the problem of applications those use e.g. popen/pclose
which terminate the forked child process.
Differential Revision: https://reviews.llvm.org/D125996
Yusra Syeda [Tue, 19 Jul 2022 17:34:09 +0000 (13:34 -0400)]
[SystemZ][z/OS] Introduce CCAssignToRegAndStack to calling convention
Differential Revision: https://reviews.llvm.org/D127328
Cole Kissane [Tue, 19 Jul 2022 17:54:35 +0000 (10:54 -0700)]
[llvm] add zstd to `llvm::compression` namespace
- add zstd to `llvm::compression` namespace
- add a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB`
- add tests for zstd to `llvm/unittests/Support/CompressionTest.cpp`
- debian users should install libzstd when using `LLVM_ENABLE_ZSTD=FORCE_ON` from source due to this bug https://bugs.launchpad.net/ubuntu/+source/libzstd/+bug/1941956
Reviewed By: leonardchan, MaskRay
Differential Revision: https://reviews.llvm.org/D128465
Jez Ng [Tue, 19 Jul 2022 17:18:54 +0000 (13:18 -0400)]
[lld-macho] Support folding of functions with identical LSDAs
To do this, we need to slice away the LSDA pointer, just like we are
slicing away the functionAddress pointer.
No observable difference in perf on chromium_framework:
base diff difference (95% CI)
sys_time 1.769 ± 0.068 1.761 ± 0.065 [ -2.7% .. +1.8%]
user_time 9.517 ± 0.110 9.528 ± 0.116 [ -0.6% .. +0.8%]
wall_time 8.291 ± 0.174 8.307 ± 0.183 [ -1.1% .. +1.5%]
samples 21 25
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D129830
Jeff Niu [Tue, 19 Jul 2022 17:25:24 +0000 (10:25 -0700)]
Revert "[mlir][ods] (NFC) Remove warning in `AttrOrTypeDef`"
This reverts commit
e45ef5ebf4402e553c9a0b10e8765811cc33bbdd.
Jon Chesterfield [Tue, 19 Jul 2022 16:59:45 +0000 (17:59 +0100)]
Revert "[Libomptarget] Make libomptarget an LLVM library"
This reverts commit
70039be62774ae8fc53bb3b8f1bdbd2b0efb3355.
Jon Chesterfield [Tue, 19 Jul 2022 16:46:17 +0000 (17:46 +0100)]
[amdgpu] Implement lds kernel id intrinsic
Implement an intrinsic for use lowering LDS variables to different
addresses from different kernels. This will allow kernels that cannot
reach an LDS variable to avoid wasting space for it.
There are a number of implicit arguments accessed by intrinsic already
so this implementation closely follows the existing handling. It is slightly
novel in that this SGPR is written by the kernel prologue.
It is necessary in the general case to put variables at different addresses
such that they can be compactly allocated and thus necessary for an
indirect function call to have some means of determining where a
given variable was allocated. Claiming an arbitrary SGPR into which
an integer can be written by the kernel, in this implementation based
on metadata associated with that kernel, which is then passed on to
indirect call sites is sufficient to determine the variable address.
The intent is to emit a __const array of LDS addresses and index into it.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D125060
Tarun Prabhu [Tue, 19 Jul 2022 16:24:29 +0000 (16:24 +0000)]
Bitwise comparison intrinsics
This patch implements lowering for the F08 bitwise comparison intrinsics
(BGE, BGT, BLE and BLT).
This does not create any runtime functions since the functionality is
simple enough to carry out in IR.
The existing semantic check has been changed because it unconditionally
converted the arguments to the largest possible integer type. This
resulted in the argument with the smaller bit-size being sign-extended.
However, the standard requires the argument with the smaller bit-size to
be zero-extended.
Reviewed By: klausler, jeanPerier
Differential Revision: https://reviews.llvm.org/D127805
Joseph Huber [Fri, 15 Jul 2022 16:10:18 +0000 (12:10 -0400)]
[Libomptarget] Make libomptarget an LLVM library
This patch makes libomptarget depend on LLVM libraries to be built. The
reason for this is because we already have an implicit dependency on
LLVM headers for ELF identification and extraction as well as an
optional dependenly on the LLVMSupport library for time tracing
information. Furthermore, there are changes in the future that require
using more LLVM libraries, and will heavily simplify some future code as
well as open up the large amount of useful LLVM libraries to
libomptarget.
This will make "standalone" builds of `libomptarget' more difficult for
vendors wishing to ship their own. This will require a sufficiently new
version of LLVM to be installed on the system that should be picked up
by the existing handling for the implicit headers.
The things this patch changes are as follows:
- `libomptarget.so` links against LLVMSupport and LLVMObject
- `libomptarget.so` is a symbolic link to `libomptarget.so.15`
- If using a shared library build, user applications will depend on LLVM
libraries as well
- We can now use LLVM resources in Libomptarget.
Note that this patch only changes this to apply to libomptarget itself,
not the plugins. Additional patches will be necessary for that.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D129875
Arthur Eubanks [Tue, 12 Apr 2022 23:22:49 +0000 (16:22 -0700)]
[DSE] Revisit pointers that may no longer escape after removing another store
In dependent-capture, previously we'd see that %tmp4 is captured due to
the first store. We'd cache this info in CapturedBeforeReturn and
InvisibleToCallerAfterRet. Then the first store is then removed, causing
the cached values to be wrong.
We also need to revisit everything because normally we work backwards
when removing stores at the end of the function, but in this case
removing an earlier store causes a later store to be removable.
No compile time impact:
https://llvm-compile-time-tracker.com/compare.php?from=
56796ae1a8db4c85dada28676f8303a5a3609c63&to=
21b7e5248ffc423cd36c9d4a020085e363451465&stat=instructions
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D123686
Sanjay Patel [Tue, 19 Jul 2022 15:55:30 +0000 (11:55 -0400)]
[SimplifyLibCalls] avoid converting pow() to powi() with no FMF
powi() is not a standard math library function; it is specified
with non-strict semantics in the LangRef. We currently require
'afn' to do this transform when it needs a sqrt(), so I just
extended that requirement to the whole-number exponent too.
This bug was introduced with:
b17754bcaa14
...where we deferred expansion of pow() to later passes.
Jon Chesterfield [Tue, 19 Jul 2022 16:17:22 +0000 (17:17 +0100)]
[nfc][amdgpu] LDS. Move selection logic up the stack.
Akira Hatanaka [Mon, 11 Jul 2022 17:01:23 +0000 (10:01 -0700)]
[libclang][ObjC] Inherit availability attribute from containing decls or
interface decls
This patch teaches getCursorPlatformAvailabilityForDecl to look for
availability attributes on the containing decls or interface decls if
the current decl doesn't have any availability attributes.
Differential Revision: https://reviews.llvm.org/D129504
Jeff Niu [Tue, 19 Jul 2022 16:14:52 +0000 (09:14 -0700)]
[mlir][ods] (NFC) Remove warning in `AttrOrTypeDef`
This warning was added because using attribute or type assembly formats
with `skipDefaultBuilders` set could cause compilation errors, since the
required builder prototype may not necessarily be generated and would
need to be checked by hand. This patch removes the warning because a
warning that the generated C++ "might" not compile is not particularly
useful. Attempting to address the TODO (i.e. detect whether a builder of
the correct prototype is provided) would be fragile since it would not
be possible to account for implicit conversions, etc.
In general, ODS should not be emitting warnings in cases like these.
bhatuzdaname [Tue, 19 Jul 2022 15:54:24 +0000 (08:54 -0700)]
[mlir][tblgen] Add support for extraClassDefinition in AttrDef
For AttrDef declarations, place specified code in extraClassDefinition into the generated *.cpp.inc file.
Reviewed By: Mogball, rriddle
Differential Revision: https://reviews.llvm.org/D129574
David Truby [Tue, 19 Jul 2022 13:13:10 +0000 (14:13 +0100)]
[llvm][SVE] Remove redundant and when comparing against extending load
When determining if an `and` should be merged into an extending load
the constant argument to the `and` is currently not checked if the
argument requires truncation. This prevents the combine happening when
the vector width is half the normal available vector width for SVE VLA
vectors.
Reviewed By: c-rhodes
Differential Revision: https://reviews.llvm.org/D129281
Arthur Eubanks [Thu, 16 Jun 2022 20:30:12 +0000 (13:30 -0700)]
[NewPM] Print function/SCC size with -debug-pass-manager
This is helpful for debugging issues with very large functions or SCC.
Also helpful when function names are very large and it's hard to tell the number of nodes in an SCC.
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D128003
lipracer [Tue, 19 Jul 2022 15:48:34 +0000 (08:48 -0700)]
[mlir][NFC] Use proper c++ namespaces in .td files
td files:
mlir::ArrayRef => llvm::ArrayRef
mlir::Optional=>llvm::Optional
mlir::SmallVector => llvm::SmallVector
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D128537
Nico Weber [Tue, 19 Jul 2022 15:50:54 +0000 (11:50 -0400)]
[gn build] (manually) port
4539b44148918 (llvm-dwarfutil)
Nico Weber [Tue, 19 Jul 2022 15:38:00 +0000 (11:38 -0400)]
[gn build] (manually) port
8711fcae276a593
Benjamin Kramer [Tue, 19 Jul 2022 15:34:39 +0000 (17:34 +0200)]
Benjamin Kramer [Tue, 19 Jul 2022 15:12:09 +0000 (17:12 +0200)]
[bazel] Remove libraries that don't build anymore after
5e83a5b4752da6631d79c446f21e5d128b5c5495
I don't know who uses these python extensions, probably nobody.
Louis Dionne [Thu, 30 Jun 2022 15:57:52 +0000 (11:57 -0400)]
[libc++] Treat incomplete features just like other experimental features
In particular remove the ability to expel incomplete features from the
library at configure-time, since this can now be done through the
_LIBCPP_ENABLE_EXPERIMENTAL macro.
Also, never provide symbols related to incomplete features inside the
dylib, instead provide them in c++experimental.a (this changes the
symbols list, but not for any configuration that should have shipped).
Differential Revision: https://reviews.llvm.org/D128928
Louis Dionne [Tue, 19 Jul 2022 14:44:06 +0000 (10:44 -0400)]
[libc++] Re-apply "Always build c++experimental.a""
This re-applies
bb939931a1ad, which had been reverted by
09cebfb978de
because it broke Chromium. The issues seen by Chromium should be
addressed by
1d0f79558ca4.
Differential Revision: https://reviews.llvm.org/D128927
Louis Dionne [Tue, 19 Jul 2022 14:40:26 +0000 (10:40 -0400)]
[libc++] Make sure cxx_experimental links against libc++ headers
This should fix builds where we build neither the static nor the shared
library.
Nicolai Hähnle [Tue, 19 Jul 2022 14:39:05 +0000 (16:39 +0200)]
Revert "Update some more tests with update_cc_test_checks.py"
This reverts commit
9fb33d52b045b6cc97f2f56fe5cd23b41de86ffe.
Buildbots are showing a number of regressions that don't reproduce
locally. Needs more investigating.
Arnold Schwaighofer [Mon, 18 Jul 2022 17:46:57 +0000 (10:46 -0700)]
[coro async] Add missing llvm.coro.id.async intrinsic to declaresCoroCleanupIntrinsics
rdar://
97214593
Differential Revision: https://reviews.llvm.org/D130038
Daniil Dudkin [Tue, 19 Jul 2022 14:22:39 +0000 (17:22 +0300)]
[flang][NFC] Drop `AbstractResultOptions` structure
`AbstractResultOptions` is obsolete structure because `newArg` is used
only in `ReturnOpConversion`.
This change removes this struct, making dependencies of conversions more
straight-forward.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D129485
Nicolai Hähnle [Tue, 19 Jul 2022 06:58:31 +0000 (08:58 +0200)]
Update some more tests with update_cc_test_checks.py
Joe Nash [Fri, 15 Jul 2022 17:49:02 +0000 (13:49 -0400)]
[AMDGPU] Remove old operand from VOPC DPP
For most DPP instructions, the old operand stores the value that was in
the current lane before the DPP operation, and is tied to the
destination. For VOPC DPP, this is unnecessary and incorrect.
There appears to have been a latent bug related to D122737 with
SIInstrInfo::isOperandLegal. If you checked if a register operand was legal
when the InstructionDesc expected an immediate, it reported that is valid.
Its fix is necessary for and tested in this patch.
Reviewed By: foad, rampitec
Differential Revision: https://reviews.llvm.org/D130040
Andrew Turner [Wed, 18 May 2022 11:02:26 +0000 (12:02 +0100)]
Add the FreeBSD AArch64 memory layout
Use the FreeBSD AArch64 memory layout values when building for it.
These are based on the x86_64 values, scaled to take into account the
larger address space on AArch64.
Reviewed by: vitalybuka
Differential Revision: https://reviews.llvm.org/D125883
Andrew Turner [Mon, 16 May 2022 16:20:52 +0000 (17:20 +0100)]
Add the FreeBSD AArch64 shadow offset to llvm
AArch64 has a larger address space than 64 but x86. Use the larger
shadow offset on FreeBSD AArch64.
Reviewed by: vitalybuka
Differential Revision: https://reviews.llvm.org/D125873
Andrew Turner [Mon, 16 May 2022 16:36:40 +0000 (17:36 +0100)]
Add the FreeBSD AArch64 memory layout
Use the FreeBSD AArch64 memory layout values when building for it.
These are based on the x86_64 values, scaled to take into account the
larger address space on AArch64.
Reviewed by: vitalybuka
Differential Revision: https://reviews.llvm.org/D125758
Dmitry Vyukov [Sat, 16 Jul 2022 09:48:18 +0000 (11:48 +0200)]
tsan: optimize DenseSlabAlloc
If lots of threads do lots of malloc/free and they overflow
per-pthread DenseSlabAlloc cache, it causes lots of contention:
31.97% race.old race.old [.] __sanitizer::StaticSpinMutex::LockSlow
17.61% race.old race.old [.] __tsan_read4
10.77% race.old race.old [.] __tsan::SlotLock
Optimize DenseSlabAlloc to use a lock-free stack of batches of nodes.
This way we don't take any locks in steady state at all and do only
1 push/pop per Refill/Drain.
Effect on the added benchmark:
$ TIME="%e %U %S %M" time ./test.old 36 5 2000000
34.51 978.22 175.67 5833592
32.53 891.73 167.03 5790036
36.17 1005.54 201.24 5802828
36.94 1004.76 226.58 5803188
$ TIME="%e %U %S %M" time ./test.new 36 5 2000000
26.44 720.99 13.45 5750704
25.92 721.98 13.58 5767764
26.33 725.15 13.41 5777936
25.93 713.49 13.41 5791796
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D130002
Simon Pilgrim [Tue, 19 Jul 2022 13:10:53 +0000 (14:10 +0100)]
[DAG] Call SimplifyDemandedBits from ISD::MUL nodes
Noticed while triaging D129765.
William Schmidt [Mon, 18 Jul 2022 20:55:27 +0000 (13:55 -0700)]
Don't vectorize PHIs in catchswitch blocks
We currently assert in vectorizeTree(TreeEntry*) when processing a PHI
bundle in a block containing a catchswitch. We attempt to set the
IRBuilder insertion point following the catchswitch, which is invalid.
This is done so that ShuffleBuilder.finalize() knows where to insert
a shuffle if one is needed.
To avoid this occurring, watch out for catchswitch blocks during
buildTree_rec() processing, and avoid adding PHIs in such blocks to
the vectorizable tree. It is unlikely that constraining vectorization
over an exception path will cause a noticeable performance loss, so
this seems preferable to trying to anticipate when a shuffle will and
will not be required.
Nikita Popov [Mon, 18 Jul 2022 10:08:00 +0000 (12:08 +0200)]
[Local] Allow creating callbr with duplicate successors
Since D129288, callbr is allowed to have duplicate successors. This
patch removes a limitation which prevents optimizations from actually
producing such callbrs.
Differential Revision: https://reviews.llvm.org/D129997
Alexey Lapshin [Sun, 10 Jul 2022 17:11:55 +0000 (20:11 +0300)]
[Reland][Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF.
This patch implements proposal https://lists.llvm.org/pipermail/llvm-dev/2020-August/144579.html
llvm-dwarfutil - is a tool that is used for processing debug info(DWARF) located in built binary files to improve debug info quality, reduce debug info size. The patch currently implements smaller set of command-line options(comparing to the proposal):
```
./llvm-dwarfutil [options] <input file> <output file>
--garbage-collection Do garbage collection for debug info(default)
-j <value> Alias for --num-threads
--no-garbage-collection Don`t do garbage collection for debug info
--no-odr-deduplication Don`t do ODR deduplication for debug types
--no-odr Alias for --no-odr-deduplication
--no-separate-debug-file
Create single output file, containing debug tables(default)
--num-threads <threads> Number of available threads for multi-threaded execution. Defaults to the number of cores on the current machine
--odr-deduplication Do ODR deduplication for debug types(default)
--odr Alias for --odr-deduplication
--separate-debug-file Create two output files: file w/o debug tables and file with debug tables
--tombstone [bfd,maxpc,exec,universal]
Tombstone value used as a marker of invalid address(default: universal)
=bfd - Zero for all addresses and [1,1] for DWARF v4 (or less) address ranges and exec
=maxpc - Minus 1 for all addresses and minus 2 for DWARF v4 (or less) address ranges
=exec - Match with address ranges of executable sections
=universal - Both: bfd and maxpc
```
Reviewed By: clayborg
Differential Revision: https://reviews.llvm.org/D86539
serge-sans-paille [Tue, 19 Jul 2022 10:53:01 +0000 (06:53 -0400)]
[flang] Fix flang-to-external-fc --version
Substitution of @FLANG_VERSION@ wasn't correctly performed.
Differential Revision: https://reviews.llvm.org/D130074
Evgeniy Brevnov [Tue, 19 Jul 2022 11:46:43 +0000 (18:46 +0700)]
Additional regression test for a crash during reorder masked gather nodes
Nicolas Vasilache [Tue, 19 Jul 2022 08:33:21 +0000 (01:33 -0700)]
[mlir][Linalg] Add a TileToForeachThread transform.
This revision adds a new transformation to tile a TilingInterface `op` to a tiled `scf.foreach_thread`, applying
tiling by `num_threads`.
If non-empty, the `threadDimMapping` is added as an attribute to the resulting `scf.foreach_thread`.
0-tile sizes (i.e. tile by the full size of the data) are used to encode
that a dimension is not tiled.
Differential Revision: https://reviews.llvm.org/D129577
Benjamin Kramer [Tue, 19 Jul 2022 10:35:47 +0000 (12:35 +0200)]
[LegalizeDAG] Propagate alignment in ExpandExtractFromVectorThroughStack
Unlike the name suggests this can reuse any store as a base for a
memory-based vector extract. If that store is underaligned the loads
created to extract will have an invalid alignment. Since most CPUs are
forgiving wrt alignment this is almost never an issue, on x86 this is
only reproducible by extracting a 128 bit vector out of a wider vector.
I tried making a test case in the context of
https://reviews.llvm.org/D127982 but it's really really fragile, as the
output pretty much looks like a missed optimization.
David Green [Tue, 19 Jul 2022 10:53:47 +0000 (11:53 +0100)]
[ARM] Remove VBICimm if no cleared bits are demanded
If none of the bits of a VBICimm are demanded, we can remove the node
entirely using the input operand instead.
Differential Revision: https://reviews.llvm.org/D129966
Florian Hahn [Tue, 19 Jul 2022 10:23:24 +0000 (11:23 +0100)]
[LV] Remove unnecessary cast in widenCallInstruction. (NFC)
Simon Pilgrim [Tue, 19 Jul 2022 10:13:31 +0000 (11:13 +0100)]
Fix signed/unsigned comparison mismatch warning
Simon Pilgrim [Tue, 19 Jul 2022 09:58:27 +0000 (10:58 +0100)]
[DAG] SimplifyDemandedBits - relax "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" to match only demanded bits
The "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" fold is currently limited to the XOR mask being a shifted all-bits mask, but we can relax this to only need to match under the demanded bits.
This helps expose more bit extraction/clearing patterns and fixes the PowerPC testCompares*.ll regressions from D127115
Alive2: https://alive2.llvm.org/ce/z/fl7T7K
Differential Revision: https://reviews.llvm.org/D129933
Abinav Puthan Purayil [Wed, 13 Jul 2022 06:40:02 +0000 (12:10 +0530)]
[AMDGPU] Set amdgpu-memory-bound if a basic block has dense global memory access
AMDGPUPerfHintAnalysis doesn't set the memory bound attribute if
FuncInfo::InstCost outweighs MemInstCost even if we have a basic block
with relatively high global memory access. GCNSchedStrategy could revert
optimal scheduling in favour of occupancy which seems to degrade
performance for some kernels. This change introduces the
HasDenseGlobalMemAcc metric in the heuristic that makes the analysis
more conservative in these cases.
This fixes SWDEV-334259/SWDEV-343932
Differential Revision: https://reviews.llvm.org/D129759