Brad Smith [Thu, 27 Aug 2020 21:17:38 +0000 (17:17 -0400)]
[SSP] Restore setting the visibility of __guard_local to hidden for better code generation.
Patch by: Philip Guenther
Alexey Bataev [Thu, 27 Aug 2020 20:06:28 +0000 (16:06 -0400)]
[OPENMP]Do not crash for globals in inner regions with outer target
region.
If the global variable is used in the target region,it is always
captured, if not marked as declare target.
Azharuddin Mohammed [Thu, 27 Aug 2020 20:57:07 +0000 (13:57 -0700)]
[Driver][XRay][test] Update the macOS support check
For macOS, the code says, the XRay flag is only supported on x86_64.
Updating the test and making that check explicit.
Differential Revision: https://reviews.llvm.org/D85773
Shinji Okumura [Thu, 27 Aug 2020 20:54:01 +0000 (05:54 +0900)]
[Attributor] Do not manifest noundef for dead positions
Even if noundef is deduced for a position, we should not manifest it when the position is dead.
This is because the associated values with dead positions are replaced with undef values by AAIsDead.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D86565
Saiyedul Islam [Thu, 27 Aug 2020 20:54:42 +0000 (20:54 +0000)]
[OpenMP] Fix a failing test after D85214
Removed version 45 testing from a failing test for now.
Matt Arsenault [Thu, 27 Aug 2020 16:40:03 +0000 (12:40 -0400)]
GlobalISel: Implement known bits for min/max
Matt Arsenault [Thu, 27 Aug 2020 17:25:12 +0000 (13:25 -0400)]
AArch64/GlobalISel: Fix missing function begin marker in test
Matt Arsenault [Sun, 28 Jun 2020 15:34:42 +0000 (11:34 -0400)]
MIR: Infer not-SSA for subregister defs
It's possible to have a single virtual register def with a subreg
index that would pass the previous check, but it's not possible to
have a subregister def in SSA.
This is in preparation for adding stricter checks for SSA MIR.
Vitaly Buka [Thu, 27 Aug 2020 20:53:28 +0000 (13:53 -0700)]
[StackSafety] Ignore allocas with partial lifetime markers
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D86672
Vitaly Buka [Thu, 27 Aug 2020 20:45:39 +0000 (13:45 -0700)]
[NFC][ValueTracking] Add OffsetZero into findAllocaForValue
For StackLifetime after finding alloca we need to check that
values ponting to the begining of alloca.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D86692
Matt Arsenault [Thu, 27 Aug 2020 15:14:59 +0000 (11:14 -0400)]
AMDGPU: Use caller subtarget, not intrinsic declaration
Intrinsic declarations use the default subtarget, but this should be
using the subtarget for the calling function. I haven't been able to
come up with a case where it matters though.
Matt Arsenault [Thu, 27 Aug 2020 17:15:46 +0000 (13:15 -0400)]
GlobalISel: Add and_trivial_mask to all_combines
Also make up a new category of combines.
Krzysztof Parzyszek [Thu, 27 Aug 2020 20:16:39 +0000 (15:16 -0500)]
[Hexagon] Emit better 32-bit multiplication sequence for HVXv62+
Eli Friedman [Thu, 27 Aug 2020 20:17:47 +0000 (13:17 -0700)]
[RegisterScavenging] Delete dead function unprocess().
Shinji Okumura [Thu, 27 Aug 2020 20:16:18 +0000 (05:16 +0900)]
[Attributor] Do not add AA to dependency graph after the update stage
If an AA is registered to the dependency graph in the manifest stage, Attributor aborts in `::manifestAttributes()`.
This patch prevents such termination.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D86734
Craig Topper [Thu, 27 Aug 2020 19:32:17 +0000 (12:32 -0700)]
[CodeGen] Use an AttrBuilder to bulk remove 'target-cpu', 'target-features', and 'tune-cpu' before re-adding in CodeGenModule::setNonAliasAttributes.
I think the removeAttributes interface should be faster than
calling removeAttribute 3 times.
Saiyedul Islam [Thu, 27 Aug 2020 19:35:36 +0000 (19:35 +0000)]
[OpenMP] Ensure testing for versions 4.5 and default - Part 3
This third patch in the series removes version 5.0 string from
test cases making them check for default version. It also add test
cases for version 4.5.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D85214
Roman Lebedev [Thu, 27 Aug 2020 19:31:40 +0000 (22:31 +0300)]
[InstSimplify] SimplifyPHINode(): check that instruction is in basic block first
As pointed out in post-commit review, this can legally be called
on instructions that are not inserted into basic blocks,
so don't blindly assume that there is basic block.
Christopher Tetreault [Thu, 27 Aug 2020 19:04:39 +0000 (12:04 -0700)]
[SVE] Remove bad call to VectorType::getNumElements() from HeapProfiler
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D86727
Yang Fan [Thu, 27 Aug 2020 19:09:17 +0000 (12:09 -0700)]
[analyzer] NFC: Fix wrong parameter name in printFormattedEntry.
Parameters were in a different order in the header and in the implementation.
Fix surrounding comments a bit.
Differential Revision: https://reviews.llvm.org/D86691
Yang Fan [Thu, 27 Aug 2020 18:45:12 +0000 (11:45 -0700)]
[analyzer] Fix the debug print about debug egraph dumps requiring asserts.
There's no need to remind people about that when clang *is* built with asserts.
Differential Revision: https://reviews.llvm.org/D86334
Adam Balogh [Thu, 27 Aug 2020 15:06:10 +0000 (08:06 -0700)]
[analyzer] pr47037: CastValueChecker: Support for the new variadic isa<>.
llvm::isa<>() and llvm::isa_and_not_null<>() template functions recently became
variadic. Unfortunately this causes crashes in case of isa_and_not_null<>()
and incorrect behavior in isa<>(). This patch fixes this issue.
Differential Revision: https://reviews.llvm.org/D85728
Adam Balogh [Thu, 27 Aug 2020 15:01:43 +0000 (08:01 -0700)]
[analyzer] NFC: Store the pointee/referenced type for dynamic type tracking.
The successfulness of a dynamic cast depends only on the C++ class, not the pointer or reference. Thus if *A is a *B, then &A is a &B,
const *A is a const *B etc. This patch changes DynamicCastInfo to store
and check the cast between the unqualified pointed/referenced types.
It also removes e.g. SubstTemplateTypeParmType from both the pointer
and the pointed type.
Differential Revision: https://reviews.llvm.org/D85752
Dokyung Song [Wed, 19 Aug 2020 20:21:05 +0000 (20:21 +0000)]
Recommit "[libFuzzer] Fix value-profile-load test."
value-profile-load.test needs adjustment with a mutator change in
bb54bcf84970c04c9748004f3a4cf59b0c1832a7, which reverted as of now, but will be
recommitted after landing this patch.
This patch makes value-profile-load.test more friendly to (and aware of) the
current value profiling strategy, which is based on the hamming as well as the
absolute distance. To this end, this patch adjusts the set of input values that
trigger an expected crash. More specifically, this patch now uses a single value
0x01effffe as a crashing input, because this value is close to values like
{0x1ffffff, 0xffffff, ...}, which are very likely to be added to the corpus per
the current hamming- and absolute-distance-based value profiling strategy. Note
that previously the crashing input values were {1234567 * {1, 2, ...}, s.t. <
INT_MAX}.
Every byte in the chosen value 0x01effeef is intentionally different; this was
to make it harder to find the value without the intermediate inputs added to the
corpus by the value profiling strategy.
Also note that LoadTest.cpp now uses a narrower condition (Size != 8) for
initial pruning of inputs, effectively preventing libFuzzer from generating
inputs longer than necessary and spending time on mutating such long inputs in
the corpus - a functionality not meant to be tested by this specific test.
Differential Revision: https://reviews.llvm.org/D86247
Haojian Wu [Thu, 27 Aug 2020 18:43:20 +0000 (20:43 +0200)]
[libcxx] Fix the broken test after D82657.
Differential Revision: https://reviews.llvm.org/D86685
Shinji Okumura [Thu, 27 Aug 2020 18:29:39 +0000 (03:29 +0900)]
[Attributor] Guarantee getAAFor not to update AA in the manifestation stage
If we query an AA with `Attributor::getAAFor` in `AbstractAttribute::manifest`, the AA may be updated.
This patch makes use of the phase flag in Attributor, and handle `getAAFor` behavior according to the flag.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D86635
Vincent Zhao [Thu, 27 Aug 2020 18:57:36 +0000 (00:27 +0530)]
[MLIR] Fixed missing constraint append when adding an AffineIfOp domain
The prior diff that introduced `addAffineIfOpDomain` missed appending
constraints from the ifOp domain. This revision fixes this problem.
Differential Revision: https://reviews.llvm.org/D86421
Christopher Tetreault [Thu, 27 Aug 2020 18:19:46 +0000 (11:19 -0700)]
[SVE] Remove calls to VectorType::getNumElements from Transforms/Vectorize
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D82056
Saiyedul Islam [Thu, 27 Aug 2020 18:50:34 +0000 (18:50 +0000)]
[OpenMP] Ensure testing for versions 4.5 and default - Part 2
Many OpenMP Clang tests do not RUN for version 4.5 and the default
version. This second patch in the series handles test cases which
require updation in CHECK lines along with adding RUN lines for
the default version. It involves updating line number of pragmas.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D85150
Kiran Chandramohan [Thu, 13 Aug 2020 08:03:04 +0000 (09:03 +0100)]
[OpenMP][MLIR] Conversion pattern for OpenMP to LLVM
Adding a conversion pattern for the parallel Operation. This will
help the conversion of parallel operation with standard dialect to
parallel operation with llvm dialect. The type conversion of the block
arguments in a parallel region are controlled by the pattern for the
parallel Operation. Without this pattern, a parallel Operation with
block arguments cannot be converted from standard to LLVM dialect.
Other OpenMP operations without regions are marked as legal. When
translation of OpenMP operations with regions are added then patterns
for these operations can also be added.
Also uses all the standard to llvm patterns. Patterns of other dialects
can be added later if needed.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D86273
Jez Ng [Thu, 27 Aug 2020 18:10:59 +0000 (11:10 -0700)]
[lld-macho] Disable invalid/stub-link.s test for Mac
It seems to be failing on some Google Buildbots.
This diff also includes a minor fix for the install name of one of
libSystem's re-exports. I don't think it's the cause of the test
failure, though. The wrong install name just meant that the symbol
lookup failure would still happen, but it would have been caused by the
re-export not being found, instead of the arch failing to match.
Differential Revision: https://reviews.llvm.org/D86728
Louis Dionne [Thu, 27 Aug 2020 17:09:23 +0000 (13:09 -0400)]
[libc++][NFC] Define functor's call operator inline
This fixes a mismatched visibility attribute on the call operator in
addition to making the code clearer. Given this is a simple lambda
in essence, the intent has always been to give it inline visibility.
Christopher Tetreault [Thu, 27 Aug 2020 17:39:18 +0000 (10:39 -0700)]
[SVE] Remove calls to VectorType::getNumElements from IR
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D81500
Matt Arsenault [Thu, 27 Aug 2020 16:32:49 +0000 (12:32 -0400)]
GlobalISel: Use & operator on KnownBits
Avoid repeating for zero and one
Matt Arsenault [Thu, 27 Aug 2020 16:15:16 +0000 (12:15 -0400)]
GlobalISel: Implement known bits for G_MERGE_VALUES
Mikhail Maltsev [Thu, 27 Aug 2020 17:52:59 +0000 (18:52 +0100)]
[ARM][BFloat16] Change types of some Arm and AArch64 bf16 intrinsics
Add bitcode files which got truncated to 0 length in phabricator.
Differential Revision: https://reviews.llvm.org/D86146
Matt Arsenault [Thu, 27 Aug 2020 17:47:38 +0000 (13:47 -0400)]
GlobalISel: Remove leftover lit.local.cfg
The global-isel feature has been required for a long time and was
removed in
c9455d3c579292e7ae5b7559ad0302d459e69a95, so this was
causing all tests to be skipped.
Mikhail Maltsev [Thu, 27 Aug 2020 17:43:16 +0000 (18:43 +0100)]
[ARM][BFloat16] Change types of some Arm and AArch64 bf16 intrinsics
This patch adjusts the following ARM/AArch64 LLVM IR intrinsics:
- neon_bfmmla
- neon_bfmlalb
- neon_bfmlalt
so that they take and return bf16 and float types. Previously these
intrinsics used <8 x i8> and <4 x i8> vectors (a rudiment from
implementation lacking bf16 IR type).
The neon_vbfdot[q] intrinsics are adjusted similarly. This change
required some additional selection patterns for vbfdot itself and
also for vector shuffles (in a previous patch) because of SelectionDAG
transformations kicking in and mangling the original code.
This patch makes the generated IR cleaner (less useless bitcasts are
produced), but it does not affect the final assembly.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D86146
Craig Topper [Thu, 27 Aug 2020 17:30:47 +0000 (10:30 -0700)]
[X86] Don't call hasFnAttribute and getFnAttribute for 'prefer-vector-width' and 'min-legal-vector-width' in getSubtargetImpl
We only need to call getFnAttribute and then check if the Attribute
is None or not.
Owen Anderson [Wed, 26 Aug 2020 19:36:13 +0000 (19:36 +0000)]
Reapply D70800: Fix AArch64 AAPCS frame record chain
Original Commit Message:
After the commit r368987 (rG643adb55769e) was landed, the frame record (FP and LR register)
may be placed in the middle of a stack frame if a function has both callee-saved
general-purpose registers and floating point registers. This will break the stack unwinders
that simply walk through the frame records (based on the guarantee from AAPCS64
"The Frame Pointer" section). This commit fixes the problem by adding the frame record offset.
Patch By: logan
Differential Revision: D70800
Teresa Johnson [Thu, 27 Aug 2020 16:38:45 +0000 (09:38 -0700)]
[HeapProf] Fix bot failures from instrumentation pass
Fix bot failure from
7ed8124d46f94601d5f1364becee9cee8538265e:
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-ubuntu/builds/8533
Since we are always using dynamic shadow,
insertDynamicShadowAtFunctionEntry should always return true for
modifying the function.
LLVM GN Syncbot [Thu, 27 Aug 2020 17:08:02 +0000 (17:08 +0000)]
[gn build] Port
7ed8124d46f
Arthur Eubanks [Thu, 27 Aug 2020 17:05:34 +0000 (10:05 -0700)]
[gn build] Manually port c9455d3
Arthur Eubanks [Wed, 26 Aug 2020 23:55:46 +0000 (16:55 -0700)]
[test][Inliner] Make always-inline.ll work with NPM
The NPM doesn't support call-site alwaysinline as described in the comments.
Also make NPM runs more similar to legacy PM runs.
Reviewed By: ychen, asbirlea
Differential Revision: https://reviews.llvm.org/D86663
Aditya Nandakumar [Thu, 27 Aug 2020 16:38:48 +0000 (09:38 -0700)]
[GISel] Add new GISel combiners for G_SELECT
https://reviews.llvm.org/D83833
Patch adds two new GICombinerRules for G_SELECT. The rules include:
combining selects with undef comparisons into their first selectee value,
and to combine away selects with constant comparisons. Patch additionally
adds a new combiner test for the AArch64 target to test these new G_SELECT
combiner rules and the existing select_same_val combiner rule.
Patch by mkitzan
Jonas Devlieghere [Thu, 27 Aug 2020 16:29:47 +0000 (09:29 -0700)]
[lldb] Make lldb-argdumper a dependency of liblldb
Always make lldb-argdumper a dependency of liblldb. Currently it is only
a dependency of the python swig target because of the relative symlink
in the python resource directory. That means that the dependency won't
be there when LLDB_ENABLE_PYTHON is disabled.
Differential revision: https://reviews.llvm.org/D86722
Jonas Devlieghere [Wed, 26 Aug 2020 18:56:30 +0000 (11:56 -0700)]
[lldb] Move triple construction out of getArchCFlags in DarwinBuilder (NFC)
Move the construction of the triple out of getArchCFlags in the
DarwinBuilder.
Arthur Eubanks [Thu, 27 Aug 2020 16:29:17 +0000 (09:29 -0700)]
[OCaml] Remove add_constant_propagation
After https://reviews.llvm.org/D85159.
Simon Moll [Thu, 27 Aug 2020 16:22:47 +0000 (18:22 +0200)]
[sda][nfc] clang-formatting
Shinji Okumura [Thu, 27 Aug 2020 16:16:38 +0000 (01:16 +0900)]
[Attributor] Add a phase flag to Attributor
Add a new flag that indicates which stage in the process we are in.
This flag is introduced for handling behavior of `getAAFor` according to the stage. (discussed in D86635)
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D86678
Aditya Nandakumar [Thu, 27 Aug 2020 15:54:33 +0000 (08:54 -0700)]
[GISel]: Fix one more CSE Non determinism
https://reviews.llvm.org/D86676
Sometimes we can have the following code
x:gpr(s32) = G_OP
Say we build G_OP2 to the same x and then delete the previous instruction. Using something like
Register X = ...;
auto NewMIB = CSEBuilder.buildOp2(X, ... args);
Currently there's a mismatch in how NewMIB is profiled and inserted into the CSEMap (ie it doesn't consider register bank/register class along with type).Unify the profiling by refactoring and calling the common method.
This was found by turning on the CSEInfo::verify in at the end of each of our GISel passes which turns inconsistent state/non determinism in CSEing into crashes which likely usually indicates missing calls to Observer on mutations (the most common case). Here non determinism usually means not cseing sometimes, but almost never about producing incorrect code.
Also this patch adds this verification at the end of the combiners as well.
Lucas Prates [Thu, 27 Aug 2020 14:31:40 +0000 (15:31 +0100)]
[CodeGen] Properly propagating Calling Convention information when lowering vector arguments
When joining the legal parts of vector arguments into its original value
during the lower of Formal Arguments in SelectionDAGBuilder, the Calling
Convention information was not being propagated for the handling of each
individual parts. The same did not happen when lowering calls, causing a
mismatch.
This patch fixes the issue by properly propagating the Calling
Convention details.
This fixes Bugzilla #47001.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D86715
Benjamin Kramer [Thu, 27 Aug 2020 15:57:11 +0000 (17:57 +0200)]
[MLIR][GPUToSPIRV] Fix use-after-free. Found by asan.
Teresa Johnson [Thu, 13 Aug 2020 23:29:38 +0000 (16:29 -0700)]
[HeapProf] Clang and LLVM support for heap profiling instrumentation
See RFC for background:
http://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html
Note that the runtime changes will be sent separately (hopefully this
week, need to add some tests).
This patch includes the LLVM pass to instrument memory accesses with
either inline sequences to increment the access count in the shadow
location, or alternatively to call into the runtime. It also changes
calls to memset/memcpy/memmove to the equivalent runtime version.
The pass is modeled on the address sanitizer pass.
The clang changes add the driver option to invoke the new pass, and to
link with the upcoming heap profiling runtime libraries.
Currently there is no attempt to optimize the instrumentation, e.g. to
aggregate updates to the same memory allocation. That will be
implemented as follow on work.
Differential Revision: https://reviews.llvm.org/D85948
Mikhail Maltsev [Thu, 27 Aug 2020 15:47:18 +0000 (16:47 +0100)]
Revert "[libcxx] Fix compile for BUILD_EXTERNAL_THREAD_LIBRARY"
This reverts commit
3b71f91558ff8b569199547efe800cb501c3cf94.
The commit is breaking some build bots.
Roman Lebedev [Wed, 26 Aug 2020 08:11:04 +0000 (11:11 +0300)]
[InstSimplify][EarlyCSE] Try to CSE PHI nodes in the same basic block
Apparently, we don't do this, neither in EarlyCSE, nor in InstSimplify,
nor in (old) GVN, but do in NewGVN and SimplifyCFG of all places..
While i could teach EarlyCSE how to hash PHI nodes,
we can't really do much (anything?) even if we find two identical
PHI nodes in different basic blocks, same-BB case is the interesting one,
and if we teach InstSimplify about it (which is what i wanted originally,
https://reviews.llvm.org/D86530), we get EarlyCSE support for free.
So i would think this is pretty uncontroversial.
On vanilla llvm test-suite + RawSpeed, this has the following effects:
```
| statistic name | baseline | proposed | Δ | % | \|%\| |
|----------------------------------------------------|-----------|-----------|-------:|---------:|---------:|
| instsimplify.NumPHICSE | 0 | 23779 | 23779 | 0.00% | 0.00% |
| asm-printer.EmittedInsts | 7942328 | 7942392 | 64 | 0.00% | 0.00% |
| assembler.ObjectBytes |
273069192 |
273084704 | 15512 | 0.01% | 0.01% |
| correlated-value-propagation.NumPhis | 18412 | 18539 | 127 | 0.69% | 0.69% |
| early-cse.NumCSE | 2183283 | 2183227 | -56 | 0.00% | 0.00% |
| early-cse.NumSimplify | 550105 | 542090 | -8015 | -1.46% | 1.46% |
| instcombine.NumAggregateReconstructionsSimplified | 73 | 4506 | 4433 | 6072.60% | 6072.60% |
| instcombine.NumCombined | 3640264 | 3664769 | 24505 | 0.67% | 0.67% |
| instcombine.NumDeadInst | 1778193 | 1783183 | 4990 | 0.28% | 0.28% |
| instcount.NumCallInst | 1758401 | 1758799 | 398 | 0.02% | 0.02% |
| instcount.NumInvokeInst | 59478 | 59502 | 24 | 0.04% | 0.04% |
| instcount.NumPHIInst | 330557 | 330533 | -24 | -0.01% | 0.01% |
| instcount.TotalInsts | 8831952 | 8832286 | 334 | 0.00% | 0.00% |
| simplifycfg.NumInvokes | 4300 | 4410 | 110 | 2.56% | 2.56% |
| simplifycfg.NumSimpl | 1019808 | 999607 | -20201 | -1.98% | 1.98% |
```
I.e. it fires ~24k times, causes +110 (+2.56%) more `invoke` -> `call`
transforms, and counter-intuitively results in *more* instructions total.
That being said, the PHI count doesn't decrease that much,
and looking at some examples, it seems at least some of them
were previously getting PHI CSE'd in SimplifyCFG of all places..
I'm adjusting `Instruction::isIdenticalToWhenDefined()` at the same time.
As a comment in `InstCombinerImpl::visitPHINode()` already stated,
there are no guarantees on the ordering of the operands of a PHI node,
so if we just naively compare them, we may false-negatively say that
the nodes are not equal when the only difference is operand order,
which is especially important since the fold is in InstSimplify,
so we can't rely on InstCombine sorting them beforehand.
Fixing this for the general case is costly (geomean +0.02%),
and does not appear to catch anything in test-suite, but for
the same-BB case, it's trivial, so let's fix at least that.
As per http://llvm-compile-time-tracker.com/compare.php?from=
04879086b44348cad600a0a1ccbe1f7776cc3cf9&to=
82bdedb888b945df1e9f130dd3ac4dd3c96e2925&stat=instructions
this appears to cause geomean +0.03% compile time increase (regression),
but geomean -0.01%..-0.04% code size decrease (improvement).
Roman Lebedev [Thu, 27 Aug 2020 12:03:53 +0000 (15:03 +0300)]
[NFC][EarlyCSE][InstSimplify] Add tests for CSE of PHI nodes
PHI nodes depend on the block they're in,
so we can only deal with the most basic case of same-BB PHI's.
Russell Gallop [Thu, 27 Aug 2020 14:50:25 +0000 (15:50 +0100)]
[Test] Tidy up loose ends from LLVM_HAS_GLOBAL_ISEL
This hasn't been allowed as a build option since r309990
Remove leftover REQUIRES: global-isel
Differential Revision: https://reviews.llvm.org/D86714
Louis Dionne [Thu, 27 Aug 2020 15:21:37 +0000 (11:21 -0400)]
[libc++] Install a more recent CMake on libc++ builders
David Nicuesa [Thu, 27 Aug 2020 15:21:35 +0000 (16:21 +0100)]
[libcxx] Fix compile for BUILD_EXTERNAL_THREAD_LIBRARY
Fix compilation with -DLIBCXX_BUILD_EXTERNAL_THREAD_LIBRARY when using clang. Now linking target 'cxx_external_threads' with 'cxx-headers'. Fix mismatching visibility for `libcpp_timed_backoff_policy` function in file <__threading_support>.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D86598
Cullen Rhodes [Tue, 11 Aug 2020 14:30:02 +0000 (14:30 +0000)]
[CodeGen][AArch64] Support arm_sve_vector_bits attribute
This patch implements codegen for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1].
The purpose of this attribute is to define vector-length-specific (VLS)
versions of existing vector-length-agnostic (VLA) types.
VLSTs are represented as VectorType in the AST and fixed-length vectors
in the IR everywhere except in function args/return. Implemented in this
patch is codegen support for the following:
* Implicit casting between VLA <-> VLS types.
* Coercion of VLS types in function args/return.
* Mangling of VLS types.
Casting is handled by the CK_BitCast operation, which has been extended
to support the two new vector kinds for fixed-length SVE predicate and
data vectors, where the cast is implemented through memory rather than a
bitcast which is unsupported. Implementing this as a normal bitcast
would require relaxing checks in LLVM to allow bitcasting between
scalable and fixed types. Another option was adding target-specific
intrinsics, although codegen support would need to be added for these
intrinsics. Given this, casting through memory seemed like the best
approach as it's supported today and existing optimisations may remove
unnecessary loads/stores, although there is room for improvement here.
Coercion of VLSTs in function args/return from fixed to scalable is
implemented through the AArch64 ABI in TargetInfo.
The VLA and VLS types are defined by the ACLE to map to the same
machine-level SVE vectors. VLS types are mangled in the same way as:
__SVE_VLS<typename, unsigned>
where the first argument is the underlying variable-length type and the
second argument is the SVE vector length in bits. For example:
#if __ARM_FEATURE_SVE_BITS==512
// Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE
typedef svint32_t vec __attribute__((arm_sve_vector_bits(512)));
// Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE
typedef svbool_t pred __attribute__((arm_sve_vector_bits(512)));
#endif
The latest ACLE specification (00bet5) does not contain details of this
mangling scheme, it will be specified in the next revision. The
mangling scheme is otherwise defined in the appendices to the Procedure
Call Standard for the Arm Architecture, see [2] for more information.
[1] https://developer.arm.com/documentation/100987/latest
[2] https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D85743
Alexandre Ganea [Thu, 27 Aug 2020 15:09:20 +0000 (11:09 -0400)]
[Support] On Windows, add optional support for {rpmalloc|snmalloc|mimalloc}
This patch optionally replaces the CRT allocator (i.e., malloc and free) with rpmalloc (mixed public domain licence/MIT licence) or snmalloc (MIT licence) or mimalloc (MIT licence). Please note that the source code for these allocators must be available outside of LLVM's tree.
To enable, use `cmake ... -DLLVM_INTEGRATED_CRT_ALLOC=D:/git/rpmalloc -DLLVM_USE_CRT_RELEASE=MT` where `D:/git/rpmalloc` has already been git clone'd from `https://github.com/mjansson/rpmalloc`. The same applies to snmalloc and mimalloc.
When enabled, the allocator will be embeded (statically linked) into the LLVM tools & libraries. This currently only works with the static CRT (/MT), although using the dynamic CRT (/MD) could potentially work as well in the future.
When enabled, this changes the memory stack from:
new/delete -> MS VC++ CRT malloc/free -> HeapAlloc -> VirtualAlloc
to:
new/delete -> {rpmalloc|snmalloc|mimalloc} -> VirtualAlloc
The goal of this patch is to bypass the application's global heap - which is thread-safe thus inducing locking - and instead take advantage of a modern lock-free, thread cache, allocator. On a 6-core Xeon Skylake we observe a 2.5x decrease in execution time when linking a large scale application with LLD and ThinLTO (12 min 20 sec -> 5 min 34 sec), when all hardware threads are being used (using LLD's flag /opt:lldltojobs=all). On a dual 36-core Xeon Skylake with all hardware threads used, we observe a 24x decrease in execution time (1 h 2 min -> 2 min 38 sec) when linking a large application with LLD and ThinLTO. Clang build times also see a decrease in the range 5-10% depending on the configuration.
Differential Revision: https://reviews.llvm.org/D71786
diggerlin [Thu, 27 Aug 2020 15:07:58 +0000 (11:07 -0400)]
Revert "[AIX][XCOFF] emit symbol visibility for xcoff object file."
This reverts commit
a0818689213234d5a078641432d10eccccf61a13.
Based on the Hubert Tong'comment https://reviews.llvm.org/D84265#inline-799085
Alexandre E. Eichenberger [Thu, 27 Aug 2020 05:17:33 +0000 (10:47 +0530)]
[MLIR] MemRef Normalization for Dialects
When dealing with dialects that will results in function calls to
external libraries, it is important to be able to handle maps as some
dialects may require mapped data. Before this patch, the detection of
whether normalization can apply or not, operations are compared to an
explicit list of operations (`alloc`, `dealloc`, `return`) or to the
presence of specific operation interfaces (`AffineReadOpInterface`,
`AffineWriteOpInterface`, `AffineDMAStartOp`, or `AffineDMAWaitOp`).
This patch add a trait, `MemRefsNormalizable` to determine if an
operation can have its `memrefs` normalized.
This trait can be used in turn by dialects to assert that such
operations are compatible with normalization of `memrefs` with
nontrivial memory layout specification. An example is given in the
literal tests.
Differential Revision: https://reviews.llvm.org/D86236
Benjamin Kramer [Thu, 27 Aug 2020 14:52:34 +0000 (16:52 +0200)]
[Hexagon] Fold another layer of single-use variable into assert. NFCI.
Benjamin Kramer [Thu, 27 Aug 2020 14:40:20 +0000 (16:40 +0200)]
[Hexagon] Fold single-use variable into assert. NFCI.
Pavel Labath [Thu, 27 Aug 2020 14:36:55 +0000 (16:36 +0200)]
[lldb/cmake] Fix linking of lldbSymbolHelpers for
9cb222e7
I didn't find this locally because I have a /usr/include/gtest which is
similar enough to the bundled one to make things appear to work.
Matt Arsenault [Wed, 12 Aug 2020 12:53:35 +0000 (08:53 -0400)]
AMDGPU: Hoist subtarget lookup
Krzysztof Parzyszek [Thu, 27 Aug 2020 02:00:49 +0000 (21:00 -0500)]
[Hexagon] Widen short vector stores to HVX vectors using masked stores
Also invent a flag -hexagon-hvx-widen=N to set the minimum threshold
for widening short vectors to HVX vectors.
Florian Hahn [Thu, 27 Aug 2020 13:13:07 +0000 (14:13 +0100)]
[SimplifyLibCalls] Remove over-eager early return in strlen optzns.
Currently we bail out early for strlen calls with a GEP operand, if none
of the GEP specific optimizations fire. But there could be later
optimizations that still apply, which we currently miss out on.
An example is that we do not apply the following optimization
strlen(x) == 0 --> *x == 0
Unless I am missing something, there seems to be no reason for bailing
out early there.
Fixes PR47149.
Reviewed By: lebedev.ri, xbolva00
Differential Revision: https://reviews.llvm.org/D85886
Pavel Labath [Thu, 27 Aug 2020 14:06:59 +0000 (16:06 +0200)]
[lldb/cmake] Fix linking of lldbUtilityHelpers for
9cb222e74
Pavel Labath [Mon, 24 Aug 2020 09:20:25 +0000 (11:20 +0200)]
[lldb] Fix Type::GetByteSize for pointer types
The function was returning an incorrect (empty) value on the first
invocation. Given that this only affected the first invocation, this
bug/typo went mostly unaffected. DW_AT_const_value were particularly
badly affected by this as the GetByteSize call is
SymbolFileDWARF::ParseVariableDIE is likely to be the first call of this
function, and its effects cannot be undone by retrying.
Depends on D86348.
Differential Revision: https://reviews.llvm.org/D86436
Pavel Labath [Wed, 12 Aug 2020 10:02:37 +0000 (12:02 +0200)]
[cmake] Make gtest include directories a part of the library interface
This applies the same fix that D84748 did for macro definitions.
Appropriate include path is now automatically set for all libraries
which link against gtest targets, which avoids the need to set
include_directories in various parts of the project.
Differential Revision: https://reviews.llvm.org/D86616
Sam McCall [Wed, 26 Aug 2020 08:33:25 +0000 (10:33 +0200)]
[Tooling][Format] Treat compound extensions (foo.bar.cc) as matching foo.h
Motivating use case is ".cu.cc" extensions used in some bazel projects.
Alternative is to work around this with IncludeIsMainRegex in styles.
I proposed this approach because it seems like a better default.
Differential Revision: https://reviews.llvm.org/D86597
Pavel Labath [Wed, 26 Aug 2020 12:13:54 +0000 (14:13 +0200)]
[lldb/DWARF] Fix handling of variables with both location and const_value attributes
Class-level static constexpr variables can have both DW_AT_const_value
(in the "declaration") and a DW_AT_location (in the "definition")
attributes. Our code was trying to handle this, but it was brittle and
hard to follow (and broken) because it was processing the attributes in
the order in which they were found.
Refactor the code to make the intent clearer -- DW_AT_location trumps
DW_AT_const_value, and fix the bug which meant that we were not
displaying these variables properly (the culprit was the delayed parsing
of the const_value attribute due to a need to fetch the variable type.
Differential Revision: https://reviews.llvm.org/D86615
Pavel Labath [Mon, 24 Aug 2020 12:03:17 +0000 (14:03 +0200)]
[lldb/Utility] Use APSInt in the Scalar class
This enables us to further simplify some code because it no longer needs
to switch on the signedness of the type (APSInt handles that).
serge-sans-paille [Thu, 27 Aug 2020 12:34:54 +0000 (14:34 +0200)]
Fix OpenMP deduplicateRuntimeCalls return status
Differential Revision: https://reviews.llvm.org/D86705
serge-sans-paille [Thu, 27 Aug 2020 12:34:32 +0000 (14:34 +0200)]
Fix Attributor return status
Differential Revision: https://reviews.llvm.org/D86703
Eduardo Caldas [Thu, 27 Aug 2020 06:00:44 +0000 (06:00 +0000)]
[SyntaxTree][NFC][Style] Functions start with lowercase
Differential Revision: https://reviews.llvm.org/D86682
Eduardo Caldas [Thu, 27 Aug 2020 05:41:26 +0000 (05:41 +0000)]
[SyntaxTree][NFC] Append "get" to syntax Nodes accessor names
Differential Revision: https://reviews.llvm.org/D86679
Raul Tambre [Thu, 20 Aug 2020 17:17:47 +0000 (20:17 +0300)]
[CMake][compiler-rt][libunwind] Compile assembly files as ASM not C, unify workarounds
It isn't very wise to pass an assembly file to the compiler and tell it to compile as a C file and hope that the compiler recognizes it as assembly instead.
Simply don't mark the file as C and CMake will recognize the rest.
This was attempted earlier in https://reviews.llvm.org/D85706, but reverted due to architecture issues on Apple.
Subsequent digging revealed a similar change was done earlier for libunwind in https://reviews.llvm.org/rGb780df052dd2b246a760d00e00f7de9ebdab9d09.
Afterwards workarounds were added for MinGW and Apple:
* https://reviews.llvm.org/rGb780df052dd2b246a760d00e00f7de9ebdab9d09
* https://reviews.llvm.org/rGd4ded05ba851304b26a437896bc3962ef56f62cb
The workarounds in libunwind and compiler-rt are unified and comments added pointing to each other.
The workaround is updated to only be used for MinGW for CMake versions before 3.17, which fixed the issue (https://gitlab.kitware.com/cmake/cmake/-/merge_requests/4287).
Additionally fixed Clang not being passed as the assembly compiler for compiler-rt runtime build.
Example error:
[525/634] Building C object lib/tsan/CMakeFiles/clang_rt.tsan-aarch64.dir/rtl/tsan_rtl_aarch64.S.o
FAILED: lib/tsan/CMakeFiles/clang_rt.tsan-aarch64.dir/rtl/tsan_rtl_aarch64.S.o
/opt/tooling/drive/host/bin/clang --target=aarch64-linux-gnu -I/opt/tooling/drive/llvm/compiler-rt/lib/tsan/.. -isystem /opt/tooling/drive/toolchain/opt/drive/toolchain/include -x c -Wall -Wno-unused-parameter -fno-lto -fPIC -fno-builtin -fno-exceptions -fomit-frame-pointer -funwind-tables -fno-stack-protector -fno-sanitize=safe-stack -fvisibility=hidden -fno-lto -O3 -gline-tables-only -Wno-gnu -Wno-variadic-macros -Wno-c99-extensions -Wno-non-virtual-dtor -fPIE -fno-rtti -Wframe-larger-than=530 -Wglobal-constructors --sysroot=. -MD -MT lib/tsan/CMakeFiles/clang_rt.tsan-aarch64.dir/rtl/tsan_rtl_aarch64.S.o -MF lib/tsan/CMakeFiles/clang_rt.tsan-aarch64.dir/rtl/tsan_rtl_aarch64.S.o.d -o lib/tsan/CMakeFiles/clang_rt.tsan-aarch64.dir/rtl/tsan_rtl_aarch64.S.o -c /opt/tooling/drive/llvm/compiler-rt/lib/tsan/rtl/tsan_rtl_aarch64.S
/opt/tooling/drive/llvm/compiler-rt/lib/tsan/rtl/tsan_rtl_aarch64.S:29:1: error: expected identifier or '('
.section .text
^
1 error generated.
Differential Revision: https://reviews.llvm.org/D86308
Jay Foad [Wed, 26 Aug 2020 16:06:54 +0000 (17:06 +0100)]
[AMDGPU] Remove unused variable introduced in r251860
Drew Wock [Wed, 26 Aug 2020 19:17:17 +0000 (15:17 -0400)]
[FPEnv] Allow fneg + strict_fadd -> strict_fsub in DAGCombiner
This is the first of a set of DAGCombiner changes enabling strictfp
optimizations. I want to test to waters with this to make sure changes
like these are acceptable for the strictfp case- this particular change
should preserve exception ordering and result precision perfectly, and
many other possible changes appear to be able to as well.
Copied from regular fadd combines but modified to preserve ordering via
the chain, this change allows strict_fadd x, (fneg y) to become
struct_fsub x, y and strict_fadd (fneg x), y to become strict_fsub y, x.
Differential Revision: https://reviews.llvm.org/D85548
Martin Storsjö [Wed, 26 Aug 2020 12:47:04 +0000 (15:47 +0300)]
[LLD] [COFF] Check the aux section definition size for IMAGE_COMDAT_SELECT_SAME_SIZE
Binutils generated sections seem to be padded to a multiple of 16 bytes,
but the aux section definition contains the original, unpadded section
length.
The size check used for IMAGE_COMDAT_SELECT_SAME_SIZE previously
only checked the size of the section itself. When checking the
currently processed object file against the previously chosen
comdat section, we easily have access to the aux section definition
of the currently processed section, but we have to iterate over the
symbols of the previously selected object file to find the section
definition of the previously picked section. (We don't want to
inflate SectionChunk to carry more data, for something that is only
needed in corner cases.) Only do this when the mingw flag is set.
This fixes statically linking clang-built C++ object files against
libstdc++ built with GCC, if the object files contain e.g. typeinfo.
Differential Revision: https://reviews.llvm.org/D86659
Martin Storsjö [Wed, 26 Aug 2020 13:02:52 +0000 (16:02 +0300)]
[LLD] [MinGW] Enable dynamicbase by default
This matches lld-link's own default.
Add a new command line option --no-dynamicbase for disabling it.
(Unfortunately, GNU ld doesn't yet have a matching --no-dynamicbase
option, as that's the default there.)
Differential Revision: https://reviews.llvm.org/D86654
Russell Gallop [Thu, 27 Aug 2020 11:46:49 +0000 (12:46 +0100)]
Florian Hahn [Thu, 27 Aug 2020 11:39:22 +0000 (12:39 +0100)]
[DSE,MemorySSA] Remove short-cut to check if all paths are covered.
The post-order number early continue does not work in some cases, e.g.
if a path from EarlierAccess to an exit includes a node that dominates
EarlierAccess in a cycle.
The short-cut only has very minor impact on compile-time, so it seems
straight-forward to remove it for now:
http://llvm-compile-time-tracker.com/compare.php?from=
062412e79fcfedf2cf004433e42036b0333e3f83&to=
d7386016a77ce1387bdbbf360f1de157faea9d31&stat=instructions
Fixes PR47285.
Anatoly Trosinenko [Thu, 27 Aug 2020 10:19:08 +0000 (13:19 +0300)]
[NFC][compiler-rt] Factor out __mulo[sdt]i4 implementations to .inc file
The existing implementations are almost identical except for width of the
integer type.
Factor them out to int_mulo_impl.inc for better maintainability.
This patch is almost identical to D86277.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D86289
Anatoly Trosinenko [Thu, 27 Aug 2020 10:19:00 +0000 (13:19 +0300)]
[NFC][compiler-rt] Factor out __mulv[sdt]i3 implementations to .inc file
The existing implementations are almost identical except for width of the
integer type.
Factor them out to int_mulv_impl.inc for better maintainability.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D86277
Andrew Ng [Wed, 26 Aug 2020 16:06:41 +0000 (17:06 +0100)]
[ELF][test] Add test coverage of TLS to gc-sections.s
Differential Revision: https://reviews.llvm.org/D86639
OCHyams [Thu, 27 Aug 2020 09:38:28 +0000 (10:38 +0100)]
Revert "[DWARF] Add cuttoff guarding quadratic validThroughout behaviour"
This reverts commit
b9d977b0ca60c54f11615ca9d144c9f08b29fd85.
This cutoff is no longer required. The commit
34ffa7fc501 (D86153) introduces a
performance improvement which was tested against the motivating case for this
patch.
Discussed in differential revision: https://reviews.llvm.org/D86153
OCHyams [Thu, 27 Aug 2020 08:40:53 +0000 (09:40 +0100)]
[DwarfDebug] Improve validThroughout performance (4/4)
Almost NFC (see end).
The backwards scan in validThroughout significantly contributed to compile time
for a pathological case, causing the 'X86 Assembly Printer' pass to account for
roughly 70% of the run time. This patch guards the loop against running
unnecessarily, bringing the pass contribution down to 4%.
Almost NFC: There is a hack in validThroughout which promotes single constant
value DBG_VALUEs in the prologue to be live throughout the function. We're more
likely to hit this code path with this patch applied. Similarly to the parent
patches there is a small coverage change reported in the order of 10s of bytes.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D86153
OCHyams [Thu, 27 Aug 2020 08:40:52 +0000 (09:40 +0100)]
[DwarfDebug] Improve multi-BB single location detection in validThroughout (3/4)
With the changes introduced in D86151 we can now check for single locations
which span multiple blocks for inlined scopes and blocks.
D86151 introduced the InstructionOrdering parameter, replacing a scan through
MBB instructions. The functionality to compare instruction positions across
blocks was add there, and this patch just removes the exit checks that were
previously (but no longer) required.
CTMark shows a geomean binary size reduction of 2.2% for RelWithDebInfo builds.
llvm-locstats (using D85636) shows a very small variable location coverage
change in 5 of 10 binaries, but just like in D86151 it is only in the order of
10s of bytes.
Reviewed By: djtodoro
Differential Revision: https://reviews.llvm.org/D86152
OCHyams [Thu, 27 Aug 2020 08:40:50 +0000 (09:40 +0100)]
[DwarfDebug] Improve single location detection in validThroughout (2/4)
With this patch we're now accounting for two more cases which should be
considered 'valid throughout': First, where RangeEnd is ScopeEnd. Second, where
RangeEnd comes before ScopeEnd when including meta instructions, but are both
preceded by the same non-meta instruction.
CTMark shows a geomean binary size reduction of 1.5% for RelWithDebInfo builds.
`llvm-locstats` (using D85636) shows a very small variable location coverage
change in 2 of 10 binaries, but it is in the order of 10s of bytes which lines
up with my expectations.
I've added a test which checks both of these new cases. The first check in the
test isn't strictly necessary for this patch. But I'm not sure that it is
explicitly tested anywhere else, and is useful for the final patch in the
series.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D86151
OCHyams [Thu, 27 Aug 2020 08:40:48 +0000 (09:40 +0100)]
[NFC][DebugInfo] Create InstructionOrdering helper class (1/4)
Group the map and methods used to query instruction ordering for trimVarLocs
(D82129) into a class. This will make it easier to reuse the functionality
upcoming patches.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D86150
Cullen Rhodes [Tue, 11 Aug 2020 13:04:21 +0000 (13:04 +0000)]
[Sema][AArch64] Support arm_sve_vector_bits attribute
This patch implements the semantics for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1].
The purpose of this attribute is to define vector-length-specific (VLS)
versions of existing vector-length-agnostic (VLA) types.
The semantics were already implemented by D83551, although the
implementation approach has since changed to represent VLSTs as
VectorType in the AST and fixed-length vectors in the IR everywhere
except in function args/returns. This is described in the prototype
patch D85128 demonstrating the new approach.
The semantic changes added in D83551 are changed since the
AttributedType is replaced by VectorType in the AST. Minimal changes
were necessary in the previous patch as the canonical type for both VLA
and VLS was the same (i.e. sizeless), except in constructs such as
globals and structs where sizeless types are unsupported. This patch
reverts the changes that permitted VLS types that were represented as
sizeless types in such circumstances, and adds support for implicit
casting between VLA <-> VLS types as described in section 3.7.3.2 of the
ACLE.
Since the SVE builtin types for bool and uint8 are both represented as
BuiltinType::UChar in VLSTs, two new vector kinds are implemented to
distinguish predicate and data vectors.
[1] https://developer.arm.com/documentation/100987/latest
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D85736
Florian Hahn [Thu, 27 Aug 2020 10:01:20 +0000 (11:01 +0100)]
[DSE,MemorySSA] Add test for PR47285.
Vitaly Buka [Thu, 27 Aug 2020 10:24:21 +0000 (03:24 -0700)]
[NFC][ValueTracking] Cleanup a test
Mikhail Maltsev [Thu, 27 Aug 2020 10:06:45 +0000 (11:06 +0100)]
[AArch64] Optimize instruction selection for certain vector shuffles
This patch adds code to recognize vector shuffles which can be
represented as VDUP (splat) of a vector lane with of a different
(wider) type than the original vector lane type.
For example:
shufflevector <4 x i16> %v, <4 x i16> undef, <4 x i32> <i32 0, i32 1, i32 0, i32 1>
is essentially:
shufflevector <2 x i32> %v, <2 x i32> undef, <2 x i32> <i32 0, i32 0>
Such patterns are generated by the SelectionDAG machinery in some cases
(see DAGCombiner::visitBITCAST in DAGCombiner.cpp, the "Remove double
bitcasts from shuffles" part).
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D86225
Vitaly Buka [Thu, 27 Aug 2020 10:00:56 +0000 (03:00 -0700)]
[NFC][ValueTracking] Fix typo in test