Mogball [Tue, 16 Nov 2021 23:59:37 +0000 (23:59 +0000)]
[mlir][lsp] Use ResultGroupDefinition struct
This struct was added and was intended to be used, but it was missed in the original patch.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D114041
Mircea Trofin [Wed, 17 Nov 2021 00:35:06 +0000 (16:35 -0800)]
Revert "Reland "[benchmarks] Move libcxx's fork of google/benchmark and llvm/utils'"""
This reverts commit
1ee32055ea1dd4db70d1939cbd4f5105c2dce160.
We hit additional bot failures; in particular, Fuchsia's seems to be
related to how CMakeLists are ingested, see https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/
b8830380874445931681/overview
Shoaib Meenai [Sat, 13 Nov 2021 00:07:07 +0000 (16:07 -0800)]
[MachO] Shrink reloc from 32 bytes to 24 bytes
The `r_address` field of `relocation_info` is only 4 bytes, so our
offset field (which is the `r_address` field adjusted for subsection
splitting) also only needs to be 4 bytes. This reduces the structure
size from 32 bytes to 24 bytes.
Combined with https://reviews.llvm.org/D113813, this is a minor perf
improvement for linking an internal app, tested on two machines:
```
smol-relocs baseline difference (95% CI)
sys_time 7.367 ± 0.138 7.543 ± 0.157 [ +0.9% .. +3.8%]
user_time 21.843 ± 0.351 21.861 ± 0.450 [ -1.3% .. +1.4%]
wall_time 20.301 ± 0.307 20.556 ± 0.324 [ +0.1% .. +2.4%]
samples 16 16
smol-relocs baseline difference (95% CI)
sys_time 2.923 ± 0.050 2.992 ± 0.018 [ +1.4% .. +3.4%]
user_time 10.345 ± 0.039 10.448 ± 0.023 [ +0.8% .. +1.2%]
wall_time 12.068 ± 0.071 12.229 ± 0.021 [ +1.0% .. +1.7%]
samples 15 12
```
More importantly though, this change by itself reduces our maximum
resident set size by 220 MB (2.75%, from 7.85 GB to 7.64 GB) on the
first machine. On the second machine, it reduces it by 125 MB (1.94%,
from 6.31 GB to 6.19 GB).
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D113818
Shoaib Meenai [Fri, 12 Nov 2021 23:33:21 +0000 (15:33 -0800)]
[MachO] Reduce size of Symbol and Defined
We can lay out Symbol more optimally to reduce its size from 56 bytes to
48 bytes by eliminating unnecessary padding, and we can lay out Defined
such that its bitfield members are placed in the tail padding of Symbol
(on ABIs which support this), to reduce it from 96 bytes to 80 bytes (8
bytes from the Symbol reduction, and 8 bytes from the tail padding
reuse).
This is perf-neutral for an internal app (results from two different
machines):
```
smol-syms baseline difference (95% CI)
sys_time 7.430 ± 0.202 7.440 ± 0.193 [ -2.6% .. +2.9%]
user_time 21.443 ± 0.513 21.206 ± 0.396 [ -3.3% .. +1.1%]
wall_time 20.453 ± 0.534 20.222 ± 0.488 [ -3.7% .. +1.5%]
samples 9 8
smol-syms baseline difference (95% CI)
sys_time 3.011 ± 0.050 3.040 ± 0.052 [ -0.4% .. +2.3%]
user_time 10.416 ± 0.075 10.496 ± 0.091 [ +0.1% .. +1.4%]
wall_time 12.229 ± 0.144 12.354 ± 0.192 [ -0.1% .. +2.1%]
samples 14 13
```
However, on the first machine, it reduces maximum resident set size by
65.9 MB (0.8%, from 7.92 GB to 7.85 GB). On the second machine, it
reduces it by 92 MB (1.4%, from 6.40 GB to 6.31 GB).
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D113813
Shoaib Meenai [Fri, 12 Nov 2021 23:10:59 +0000 (15:10 -0800)]
[MachO] Fix struct size assertion
It was checking for 64-bit builds incorrectly. Unfortunately,
ConcatInputSection has grown a bit in the meantime, and I don't see any
obvious way to shrink it. Perhaps icfEqClass could use 32-bit hashes
instead of 64-bit ones, but xxHash64 is supposed to be much faster than
xxHash32 (https://github.com/Cyan4973/xxHash#benchmarks), so that sounds
like a loss. (Unrelatedly, we should really look at using XXH3 instead
of xxHash64 now.)
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D113809
Jim Ingham [Tue, 9 Nov 2021 22:25:03 +0000 (14:25 -0800)]
Make it possible for lldb to launch a remote binary with no local file.
We don't actually need a local copy of the main executable to debug
a remote process. So instead of treating "no local module" as an error,
see if the LaunchInfo has an executable it wants lldb to use, and if so
use it. Then report whatever error the remote server returns.
Differential Revision: https://reviews.llvm.org/D113521
Nico Weber [Wed, 17 Nov 2021 00:00:23 +0000 (19:00 -0500)]
[gn build] (manually) port
1ee32055ea1d more (benchmark move)
The new benchmark dep apparently has more source files than the old one.
Nico Weber [Tue, 16 Nov 2021 23:33:32 +0000 (18:33 -0500)]
[gn build] (manually) port
1ee32055ea1d (benchmark move)
Matthias Springer [Tue, 16 Nov 2021 23:07:12 +0000 (08:07 +0900)]
[ADT] Add unit test for EquivalanceClasses comparator
This unit tests tests new functionality added by D112052.
Differential Revision: https://reviews.llvm.org/D113461
Aaron Puchert [Tue, 16 Nov 2021 22:57:41 +0000 (23:57 +0100)]
Don't add irrelevant items to queue in DwarfCompileUnit::createScopeChildrenDIE (NFC)
Instead of popping them and then immediately throwing them away, we can
just filter out globals and items in different scopes before adding them
to WorkList. Shouldn't change anything but keep the queue smaller.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D113864
Aaron Puchert [Tue, 16 Nov 2021 22:56:21 +0000 (23:56 +0100)]
[DebugInfo] Use DbgEntityKind in DbgEntity interface (NFC)
It was being used occasionally already, and using it on the constructor
and getDbgEntityID has obvious type safety benefits.
Also use llvm_unreachable in the switch as usual, but since only these
two values are used in constructor calls I think it's still NFC.
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D113862
Benjamin Kramer [Tue, 16 Nov 2021 22:51:45 +0000 (23:51 +0100)]
[PowerPC] Fix a nullptr dereference
LiMI1/LiMI2 can be null, so don't call a method on them before checking.
Found by ubsan.
David Green [Tue, 16 Nov 2021 22:48:45 +0000 (22:48 +0000)]
[ARM] Update test comments after D114018. NFC
The TODOs are now fixed as of D114018, so can be removed. Which is nice.
Leonard Chan [Tue, 16 Nov 2021 22:46:02 +0000 (14:46 -0800)]
Limit test to x86 for now.
Duncan P. N. Exon Smith [Tue, 16 Nov 2021 22:33:10 +0000 (14:33 -0800)]
Coverage: Fix iterated type for LineCoverageIterator
LineCoverageIterator is not providing access to a mutable object. Fix it
to iterate over `const LineCoverageStats` so that `operator->()`
compiles again after
6b9b86db9dd974585a5c71cf2e5231d1e3385f7c.
Lawrence D'Anna [Tue, 16 Nov 2021 22:32:03 +0000 (14:32 -0800)]
[lldb] use EXT_SUFFIX for python extension
LLDB doesn't use only the python stable ABI, which means loading
it into an incompatible python can cause the process to crash.
_lldb.so should be named with the full EXT_SUFFIX from sysconfig
-- such as _lldb.cpython-39-darwin.so -- so this doesn't happen.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D112972
Siva Chandra Reddy [Tue, 16 Nov 2021 22:23:33 +0000 (22:23 +0000)]
[libc][NFC][Obvious] Fix the benchmarks after the switch to llvm/third-party
Leonard Chan [Wed, 20 Oct 2021 20:03:49 +0000 (13:03 -0700)]
[llvm-objcopy] Add --update-section
This is another attempt at D59351 which attempted to add --update-section, but
with some heuristics for adjusting segment/section offsets/sizes in the event
the data copied into the section is larger than the original size of the section.
We are opting to not support this case. GNU's objcopy was able to do this because
the linker and objcopy are tightly coupled enough that segment reformatting was
simpler. This is not the case with llvm-objcopy and lld where they like to be separated.
This will attempt to copy data into the section without changing any other
properties of the parent segment (if the section is part of one).
Differential Revision: https://reviews.llvm.org/D112116
Lawrence D'Anna [Tue, 16 Nov 2021 21:50:18 +0000 (13:50 -0800)]
[lldb] fix -print-script-interpreter-info on windows
Apparently "{sys.prefix}/bin/python3" isn't where you find the
python interpreter on windows, so the test I wrote for
-print-script-interpreter-info is failing.
We can't rely on sys.executable at runtime, because that will point
to lldb.exe not python.exe.
We can't just record sys.executable from build time, because python
could have been moved to a different location.
But it should be OK to apply relative path from sys.prefix to sys.executable
from build-time to the sys.prefix at runtime.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D113650
Evgenii Stepanov [Tue, 16 Nov 2021 20:15:32 +0000 (12:15 -0800)]
[scudo] Regression test for the MTE crash in storeEndMarker.
The original problem was fixed in D105261.
Differential Revision: https://reviews.llvm.org/D114022
Martin Storsjö [Sat, 23 Oct 2021 22:11:20 +0000 (01:11 +0300)]
[runtimes] Fix building initial libunwind+libcxxabi+libcxx with compiler implied -lunwind
This does mostly the same as D112126, but for the runtimes cmake files.
Most of that is straightforward, but the interdependency between
libcxx and libunwind is tricky:
Libunwind is built at the same time as libcxx, but libunwind is not
installed yet. LIBCXXABI_USE_LLVM_UNWINDER makes libcxx link directly
against the just-built libunwind, but the compiler implicit -lunwind
isn't found. This patch avoids that by adding --unwindlib=none if
supported, if we are going to link explicitly against a newly built
unwinder anyway.
Differential Revision: https://reviews.llvm.org/D113253
Louis Dionne [Tue, 16 Nov 2021 16:43:42 +0000 (11:43 -0500)]
[runtimes] Fix incorrect comment about the purpose of LLVM_DEFAULT_TARGET_TRIPLE
5beec6fb04e7 added LLVM_DEFAULT_TARGET_TRIPLE to the runtimes build with
a comment, however I believe that comment had been copied from the LLVM
build tree. In the context of the runtimes, LLVM_DEFAULT_TARGET_TRIPLE
is used to set what targets we are building for, not the target for which
we "generate code".
Differential Revision: https://reviews.llvm.org/D114007
Danila Kutenin [Tue, 16 Nov 2021 20:48:59 +0000 (15:48 -0500)]
[libc++] Unspecified behavior randomization in libc++
This effort is dedicated to deflake the tests of the users which depend
on the unspecified behavior of algorithms and containers. This also
might help updating the sorting algorithm in libcxx which has the
quadratic worst case in the future or at least create a new one under
flag.
For detailed design, please see the design doc I provide in the patch.
Differential Revision: https://reviews.llvm.org/D96946
Simon Pilgrim [Tue, 16 Nov 2021 20:46:04 +0000 (20:46 +0000)]
[X86] Add shift by splat modulo amount vector tests
Shows failure to fold zero_extend_vector_inreg(and(x, c)) -> bitcast(and(x,c')) when we're only demanding the 0'th extended element, such as with the SSE variable shift ops.
Louis Dionne [Tue, 16 Nov 2021 20:32:15 +0000 (15:32 -0500)]
[libc++] Adjust comment about ABI change and std::bad_function_call
Valentin Clement [Tue, 16 Nov 2021 19:37:08 +0000 (20:37 +0100)]
[fir] Add fir.string_lit conversion
This patch adds the conversion pattern for
the fir.string_lit operation.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D113992
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
Philip Reames [Tue, 16 Nov 2021 19:53:14 +0000 (11:53 -0800)]
[SCEV] Canonicalize X - urem X, Y patterns
There are multiple possible ways to represent the X - urem X, Y pattern. SCEV was not canonicalizing, and thus, depending on which you analyzed, you could get different results. The sub representation appears to produce strictly inferior results in practice, so I decided to canonicalize to the Y * X/Y version.
The motivation here is that runtime unroll produces the sub X - (and X, Y-1) pattern when Y is a power of two. SCEV is thus unable to recognize that an unrolled loop exits because we don't figure out that the new unrolled step evenly divides the trip count of the unrolled loop. After instcombine runs, we convert the the andn form which SCEV recognizes, so essentially, this is just fixing a nasty pass ordering dependency.
The ARM loop hardware interaction in the test diff is opague to me, but the comments in the review from others knowledge of the infrastructure appear to indicate these are improvements in loop recognition, not regressions.
Differential Revision: https://reviews.llvm.org/D114018
River Riddle [Tue, 16 Nov 2021 19:48:26 +0000 (19:48 +0000)]
[mlir] Fix clang5 build after D113641
Pirama Arumuga Nainar [Tue, 16 Nov 2021 01:15:58 +0000 (17:15 -0800)]
[compiler-rt/profile] Reland mark __llvm_profile_raw_version as hidden
Since libclang_rt.profile is added later in the command line, a
definition of __llvm_profile_raw_version is not included if it is
provided from an earlier object, e.g. from a shared dependency.
This causes an extra dependence edge where if libA.so depends on libB.so
and both are coverage-instrumented, libA.so uses libB.so's definition of
__llvm_profile_raw_version. This leads to a runtime link failure if the
libB.so available at runtime does not provide this symbol (but provides
the other dependent symbols). Such a scenario can occur in Android's
mainline modules.
E.g.:
ld -o libB.so libclang_rt.profile-x86_64.a
ld -o libA.so -l B libclang_rt.profile-x86_64.a
libB.so has a global definition of __llvm_profile_raw_version. libA.so
uses libB.so's definition of __llvm_profile_raw_version. At runtime,
libB.so may not be coverage-instrumented (i.e. not export
__llvm_profile_raw_version) so runtime linking of libA.so will fail.
Marking this symbol as hidden forces each binary to use the definition
of __llvm_profile_raw_version from libclang_rt.profile. The visiblity
is unchanged for Apple platforms where its presence is checked by the
TAPI tool.
Reviewed By: MaskRay, phosek, davidxl
Differential Revision: https://reviews.llvm.org/D111759
V Donaldson [Tue, 16 Nov 2021 18:25:28 +0000 (10:25 -0800)]
[flang] Fix a bug in INQUIRE(IOLENGTH=) output
The inquire by output list form of the INQUIRE statement calculates the
number of file storage units that would be required to store the data
of an output list in an unformatted file. Currently, the result is
incorrectly multiplied by the number of bytes for a data type. A query
for "INTEGER(KIND=4) A(10)" should be 40, not 160.
Update formatting.
Tue Ly [Sat, 13 Nov 2021 00:01:48 +0000 (19:01 -0500)]
[libc] Correct rounding for hexadecimalStringToFloat with long inputs.
Update binaryExpTofloat so that it will round correctly for long inputs when converting hexadecimal strings to floating points.
Differential Revision: https://reviews.llvm.org/D113815
Konstantin Varlamov [Mon, 15 Nov 2021 20:40:55 +0000 (12:40 -0800)]
[libc++] Always define a key function for std::bad_function_call in the dylib
However, whether applications rely on the std::bad_function_call vtable
being in the dylib is still controlled by the ABI macro, since changing
that would be an ABI break.
Also separate preprocessor definitions for whether to use a key function
and whether to use a `bad_function_call`-specific `what` message
(`what` message is mandated by [LWG2233](http://wg21.link/LWG2233)).
Differential Revision: https://reviews.llvm.org/D92397
Arthur Eubanks [Tue, 16 Nov 2021 19:05:18 +0000 (11:05 -0800)]
[Loads] Handle addrspacecast constant expressions when determining dereferenceability
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D114015
Nikolas Klauser [Sun, 14 Nov 2021 17:37:27 +0000 (18:37 +0100)]
[libc++] [NFC] Disable clang-tidy's readability-identifier-naming check
In libc++ most of the names are not conforming to the llvm style. Removing the readability-identifier-naming check removes almost all clang-tidy warnings. For example in `<string>` the warning count goes from 1001 warnings down to 7.
Reviewed By: #libc, Mordante, ldionne
Spies: Mordante, Quuxplusone, aheejin, libcxx-commits, carlosgalvezp
Differential Revision: https://reviews.llvm.org/D113849
Victor Huang [Tue, 16 Nov 2021 15:59:05 +0000 (09:59 -0600)]
[PowerPC] PPC backend optimization on conditional trap intrustions
This patch adds PPC back end optimization to analyze the arguments of a
conditional trap instruction to execute one of the following:
1. Delete it if never trap
2. Replace it if always trap
3. Otherwise keep it
Reviewed By: nemanjai, amyk, PowerPC
Differential revision: https://reviews.llvm.org/D111434
Arthur Eubanks [Tue, 16 Nov 2021 19:07:03 +0000 (11:07 -0800)]
[test] Precommit test for D114015
Hongtao Yu [Thu, 11 Nov 2021 17:52:41 +0000 (09:52 -0800)]
[CSSPGO] Fix a hash code truncating issue in ContextTrieNode.
std::hash returns a 64bit hash code while previously we were using only lower 32 bits which caused hash collision for large workloads.
Reviewed By: wenlei, wlei
Differential Revision: https://reviews.llvm.org/D113688
River Riddle [Tue, 16 Nov 2021 17:59:45 +0000 (17:59 +0000)]
[llvm] Add a SFINAE template parameter to DenseMapInfo
This allows for using SFINAE partial specialization for DenseMapInfo.
In MLIR, this is particularly useful as it will allow for defining partial
specializations that support all Attribute, Op, and Type classes without
needing to specialize DenseMapInfo for each individual class.
Differential Revision: https://reviews.llvm.org/D113641
Mircea Trofin [Mon, 15 Nov 2021 21:27:00 +0000 (13:27 -0800)]
[NFC][Regalloc] Factor out eviction decision from eviction attempt
This splits tryEvict into a const tryFindEvictionCandidate, which
attempts to find a candidate, and the actual eviction (should the former
be successful)
The newly introduced tryFindEvictionCandidate will move subsequently
into the RegAllocEvictionAdvisor.
RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-November/153639.html
Differential Revision: https://reviews.llvm.org/D113941
Geoffrey Martin-Noble [Tue, 16 Nov 2021 18:40:48 +0000 (10:40 -0800)]
[Bazel] Update .bazelignore for moved google/benchmark
We need to avoid directly processing the Bazel config in LLVM's copy of
google/benchmark, which was moved in
https://github.com/llvm/llvm-project/commit/
1ee32055ea.
Differential Revision: https://reviews.llvm.org/D114014
Mircea Trofin [Tue, 16 Nov 2021 17:44:26 +0000 (09:44 -0800)]
Reland "[benchmarks] Move libcxx's fork of google/benchmark and llvm/utils'""
This reverts commit
e7568b68da8a216dc22cdc1c6d8903c94096c846 and relands
c6f7b720ecfa6db40c648eb05e319f8a817110e9.
The culprit was: missed that libc also had a dependency on one of the
copies of `google-benchmark`
Also opportunistically fixed indentation from prev. change.
Differential Revision: https://reviews.llvm.org/D112012
Duncan P. N. Exon Smith [Tue, 16 Nov 2021 01:22:30 +0000 (17:22 -0800)]
DebugInfo: Make DWARFExpression::iterator a const iterator
3d1d8c767be5537eb5510ee0522e2f3475fe7c04 made
DWARFExpression::iterator's Operation member `mutable`. After a few prep
commits, the iterator can instead be made a `const` iterator since no
caller can change the Operation.
Differential Revision: https://reviews.llvm.org/D113958
Duncan P. N. Exon Smith [Tue, 16 Nov 2021 01:15:41 +0000 (17:15 -0800)]
DebugInfo: Stop modifying Operation::Error inside of verify()
The only caller of Operation::verify() is DWARFExpression::verify(),
which iterates past the (ephemeral) Operation immediately after.
- Stop setting Operation::Error because the mutation will never be
observed.
- Change verify() to a static function to be sure all callers are
updated.
Differential Revision: https://reviews.llvm.org/D113957
Quinn Pham [Mon, 8 Nov 2021 20:02:06 +0000 (14:02 -0600)]
[NFC][clang] Inclusive language: Rename myMaster in testcase
[NFC] As part of using inclusive language within the llvm project, this patch
replaces `_myMaster` with `_myLeader` in these testcases.
Reviewed By: ZarkoCA
Differential Revision: https://reviews.llvm.org/D113433
Louis Dionne [Tue, 16 Nov 2021 17:43:24 +0000 (12:43 -0500)]
[libc++] Perform the bootstrapping build before legacy builds in CI
This is to help reduce latency by running longer jobs before shorter ones.
River Riddle [Tue, 16 Nov 2021 17:21:15 +0000 (17:21 +0000)]
[mlir][NFC] Replace references to Identifier with StringAttr
This is part of the replacement of Identifier with StringAttr.
Differential Revision: https://reviews.llvm.org/D113953
Mircea Trofin [Tue, 16 Nov 2021 17:28:50 +0000 (09:28 -0800)]
Revert "[benchmarks] Move libcxx's fork of google/benchmark and llvm/utils'"
This reverts commit
c6f7b720ecfa6db40c648eb05e319f8a817110e9.
Some buildbots are failing, will investigate and reland.
Example:
https://lab.llvm.org/buildbot#builders/138/builds/14067
https://lab.llvm.org/buildbot#builders/73/builds/20159
Philip Reames [Tue, 16 Nov 2021 17:23:41 +0000 (09:23 -0800)]
[tests] Add coverage for different forms of X - urem X, Y
Philip Reames [Tue, 16 Nov 2021 17:22:18 +0000 (09:22 -0800)]
autogen a SCEV test file
Adrian Prantl [Tue, 16 Nov 2021 17:17:32 +0000 (09:17 -0800)]
fix decorator
Mircea Trofin [Thu, 14 Oct 2021 05:43:08 +0000 (22:43 -0700)]
[benchmarks] Move libcxx's fork of google/benchmark and llvm/utils'
under third-party
This change:
- moves the libcxx copy of `google/benchmark` to
`third-party/benchmkark`
- points the 2 uses of the library (libcxx and llvm/utils) to this copy
We picked the licxx copy because it is the most up to date.
Differential Revision: https://reviews.llvm.org/D112012
Adrian Prantl [Tue, 16 Nov 2021 17:09:49 +0000 (09:09 -0800)]
Skip tests on older versions of clang
Adrian Prantl [Tue, 16 Nov 2021 16:45:40 +0000 (08:45 -0800)]
Increase gdbremote timeout.
I'm still seeing random timeouts on green dragon.
Adrian Prantl [Tue, 16 Nov 2021 16:44:15 +0000 (08:44 -0800)]
typo
Kazu Hirata [Tue, 16 Nov 2021 17:01:56 +0000 (09:01 -0800)]
[llvm] Use range-for loops (NFC)
William S. Moses [Tue, 16 Nov 2021 01:13:33 +0000 (20:13 -0500)]
[MLIR][LLVM] Permit integer types in switch other than i32
LLVM switchop currently only permits i32. Both LLVM IR and MLIR Standard switch permit other integer types leading to an illegal state when lowering an i8 switch from MLIR standard
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D113955
Philip Reames [Tue, 16 Nov 2021 16:48:16 +0000 (08:48 -0800)]
Add a hasPoisonGeneratingFlags proxy wrapper to Instruction [NFC]
This just cuts down on casts to Operator.
Nilay Vaish [Tue, 16 Nov 2021 16:37:55 +0000 (11:37 -0500)]
[libc++] Add introsort to avoid O(n^2) behavior
This commit adds a benchmark that tests std::sort on an adversarial inputs,
and uses introsort in std::sort to avoid O(n^2) behavior on adversarial
inputs.
Inputs where partitions are unbalanced even after 2 log(n) pivots have
been selected, the algorithm switches to heap sort to avoid the
possibility of spending O(n^2) time on sorting the input.
Benchmark results show that the intro sort implementation does
significantly better.
Benchmarking results before this change. Time represents the sorting
time required per element:
----------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------------------------------------------
BM_Sort_uint32_QuickSortAdversary_1 3.75 ns 3.74 ns
187432960
BM_Sort_uint32_QuickSortAdversary_4 3.05 ns 3.05 ns
231211008
BM_Sort_uint32_QuickSortAdversary_16 2.45 ns 2.45 ns
288096256
BM_Sort_uint32_QuickSortAdversary_64 32.8 ns 32.8 ns
21495808
BM_Sort_uint32_QuickSortAdversary_256 132 ns 132 ns 5505024
BM_Sort_uint32_QuickSortAdversary_1024 498 ns 497 ns 1572864
BM_Sort_uint32_QuickSortAdversary_16384 3846 ns 3845 ns 262144
BM_Sort_uint32_QuickSortAdversary_262144 61431 ns 61400 ns 262144
BM_Sort_uint64_QuickSortAdversary_1 3.93 ns 3.92 ns
181141504
BM_Sort_uint64_QuickSortAdversary_4 3.10 ns 3.09 ns
222560256
BM_Sort_uint64_QuickSortAdversary_16 2.50 ns 2.50 ns
283639808
BM_Sort_uint64_QuickSortAdversary_64 33.2 ns 33.2 ns
21757952
BM_Sort_uint64_QuickSortAdversary_256 132 ns 132 ns 5505024
BM_Sort_uint64_QuickSortAdversary_1024 478 ns 477 ns 1572864
BM_Sort_uint64_QuickSortAdversary_16384 3932 ns 3930 ns 262144
BM_Sort_uint64_QuickSortAdversary_262144 61646 ns 61615 ns 262144
Benchmarking results after this change:
----------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------------------------------------------
BM_Sort_uint32_QuickSortAdversary_1 6.31 ns 6.30 ns
107741184
BM_Sort_uint32_QuickSortAdversary_4 4.51 ns 4.50 ns
158859264
BM_Sort_uint32_QuickSortAdversary_16 3.00 ns 3.00 ns
223608832
BM_Sort_uint32_QuickSortAdversary_64 44.8 ns 44.8 ns
15990784
BM_Sort_uint32_QuickSortAdversary_256 69.0 ns 68.9 ns 9961472
BM_Sort_uint32_QuickSortAdversary_1024 118 ns 118 ns 6029312
BM_Sort_uint32_QuickSortAdversary_16384 175 ns 175 ns 4194304
BM_Sort_uint32_QuickSortAdversary_262144 210 ns 210 ns 3407872
BM_Sort_uint64_QuickSortAdversary_1 6.75 ns 6.73 ns
103809024
BM_Sort_uint64_QuickSortAdversary_4 4.53 ns 4.53 ns
160432128
BM_Sort_uint64_QuickSortAdversary_16 2.98 ns 2.97 ns
234356736
BM_Sort_uint64_QuickSortAdversary_64 44.3 ns 44.3 ns
15990784
BM_Sort_uint64_QuickSortAdversary_256 69.2 ns 69.2 ns
10223616
BM_Sort_uint64_QuickSortAdversary_1024 119 ns 119 ns 6029312
BM_Sort_uint64_QuickSortAdversary_16384 173 ns 173 ns 4194304
BM_Sort_uint64_QuickSortAdversary_262144 212 ns 212 ns 3407872
Differential Revision: https://reviews.llvm.org/D113413
Louis Dionne [Tue, 16 Nov 2021 16:26:56 +0000 (11:26 -0500)]
[libc++] Add missed comment in https://reviews.llvm.org/D113910
Mark de Wever [Tue, 16 Nov 2021 15:04:29 +0000 (16:04 +0100)]
[libc++][nfc] Improve standard conformance.
The return type of the deleted functions doesn't match the synopsis in
the standard.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D114000
David Spickett [Thu, 11 Nov 2021 12:28:39 +0000 (12:28 +0000)]
[libcxxabi/runtimes] Set LLVM_HOST_TRIPLE in runtimes build
This allows tests to tell if they're running natively.
Those tests are libcxxabi/test/native/arm-linux-eabi.
Which were running on Linaro's bots but became unsupported
when we switched to the runtimes build.
Reviewed By: #libc_abi, phosek
Differential Revision: https://reviews.llvm.org/D113663
Mark de Wever [Tue, 16 Nov 2021 16:29:40 +0000 (17:29 +0100)]
[libc++][doc] Update format implementation status.
Nilay Vaish [Tue, 16 Nov 2021 16:25:42 +0000 (11:25 -0500)]
[libc++] Remove not needed call to __is_long()
The string is known to be long since __grow_by unconditionally calls
__set_long_cap().
Differential Revision: https://reviews.llvm.org/D113910
David Sherwood [Tue, 16 Nov 2021 14:19:18 +0000 (14:19 +0000)]
[AArch64] Fix TypeSize->uint64_t implicit conversion in AArch64ISelLowering::hasAndNot
For now I've just changed the code to only return true from
AArch64ISelLowering::hasAndNot if the vector is fixed-length.
Once we have the right patterns or DAG combines to use bic/bif
we can also enable this for SVE.
Test added here:
CodeGen/AArch64/vselect-constants.ll
Differential Revision: https://reviews.llvm.org/D113994
Matheus Izvekov [Mon, 15 Nov 2021 23:32:48 +0000 (00:32 +0100)]
[libcxx] CI: only build native target for bootstrapping-build
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D113950
Quinn Pham [Mon, 15 Nov 2021 22:22:00 +0000 (16:22 -0600)]
[NFC][clang] Inclusive language: replace master with main in convert_arm_neon.py
[NFC] As part of using inclusive language within the llvm project and to
match the renamed master branch, this patch replaces master with main in
`convert_arm_neon.py`.
Reviewed By: kristof.beyls
Differential Revision: https://reviews.llvm.org/D113942
Dmitry Vyukov [Tue, 16 Nov 2021 13:45:07 +0000 (14:45 +0100)]
tsan: disable bench_threads.cpp on aarch64
The new test started failing on bots with:
CHECK failed: tsan_rtl.cpp:327 "((addr + size)) <= ((TraceMemEnd()))"
(0xf06200e03010, 0xf06200000000) (tid=4073872)
https://lab.llvm.org/buildbot#builders/179/builds/1761
This is a latent bug in aarch64 virtual address space layout,
there is not enough address space to fit traces for all threads.
But since the trace space is going away with the new tsan runtime
(D112603), disable the test.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D113990
Dmitry Vyukov [Tue, 16 Nov 2021 13:29:02 +0000 (14:29 +0100)]
tsan: fix crash during thread exit
Use of gethostent provokes caching of some resources inside of libc.
They are freed in __libc_thread_freeres very late in thread lifetime,
after our ThreadFinish. __libc_thread_freeres calls free which
previously crashed in malloc hooks.
Fix it by setting ignore_interceptors for finished threads,
which in turn prevents malloc hooks.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D113989
Andrzej Warzynski [Thu, 11 Nov 2021 14:54:00 +0000 (14:54 +0000)]
[flang][fir] Add missing `HasParent` in `fir_DTEntryOp`
Differential Revision: https://reviews.llvm.org/D113674
Florian Hahn [Tue, 16 Nov 2021 15:39:12 +0000 (15:39 +0000)]
[llvm-reduce] Move code to check chunk to function, to enable reuse (NFC).
This patch moves the logic to clone and check a new chunk into a new
function, to allow re-use in a follow-up patch that implements parallel
reductions.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D113856
Christian Kühnel [Mon, 15 Nov 2021 14:37:20 +0000 (14:37 +0000)]
[NFC] disabling clang-tidy check readability-identifier-naming in Protocol.h
The file follows the LSP syntax, so we're intentially deviating
from the LLVM coding standard.
Differential Revision: https://reviews.llvm.org/D113889
Ahsan Saghir [Fri, 29 Oct 2021 12:05:11 +0000 (07:05 -0500)]
[PowerPC] Allow MMA built-ins to accept non-void pointers and arrays
Calls to MMA builtins that take pointer to void
do not accept other pointers/arrays whereas normal
functions with the same parameter do. This patch
allows MMA built-ins to accept non-void pointers
and arrays.
Reviewed By: nemanjai
Differential Revision: https://reviews.llvm.org/D113306
Christian Kühnel [Mon, 15 Nov 2021 15:10:13 +0000 (15:10 +0000)]
[NFC][clangd] fix llvm-namespace-comment finding
Fixing the clang-tidy finding.
Differential Revision: https://reviews.llvm.org/D113895
Mark de Wever [Tue, 16 Nov 2021 15:02:26 +0000 (16:02 +0100)]
[libc++][format][nfc] Remove dead code.
This was an early part of the prototype. This has never been shipped
enabled and the final version of this code looks completely different.
Dmitry Vyukov [Tue, 16 Nov 2021 08:18:00 +0000 (09:18 +0100)]
tsan: de-hardcode number of unused bits in trace events
Precisely specifying the unused parts of the bitfield is critical for
performance. If we don't specify them, compiler will generate code to load
the old value and shuffle it to extract the unused bits to apply to the new
value. If we specify the unused part and store 0 in there, all that
unnecessary code goes away (store of the 0 const is combined with other
constant parts).
I don't see a good way to ensure we cover all of u64 bits with fields.
So at least introduce named kUnusedBits consts and check that bits
sum up to 64.
Depends on D113978.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D113979
Dmitry Vyukov [Tue, 16 Nov 2021 08:14:09 +0000 (09:14 +0100)]
tsan: use smaller trace parts for Go
In the old runtime we used to use different number of trace parts
for C++ and Go to reduce trace memory consumption for Go.
But now it's easier and better to use smaller parts because
we already use minimal possible number of parts for C++ (3).
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D113978
Mark de Wever [Tue, 16 Nov 2021 14:56:59 +0000 (15:56 +0100)]
[libc++][doc] Fix copy pasted comment.
Mark de Wever [Tue, 16 Nov 2021 14:55:17 +0000 (15:55 +0100)]
[libc++][doc] Add a todo.
As suggested in D113831.
LLVM GN Syncbot [Tue, 16 Nov 2021 14:52:00 +0000 (14:52 +0000)]
[gn build] Port
5baa4ee30b5c
Mark de Wever [Sat, 13 Nov 2021 18:43:32 +0000 (19:43 +0100)]
[libc++][NFC] Move format_to_n_result.
Places `format_to_n_result` to its own file. While working on D112361 it
turns out the type will be used outside the format header.
Reviewed By: #libc, Quuxplusone, Mordante
Differential Revision: https://reviews.llvm.org/D113831
Nicolas Vasilache [Tue, 16 Nov 2021 10:06:15 +0000 (10:06 +0000)]
[mlir][LLVM] Fix folding of LLVM::ExtractValueOp
Limit the backtracking along def-use chains when a prefix is encountered as it would generate incorrect foldings.
Differential Revision: https://reviews.llvm.org/D113975
Matt Devereau [Tue, 16 Nov 2021 14:36:13 +0000 (14:36 +0000)]
[AArch64][SVE] Remove arm-registered-target requirement on bfloat tests
Changes in https://reviews.llvm.org/D113489 caused buildbot failures
Jon Chesterfield [Tue, 16 Nov 2021 14:36:06 +0000 (14:36 +0000)]
[amdgpu] Don't crash on empty global ctor/dtor
Global ctor/dtor can be an empty array, which is a Constant not a
ConstantArray. The cast<ConstantArray> therefore asserts / crashes.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D113800
Sanjay Patel [Tue, 16 Nov 2021 14:18:18 +0000 (09:18 -0500)]
[InstCombine] canonicalize icmp with trunc op into mask and cmp, part 2
If C is a high-bit mask:
(trunc X) u< C --> (X & C) != C (are any masked-high-bits clear?)
If C is low-bit mask:
(trunc X) u> C --> (X & ~C) != 0 (are any masked-high-bits set?)
If C is not-of-power-of-2 (one clear bit):
(trunc X) u> C --> (X & (C+1)) == C+1 (are all masked-high-bits set?)
This extends the fold added with:
acabad9ff6bf (https://alive2.llvm.org/ce/z/aFr7qV)
Using decomposeBitTestICmp() to generalize this is a planned follow-up, but that requires removing an inverse fold.
Here are Alive2 generalizations for these folds:
https://alive2.llvm.org/ce/z/u-ZpC_ (ult, the previous patch)
https://alive2.llvm.org/ce/z/YsuAu2 (ult, this patch)
https://alive2.llvm.org/ce/z/ekktQP (ugt, low bitmask)
https://alive2.llvm.org/ce/z/pJY9wR (ugt, one clear bit)
Differential Revision: https://reviews.llvm.org/D112634
Alexey Bataev [Fri, 12 Nov 2021 14:21:47 +0000 (06:21 -0800)]
[SLP]Improve cost of the gather nodes.
No need to count the final shuffle cost for the constants, gathering of
the constants is just a constant vector + extra inserts, if required.
Differential Revision: https://reviews.llvm.org/D113770
Jake Egan [Tue, 16 Nov 2021 14:14:43 +0000 (09:14 -0500)]
[AIX] XFAIL lto-comp-dir.ll for lack of .file directive support
This test explicitly checks for .file directives, which is not currently supported on AIX. This patch sets this test to XFAIL on AIX for now.
Reviewed By: shchenz
Differential Revision: https://reviews.llvm.org/D113640
Alexey Bataev [Tue, 16 Nov 2021 14:08:48 +0000 (06:08 -0800)]
[SLP]Fix windows build, NFC.
Need to put `IndexIdx` var to the list of captures.
Greg McGary [Fri, 5 Nov 2021 03:55:31 +0000 (20:55 -0700)]
[lld-macho][nfc] rename parsed-section types & variables
This is an NFC diff that prepares for pruning & relocating `__eh_frame`.
Along the way, I made the following changes to ...
* clarify usage of `section` vs. `subsection`
* remove `map` & `vec` from type names
* disambiguate class `Section` from template parameter `SectionHeader`.
Differential Revision: https://reviews.llvm.org/D113241
Jean Perier [Tue, 16 Nov 2021 13:52:35 +0000 (14:52 +0100)]
[flang] Allow write after non advancing read in IO runtime
1. To avoid overwriting the part of the record read in the non advancing read,
the furtherPositionInRecord field must be set to the max of the
furtherPositionInRecord and the positionInRecord at the beginning of the
IO write.
2. To allow any further read to succeed after the write, the unit
beganReadingRecord_ must be set to false when resetting the recordLength
during the write, otherwise, recordLength will not be computed in further
read and an assert is hit (at unit.cpp(398)).
The added unit test exercises both of these scenarios.
Differential Revision: https://reviews.llvm.org/D113740
Zbigniew Sarbinowski [Fri, 29 Oct 2021 18:35:17 +0000 (18:35 +0000)]
[SystemZ][z/OS] Fix warnings from unsigned int to long in 32-bit mode
This patch fixes the warnings which shows up when libcxx library started to be compiled in 32-bit mode on z/OS.
More specifically, the assignment from unsigned int to time_t aka long was flags as follows:
```
libcxx/include/c++/v1/__support/ibm/nanosleep.h:31:11: warning: implicit conversion changes signedness: 'unsigned int' to 'time_t' (aka 'long') [-Wsign-conversion]
__sec = sleep(static_cast<unsigned int>(__sec));
~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
libcxx/include/c++/v1/__support/ibm/nanosleep.h:36:36: warning: implicit conversion changes signedness: 'unsigned int' to 'long' [-Wsign-conversion]
__rem->tv_nsec = __micro_sec * 1000;
~ ~~~~~~~~~~~~^~~~~~
libcxx/include/c++/v1/__support/ibm/nanosleep.h:47:36: warning: implicit conversion changes signedness: 'unsigned int' to 'long' [-Wsign-conversion]
__rem->tv_nsec = __micro_sec * 1000;
~ ~~~~~~~~~~~~^~~~~~
3 warnings generated.
```
Here is a small test case illustrating the issue:
```
typedef long time_t ;
unsigned int sleep(unsigned int );
int main() {
time_t sec = 0;
#ifdef FIX
sec = static_cast<time_t>(sleep(static_cast<unsigned int>(sec)));
#else
sec = sleep(static_cast<unsigned int>(sec));
#endif
}
```
clang++ -c -Wsign-conversion -m32 t.C
```
t.C:8:9: warning: implicit conversion changes signedness: 'unsigned int' to 'time_t' (aka 'long') [-Wsign-conversion]
sec = sleep(static_cast<unsigned int>(sec));
~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Reviewed By: ldionne, #libc, Quuxplusone, Mordante
Differential Revision: https://reviews.llvm.org/D112837
Andrzej Warzynski [Fri, 12 Nov 2021 13:25:42 +0000 (13:25 +0000)]
[flang][CodeGen] Transform `fir.boxchar_len` to a sequence of LLVM MLIR
This patch extends the `FIRToLLVMLowering` pass in Flang by adding a
hook to transform `fir.boxchar_len` to a sequence of LLVM MLIR
instructions.
This is part of the upstreaming effort from the `fir-dev` branch in [1].
[1] https://github.com/flang-compiler/f18-llvm-project
Differential Revision: https://reviews.llvm.org/D113763
Originally written by:
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Alexey Bataev [Fri, 12 Nov 2021 19:22:12 +0000 (11:22 -0800)]
[SLP]Adjust GEP indices types when trying to build entries.
Need to adjust the types of GEPs indices when building the tree
entries/operands. Otherwise some of the nodes might differ and
vectorizer is unable to correctly find them and count their cost.
Differential Revision: https://reviews.llvm.org/D113792
Alexey Bataev [Tue, 16 Nov 2021 13:41:32 +0000 (05:41 -0800)]
[SLP][NFC]Add more tests for shuffles that can be optimized after SLP,
NFC.
Valentin Clement [Tue, 16 Nov 2021 13:37:05 +0000 (14:37 +0100)]
[fir] Add fir.gentypedesc conversion
Add conversion pattern for the GenTypeDescOp.
Convert to a global constant with an addressof.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D113766
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
Joseph Huber [Tue, 16 Nov 2021 04:04:55 +0000 (23:04 -0500)]
[OpenMP] Fix initializer not working on AMDGPU
The RAII class used for debugging RTL entry used a shared variable to
keep track of the current depth. This used a global initializer, which
isn't supported on AMDGPU. This patch removes the initializer and
instead sets it to zero when the state is initialized in the runtime.
Reviewed By: jdoerfert, JonChesterfield
Differential Revision: https://reviews.llvm.org/D113963
Christian Kühnel [Mon, 15 Nov 2021 14:52:07 +0000 (14:52 +0000)]
[NFC][clangd] cleaning up unused "using"
Cleaning up unused "using" declarations.
This patch was generated from automatically applyning clang-tidy fixes.
Differential Revision: https://reviews.llvm.org/D113891
Florian Hahn [Tue, 16 Nov 2021 12:48:21 +0000 (12:48 +0000)]
[llvm-reduce] Add new BitWriter dependency after
28d95a26109e.
Sander.DeSmalen@arm.com [Tue, 16 Nov 2021 11:14:04 +0000 (11:14 +0000)]
[IndVarSimplify] Reduce nondeterministic behaviour in visitIVCast.
rGf39978b84f1d3a1da6c32db48f64c8daae64b3ad led to and/or exposed
an issue with IndVarSimplification for a loop where a i32 phi node is
no longer replaced by a widened (i64) phi node, because the SCEVs of a
sign-extend no longer folded the same way. I'm unsure how to properly
explain this because it's all rather complicated, but in short: SCEVs
don't fold as nicely as they used to and this caused a difference.
While investigating this, I found that IndVarSimplify can actually
optimise the case in the way we want to if it chooses the widened IV to
be 'signed' (the i32 IV is both sign and zero-extended). Oddly enough,
there is some level of indeterminism in the way the algorithm works,
it just picks the sign of the 'first' zext/sext user, where the order of
the users-iterator is not guaranteed to be the same on each invocation
of the pass (e.g. shown by first running loop-rotate, which puts the
users in a different order).
While I think the fix is valid in the sense that consistently picking
_any_ order is better than having an nondeterministic order, I can
use a bit of advice from people more familiar in this area of the
code-base.
For example, I'm not sure if this fix is hiding another issue where the
IndVarSimplify pass could actually draw the same conclusions (i.e. that
it only needs an i64 phi node) if it does a bit more work, regardless
of whether it chooses the induction variable to be signed or unsigned.
I'm also not sure if choosing signed is better than unsigned, or whether
that just happens to be beneficial only in this individual case.
Any feedback would be much appreciated!
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D112573
Florian Hahn [Tue, 16 Nov 2021 12:39:32 +0000 (12:39 +0000)]
[llvm-reduce] Allow writing temporary files as bitcode.
Textual LLVM IR files are much bigger and take longer to write to disk.
To avoid the extra cost incurred by serializing to text, this patch adds
an option to save temporary files as bitcode instead.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D113858
Diana Picus [Tue, 16 Nov 2021 10:17:29 +0000 (10:17 +0000)]
[fir] Add fir.cmpc conversion
This patch adds the codegen for fir.cmpc. The real and imaginary parts
are extracted and compared separately. For the .EQ. predicate the
results are AND'd, for the .NE. predicate the results are OR'd, and for
other predicates we keep only the result on the real parts.
This patch is part of the upstreaming effort from fir-dev.
Differential Revision: https://reviews.llvm.org/D113976
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>