Simon Pilgrim [Sat, 14 May 2022 12:25:30 +0000 (13:25 +0100)]
[ARM] Regenerate combine-movc-sub.ll test checks
Christian Sigg [Fri, 6 May 2022 18:31:36 +0000 (20:31 +0200)]
[MLIR][GPU] NFC: simplify kernel operand accessor implementations.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D125112
Simon Pilgrim [Sat, 14 May 2022 11:58:50 +0000 (12:58 +0100)]
[X86] rotate-extract-vector.ll - use avx512bw+avx512vl target for more useful codegen checks
This is a much more realistic target than just avx512bw, which has never existed as a real world cpu target
Mark de Wever [Sat, 27 Feb 2021 15:52:39 +0000 (16:52 +0100)]
[libc++] Improve std::to_chars for base != 10.
This improves the speed of `to_chars` for bases 2, 8, and 16.
These bases are common and used in `<format>`. This change
uses a lookup table, like done in base 10 and causes an increase
in code size. The change has a small overhead for the other bases.
```
Benchmark Time CPU Time Old Time New CPU Old CPU New
------------------------------------------------------------------------------------------------------------------
BM_to_chars_good/2 -0.9476 -0.9476 252 13 252 13
BM_to_chars_good/3 +0.0018 +0.0018 145 145 145 145
BM_to_chars_good/4 +0.0108 +0.0108 104 105 104 105
BM_to_chars_good/5 +0.0159 +0.0160 89 91 89 91
BM_to_chars_good/6 +0.0162 +0.0162 80 81 80 81
BM_to_chars_good/7 +0.0117 +0.0117 72 73 72 73
BM_to_chars_good/8 -0.8643 -0.8643 64 9 64 9
BM_to_chars_good/9 +0.0095 +0.0095 60 60 60 60
BM_to_chars_good/10 +0.0540 +0.0540 6 6 6 6
BM_to_chars_good/11 +0.0299 +0.0299 55 57 55 57
BM_to_chars_good/12 +0.0060 +0.0060 48 49 49 49
BM_to_chars_good/13 +0.0102 +0.0102 48 48 48 48
BM_to_chars_good/14 +0.0184 +0.0185 47 48 47 48
BM_to_chars_good/15 +0.0269 +0.0269 44 45 44 45
BM_to_chars_good/16 -0.8207 -0.8207 37 7 37 7
BM_to_chars_good/17 +0.0241 +0.0241 37 38 37 38
BM_to_chars_good/18 +0.0221 +0.0221 37 38 37 38
BM_to_chars_good/19 +0.0222 +0.0223 37 38 37 38
BM_to_chars_good/20 +0.0317 +0.0317 38 39 38 39
BM_to_chars_good/21 +0.0342 +0.0341 38 39 38 39
BM_to_chars_good/22 +0.0336 +0.0336 36 38 36 38
BM_to_chars_good/23 +0.0222 +0.0222 34 35 34 35
BM_to_chars_good/24 +0.0185 +0.0185 31 32 31 32
BM_to_chars_good/25 +0.0157 +0.0157 32 32 32 32
BM_to_chars_good/26 +0.0181 +0.0181 32 32 32 32
BM_to_chars_good/27 +0.0153 +0.0153 32 32 32 32
BM_to_chars_good/28 +0.0179 +0.0179 32 32 32 32
BM_to_chars_good/29 +0.0189 +0.0189 32 33 32 33
BM_to_chars_good/30 +0.0212 +0.0212 32 33 32 33
BM_to_chars_good/31 +0.0221 +0.0221 32 33 32 33
BM_to_chars_good/32 +0.0292 +0.0292 32 33 32 33
BM_to_chars_good/33 +0.0319 +0.0319 32 33 32 33
BM_to_chars_good/34 +0.0411 +0.0410 33 34 33 34
BM_to_chars_good/35 +0.0515 +0.0515 33 34 33 34
BM_to_chars_good/36 +0.0502 +0.0502 32 34 32 34
BM_to_chars_bad/2 -0.8752 -0.8752 40 5 40 5
BM_to_chars_bad/3 +0.1952 +0.1952 21 26 21 26
BM_to_chars_bad/4 +0.3626 +0.3626 16 22 16 22
BM_to_chars_bad/5 +0.2267 +0.2268 17 21 17 21
BM_to_chars_bad/6 +0.3560 +0.3559 14 19 14 19
BM_to_chars_bad/7 +0.4599 +0.4600 12 18 12 18
BM_to_chars_bad/8 -0.5074 -0.5074 11 5 11 5
BM_to_chars_bad/9 +0.4814 +0.4814 10 15 10 15
BM_to_chars_bad/10 +0.7761 +0.7761 2 4 2 4
BM_to_chars_bad/11 +0.3948 +0.3948 12 16 12 16
BM_to_chars_bad/12 +0.3203 +0.3203 10 13 10 13
BM_to_chars_bad/13 +0.3067 +0.3067 11 14 11 14
BM_to_chars_bad/14 +0.2235 +0.2235 12 14 12 14
BM_to_chars_bad/15 +0.2675 +0.2675 11 14 11 14
BM_to_chars_bad/16 -0.1801 -0.1801 7 5 7 5
BM_to_chars_bad/17 +0.5651 +0.5651 7 11 7 11
BM_to_chars_bad/18 +0.5407 +0.5406 7 11 7 11
BM_to_chars_bad/19 +0.5593 +0.5593 8 12 8 12
BM_to_chars_bad/20 +0.5823 +0.5823 8 13 8 13
BM_to_chars_bad/21 +0.6032 +0.6032 9 15 9 15
BM_to_chars_bad/22 +0.6407 +0.6408 9 14 9 14
BM_to_chars_bad/23 +0.6292 +0.6292 7 12 7 12
BM_to_chars_bad/24 +0.5784 +0.5784 6 10 6 10
BM_to_chars_bad/25 +0.5784 +0.5784 6 10 6 10
BM_to_chars_bad/26 +0.5713 +0.5713 7 10 7 10
BM_to_chars_bad/27 +0.5969 +0.5969 7 11 7 11
BM_to_chars_bad/28 +0.6131 +0.6131 7 11 7 11
BM_to_chars_bad/29 +0.6937 +0.6937 7 11 7 11
BM_to_chars_bad/30 +0.7655 +0.7656 7 12 7 12
BM_to_chars_bad/31 +0.8939 +0.8939 6 12 6 12
BM_to_chars_bad/32 +1.0157 +1.0157 6 13 6 13
BM_to_chars_bad/33 +1.0279 +1.0279 7 14 7 14
BM_to_chars_bad/34 +1.0388 +1.0388 7 14 7 14
BM_to_chars_bad/35 +1.0990 +1.0990 7 15 7 15
BM_to_chars_bad/36 +1.1503 +1.1503 7 15 7 15
```
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D97705
Simon Pilgrim [Sat, 14 May 2022 10:54:08 +0000 (11:54 +0100)]
[X86] Regenerate pull-binop-through-shift.ll showing stack address math
Also rename X32 check prefixes to X86 as we try to use X32 for gnux32 targets
Chris Lattner [Sat, 14 May 2022 10:48:17 +0000 (11:48 +0100)]
[DenseElementsAttr] Teach isValidRawBuffer that 1-elt values are splats.
We want getRaw() on tensors with i1 element type with a zero or 1 value
to be treated as a splat. This fixes:
https://github.com/llvm/llvm-project/issues/55440
Aaron Puchert [Sat, 14 May 2022 10:37:35 +0000 (12:37 +0200)]
Resolve overload ambiguity on Mac OS when printing size_t in diagnostics
Precommit builds cover Linux and Windows, but this ambiguity would only
show up on Mac OS: there we have int32_t = int, int64_t = long long and
size_t = unsigned long. So printing a size_t, while successful on the
other two architectures, cannot be unambiguously resolved on Mac OS.
This is not really meant to support printing arguments of type long or
size_t, but more as a way to prevent build breakage that would not be
detected in precommit builds, as happened in D125429.
Technically we have no guarantee that one of these types has the 64 bits
that
afdac5fbcb6a3 wanted to provide, so proposals are welcome. We do
have a guarantee though that these three types are different, so we
should be fine with overload resolution.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D125580
Andrzej Warzynski [Fri, 29 Apr 2022 17:36:26 +0000 (17:36 +0000)]
[flang][driver] Switch to the MLIR coding style in the driver (nfc)
This patch re-factors the driver code in LLVM Flang (frontend +
compiler) to use the MLIR style. For more context, please see:
https://discourse.llvm.org/t/rfc-coding-style-in-the-driver/
Most changes here are rather self-explanatory. Accessors are renamed to
be more consistent with the rest of LLVM (e.g. allSource -->
getAllSources). Additionally, MLIR clang-tidy files are added in the
affected directories.
clang-tidy and clang-format files were copied from MLIR. Small
additional changes are made to silence clang-tidy/clang-format
warnings.
[1] https://mlir.llvm.org/getting_started/DeveloperGuide/
Differential Revision: https://reviews.llvm.org/D125007
Benjamin Kramer [Sat, 14 May 2022 10:11:58 +0000 (12:11 +0200)]
[bazel] Port
ae8bbc43f470
Weining Lu [Sun, 24 Apr 2022 06:59:19 +0000 (14:59 +0800)]
[LoongArch] Add privilege instructions definition
These instructions are added by following the `LoongArch Reference
Manual Volume 1: Basic Architecture Version 1.00`.
Differential Revision: https://reviews.llvm.org/D124826
Xiaodong Liu [Sat, 14 May 2022 09:40:13 +0000 (17:40 +0800)]
[llvm] Fix comment nits in Module class, NFC.
There is no member called "GlobalValRefMap" in Module class.
It has been changed to "GlobalList".
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D125187
Mark de Wever [Sat, 14 May 2022 09:25:15 +0000 (11:25 +0200)]
[lib++][doc] Fixes a link in the status paper.
Kristina Bessonova [Sat, 14 May 2022 09:09:43 +0000 (11:09 +0200)]
[DebugInfo][Test] Simplify 'llvm/test/CodeGen/ARM/*-MergedGlobalDbg.ll'. NFC
Differential Revision: https://reviews.llvm.org/D125531
Jay Foad [Wed, 6 Oct 2021 08:49:51 +0000 (09:49 +0100)]
[APInt] Allow extending and truncating to the same width
Allow zext, sext, trunc, truncUSat and truncSSat to extend or truncate
to the same bit width, which is a no-op.
Disallowing this forced clients to use workarounds like using
zextOrTrunc (even though they never wanted truncation) or zextOrSelf
(even though they did not want its strange behaviour of allowing a
*smaller* bit width, which is also treated as a no-op).
Differential Revision: https://reviews.llvm.org/D125556
Simon Pilgrim [Sat, 14 May 2022 08:49:55 +0000 (09:49 +0100)]
[DAG] Enable ISD::SHL SimplifyMultipleUseDemandedBits handling inside SimplifyDemandedBits
Pulled out of D77804 as its going to be easier to address the regressions individually.
This patch allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits in cases where the source operand has other uses, enabling us to peek through the shifted value if we don't demand all the bits/elts.
The lost RISCV gorc2 fold shouldn't be a problem - instcombine would have already destroyed that pattern - see https://github.com/llvm/llvm-project/issues/50553
Differential Revision: https://reviews.llvm.org/D124839
Cassie Jones [Sat, 14 May 2022 08:47:41 +0000 (01:47 -0700)]
[clang] Require including config.h for CLANG_DEFAULT_STD_C
This makes CLANG_DEFAULT_STD_C(XX) always be defined, defaulting to
lang_unspecified, so you are forced to check its value instead of using
an #ifdef. This should help avoid accidentally omitting the include in
places where that's important, so that the default language version bug
isn't re-introduced.
Reviewed By: hokein, dexonsmith
Differential Revision: https://reviews.llvm.org/D124974
Cassie Jones [Sat, 14 May 2022 08:47:41 +0000 (01:47 -0700)]
[clang] Include clang config.h in LangStandards.cpp
This is necessary in order to pick up the default C/C++ standard from
the CLANG_DEFAULT_STD_C(XX) defines. This fixes a bug that was
introduced when this default language standard code was moved from
Frontend to Basic, making compilers ignore the configured default
language version override.
Fixes a bug introduced by D121375.
Reviewed By: hokein, dexonsmith
Differential Revision: https://reviews.llvm.org/D124974
Petr Hosek [Sat, 14 May 2022 02:23:52 +0000 (02:23 +0000)]
[libcxxabi] Copy headers into build location
Prior to D120727, the libcxx build was responsible for copying libcxxabi
headers into the right location, both in the build and install trees,
but now it's the responsibility of the libcxxabi build. While the build
already did the right thing for the install tree, it wouldn't copy
headers into the build tree, resulting in errors when trying to use the
just built toolchain as is the case in the runtimes build when building
compiler-rt runtimes.
Differential Revision: https://reviews.llvm.org/D125597
Chenbing Zheng [Sat, 14 May 2022 02:54:15 +0000 (10:54 +0800)]
[InstCombine] [NFC] separate a function foldICmpBinOpWithConstant
There is a long function foldICmpInstWithConstant,
we can separate a function foldICmpBinOpWithConstant from it.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D125457
Rafael Auler [Fri, 13 May 2022 22:51:36 +0000 (15:51 -0700)]
[BOLT] Fix merge-fdata handling of BAT profiles
When a profile is collected in a BOLTed binary, the generated
profile is tagged with a header string "boltedcollection" in the first
line of the fdata file. Fix merge-fdata to recognize this header
string and preserve it into the output.
Reviewed By: Amir
Differential Revision: https://reviews.llvm.org/D125591
Nico Weber [Sat, 14 May 2022 01:35:09 +0000 (21:35 -0400)]
[gn build] (semi-manually) port
512273833136
Med Ismail Bennani [Fri, 13 May 2022 23:50:52 +0000 (16:50 -0700)]
[lldb/API] Turn SBCompileUnit::GetIndexForLineEntry into FindLineEntryIndex (NFC)
This patch renames the `SBCompileUnit::GetIndexForLineEntry` api to be
an overload of `SBCompileUnit::FindLineEntryIndex`
Differential Revision: https://reviews.llvm.org/D125594
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Richard [Sat, 23 Apr 2022 23:47:30 +0000 (17:47 -0600)]
[clang-tidy] Support expressions of literals in modernize-macro-to-enum
Add a recursive descent parser to match macro expansion tokens against
fully formed valid expressions of integral literals. Partial
expressions will not be matched -- they can't be valid initializing
expressions for an enum.
Differential Revision: https://reviews.llvm.org/D124500
Fixes #55055
Mogball [Sat, 14 May 2022 00:37:34 +0000 (00:37 +0000)]
[mlir][ods] (NFC) remove erroneous trait
Mogball [Sat, 14 May 2022 00:20:01 +0000 (00:20 +0000)]
[mlir] Rename Zero* traits to Zero*s
Rename
ZeroResult -> ZeroResults
ZeroSuccessor -> ZeroSuccessors
ZeroRegion -> ZeroRegions
to be in line with ZeroOperands and grammatically correct.
owenca [Fri, 13 May 2022 01:03:24 +0000 (18:03 -0700)]
[clang-format][NFC] Format unit tests with insert/remove braces
This patch is the result of running clang-format version 753fe33 in
clang/unittests/Format/:
clang-format -style="{InsertBraces: true, RemoveBracesLLVM: true}" -i *.cpp *.h
Differential Revision: https://reviews.llvm.org/D125510
Wolfgang Pieb [Fri, 29 Apr 2022 21:39:40 +0000 (14:39 -0700)]
[NFC][Metadata] Refactor allocation, initalization and deletion of MDNodes.
This patch is refactoring the allocation, initialization and deletion
of MDNodes. It is intended as a preparatory patch for the upcoming
addition of dynamic resizability of MDNodes. It is fundamentally NFC,
but removes the necessity for suppressing the memory sanitizer for
MDNode's operator delete.
Reviewers: dexonsmith
Differential Revision: https://reviews.llvm.org/D125489
Ben Dunbobbin [Fri, 13 May 2022 22:37:16 +0000 (23:37 +0100)]
[llvm-ar][mri] Ensure CREATE commands overwrite the output file
The CREATE/CREATETHIN commands should overwrite the output file:
https://sourceware.org/binutils/docs/binutils/ar-scripts.html.
This fixes a regression for MRI scripts introduced in:
https://reviews.llvm.org/D123142 which put logic into
performWriteOperation. performWriteOperation is called for all MRI
commands that write an archive out (one's with a SAVE command).
performWriteOperation is unaware of MRI semantics and loads an
existing archive if present. If an existing archive is loaded, llvm-ar
checks the properties of the existing archive for decisions about the
output archive (for example making the output archive thin if the
existing one was). https://reviews.llvm.org/D123142 adds the following
logic...
if (OldArchive) {
if (Thin && !OldArchive->isThin())
fail("cannot convert a regular archive to a thin one");
if (OldArchive->isThin())
Thin = true;
}
... which errors for a script with CREATETHIN in effect if there is an
existing regular archive, and causes CREATE to output a thin archive
if there is an existing thin archive.
Differential Revision: https://reviews.llvm.org/D125439
bzcheeseman [Fri, 13 May 2022 22:41:22 +0000 (18:41 -0400)]
[LLVM][Casting.h] Add ForwardToPointerCast trait
Addresses use cases in Clang/MLIR that need pointer-to-pointer, reference-to-reference, and value-to-value casts from/to the same types. This should reduce boilerplate by allowing the user to simply specify the pointer cast and forward the reference cast directly to the pointer cast.
This cast trait DOES NOT implement `castFailed` and `doCastIfPossible` because in the general case doing so could result in a nullptr dereference. Users can use `NullableValueCastFailed` and `DefaultDoCastIfPossible` as desired for those cases where `nullptr` is acceptable.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D125576
bzcheeseman [Fri, 13 May 2022 19:57:03 +0000 (15:57 -0400)]
[LLVM][Casting.h] Remove CastInfo pointer partial specialization.
Since cast_convert_val now has pointer specializations, we don't need the pointer partial specialization for CastInfo. We want to trim these down when possible to avoid future ambiguous partial specialization errors.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D125578
Chris Lattner [Fri, 13 May 2022 22:29:21 +0000 (23:29 +0100)]
[ParseResult] Fix warning in flang build, incorporate feedback from River.
The warning caused build errors on a couple flang testers that are
building with -Werror. The diagnostic change makes the generated
error correct.
This is a followup to https://reviews.llvm.org/D125549
Differential Revision: https://reviews.llvm.org/D125587
Roger Ferrer Ibanez [Fri, 13 May 2022 15:45:05 +0000 (15:45 +0000)]
[RISCV] Use the new chain when converting a fixed RVV load
When building the final merged node, we were using the original chain
rather than the output chain of the new operation. After some collapsing
of the chain this could cause the loads be incorrectly scheduled respect
to later stores.
This was uncovered by SingleSource/Regression/C/gcc-c-torture/execute/pr36038.c
of the llvm testsuite.
https://reviews.llvm.org/D125560
Roger Ferrer Ibanez [Fri, 13 May 2022 15:46:40 +0000 (15:46 +0000)]
[RISCV][NFC] Test showing wrong scheduling of expansion of memmove for fixed RVV
After a memmove is expanded, we load the higher addresses after we have
stored onto the lower ones, as if we could do a forward copy.
Differential Revision: https://reviews.llvm.org/D125553
Alexander Shaposhnikov [Fri, 13 May 2022 20:44:49 +0000 (20:44 +0000)]
[GlobalOpt] Enable optimization of constructors with different priorities
Adjust `optimizeGlobalCtorsList` to handle the case of different priorities.
This addresses the issue https://github.com/llvm/llvm-project/issues/55083.
Test plan: ninja check-all
Differential revision: https://reviews.llvm.org/D125278
Joseph Huber [Fri, 13 May 2022 22:04:29 +0000 (18:04 -0400)]
[Cuda] Add the features using the last argument
Summary:
We should use the last argument so this flag can be overridden properly.
Alexey Bataev [Wed, 16 Mar 2022 20:18:54 +0000 (13:18 -0700)]
[SLP]Do not vectorize non-profitable alternate nodes.
If alternate node has only 2 instructions and the tree is already big
enough, better to skip the vectorization of such nodes, they are not
very profitable (the resulting code cotains 3 instructions instead of
original 2 scalars). SLP can try to vectorize the buildvector sequence
in the next attempt, if it is profitable.
Metric: SLP.NumVectorInstructions
Program SLP.NumVectorInstructions
results results0 diff
test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C/miniAMR/miniAMR.test 72.00 73.00 1.4%
test-suite :: MultiSource/Benchmarks/Prolangs-C/TimberWolfMC/timberwolfmc.test 1186.00 1198.00 1.0%
test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/miniFE/miniFE.test 241.00 242.00 0.4%
test-suite :: MultiSource/Applications/JM/lencod/lencod.test 2131.00 2139.00 0.4%
test-suite :: External/SPEC/CINT2017rate/523.xalancbmk_r/523.xalancbmk_r.test 6377.00 6384.00 0.1%
test-suite :: External/SPEC/CINT2017speed/623.xalancbmk_s/623.xalancbmk_s.test 6377.00 6384.00 0.1%
test-suite :: External/SPEC/CFP2017rate/510.parest_r/510.parest_r.test 12650.00 12658.00 0.1%
test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 26169.00 26147.00 -0.1%
test-suite :: MultiSource/Benchmarks/Trimaran/enc-3des/enc-3des.test 99.00 86.00 -13.1%
Gains:
526.blender_r - more vectorized trees.
enc-3des - same.
Others:
510.parest_r - no changes.
miniFE - same
623.xalancbmk_s - some (non-profitable) parts of the trees are not
vectorized.
523.xalancbmk_r - same
lencod - same
timberwolfmc - same
miniAMR - same
Differential Revision: https://reviews.llvm.org/D125571
Alan Zhao [Fri, 13 May 2022 20:09:11 +0000 (16:09 -0400)]
[llvm-ml] Add support for extern proc
EXTERN PROC isn't really well documented in MSVC, so after poking around
it seems as if it's just a regular extern symbol.
Interestingly enough, under MSVC the following is allowed:
extern foo:proc
mov eax, foo
MSVC will output:
mov eax, 0
while llvm-ml will currently output:
mov eax, dword ptr [foo]
(since foo is an extern)
Arguably, llvm-ml's output makes more sense, even though it's
inconsistent with MSVC ml. However, since moving an extern proc symbol
to a register doesn't really make sense in the first place, we'll treat
it as undefined behavior for now.
Reviewed By: epastor
Differential Revision: https://reviews.llvm.org/D125582
Eli Friedman [Fri, 13 May 2022 20:36:16 +0000 (13:36 -0700)]
[GlobalIsel] Fix fallback if stack protector isn't supported.
When GlobalISel fails, we need to report the error, and we need to set
the FailedISel property. We skipped those steps if stack protector
insertion failed, which led to a very strange miscompile.
Differential Revision: https://reviews.llvm.org/D125584
Egor Zhdan [Fri, 13 May 2022 21:05:28 +0000 (22:05 +0100)]
[Clang] Fix DriverKit tests on Linux
Some new DriverKit tests were added in https://reviews.llvm.org/D121911, and unfortunately they fail on Linux build bots.
Alexey Bataev [Fri, 13 May 2022 19:08:08 +0000 (12:08 -0700)]
[SLP]Do not look for buildvector sequence, if the index is reused.
If the insert indes was used already or is not constant, we should stop
looking for unique buildvector sequence, it mustbe splitted to
2 different buildvectors.
Joseph Huber [Fri, 13 May 2022 20:29:39 +0000 (16:29 -0400)]
[Libomptarget] Build the static library without CUDA installed
Summary:
This patch allows users to compile the static library without CUDA
installed on the system. This requires the new flag `--cuda-feature` to
indicate that we need `+ptx61` in order to compile the runtime.
Joseph Huber [Fri, 13 May 2022 20:16:15 +0000 (16:16 -0400)]
[CUDA] Add a flag to manually specify the target feature to use with CUDA
Summary:
Normally we parse through the CUDA installation to disover the needed
features. However, we may want to build libraries on targets that do not
currently have CUDA installed but still need to know which features to
make use of when creating the PTX or bitcode. This flag is a simple way
to specify this so we can compile certain codes withotu a valid CUDA
installation.
Ideally this could be done via an -Xarch or simimlar flag but currently
they cannot handle this. We would need to support using an -Xarch flag
that takes multiple arguments that then pass them to the -Xclang
functionality.
Amir Ayupov [Fri, 13 May 2022 20:14:45 +0000 (13:14 -0700)]
[BOLT][CMAKE] Fix DYLIB build
Move BOLT libraries out of `LLVM_LINK_COMPONENTS` to `target_link_libraries`.
Addresses issue #55432.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D125568
River Riddle [Wed, 11 May 2022 22:49:24 +0000 (15:49 -0700)]
[TableGen] Add a new json textmate description for syntax highlighting
There isn't really a good pre-existing syntax highlighter for tablegen, so this
commit adds a textmate version that covers nearly everything in the current
spec.
Differential Revision: https://reviews.llvm.org/D125427
Amir Ayupov [Fri, 13 May 2022 14:12:40 +0000 (07:12 -0700)]
[BOLT][TEST] Fix testing on macos
- Fix common (arch-independent) tests to explicitly target -linux triple.
- Override the triple inside arch-specific tests.
- Add cflags to common tests.
- Update individual tests.
- Expand pipe stderr `|&` shorthand.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D125548
Egor Zhdan [Wed, 16 Mar 2022 19:31:04 +0000 (19:31 +0000)]
[Clang] Add DriverKit support
This is the second patch that upstreams the support for Apple's DriverKit.
The first patch: https://reviews.llvm.org/D118046.
Differential Revision: https://reviews.llvm.org/D121911
Jonas Devlieghere [Fri, 13 May 2022 19:15:05 +0000 (12:15 -0700)]
[lldb] Parallelize fetching symbol files in crashlog.py
When using dsymForUUID, the majority of time symbolication a crashlog
with crashlog.py is spent waiting for it to complete. Currently, we're
calling dsymForUUID sequentially when iterating over the modules. We can
drastically cut down this time by calling dsymForUUID in parallel. This
patch uses Python's ThreadPoolExecutor (introduced in Python 3.2) to
parallelize this IO-bound operation.
The performance improvement is hard to benchmark, because even with an
empty local cache, consecutive calls to dsymForUUID for the same UUID
complete faster. With warm caches, I'm seeing a ~30% performance
improvement (~90s -> ~60s). I suspect the gains will be much bigger for
a cold cache.
dsymForUUID supports batching up multiple UUIDs. I considered going that
route, but that would require more intrusive changes. It would require
hoisting the logic out of locate_module_and_debug_symbols which we
explicitly document [1] as a feature of Symbolication.py to locate
symbol files.
[1] https://lldb.llvm.org/use/symbolication.html
Differential reviison: https://reviews.llvm.org/D125107
Amara Emerson [Thu, 5 May 2022 20:55:49 +0000 (13:55 -0700)]
[GlobalISel] Combine G_SHL, G_ASHR, G_SHL of undef shifts to undef.
Differential Revision: https://reviews.llvm.org/D125041
Amir Ayupov [Fri, 13 May 2022 19:14:17 +0000 (20:14 +0100)]
[BOLT][NFC] Use refs for loop variables to avoid copies
Addresses warnings when built with Apple Clang.
Reviewed By: yota9
Differential Revision: https://reviews.llvm.org/D125483
Amir Ayupov [Fri, 13 May 2022 18:56:45 +0000 (19:56 +0100)]
[BOLT][NFC] Suppress unused variable warnings
Address warnings in Release build without assertions.
Tip @tschuett for reporting the issue #55404.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D125475
Fangrui Song [Fri, 13 May 2022 19:02:14 +0000 (12:02 -0700)]
[ELF][test] Add an input section description test with "()" in the filename
Fangrui Song [Fri, 13 May 2022 18:53:03 +0000 (11:53 -0700)]
[ELF][test] Clean up linkerscript/{filename-spec.s,group.s}
Amir Ayupov [Thu, 12 May 2022 17:29:41 +0000 (18:29 +0100)]
[BOLT][CMAKE] Add missing clauses to bolt/runtime/CMakeLists.txt
Fix build with Apple Clang.
Tip @tschuett for reporting the issue #55404.
Reviewed By: rafauler
Differential Revision: https://reviews.llvm.org/D125480
Louis Dionne [Fri, 13 May 2022 18:43:43 +0000 (14:43 -0400)]
[runtimes] Fix how we trigger CI
For example, we used to trigger CI even for commits that touched a file
whose path contained 'cmake', even if it's not the root cmake directory.
Fix that.
Joseph Huber [Tue, 10 May 2022 19:00:53 +0000 (15:00 -0400)]
[OpenMP] Use the new OpenMP device static library when doing LTO
The previous patches allowed us to create a static library containing
all the device code. This patch uses that library to perform the device
runtime linking late when performing LTO. This in addition to
simplifying the libraries, allows us to transparently handle the runtime
library as-needed without needing Clang to manually pass the necessary
library in the linker wrapper job.
Depends on D125315
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D125333
Joseph Huber [Tue, 10 May 2022 12:49:08 +0000 (08:49 -0400)]
[Libomptarget] Build the device runtime as a static library
This patch adds the necessary CMake configuration to build a static
library version of the device runtime, `libomptarget.devicertl.a`.
Various improvements in how we handle static libraries and generating
offloading code should allow us to treat the device library as a regular
project without needing to invoke the clang front-end directly. Here we
generate a job for each offloading architecture supported. Each
offloading architecture will be embedded into the static library and
used as-needed by the host.
This library will primarily be used to replace the bitcode library when
performing LTO. Currently, we need to manually pass in the bitcode
library which requires foreknowledge of the offloading architecture.
This approach lets us handle that in the linker wrapper instead.
Furthermore this should improve our interface to the device runtime. We
can now build it fully under a release build and have all the expected
entry points, as well as supporting debug builds.
Depends on D125265 D125256 D125260 D125314 D125563
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D125315
Joseph Huber [Fri, 13 May 2022 17:07:46 +0000 (13:07 -0400)]
[Libomptarget] Remove global include directory from libomptarget
We used to globally include the libomptarget include directory for all
projects. This caused some conflicts with the other files named
"Debug.h". This patch changes the cmake to include these files via the
target include instead.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D125563
Joseph Huber [Tue, 10 May 2022 13:45:39 +0000 (09:45 -0400)]
[OpenMP] Don't set device runtime debugging flags if using '-nogpulib'
We use globals to configure debugging at compile-time for the device
runtime. Because these are only used by the OpenMP runtime we shouldn't
define them if we aren't using the device runtime. When a user passes in
'-nogpulib' this indicates that we are not using the device runtime, so
we should check for the precense of this flag and not emit these globals
if used.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D125314
Joseph Huber [Mon, 9 May 2022 19:48:58 +0000 (15:48 -0400)]
[OpenMP] Don't include the device wrappers if -nostdinc is used
OpenMP uses several wrapper hearders to provide the definitions of
needed symbols contained in the host. However, some users may use the
`-nostdinc` option to override these definitions themselves. The OpenMP
wrapper headers are stored in the same location as the clang install. If
the user passes `-nostdinc` then this include directory is never looked
at by default which means that including these wrappers will always
fail. These headers should instead be included manually if they are
needed with a `-nostdinc` build.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D125265
Joseph Huber [Mon, 9 May 2022 19:10:04 +0000 (15:10 -0400)]
[OpenMP] Add `__CUDA_ARCH__` definition when offloading with OpenMP
Currently we define the `__CUDA_ARCH__` macro only in CUDA mode. This
patch allows us to use this macro in OpenMP-offloading mode when
targeting NVPTX.
Reviewed By: tra, tianshilei1992
Differential Revision: https://reviews.llvm.org/D125256
Joseph Huber [Tue, 10 May 2022 21:33:41 +0000 (17:33 -0400)]
[Libomptarget] Address existing warnings in the device runtime library
This patche attemps to address the current warnings in the OpenMP
offloading device runtime. Previously we did not see these because we
compiled the runtime without the standard warning flags enabled.
However, these warnings are used when we now build the static library
version of this runtime. This became extremely noisy when coupled with
the fact the we compile each file roughly 32 times when all the
architectures are considered. So it would be ideal to not have all these
warnings show up when building.
Most of these errors were simply implicit switch-case fallthroughs,
which can be addressed using C++17's fallthrough attribute. Additionally
there was a volatile variable that was being casted away. This is most
likely safe to remove because we cast it away before its even used and
didn't seem to affect anything in testing.
Depends on D125260
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D125339
Joseph Huber [Mon, 9 May 2022 18:22:59 +0000 (14:22 -0400)]
[Libomptarget] Allow the device runtime to be compiled for the host
Currently the OpenMP offloading device runtime is only expected to be
compiled for the specific architecture it's targeting. This is
problematic if we want to make compiling the device runtime more general
via the standar `clang` driver rather than invoking the clang front-end
directly. This patch addresses this by primarily changing the declare
type to `nohost` so the host will not contain any of this code.
Additionally we forward declare the functions that are defined via
variants, otherwise these would cause problems on the host.
Reviewed By: jdoerfert, tianshilei1992
Differential Revision: https://reviews.llvm.org/D125260
Louis Dionne [Fri, 13 May 2022 16:15:15 +0000 (12:15 -0400)]
[runtimes][NFC] Remove dead code for Standalone builds
Standalone builds have been deprecated and then removed for a while now.
Trying to use standalone builds leads to a fatal CMake error, so this
code is all dead. Remove it to clean things up.
Differential Revision: https://reviews.llvm.org/D125561
Simon Pilgrim [Fri, 13 May 2022 18:06:51 +0000 (19:06 +0100)]
Fix implicit double -> float truncation warnings. NFCI.
Fangrui Song [Fri, 13 May 2022 18:06:01 +0000 (11:06 -0700)]
[ELF] Disallow input section description without a filename
GNU ld does not allow `.foo : { (*foo) }`, but we may recognize it as three
input section descriptions: file "(" with any section name, file "*foo" with
any section name, file ")" with any section name. Disallow the error-prone usage.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D125523
Craig Topper [Fri, 13 May 2022 17:57:35 +0000 (10:57 -0700)]
Revert "[RISCV] Enable subregister liveness tracking for RVV."
This reverts most of
ed242b54c9c2aa84a47f66af5b8497d93646b68d
I'm seeing failures in our intrinsic testing on qemu that seem
related to this. Reverting while I investigate.
I've left the command line option in place for directed testing.
It defaults to off.
Petr Hosek [Fri, 13 May 2022 17:45:08 +0000 (10:45 -0700)]
[CMake] Disable libedit in Fuchsia toolchain
We don't need libedit in our toolchain build.
Differential Revision: https://reviews.llvm.org/D125570
Simon Pilgrim [Fri, 13 May 2022 17:31:15 +0000 (18:31 +0100)]
DAGCombiner.cpp - break if-else chains that always return (style)
Louis Dionne [Fri, 13 May 2022 17:24:57 +0000 (13:24 -0400)]
[libunwind] Remove -Wsign-conversion warning
Philip Reames [Fri, 13 May 2022 16:51:23 +0000 (09:51 -0700)]
[RISCV] Address post-commit feedback from af5e09b
Philip Reames [Fri, 13 May 2022 16:29:35 +0000 (09:29 -0700)]
[RISCV] Precommit tests showing missed vlenb optimizations
Louis Dionne [Tue, 10 May 2022 14:35:49 +0000 (10:35 -0400)]
[libc++abi][NFCI] Refactor demangling_terminate_handler to reduce nesting
This keeps the same logic, but uses early return to avoid multiple layers
of nested ifs and make the code simpler to follow.
Differential Revision: https://reviews.llvm.org/D125476
Hongtao Yu [Mon, 9 May 2022 20:23:59 +0000 (13:23 -0700)]
[CSSPGO][CSProfileConverter] Remove call target samples when including callee samples into caller.
When a flat CS profile is converted to a nested profile, the call target samples for inlined callee contexts are left over in the callsite target map. This could cause indirect call promotion to function improperly. One issue is that the inlined callsites are treated with double amount of samples. The other is the inlined callsites are reconsidered for subsequent PGO ICP.
I'm fixing this by excluding call targets from the callsite for inlined targets. While fixing this I found that callsite target sum and the number of body samples for that callsite could be mismatched. {D122609} has an explanation and a fix for that on llvm-profgen side. For now I'm tolerating it in this change.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D125266
Philip Reames [Fri, 13 May 2022 16:01:19 +0000 (09:01 -0700)]
[RISCV] Add llvm.read.register support for vlenb
This patch adds minimal support for lowering an read.register intrinsic with vlenb as the argument. Note that vlenb is an implementation constant, so it is never allocatable.
This was split off a patch to eventually replace PseudoReadVLENB with a COPY MI because doing so revealed a couple of optimization opportunities which really seemed to warrant individual patches and tests. To write those patches, I need a way to write the tests involving vlenb, and read.register seemed like the right testing hook.
Differential Revision: https://reviews.llvm.org/D125552
Chris Lattner [Fri, 13 May 2022 14:38:50 +0000 (15:38 +0100)]
[ParseResult] Mark this as LLVM_NODISCARD (like LogicalResult) and fix issues.
There are a lot of cases where we accidentally ignored the result of some
parsing hook. Mark ParseResult as LLVM_NODISCARD just like ParseResult is.
This exposed some stuff to clean up, so do.
Differential Revision: https://reviews.llvm.org/D125549
Mike Rice [Thu, 12 May 2022 16:33:53 +0000 (09:33 -0700)]
[OpenMP] Fix declare simd use on in-class member template function
Return the Decl when parsing the template member declaration so the
'omp declare simd' pragma can be applied to it. Previously a nullptr
was returned causing an error applying the pragma.
Fixes #52700.
Differential Revision: https://reviews.llvm.org/D125493
Nikita Popov [Fri, 13 May 2022 15:19:19 +0000 (17:19 +0200)]
[InstSimplify] Add additional implied condition tests (NFC)
Balazs Benics [Fri, 13 May 2022 15:06:55 +0000 (17:06 +0200)]
Revert "[clang-tidy] modernize-deprecated-headers check should respect extern "C" blocks"
This reverts commit
7e3ea55da88a9d7feaa22f29d51f89fd0152a189.
Looks like this breaks tests: http://45.33.8.238/linux/76033/step_8.txt
Balazs Benics [Fri, 13 May 2022 10:34:45 +0000 (12:34 +0200)]
[analyzer] Introduce clang_analyzer_dumpSvalType introspection function
In some rare cases the type of an SVal might be interesting.
This introspection function exposes this information in tests.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D125532
Balazs Benics [Fri, 13 May 2022 15:04:34 +0000 (17:04 +0200)]
[analyzer][NFC] Tighten some of the SValBuilder return types
This is purely a cosmetic change.
Reviewed By: martong
Differential Revision: https://reviews.llvm.org/D125463
Sanjay Patel [Fri, 13 May 2022 14:30:58 +0000 (10:30 -0400)]
[SDAG] freeze operand when expanging urem
This is a potential miscompile as discussed in issue #55291.
The related IR transform was patched with:
d428f09b2c9d49f6a32
Sanjay Patel [Fri, 13 May 2022 14:28:22 +0000 (10:28 -0400)]
[x86] add test to show potential miscompile with undef value; NFC
This is based on:
c2a5a87500d92c
Balazs Benics [Fri, 13 May 2022 14:54:13 +0000 (16:54 +0200)]
[clang-tidy] modernize-deprecated-headers check should respect extern "C" blocks
The check should not report includes wrapped by `extern "C" { ... }` blocks,
such as:
```lang=C++
#ifdef __cplusplus
extern "C" {
#endif
#include "assert.h"
#ifdef __cplusplus
}
#endif
```
This pattern comes up sometimes in header files designed to be consumed
by both C and C++ source files.
The check now reports false reports when the header file is consumed by
a C++ translation unit.
In this change, I'm not emitting the reports immediately from the
`PPCallback`, rather aggregating them for further processing.
After all preprocessing is done, the matcher will be called on the
`TranslationUnitDecl`, ensuring that the check callback is called only
once.
Within that callback, I'm recursively visiting each decls, looking for
`LinkageSpecDecls` which represent the `extern "C"` specifier.
After this, I'm dropping all the reports coming from inside of it.
After the visitation is done, I'm emitting the reports I'm left with.
For performance reasons, I'm sorting the `IncludeMarkers` by their
corresponding locations.
This makes the scan `O(log(N)` when looking up the `IncludeMarkers`
affected by the given `extern "C"` block. For this, I'm using
`lower_bound()` and `upper_bound()`.
Reviewed By: whisperity
Differential Revision: https://reviews.llvm.org/D125209
PeixinQiao [Fri, 13 May 2022 14:43:12 +0000 (22:43 +0800)]
[flang] Warn for the limit on name length
As fortran 2018 C601, the maximum length of a name is 63 characters.
Reviewed By: klausler
Differential Revision: https://reviews.llvm.org/D125371
Nikita Popov [Fri, 13 May 2022 14:41:27 +0000 (16:41 +0200)]
[LoopVectorize] Regenerate test checks (NFC)
zhijian [Fri, 13 May 2022 14:40:15 +0000 (10:40 -0400)]
[AIX] support write operation of big archive.
SUMMARY
1. Enable supporting the write operation of big archive.
2. the first commit come from https://reviews.llvm.org/D104367
3. refactor the first commit and implement writing symbol table.
4. fixed the bugs and add more test cases in the second commit.
Reviewers: James Henderson
Differential Revision: https://reviews.llvm.org/D123949
Simon Pilgrim [Fri, 13 May 2022 14:29:51 +0000 (15:29 +0100)]
[X86] LowerStore - use is64BitVector() wrapper. NFCI.
Kristof Beyls [Fri, 13 May 2022 14:28:06 +0000 (16:28 +0200)]
Update my office hours
Aaron Puchert [Fri, 13 May 2022 14:26:31 +0000 (16:26 +0200)]
Try to disambiguate between overloads on Mac
Presumably Mac has a different understanding of how long `long` is.
Should fix a build error introduced by D125429 that's not visible on
other architectures.
Jay Foad [Fri, 13 May 2022 13:17:23 +0000 (14:17 +0100)]
[APInt] Fix documentation of *OrSelf methods
Document that truncOrSelf, zextOrSelf and sextOrSelf only enforce
an upper or lower bound on the bitwidth of the result.
Aaron Ballman [Fri, 13 May 2022 14:23:23 +0000 (10:23 -0400)]
Remove a stale FIXME comment; NFC
Sanjay Patel [Fri, 13 May 2022 13:15:22 +0000 (09:15 -0400)]
[ValueTracking] recognize sub X, (X % Y) as not overflowing
I fixed some poison-safety violations on related patterns in InstCombine
and noticed that we missed adding nsw/nuw on them, so this adds clauses
to the underlying analysis for that.
We need the undef input restriction to make this safe according to Alive2:
https://alive2.llvm.org/ce/z/48g9K8
Differential Revision: https://reviews.llvm.org/D125500
Sanjay Patel [Thu, 12 May 2022 20:08:51 +0000 (16:08 -0400)]
[InstCombine] add tests for sub with rem operand; NFC
Nico Weber [Fri, 13 May 2022 13:48:01 +0000 (09:48 -0400)]
Revert "In MSVC compatibility mode, friend function declarations behave as function declarations"
This reverts commit
ad47114ad8500c78046161d492ac13a8e3e610eb.
See discussion on https://reviews.llvm.org/D124613.
Stephen Long [Fri, 13 May 2022 13:39:19 +0000 (06:39 -0700)]
[MSVC] Add support for pragma function
MSVC pragma function tells the compiler to generate calls to functions in the pragma function list, instead of using the builtin. Needs https://reviews.llvm.org/D124701
https://docs.microsoft.com/en-us/cpp/preprocessor/function-c-cpp?view=msvc-170
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D124702
Jonas Paulsson [Tue, 8 Mar 2022 20:51:43 +0000 (15:51 -0500)]
[SystemZ] Patchset for expanding memcpy/memset using at most two stores.
* Set MaxStoresPerMemcpy and MaxStoresPerMemset to 2.
* Optimize stores of replicated values in SystemZ::combineSTORE(). This
handles the now expanded memory operations and as well some other
pre-existing cases.
* Reject a big displacement in isLegalAddressingMode() for a vector type.
* Return true from shouldConsiderGEPOffsetSplit().
Reviewed By: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D122105
Nikita Popov [Fri, 13 May 2022 13:29:20 +0000 (15:29 +0200)]
[ControlHeightReduction] Simplify addToMergedCondition() (NFC)
Ken Matsui [Fri, 13 May 2022 13:09:32 +0000 (09:09 -0400)]
Suggest typo corrections for preprocessor directives
When a preprocessor directive is unknown outside of a skipped
conditional block, we give an error diagnostic because we don't know
how to proceed with preprocessing. But when the directive is in a
skipped conditional block, we would not diagnose it on the theory that
the directive may be known to an implementation other than Clang.
Now, for unknown directives inside a skipped conditional block, we
diagnose the unknown directive as a warning if it is sufficiently
similar to a directive specific to preprocessor conditional blocks. For
example, we'll warn about `#esle` and suggest `#else` but we won't warn
about `#progma` because it's not a directive specific to preprocessor
conditional blocks.
Fixes #51598
Differential Revision: https://reviews.llvm.org/D124726
David Sherwood [Fri, 6 May 2022 14:14:06 +0000 (15:14 +0100)]
[LoopVectorize] Add overflow checks when tail-folding with scalable vectors
In InnerLoopVectorizer::getOrCreateVectorTripCount there is an
assert that the known minimum value for the VF is a power of 2
when tail-folding is enabled. However, for scalable vectors the
value of vscale may not be a power of 2, which means we have
to worry about the possibility of overflow. I have solved this
problem by adding preheader checks that prevent us from entering
the vector body if the canonical IV would overflow, i.e.
if ((IntMax - TripCount) < (VF * UF)) ... skip vector loop ...
Differential Revision: https://reviews.llvm.org/D125235
Nikita Popov [Fri, 13 May 2022 08:38:33 +0000 (10:38 +0200)]
[InstSimplify] Fold and/or using implied conditions
This adds two conjugated folds:
* A | B -> B if A implies B (https://alive2.llvm.org/ce/z/R6GU4j)
* A & B -> A if A implies B (https://alive2.llvm.org/ce/z/EGMqyy)
If A and B are icmps themselves, we will usually fold this through
other logic already (though the tests show a couple additional cases
we previously missed). However, isImpliedCond() also supports A
being of the form X & Y, which allows us to handle cases like
(X & Y) | B where X implies B. This addresses the regression from
D125398.
Something that notably doesn't work yet is the (X | Y) & B case.
This is due to an asymmetry in the isImpliedCondition()
implementation that will have to be addressed separately.
Differential Revision: https://reviews.llvm.org/D125530