Roman Lebedev [Wed, 7 Dec 2022 23:27:17 +0000 (02:27 +0300)]
[NFC] Port all ConstraintElimination tests to `-passes=` syntax
Roman Lebedev [Wed, 7 Dec 2022 23:27:16 +0000 (02:27 +0300)]
[NFC] Port all ConstantHoisting tests to `-passes=` syntax
Roman Lebedev [Wed, 7 Dec 2022 23:27:16 +0000 (02:27 +0300)]
[NFC] Port all CodeExtractor tests to `-passes=` syntax
Roman Lebedev [Wed, 7 Dec 2022 23:27:16 +0000 (02:27 +0300)]
[NFC] Port all CanonicalizeFreezeInLoops tests to `-passes=` syntax
Roman Lebedev [Wed, 7 Dec 2022 23:27:15 +0000 (02:27 +0300)]
[NFC] Port all CallSiteSplitting tests to `-passes=` syntax
Roman Lebedev [Wed, 7 Dec 2022 23:27:15 +0000 (02:27 +0300)]
[NFC] Port all BlockExtractor tests to `-passes=` syntax
Roman Lebedev [Wed, 7 Dec 2022 23:27:14 +0000 (02:27 +0300)]
[NFC] Port all Attributor tests to `-passes=` syntax
Roman Lebedev [Wed, 7 Dec 2022 23:27:14 +0000 (02:27 +0300)]
[NFC] Port all ArgumentPromotion tests to `-passes=` syntax
Roman Lebedev [Wed, 7 Dec 2022 23:27:13 +0000 (02:27 +0300)]
[NFC] Port all ADCE tests to `-passes=` syntax
Tue Ly [Wed, 7 Dec 2022 19:24:46 +0000 (14:24 -0500)]
[libc] Fix undefined behavior in UInt<>::shift_right.
Fix undefined behavior of left-shifting uint64_t by 64 in
`UInt<>::shift_right` implementation.
Reviewed By: michaelrj, sivachandra
Differential Revision: https://reviews.llvm.org/D139566
Akira Hatanaka [Wed, 16 Nov 2022 22:20:23 +0000 (14:20 -0800)]
Add support for a backdoor driver option that enables emitting header
usage information in JSON to a file
Each line in the file is a JSON object that has the name of the main
source file followed by the list of system header files included
directly or indirectly from that file.
For example:
{"source":"/tmp/foo.c",
"includes":["/usr/include/stdio.h", "/usr/include/stdlib.h"]}
To reduce the amount of data written to the file, only the system
headers that are directly included from a non-system header file are
recorded.
In order to emit the header information in JSON, it is necessary to set
the following environment variables:
CC_PRINT_HEADERS_FORMAT=json CC_PRINT_HEADERS_FILTERING=only-direct-system
The following combination is equivalent to setting CC_PRINT_HEADERS=1:
CC_PRINT_HEADERS_FORMAT=textual CC_PRINT_HEADERS_FILTERING=none
Differential Revision: https://reviews.llvm.org/D137996
Krzysztof Parzyszek [Wed, 7 Dec 2022 18:04:25 +0000 (10:04 -0800)]
[Bitcode(Reader|Writer)] Convert Optional to std::optional
bixia1 [Wed, 7 Dec 2022 20:54:50 +0000 (12:54 -0800)]
[mlir][sparse] Add dependence on bufferization.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D139571
Mahesh Ravishankar [Wed, 7 Dec 2022 01:00:37 +0000 (01:00 +0000)]
[mlir][Transforms] Simplify region before simplifying operation in CSE.
This covers more options for CSE. It also ensures that two operations
that have same operands but different regions to begin with, but same
regions after `simplifyRegions`, don't get both added to the list of
`knownValues`.
Fixes #59135
Differential Revision: https://reviews.llvm.org/D139490
Leonard Chan [Wed, 7 Dec 2022 23:09:53 +0000 (23:09 +0000)]
[compiler-rt][hwasan] Let CheckAddressSized eventually call HandleTagMismatch on Fuchsia
Any hwasan tag checking done through runtime calls like __hwasan_mem* or
__hwasan_load/store* currently raise a sigtrap on a tag mismatch. Hwasan
dumps as much information it knows on the tag mismatch by placing
important values in specific registers before the brk and encoding the
access information in the optional argument supplied to the brk. If the
platform hwasan runs on uses signal handlers, then users can see the
typical pretty hwasan error report, but Fuchsia doesn't use signal
handlers, so it's left up to the platform exception handler to print all
this encoded information.
This patch attempts to enter the regular error reporting path via
HandleTagMismatch if a new macro CAN_GET_REGISTERS is set. For now this
is only defined for Fuchsia + aarch64, but can be expanded for other
platforms.
Differential Revision: https://reviews.llvm.org/D139377
Johannes Doerfert [Tue, 4 Oct 2022 12:45:21 +0000 (05:45 -0700)]
[AMDGPU] Annotate the intrinsics to be default and nocallback
Differential Revision: https://reviews.llvm.org/D135155
Jakub Kuderski [Wed, 7 Dec 2022 22:21:41 +0000 (17:21 -0500)]
[mlir][arith] Fix comment typo. NFC.
Jakub Kuderski [Wed, 7 Dec 2022 22:15:55 +0000 (17:15 -0500)]
[mlir][arith] Rename addui_carry to addui_extended
The goal is to make the naming of the future `_extended` ops more
consistent. With unsigned addition, the carry value/flag and overflow
bit are the same, but this is not true when it comes to signed addition.
Also rename the second result from `carry` to `overflow`.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D139569
Jon Chesterfield [Wed, 7 Dec 2022 22:02:53 +0000 (22:02 +0000)]
[amdgpu] Reimplement LDS lowering
Renames the current lowering scheme to "module" and introduces two new
ones, "kernel" and "table", plus a "hybrid" that chooses between those three
on a per-variable basis.
Unit tests are set up to pass with the default lowering of "module" or "hybrid"
with this patch defaulting to "module", which will be a less dramatic codegen
change relative to the current. This reflects the sparsity of test coverage for
the table lowering method. Hybrid is better than module in every respect and
will be default in a subsequent patch.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D139433
Alexander Belyaev [Wed, 7 Dec 2022 17:30:36 +0000 (18:30 +0100)]
[mlir] Make patterns for folding tensor.empty optional.
At the moment, they are a part of EmptyOp::getCanonicalizationPatterns. When
extract_slice(tensor.empty) is rewritten as a new tensor.empty, it could
happen that we end up with two tensor.empty ops, since the original
tensor.empty can have two users. After bufferization such cases result in two
allocations.
Differential Revision: https://reviews.llvm.org/D139308
Simon Pilgrim [Wed, 7 Dec 2022 21:52:06 +0000 (21:52 +0000)]
[llvm-exegesis][x86] Add test coverage for Issue #38507
Ensure that the PBLENDVBrr0 destination register is never xmm0
Bran Hagger [Wed, 7 Dec 2022 10:00:15 +0000 (12:00 +0200)]
Enable kmpc_atomic functions for arm64
Define the same kmpc_atomic functions for arm and arm64 that are defined for x86 and x64.
Reviewed By: mstorsjo
Differential Revision: https://reviews.llvm.org/D139139
Chris Bieneman [Mon, 5 Dec 2022 20:21:41 +0000 (14:21 -0600)]
Generate DXIL Shader hash
DXIL shader bitcode is hashed and the hash is placed into the final
output object file in its own data part.
This change modifies the DXContainerGlobals pass to compute the shader
hash (just an MD5 of the bitcode) and put the shader hash data into a
global for the HASH part.
This also sets the hash flag as appropriate for if the hashed shader
contained debug information. There is additional handling required to
get debug information in shaders working correctly with our tooling,
but that will be addressed in subsequent patches.
Reviewed By: python3kgae
Differential Revision: https://reviews.llvm.org/D139357
Alexander Yermolovich [Wed, 7 Dec 2022 21:14:23 +0000 (13:14 -0800)]
Revert "[llvm][dwwarf] Change CU/TU index to 64-bit"
This reverts commit
5ebd28f3e56c00a739fda46c72c9e0f6528add87.
Alexander Yermolovich [Wed, 7 Dec 2022 21:14:11 +0000 (13:14 -0800)]
Revert "[DWARFLibrary] Add support to re-construct cu-index"
This reverts commit
a5bd76a6e3119af9dd9c1d8af89e2b89f5267deb.
Alexander Yermolovich [Wed, 7 Dec 2022 20:22:58 +0000 (12:22 -0800)]
[BOLT][DWARF] Don't create extra .debug_str_offsets contributions
With ThinLTO mutliple CUs can share the same .debug_str_offsets contribution. We
were creating a new one for each CU. This lead to a binary size increase.
Reviewed By: maksfb
Differential Revision: https://reviews.llvm.org/D139214
Alexander Yermolovich [Wed, 7 Dec 2022 00:07:59 +0000 (16:07 -0800)]
[DWARFLibrary] Add support to re-construct cu-index
Summary:
According to DWARF5 specification and gnu specification for DWARF4 the offset
entry in the CU/TU Index is 32 bits. This presents a problem when
.debug_info.dwo in DWP file grows beyond 4GB. The CU Index becomes partially
corrupted.
This diff adds manual parsing of .debug_info.dwo/.debug_abbrev.dwo to
reconstruct CU index in general, and TU index for DWARF5. This is a work around
until DWARF6 spec is finalized.
Next patch will change internal CU/TU struct to 64 bit, and change uses as
necessary. The plan is to land all the patches in one go after all are approved.
This patch originates from the discussion in: https://discourse.llvm.org/t/dwarf-dwp-4gb-limit/63902
Differential Revision: https://reviews.llvm.org/D137882
Alexander Yermolovich [Tue, 6 Dec 2022 00:37:26 +0000 (16:37 -0800)]
[llvm][dwwarf] Change CU/TU index to 64-bit
Summary:
Changed contribution data structure to 64 bit. I added the 32bit and 64bit
accessors to make it explicit where we use 32bit and where we use 64bit. Also to
make sure sure we catch all the cases where this data structure is used.
Craig Topper [Wed, 7 Dec 2022 20:57:04 +0000 (12:57 -0800)]
Revert "[RISCV] Return InstSeq from generateInstSeqImpl instead of using an output parameter. NFC"
This reverts commit
d24915207c631b7cf637081f333b41bc5159c700.
Thinking about this more this probably chewed up 100+ bytes of stack
for each recursive call. So this probably needs more thought. The
code simplification wasn't that much.
Matt Arsenault [Tue, 6 Dec 2022 23:51:16 +0000 (18:51 -0500)]
NVPTX: Cleanup check for denormal mode
Go through the common query and be explicit about the supported flush
type.
Matt Arsenault [Wed, 7 Dec 2022 00:45:17 +0000 (19:45 -0500)]
AMDGPU: Rename test functions and add some cases for consistency
Test all the permutations.
David Tenty [Wed, 7 Dec 2022 20:40:22 +0000 (15:40 -0500)]
Revert "[libunwind] Use .irp directives. NFC"
This reverts commit
8482e95f75d02227fbf51527680c0b5424bacb69, which breaks on AIX
due to unsupported psudeo-ops in the assembly.
Differential Revision: https://reviews.llvm.org/D139368
Nicolai Hähnle [Thu, 1 Dec 2022 11:24:01 +0000 (12:24 +0100)]
GISel/Combiner: maintain created instructions in a SetVector
This is not a correctness fix because the set is only used for debug
output. However, it helps avoid noise when looking at diffs between
compiler runs.
The set is only maintained with debug output enabled, so the added cost
should be acceptable.
Differential Revision: https://reviews.llvm.org/D139465
Koakuma [Wed, 7 Dec 2022 20:31:31 +0000 (15:31 -0500)]
[SPARC] Lower SELECT_CC to MOVr on 64-bit target whenever possible
On 64-bit target, when doing i64 SELECT_CC where one of the comparison operands
is a constant zero, try to fold the compare and MOVcc into a MOVr instruction.
For all integers, EQ and NE comparison are available, additionally for signed
integers, GT, GE, LT, and LE is also available.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D138922
Shilei Tian [Wed, 7 Dec 2022 20:28:20 +0000 (15:28 -0500)]
[NFC][Object] Include header `BitcodeReader.h` instead of using forward declaration for BitcodeModule
`BitcodeModule` is used as element of a vector in `IRSymtabFile`, while in the
header there is only a forward declaration. It will work if the header `BitcodeReader.h`
is included before including `IRObjectFile.h`. However, it is not always the case,
causing compilation error. This patch simply includes the header and remove the
forward declaration.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D139556
Brad Smith [Wed, 7 Dec 2022 20:22:28 +0000 (15:22 -0500)]
Revert "[SPARC] Mark the %g0 register as constant & use it to materialize zeros"
2 of the Sparc tests are now failing.
This reverts commit
2c41310fc146a1f609147c65ac5f30e5a57e84a8.
Craig Topper [Wed, 7 Dec 2022 20:25:30 +0000 (12:25 -0800)]
[RISCV] Without Zfh, promote f16 inputs before creating RISCVISD::FCVT_W(U)_RV64 nodes.
This allows us to remove a couple more Zfhmin isel patterns.
Roman Lebedev [Wed, 7 Dec 2022 19:53:08 +0000 (22:53 +0300)]
[NFC] Port all SimpleLoopUnswitch tests to `-passes=` syntax
Roman Lebedev [Wed, 7 Dec 2022 19:42:54 +0000 (22:42 +0300)]
[NFC] Port all (but one) HotColdSplit tests to `-passes=` syntax
Roman Lebedev [Wed, 7 Dec 2022 19:37:54 +0000 (22:37 +0300)]
[NFC] Port all CodeExtractor tests to `-passes=` syntax
Roman Lebedev [Wed, 7 Dec 2022 19:31:04 +0000 (22:31 +0300)]
[NFC] Port all LoopIdiom tests to `-passes=` syntax
Nico Weber [Wed, 7 Dec 2022 20:14:36 +0000 (15:14 -0500)]
[gn build] Reformat all build files
Ran:
git ls-files '*.gn' '*.gni' | xargs llvm/utils/gn/gn.py format
Javier Setoain [Mon, 7 Nov 2022 21:27:36 +0000 (21:27 +0000)]
[mlir] Add hoisting of transfer ops in affine loops
The only way to do this with the current hoisting strategy is by
lowering Affine to Scf first, but that prevents further passes on
Affine.
Differential Revision: https://reviews.llvm.org/D137600
bixia1 [Wed, 7 Dec 2022 16:57:40 +0000 (08:57 -0800)]
[mlir][sparse] Improve concatenate operation conversion for the case with annotated all dense result.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D139345
Nico Weber [Wed, 7 Dec 2022 20:03:16 +0000 (15:03 -0500)]
[clang] Tweak test to tolerate clang being called "clang" instead of "clang-15"
See also
af95441ba7f3ca6.
Craig Topper [Wed, 7 Dec 2022 19:58:30 +0000 (11:58 -0800)]
[RISCV] Promote f16 fp_to_int_sat with Zfhmin during lowering instead of isel.
We already have a custom handler for FP_TO_(S/U)INT_SAT. It's easy
enought to inject an FP_EXTEND in there.
Jacob Lambert [Wed, 7 Dec 2022 19:50:11 +0000 (11:50 -0800)]
[Driver][test] Fix test by creating empty archive instead of empty file
Differential Revision: https://reviews.llvm.org/D137275
James Y Knight [Mon, 21 Nov 2022 01:41:42 +0000 (20:41 -0500)]
[SPARC] Simplify instruction decoder.
After https://reviews.llvm.org/D137653 named sub-operands can be used
in the auto-generated instruction decoders. This allows the
auto-generated decoders to work properly, so all the hand-coded
decoders in the sparc target can be removed.
In some instances, a manually-written decoder had not been implemented
for an instruction, and thus that instruction was not decoded
properly. These have been fixed (and tests added).
Differential Revision: https://reviews.llvm.org/D137727
James Y Knight [Tue, 8 Nov 2022 22:11:04 +0000 (17:11 -0500)]
[TableGen] More named sub-operands work.
Commit
a538d1f13a13 first added support for named sub-operands in
CodeEmitterGen. We now add a few more features to that, enabling
further target cleanups.
1. Adds support for handling an EncoderMethod in a sub-operand in
CodeEmitterGen. Previously, the specified encoder of a sub-operand was
ignored, and only the default used.
2. Adds support for sub-operands in DecoderEmitter, along with support
for tied sub-operands.
The changes to the decoder required a few minor tweaks to a few
targets, where existing brokeness was exposed. In order to keep this
patch small, I left FIXMEs which will be addressed in upcoming
patches. (Except MIPS16, since its object file emission/decoding is
totally broken).
Differential Revision: https://reviews.llvm.org/D137653
Craig Topper [Wed, 7 Dec 2022 19:26:58 +0000 (11:26 -0800)]
[RISCV] Consolidate identical (fcopysign FPR32:, FPR16:) isel patterns. NFC
Rob Suderman [Wed, 7 Dec 2022 19:11:51 +0000 (11:11 -0800)]
[mlir][tosa] Fix tosa.resize for i48 accumulator
Implementation assumed a i32 accumulator. Fixed the implementation to
work with an i32 accumulator.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D139365
Keith Smiley [Wed, 7 Dec 2022 19:23:42 +0000 (11:23 -0800)]
Fix @llvm.global_ctors docs (NFC)
Roman Lebedev [Wed, 7 Dec 2022 19:10:05 +0000 (22:10 +0300)]
[NFC] Port all IndVarSimplify tests to `-passes=` syntax
Roman Lebedev [Wed, 7 Dec 2022 19:01:38 +0000 (22:01 +0300)]
[NFC] Port all RewriteStatepointsForGC tests to `-passes=` syntax
Roman Lebedev [Wed, 7 Dec 2022 18:59:26 +0000 (21:59 +0300)]
[NFC] Port all MergeFunc tests to `-passes=` syntax
Roman Lebedev [Wed, 7 Dec 2022 18:57:37 +0000 (21:57 +0300)]
[NFC] Port all GVN tests to `-passes=` syntax
Roman Lebedev [Wed, 7 Dec 2022 18:52:06 +0000 (21:52 +0300)]
[NFC] Port all IROutliner tests to `-passes=` syntax
Tim Northover [Thu, 10 Nov 2022 09:22:40 +0000 (09:22 +0000)]
AArch64: emit `fcmp ord %a, zeroinitializer` as a single fcmeq.
Most "ord" checks need two real-world compares to implement, but this is the
canonical form of a "!isnan" check, which is equivalent to comparing the input
for equality against itself.
Roman Lebedev [Wed, 7 Dec 2022 18:35:57 +0000 (21:35 +0300)]
[NFC] Port all SLPVectorizer tests to `-passes=` syntax
Thomas Raoux [Tue, 6 Dec 2022 23:14:00 +0000 (23:14 +0000)]
[mlir][linalg] Add extra parameter to tiling reduction to foreach_thread
This adds a tile_size parameter, when it is used the tiles are
cyclically distributed onto the threads of the scf.foreach_thread op.
Differential Revision: https://reviews.llvm.org/D139474
Koakuma [Wed, 7 Dec 2022 18:25:38 +0000 (13:25 -0500)]
[SPARC] Mark the %g0 register as constant & use it to materialize zeros
Materialize zeros by copying from %g0, which is now marked as constant.
This makes it possible for some common operations (like integer negation) to be
performed in fewer instructions.
This continues @arichardson's patch at D132561.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D138887
Joe Nash [Fri, 2 Dec 2022 20:36:39 +0000 (15:36 -0500)]
[AMDGPU] Enable OMod on more VOP3 instructions
OMod was disabled if OpSel was enabled, but that restriction is more
specific than necessary. Any VOP3 with float operands can use OMod.
On GFX11, FMAC_F16_e64 can use op_sel.
Previously, SIFoldOperands and convertToThreeAddress were accidentally correct when
they reinterpreted the zero OMod operand on V_FMAC_F16_e64 as the OpSel operand on
V_FMA_F16_gfx9_e64. Now we explicitly add op_sel if required.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D139469
Alex Richardson [Sun, 20 Nov 2022 11:38:16 +0000 (11:38 +0000)]
Overload all llvm.annotation intrinsics for globals argument
The global constant arguments could be in a different address space
than the first argument, so we have to add another overloaded argument.
This patch was originally made for CHERI LLVM (where globals can be in
address space 200), but it also appears to be useful for in-tree targets
as can be seen from the test diffs.
Differential Revision: https://reviews.llvm.org/D138722
Alex Richardson [Fri, 25 Nov 2022 16:00:04 +0000 (16:00 +0000)]
Add a baseline test for llvm.annotation IR upgrade
This will be overloaded in the next commit.
Amara Emerson [Sun, 13 Nov 2022 09:43:04 +0000 (01:43 -0800)]
[GlobalISel] Add a new G_INVOKE_REGION_START instruction to fix an EH bug.
We currently have a bug where the legalizer, when dealing with phi operands,
may create instructions in the phi's incoming blocks at points which are effectively
dead due to a possible exception throw.
Say we have:
throwbb:
EH_LABEL
x0 = %callarg1
BL @may_throw_call
EH_LABEL
B returnbb
bb:
%v = phi i1 %true, throwbb, %false....
When legalizing we may need to widen the i1 %true value, and to do that we need
to create new extension instructions in the incoming block. Our insertion point
currently is the MBB::getFirstTerminator() which puts the IP before the unconditional
branch terminator in throwbb. These extensions may never be executed if the call
throws, and therefore we need to emit them before the call (but not too early, since
our new instruction may need values defined within throwbb as well).
throwbb:
EH_LABEL
x0 = %callarg1
BL @may_throw_call
EH_LABEL
%true = G_CONSTANT i32 1 ; <<<-- ruh'roh, this never executes if may_throw_call() throws!
B returnbb
bb:
%v = phi i32 %true, throwbb, %false....
To fix this, I've added two new instructions. The main idea is that G_INVOKE_REGION_START
is a terminator, which tries to model the fact that in the IR, the original invoke inst
is actually a terminator as well. By using that as the new insertion point, we
make sure to place new instructions on always executing paths.
Unfortunately we still need to make the legalizer use a new insertion point API
that I've added, since the existing `getFirstTerminator()` method does a reverse
walk up the block, and any non-terminator instructions cause it to bail out. To
avoid impacting compile time for all `getFirstTerminator()` uses, I've added a new
method that does a forward walk instead.
Differential Revision: https://reviews.llvm.org/D137905
Philip Reames [Wed, 7 Dec 2022 18:24:48 +0000 (10:24 -0800)]
[RISCV][InsertVSETVLI] Generalize scalar move rule for when AVL is unchanged
By definition, the AVL of the scalar move is equally zero to the prior AVL if they are the same value. This generalizes the existing code to the case where the scalar move has a register AVL which is unknown, but unchanged from the preceeding instruction.
This doesn't cause any interesting diffs on its own, but another patch makes this case much more common. Split off to reduce a future diff.
Brett Wilson [Wed, 7 Dec 2022 18:22:05 +0000 (10:22 -0800)]
Revert "[clang-doc] Add template support."
Causes a build failure in YAML specializations.
This reverts commit
0f6dbb5f164662c3e6a167a89e7a89f07c60e32b.
Craig Topper [Wed, 7 Dec 2022 18:03:51 +0000 (10:03 -0800)]
[RISCV] Remove pseudos for whole register load, store, and move.
The MC layer instructions have the correct register classes, and
the pseudos don't have any additional operands. So there doesn't
seem to be any reason for them to exist.
The pseudos were incorrectly going through code in RISCVMCInstLower
that converted LMUL>1 register classes to LMUL1 register class.
This makes the MCInst technically malformed, and prevented the
vl2r.v, vl4r.v, and vl8r.v InstAliases from matching. This accounts
for all of the .ll test diffs.
Differential Revision: https://reviews.llvm.org/D139511
Haojian Wu [Wed, 7 Dec 2022 17:59:17 +0000 (18:59 +0100)]
Fix an -Wunused-variable warning in release build, NFC
Simon Pilgrim [Wed, 7 Dec 2022 17:50:28 +0000 (17:50 +0000)]
[llvm-exegesis][x86] Add option to prevent use of xmm8-xmm15 upper SSE registers
Noticed while trying to use llvm-exegesis to get some accurate capture numbers on some old Atom/Silverment hardware as part of the work with D103695.
These targets' frontends are particularly poor and the use of the xmm8-xmm15 SSE registers results in longer instruction encodings which were affecting the latency/throughput estimates.
Thanks to @lebedev.ri for the --skip-measurements command line argument which made testing much easier!
Differential Revision: https://reviews.llvm.org/D138832
Roman Lebedev [Wed, 7 Dec 2022 17:42:33 +0000 (20:42 +0300)]
[NFC] Port all (but one) LICM tests to `-passes=` syntax
Nikolas Klauser [Thu, 1 Dec 2022 22:17:53 +0000 (23:17 +0100)]
[libc++] Implement P0339R6 (polymorphic_allocator<> as a vocabulary type)
Reviewed By: ldionne, #libc
Spies: LRFLEW, libcxx-commits, arichardson, krytarowski, jdoerfert
Differential Revision: https://reviews.llvm.org/D137739
Brett Wilson [Mon, 7 Nov 2022 23:07:56 +0000 (15:07 -0800)]
[clang-doc] Add template support.
Reads template information from the AST and adds template parameters and
specialization information to the corresponding clang-doc structures.
Add a "QualName" to the Reference struct which includes the full
qualified type name. The Reference object represents a link in the
HTML/MD generators so is based on the unqualified name. But this does
not encode C-V qualifiers or template information that decorate the
name. The new QualName member encodes all of this information and also
makes it easier for the generators or downsteam YAML consumers to
generate the full name (before they had to process the "Path").
In test code that was changed, remove made-up paths to built-in types
like "int". In addition to slightnly cleaning up the code, these types
do not have paths in real execution, and generating incorrect references
to nonexistant data may complicate future changes in the generators.
Differential Revision: https://reviews.llvm.org/D139154
Mitch Phillips [Wed, 7 Dec 2022 17:46:22 +0000 (09:46 -0800)]
Disable flaky MLIR async.mlir test on ASan.
Test keeps flaking on the ASan buildbot:
https://github.com/llvm/llvm-project/issues/57231
Craig Topper [Wed, 7 Dec 2022 17:33:40 +0000 (09:33 -0800)]
[RISCV] Replace uses of hasStdExtC with COrZca.
Except MakeCompressible which will need more work.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D139504
Mirko Brkusanin [Wed, 7 Dec 2022 17:20:41 +0000 (18:20 +0100)]
[AMDGPU][GlobalISel] Fix legalizing image intrinsics for new types
We no longer need to increase vector size to 16 for intrinsics that use more
than 8 vgprs for addr. There is no image intrinsic that needs more than 12
so all currently existing cases will be covered. Using incorrect size was
causing an error in instruction selection because instructions were updated
to require new types (9x32, 10x32, 11x32, 12x32).
Differential Revision: https://reviews.llvm.org/D139546
Roman Lebedev [Wed, 7 Dec 2022 17:14:13 +0000 (20:14 +0300)]
[llvm-exegesis] Dry run mode
Sometimes we only want to ensure that we can produce snippets (all the way
through `SnippetRepetitor`!), but don't care for the execution.
E.g. all of our tests are this way.
I've built LLVM without PFM and removed my CPU from `X86PfmCounters.td`,
and this produces the expected results in that configuration.
Reviewed By: courbet
Differential Revision: https://reviews.llvm.org/D139448
Roman Lebedev [Wed, 7 Dec 2022 16:43:17 +0000 (19:43 +0300)]
[NFC] Port all (but one) LoopUnroll tests to `-passes=` syntax
Mitch Phillips [Wed, 7 Dec 2022 16:50:33 +0000 (08:50 -0800)]
Fix dwarf5-lazy-dwo.c for the default c target not being c99.
My host compiler is clang version 15.0.0, which uses -std=c11 by
default. The test asserts that the language is 'c99', and so the test
fails locally.
Update the test to be explicit about compiling with 'c99'.
Reviewed By: Eric
Differential Revision: https://reviews.llvm.org/D139461
LLVM GN Syncbot [Wed, 7 Dec 2022 16:50:20 +0000 (16:50 +0000)]
[gn build] Port
d184958bad5c
Krzysztof Parzyszek [Wed, 7 Dec 2022 16:18:35 +0000 (08:18 -0800)]
[IRReader] Convert Optional in DataLayoutCallbackTy to std::optional
Joe Nash [Wed, 7 Dec 2022 16:34:58 +0000 (11:34 -0500)]
[AMDGPU] Add gfx11 runline to omod test. NFC
Phoebe Wang [Wed, 7 Dec 2022 16:17:45 +0000 (00:17 +0800)]
[X86] Pre-commit test for pr59305
Mark de Wever [Wed, 19 Oct 2022 17:50:48 +0000 (19:50 +0200)]
[libc++][format] Adds range-default-formatter.
This adds an incomplete version where the specializations for the
format_kinds are disabled dummy formatters.
Implements part of
- P2585R0 Improving default container formatting
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D137271
serge-sans-paille [Wed, 7 Dec 2022 16:15:14 +0000 (17:15 +0100)]
Revert "Store OptTable::Info::Name as a StringRef"
Another revert, for another set of issues I don't reproduce locally...
see https://lab.llvm.org/buildbot/#/builders/139/builds/32327
This reverts commit
bdfa3100dc3ea9e9ce4d3d4100ea6bb4c3fa2b81.
Dmitry Kurtaev [Wed, 7 Dec 2022 16:19:33 +0000 (08:19 -0800)]
[RISCV][JitLink] Propagate error from Expected<T> result during R_RISCV_PCREL_HI20 parsing
related issue: https://github.com/llvm/llvm-project/issues/59139
Differential Revision: https://reviews.llvm.org/D138781
Yitzhak Mandelbaum [Mon, 5 Dec 2022 20:38:55 +0000 (20:38 +0000)]
[clang][dataflow] Support (in)equality operators in `optional` model.
This patch adds interpretation of the overloaded equality and inequality
operators available for the optional types.
Fixes issue #57253.
Differential Revision: https://reviews.llvm.org/D139360
Emilio Cota [Tue, 6 Dec 2022 15:30:06 +0000 (10:30 -0500)]
[mlir][bufferize] lower allocation alignment from 128 to 64 bytes
While it is unlikely to matter in practice, there is no reason
for this value to be larger than it should be. 64 bytes is the
size of a cache line in most machines, and we can fit a full
512-bit vector in it.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D139434
Doru Bercea [Tue, 6 Dec 2022 19:07:57 +0000 (13:07 -0600)]
Automate tests.
chenglin.bi [Wed, 7 Dec 2022 15:54:23 +0000 (23:54 +0800)]
[InstCombine] fold more icmp + select patterns by distributive laws
follow up D139076, add icmp with not only eq/ne, but also gt/lt/ge/le.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D139253
chenglin.bi [Wed, 7 Dec 2022 02:16:22 +0000 (10:16 +0800)]
[Instcombine] Canonicalize ~((A & B) ^ (A | ?)) -> (A & B) | ~(A | ?)
~((A & B) ^ (A | ?)) -> (A & B) | ~(A | ?)
https://alive2.llvm.org/ce/z/JHN2p4
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D139299
serge-sans-paille [Sun, 4 Dec 2022 08:33:14 +0000 (09:33 +0100)]
Store OptTable::Info::Name as a StringRef
This is a recommit of
8ae18303f97d5dcfaecc90b4d87effb2011ed82e,
with a few cleanups.
This avoids implicit conversion to StringRef at several points, which in
turns avoid redundant calls to strlen.
As a side effect, this greatly simplifies the implementation of
StrCmpOptionNameIgnoreCase.
It also eventually gives a consistent, humble speedup in compilation
time (timing updated since original commit).
https://llvm-compile-time-tracker.com/compare.php?from=
76fcfea283472a80356d87c89270b0e2d106b54c&to=
b70eb1f347f22fe4d2977360c4ed701eabc43994&stat=instructions:u
Differential Revision: https://reviews.llvm.org/D139274
Daniel Kiss [Wed, 7 Dec 2022 14:45:42 +0000 (15:45 +0100)]
[AArch64] Add __ARM_FEATURE_BTI and __ARM_FEATURE_PAUTH
Macros are added to ACLE[1] and already added to ARM but these two are missing from AArch64.
[1] https://github.com/ARM-software/acle/blob/main/main/acle.md#changes-between-acle-q3-2021-and-acle-q4-2021
Reviewed By: chill
Differential Revision: https://reviews.llvm.org/D139445
Matthias Springer [Wed, 7 Dec 2022 15:22:07 +0000 (16:22 +0100)]
[mlir][tensor] Support parallel_insert_slice in reassociative reshape folder
Differential Revision: https://reviews.llvm.org/D139540
Hans Wennborg [Tue, 6 Dec 2022 14:21:43 +0000 (15:21 +0100)]
Handle char{8,16,32} and wchar_t in ASTContext::getIntegerRank()
They were previously not handled, causing the llvm_unreachable with
"getIntegerRank(): not a built-in integer" message to be hit.
The test case is derived from how we hit it in Chromium
(crbug.com/1396142). I tried to come up with something more direct, but
this is the best I could find.
Differential revision: https://reviews.llvm.org/D139428
Jens Massberg [Thu, 1 Dec 2022 16:49:41 +0000 (17:49 +0100)]
[clang] Correctly handle by-reference capture with an initializer that is a pack expansion in lambdas.
Ensure that the correct information whether an init-capture of a lambda
is passed by reference or by copy. This information is already computed
and has to be passed to the place where `NewInitCaptureType` is
created.
Before this fix it has been checked whether the VarDecl is a reference
type. This doesn't work for packed expansions, as the information
whether it is passed by reference or by copy is stored at the pattern of
a `PackExpansionType` and not at the type itself.
However, as the information has been already computed, we just have to
pass it.
Add tests that lambda captures with var decls which are reference types
are created in the AST and a disgnotics test for pack expansions.
Fixes #49266
Differential Revision: https://reviews.llvm.org/D139125
Guillaume Chatelet [Wed, 7 Dec 2022 14:54:03 +0000 (14:54 +0000)]
[reland][Alignment] Use Align in MCStreamer emitZeroFill/emitLocalCommonSymbol
Before performing this change, I checked that `ByteAlignment` was never `0` inside `MCAsmStreamer:emitZeroFill` and `MCAsmStreamer::emitLocalCommonSymbol`.
I believe it is NFC as `0` values are illegal in `emitZeroFill` anyways, `Log2(ByteAlignment)` would be undefined.
And currently, all calls to `emitLocalCommonSymbol` are provably `>0`.
Differential Revision: https://reviews.llvm.org/D139439
Guillaume Chatelet [Wed, 7 Dec 2022 14:48:40 +0000 (14:48 +0000)]
Revert D139439 "[Alignment] Use Align in MCStreamer emitZeroFill/emitLocalCommonSymbol"
This breaks Windows bots with
`warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)`
Some shift operators are lacking a proper literal unit ('1ULL' instead of
'1'). Will reland once fixed.
This reverts commit
c621c1a8e81856e6bf2be79714767d80466e9ede.
Nikita Popov [Wed, 7 Dec 2022 14:31:17 +0000 (15:31 +0100)]
[SCEV] Use umin_seq for symbolic max BE count
We were using umin_seq when computing the exact BE count, but not
when computing the symbolic max BE count.
Will Dietz [Tue, 6 Dec 2022 20:44:12 +0000 (14:44 -0600)]
[mlir][Pass] Fix dropped statistics with nested adaptors.
When running in parallel, nesting more than once caused
statistics to be dropped.
Fix by also preparing "async" pass managers before merging,
as they may also have "async" pass managers within.
Add test checking reported statistics have expected values
with and without threading enabled.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D139459